senior machine learning Research Scientist ic 4+ yrs Bachelor's Remote · Posted Mar 13, 2026

About this role

GitHub is hiring a senior-level Research Scientist in the machine learning function as a remote position. The posting calls out experience with Python, TypeScript, LLMs, Git and roughly 4+ years of relevant work. Listed education preference: a bachelor's degree or equivalent.

Role
Research Scientist
Function
machine learning
Level
senior
Track
Individual contributor
Location
United States
Work mode
Remote
Experience
4+ years
Education
Bachelor's degree
Department
Engineering
Posted
Mar 13, 2026
AI Summary
Design and build next-generation LLM evaluation frameworks for code generation, reasoning, and agentic workflows at GitHub Copilot. Lead model quality strategy, develop scalable metrics and benchmarking systems, and mentor researchers. Requires deep LLM evaluation expertise, strong engineering instincts, and 4-8+ years data science experience depending on degree level.

More roles at GitHub

Senior Software Engineer
United States · senior
Python JavaScript Java
Principal Software Engineering Manager
United States · manager
Python JavaScript Java
Principal Software Engineering Manager
United Kingdom · manager
Python JavaScript Java
Senior Software Engineer
United States · senior
Python JavaScript Java
Staff Software Engineer
United States · staff
Python JavaScript TypeScript
All GitHub jobs →

Job description

from GitHub careers
About GitHub

GitHub is the world’s leading platform for agentic software development — powered by Copilot to build, scale, and deliver secure software. Over 180 million developers, including more than 90% of the Fortune 100 companies, use GitHub to collaborate, and more than 77,000 organisations have adopted GitHub Copilot.

Locations

In this role you can work from Remote, United States

Overview

At GitHub, we’re building the next generation of AI‑powered developer experiences. We’re looking for a Staff Applied Researcher with deep expertise in Large Language Model (LLM) evaluation, LLM agents, strong engineering instincts, and a bias for action to help shape the future of GitHub Copilot and our AI platform.

This is a high‑impact role where you will design evaluation systems that directly influence how millions of developers experience AI every day.

 


Responsibilities

Lead Model Quality & Evaluation

  • Design next‑generation evaluation frameworks for code generation, reasoning, safety, multimodal tasks, and agentic workflows.

  • Develop scalable automatic metrics, LLM‑judge systems, reward models, and human‑in‑the‑loop evaluation pipelines.

  • Establish high‑signal, repeatable methodologies that influence product decisions across GitHub AI.

Drive Applied Research & Engineering

  • Build and optimize evaluation tooling, datasets, benchmarking systems, and experimentation pipelines.

  • Create and onboard new benchmarks for the hardest tasks for the coding agents. 

    This is an excerpt. Read the full job description on GitHub careers →
All machine learning jobs machine learning salaries machine learning career path
All GitHub Jobs Browse machine learning roles senior positions