Research Engineer (Agentic Models)
JetBrains · Amsterdam, Netherlands | Belgrade, Serbia | Berlin, Germany | Limassol | London, United Kingdom | Madrid, Spain | Munich, Germany | Paphos | Prague, Czech Republic | Remote (Germany) | Warsaw, Poland | Yerevan · JCP Core Machine Learning
About this role
JetBrains is hiring a mid-level Research Scientist in the machine learning function as a remote position. The posting calls out experience with Python, Kubernetes, Airflow, PyTorch.
- Role
- Research Scientist
- Function
- machine learning
- Level
- mid
- Track
- Individual contributor
- Employment
- Full-time
- Location
- Amsterdam, Netherlands | Belgrade, Serbia | Berlin, Germany | Limassol | London, United Kingdom | Madrid, Spain | Munich, Germany | Paphos | Prague, Czech Republic | Remote (Germany) | Warsaw, Poland | Yerevan
- Work mode
- Remote
- Department
- JCP Core Machine Learning
More roles at JetBrains
Job description
from JetBrains careersAt JetBrains, code is our passion. Ever since we started, back in 2000, we’ve been striving to make the strongest, most effective developer tools on earth. Today, AI-powered assistance and agents are becoming a core part of how developers work in our IDEs.
We’re building multi-step coding agents that can understand large codebases, plan changes, call tools, and iterate with the user. As a Research Engineer in the Agentic Models team, you’ll be responsible for the models, training loops, and evaluation pipelines that power these agents.
You’ll work at the intersection of SFT and RL-style post-training, and product-driven evaluation, using our distributed GPU and MapReduce clusters to ship models into JetBrains products.
As part of our team, you will:
- Design, implement, and maintain SFT and RL post-training pipelines for multi-step coding agents.
- Train and adapt LLMs for agent workflows, including planning, tool use, and multi-step interactions inside JetBrains IDEs.
- Build and develop evaluation and simulation environments where coding agents can act, be measured, and compared on realistic developer tasks.
- Design evaluation frameworks and metrics for agent behavior, analyze traces and logs, and close the loop from evaluation back into training, data, and reward design.
- Analyze training and evaluation results to propose and implement improvements to model architectures, training recipes, and datasets.