Machine Learning Research Scientist, Post-Training
Scale AI · San Francisco, CA | Seattle, WA | New York City, NY · Research
About this role
Scale AI is hiring a mid-level AI Research Scientist in the machine learning function based in San Francisco, CA | Seattle, WA | New York City, NY. The posting calls out experience with LLMs, Deep Learning, Reinforcement Learning, Machine Learning. Compensation is listed at $252,000–$315,000 per year.
- Role
- AI Research Scientist
- Function
- machine learning
- Level
- mid
- Track
- Individual contributor
- Employment
- Full-time
- Location
- San Francisco, CA | Seattle, WA | New York City, NY
- Department
- Research
More roles at Scale AI
Job description
from Scale AI careersScale works with the industry’s leading AI labs to provide high quality data and accelerate progress in GenAI research. We are looking for Research Scientists and Research Engineers with expertise in LLM post-training (SFT, RLHF, reward modeling). This role will focus on optimizing data curation and eval to enhance LLM capabilities in both text and multimodal modalities.
In this role, you will develop novel methods to improve the alignment and generalization of large-scale generative models. You will collaborate with researchers and engineers to define best practices in data-driven AI development. You will also partner with top foundation model labs to provide both technical and strategic input on the development of the next generation of generative AI models.
You will:
- Research and develop novel post-training techniques, including SFT, RLHF, and reward modeling, to enhance LLM core capabilities in both text and multimodal modalities.
- Design and experiment new approaches to preference optimization.
- Analyze model behavior, identify weaknesses, and propose solutions for bias mitigation and model robustness.
- Publish research findings in top-tier AI conferences.
Ideally you’d have:
- Ph.D. or Master's degree in Computer Science, Machine Learning, AI, or a related field.
- Deep understanding of deep learning, reinforcement learning, and large-scale model fine-tuning.
- Experience with post-training techniques such as RLHF, preference modeling, or instruction tuning.