Machine Learning Engineer
Together AI · San Francisco, CA · Engineering
mid
Machine Learning Engineer
ic
5+ yrs Bachelor's
$160,000 – $220,000
USD per year
Skills
About this role
Together AI is hiring a mid-level Machine Learning Engineer based in San Francisco, CA. The posting calls out experience with Python, Rust, C, LLMs and roughly 5+ years of relevant work. Listed education preference: a bachelor's degree or equivalent. Compensation is listed at $160,000–$220,000 per year.
- Role
- Machine Learning Engineer
- Function
- machine learning
- Level
- mid
- Track
- Individual contributor
- Employment
- Full-time
- Location
- San Francisco, CA
- Experience
- 5+ years
- Education
- Bachelor's degree
- Department
- Engineering
AI Summary
Develop production inference and fine-tuning systems for LLMs at scale. Requires 5+ years building high-performance, distributed systems with expertise in LLM inference frameworks (vLLM, SGLang, TRT) and proficiency in Python, Go, Rust, or C/C++.
More roles at Together AI
Senior Software Engineer - Together Cloud Infrastructure
San Francisco, CA · senior
Go CUDA AWS
Senior Software Engineer - Together Cloud Platform
San Francisco, CA · senior
Go AWS GCP
Senior Technical Recruiter
San Francisco, CA · senior
LLMs Data Structures Cloud Computing
Solutions Architect
San Francisco, CA · mid
Python JavaScript Kubernetes
Sr. Partnerships Manager, Model Ecosystem
San Francisco, CA · senior
MongoDB LLMs Data Structures
All Together AI jobs →
Job description
from Together AI careersAbout the Role
Together AI is looking for an ML Engineer who will develop systems and APIs that enable our customers to perform inference and fine tune LLMs. Relevant experience includes implementing runtime systems that perform inference at scale using AI/ML models from simple models up to the largest LLMs.
Requirements
- 5+ years experience writing high-performance, well-tested, production quality code
- Bachelor’s degree in computer science or equivalent industry experience
- Familiar with LLM inference ecosystem, including frameworks and engines (e.g. vLLM, SGLang, TRT, ...)
- Demonstrated experience in building large scale, fault tolerant, distributed systems like storage, search, and computation
- Expert level programmer in one or more of Python, Go, Rust, or C/C++
- Experience implementing runtime inference services at scale or similar
Responsibilities
- Design and build the production systems that power the Together Cloud inference and fine-tuning APIs, enabling reliability and performance at scale
- Partner with researchers, engineers, product managers, and designers to bring new features and research capabilities to the world
- Analyze and improve efficiency, scalability, and stability of various system resources
- Conduct design and code reviews
- Create services, tools & developer documentation
- Create testing frameworks for robustness and fault-tolerance
- Participate in an on-call rotation to respond to critical incidents as needed
This is an excerpt. Read the full job description on Together AI careers →