Engineering Manager, Agent Prompts & Evals
Anthropic · San Francisco, CA | New York City, NY · Engineering & Design - Product
About this role
Anthropic is hiring a manager-level Engineering Manager in the software engineering function based in San Francisco, CA | New York City, NY. The posting calls out experience with LLMs, Prompt Engineering, CI/CD, System Design. Compensation is listed at $320,000–$405,000 per year.
- Role
- Engineering Manager
- Function
- software engineering
- Level
- manager
- Track
- hybrid
- Employment
- Full-time
- Location
- San Francisco, CA | New York City, NY
- Department
- Engineering & Design - Product
More roles at Anthropic
Job description
from Anthropic careersAbout Anthropic
Anthropic’s mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group of committed researchers, engineers, policy experts, and business leaders working together to build beneficial AI systems.
About the Role
Anthropic is looking for an Engineering Manager to lead the Agent Prompts & Evals team. This team owns the infrastructure that lets Anthropic ship model and prompt changes with confidence — the eval frameworks, system prompt pipelines, and regression-detection systems that every model launch depends on.
When a new Claude model is ready to ship, this team is the one answering “is it actually better in our products?” When a product team wants to change how Claude behaves, this team owns the tooling that tells them whether they broke something. It’s a platform team whose platform is model behavior itself.
The team sits deliberately at the seam between product engineering and research. You’ll partner closely with other evals groups across the company on shared infrastructure and methodology, with product teams who are shipping features on top of Claude, and with the TPMs and research PMs driving model launches. The pace is set by the model release cadence, and the team operates as both a platform owner and a hands-on partner during launch periods.