Safeguards Enforcement Analyst, Safety Evaluations
Anthropic · Remote | San Francisco, CA · Safeguards (Trust & Safety)
About this role
Anthropic is hiring a mid-level Security Analyst as a remote position. The posting calls out experience with SQL, Incident Response, OpenAI. Compensation is listed at $230,000–$270,000 per year.
- Role
- Security Analyst
- Function
- security
- Level
- mid
- Track
- Individual contributor
- Employment
- Full-time
- Location
- Remote | San Francisco, CA
- Work mode
- Remote
- Department
- Safeguards (Trust & Safety)
More roles at Anthropic
Job description
from Anthropic careersAbout Anthropic
Anthropic’s mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group of committed researchers, engineers, policy experts, and business leaders working together to build beneficial AI systems.
About the Role
Anthropic's Safeguards team is responsible for enforcing our policies, protecting users, and ensuring our platform is not misused. As a Safeguards Enforcement Analyst focused on Safety Evaluations, you'll play a central role in ensuring our models meet safety and policy standards before and after launch. You'll run and monitor evaluations, drive mitigations when issues surface, coordinate the creation of new evals, and help build the processes and documentation that allow the team to scale this work over time.
This role requires someone who is detail-oriented, comfortable navigating ambiguity, and capable of coordinating across teams to break new ground and drive work to completion. This work is deeply cross-functional — you'll partner closely with policy experts, Safeguards engineering teams, and many other stakeholders throughout the organization to ensure our evaluations are comprehensive and current, and that findings translate into meaningful improvements to model behavior.