mid operations Program Manager ic
$150,000 – $175,000
USD per year

About this role

Together AI is hiring a mid-level Program Manager in the operations function based in San Francisco, CA. The posting calls out experience with Data Structures. Compensation is listed at $150,000–$175,000 per year.

Role
Program Manager
Function
operations
Level
mid
Track
Individual contributor
Employment
Full-time
Location
San Francisco, CA
Department
Product

More roles at Together AI

Engineering Manager / Tech Lead
Amsterdam, Netherlands · senior
Python TypeScript Rust
Finance Analytics Engineer
San Francisco, CA · mid
Python SQL Snowflake
Forward Deployed Engineer (GPU Clusters)
San Francisco, CA · mid
Python Bash Kubernetes
Forward Deployed Engineer (Inference & Post-Training)
San Francisco, CA · mid
Python LLMs Reinforcement Learning
Infrastructure Accounting Manager
San Francisco, CA · manager
Networking Data Structures
All Together AI jobs →

Job description

from Together AI careers

About the Role

Together AI runs one of the most demanding GPU fleets in the industry. Keeping that fleet healthy - every node online, every GPU performing, every datacenter transition running on schedule - is operationally complex and genuinely high-stakes. We're looking for a Junior TPM to own that operational reality.

This is not a coordination or status-reporting role. You will own the end-to-end node lifecycle - from the moment a node goes down through repair, return, and re-integration - and you'll drive the cross-functional work to close every gap as fast as possible. You'll manage datacenter bring-ups, hunt down GPU utilization loss, and build the processes and dashboards that make our fleet operations more visible and accountable over time.

The environment moves fast and doesn't always come with a clear playbook. Much of what you'll work on is genuinely novel - you'll be figuring things out alongside engineers who are building at the frontier. If that sounds like an obstacle, this isn't the right role. If it sounds like the best possible way to learn, keep reading.

Responsibilities

  • Own the end-to-end node lifecycle - from failure through repair, return, and re-integration — across provider ticketing, internal tooling, and the state machine that governs each stage
  • This is an excerpt. Read the full job description on Together AI careers →
All operations jobs operations in San Francisco, CA Jobs in San Francisco, CA operations salaries operations career path
All Together AI Jobs Browse operations roles mid positions