Software Engineer, Hardware Health
OpenAI · San Francisco, CA · Scaling
About this role
OpenAI is hiring a mid-level Software Engineer based in San Francisco, CA. The posting calls out experience with Python, SQL, Bash, Linux. Compensation is listed at $250,000–$445,000 per year.
- Role
- Software Engineer
- Function
- software engineering
- Level
- mid
- Track
- Individual contributor
- Employment
- Full-time
- Location
- San Francisco, CA
- Department
- Scaling
- Posted
- May 11, 2026
More roles at OpenAI
Job description
from OpenAI careersAbout the Team
The Hardware Health and Observability team owns the end-to-end health lifecycle of OpenAI’s global compute fleet.
Our mission is to maximize healthy, usable compute across accelerator vendors, generations, cloud providers, and regions through reliable health signals, automated remediation, and scalable operational tooling.
We build the systems that observe, detect, remediate, and verify hardware issues across GPUs, CPUs, networking, and platform infrastructure, enabling frontier model training and inference workloads to run reliably at hyperscale. We are the last line of defense for the success of OAI’s production and research workloads.
About the Role
On the Hardware Health and Observability team, you’ll build critical infrastructure that keeps OpenAI’s largest compute clusters healthy and operational at scale.
Even small numbers of unhealthy systems can impact large-scale training and inference workloads. This team focuses on minimizing downtime, improving fleet efficiency, and ensuring compute resources remain continuously available to researchers and product teams.
Engineers on this team own problems end-to-end, from defining health signals and debugging failures to building automated remediation systems that operate across millions of GPUs globally.
In this role, you will:
Define and maintain health signals across GPUs, CPUs, networking, and platform infrastructure.
Build and evolve health checks that detect, remediate, and verify failures at scale.
This is an excerpt. Read the full job description on OpenAI careers →