Senior Manager, Site Reliability Engineering
Nvidia · Santa Clara, CA
About this role
Nvidia is hiring a senior-level Technical Lead in the software engineering function based in Santa Clara, CA. The posting calls out experience with DevOps, ITIL, Configuration Management, Observability.
- Role
- Technical Lead
- Function
- software engineering
- Level
- senior
- Track
- hybrid
- Employment
- Full-time
- Location
- Santa Clara, CA
- Posted
- Apr 20, 2026
More roles at Nvidia
Job description
from Nvidia careersFor over 25 years, NVIDIA has been at the forefront of transforming computer graphics, PC gaming, and accelerated computing, driven by a legacy of continuous innovation and exceptional talent. We are now leveraging the immense potential of AI to usher in the next era of computing, where our GPUs power the "brains" of computers, robots, and autonomous vehicles that can comprehend the world. This pioneering work demands vision, innovation, and the world's best talent. Join our diverse and supportive environment, where NVIDIANs are inspired to excel and make a profound global impact.
NVIDIA is seeking a Senior Manager of Site Reliability Engineering to lead and reshape how IT operations function at scale. This role goes beyond traditional service management to build AI-powered systems that enhance reliability, speed, and employee experience. We offer an outstanding opportunity to lead and refine Incident, Problem, and Change Management into an intelligent, automated operating model using observability, AI insights, and orchestration. This leader will apply strong operational execution with an SRE attitude, facilitating the move from reactive processes to predictive and autonomous operations.
What you’ll be doing
Manage the full lifecycle of Incident, Problem, and CM as a 24×7 operational function, ensuring high reliability and minimal business disruption.
This is an excerpt. Read the full job description on Nvidia careers →