principal Software Engineer tech_leadership · Posted May 18, 2026

About this role

Nvidia is hiring a principal-level Software Engineer based in Santa Clara, CA. The posting calls out experience with Python, Kubernetes, Linux, Distributed Systems.

Role
Software Engineer
Function
software engineering
Level
principal
Track
Tech leadership
Employment
Full-time
Location
Santa Clara, CA
Posted
May 18, 2026

More roles at Nvidia

Senior Product Engineer Transceivers
Santa Clara, CA · senior
Python Embedded Systems Data Analytics
Software Engineer, AI and DL Kernel Libraries - New College Grad 2026
Santa Clara, CA · mid
Python C CUDA
AI Software Engineer, Kernel Libraries - New College Grad 2026
Santa Clara, CA · mid
Python C CUDA
Senior VLSI CAD Engineer
Santa Clara, CA · senior
Python Bash Frontend Development
Senior DFT Engineer
Santa Clara, CA · senior
Python Deep Learning
All Nvidia jobs →

Job description

from Nvidia careers

NVIDIA DGX Cloud is scaling GPU infrastructure across internal, partner, and cloud environments. We are looking for Principal Software Engineers to help shape the technical direction for production engineering, Kubernetes-based operations, automation, and reliability across large-scale GPU clusters.

This role is for senior technical leaders who can define architecture, lead through influence, build critical systems, and turn ambiguous infrastructure problems into durable software and operating models.

What you’ll be doing:

  • Define and execute the technical strategy for DGX Cloud cluster operations, building the automation, GitOps, and Day 2 reliability needed to operate large-scale GPU clusters across NVIDIA Cloud Partners (NCPs) and on-prem environments.

  • Lead design and implementation of systems for cluster lifecycle, validation, repair, upgrades, observability, and readiness.

  • Establish patterns for Kubernetes-based GPU cluster operations across partner and on-prem environments.

  • Identify and eliminate operational toil through software, APIs, automation, and agent-assisted workflows.

  • Set technical standards for production readiness, SLOs, incident response, handoff gates, and operational acceptance.

  • Mentor engineers and influence platform, infrastructure, storage, networking, security, and workload teams.

What we need to see:

  • 15+ years of experience building and operating large-scale distributed systems or cloud infrastructure.

  • Deep experience with Kubernetes, Linux, infrastructure automation, and production operations.

  • Strong programming experience in Go, Python, or similar.

    This is an excerpt. Read the full job description on Nvidia careers →
All software engineering jobs software engineering in Santa Clara, CA Jobs in Santa Clara, CA software engineering salaries software engineering career path
All Nvidia Jobs Browse software engineering roles principal positions