senior software engineering Infrastructure Engineer ic 5+ yrs Bachelor's · Posted Apr 20, 2026

Skills

DevOps Observability Python Java Kubernetes Kafka Spark LLMs Prometheus Distributed Systems System Design Data Modeling ETL Machine Learning Performance Optimization Backend Development Cloud Computing OpenTelemetry

About this role

Nvidia is hiring a senior-level Infrastructure Engineer in the software engineering function based in Santa Clara, CA. The posting calls out experience with DevOps, Observability, Python, Java and roughly 5+ years of relevant work. Listed education preference: a bachelor's degree or equivalent.

Role: Infrastructure Engineer
Function: software engineering
Level: senior
Track: Individual contributor
Employment: Full-time
Location: Santa Clara, CA
Experience: 5+ years
Education: Bachelor's degree
Posted: Apr 20, 2026

AI Summary

Design and scale high-throughput observability platforms handling metrics, logs, and traces across distributed AI/HPC environments. Build backend telemetry services, develop OpenTelemetry collectors, and optimize time-series pipelines. Requires 5+ years backend/distributed systems experience, strong Python/Go/Java skills, and hands-on observability architecture knowledge.

Upgrade to Pro for AI summaries, resume match scores & career intelligence →

More roles at Nvidia

Senior Formal Verification Engineer

Santa Clara, CA · senior

Python

Senior ASIC Verification Engineer, Coherent High Speed Interconnect

Santa Clara, CA · senior

Deep Learning

Senior ASIC Verification Engineer, Coherent High Speed Interconnect

Toronto, Canada · senior

Deep Learning

Senior Technical Marketing Engineer - EDA and Semiconductor

Santa Clara, CA · senior

CUDA Agile Machine Learning

Senior Software Architect, AI Systems and Networking

Santa Clara, CA · senior

Reinforcement Learning Rust C All Nvidia jobs →

Job description

from Nvidia careers

NVIDIA is a pioneer in accelerated computing, known for inventing the GPU and driving breakthroughs in gaming, computer graphics, high-performance computing, and artificial intelligence. Our technology powers everything from generative AI to autonomous systems, and we continue to shape the future of computing through innovation and collaboration. Within this mission, our team, Managed AI Superclusters (MARS) builds and scales the infrastructure, platforms, and tools that enable researchers and engineers to develop the next generation of AI/ML systems. By joining us, you’ll help design solutions that power some of the world’s most advanced computing workloads.

Observability is at the heart of this transformation. We are looking for a strong AI & HPC Observability Engineer to build and scale next-generation Observability and Telemetry platforms. You will design and develop high-throughput, reliable telemetry pipelines and modern data infrastructure. This role requires solid distributed systems fundamentals, production-grade coding, and a passion for operational excellence.

What You Will Be Doing:

Design and scale observability platforms handling high-volume metrics, logs, and traces across distributed environments
Build high-performance backend services for telemetry ingestion, processing, and routing
Develop and extend OpenTelemetry collectors, processors, exporters, and instrumentation libraries
Build and optimize metrics pipelines using large-scale time-series storage systems
Design and operate real-time and batch telemetry pipelines using streaming and distributed data technologies
This is an excerpt. Read the full job description on Nvidia careers →

All software engineering jobs software engineering in Santa Clara, CA Jobs in Santa Clara, CA software engineering salaries software engineering career path

All Nvidia Jobs Browse software engineering roles senior positions

Senior AI and HPC Observability Engineer

About this role

More roles at Nvidia

Job description