Senior Staff Engineer – Training Software Release & Performance Infrastructure
AMD · Hyderabad, India · Engineering
About this role
AMD is hiring a senior-level Infrastructure Engineer in the software engineering function based in Hyderabad, India. The posting calls out experience with Docker, PyTorch, LLMs, Deep Learning.
- Role
- Infrastructure Engineer
- Function
- software engineering
- Level
- senior
- Track
- Tech leadership
- Location
- Hyderabad, India
- Department
- Engineering
- Posted
- Mar 27, 2026
More roles at AMD
Job description
from AMD careersWHAT YOU DO AT AMD CHANGES EVERYTHING
At AMD, our mission is to build great products that accelerate next-generation computing experiences—from AI and data centers, to PCs, gaming and embedded systems. Grounded in a culture of innovation and collaboration, we believe real progress comes from bold ideas, human ingenuity and a shared passion to create something extraordinary. When you join AMD, you’ll discover the real differentiator is our culture. We push the limits of innovation to solve the world’s most important challenges—striving for execution excellence, while being direct, humble, collaborative, and inclusive of diverse perspectives. Join us as we shape the future of AI and beyond. Together, we advance your career.
Senior Staff Engineer – Training Software Release & Performance Infrastructure
THE ROLE:
AMD is looking for a highly motivated senior individual contributor to help build and scale a training software release and performance validation capability in Hyderabad. You will own critical pieces of infrastructure and execution for training software stack releases, nightly performance validation, and regression triage for large-scale AI workloads on AMD Instinct™ accelerators.
THE PERSON:
This role is pivotal to ensuring the quality, stability, and performance competitiveness of AMD’s AI training software ecosystem across PyTorch, JAX, Megatron-LM, Torchtitan, and related frameworks. You will work hands-on across CI, benchmarking, automation, triage, and cross-stack debugging spanning frameworks, ROCm components, kernels, drivers, and compilers.