Systems Development Engineer, Research Compute Platform, Fauna
Amazon · New York City, NY · Systems, Quality, & Security Engineering
About this role
Amazon is hiring a mid-level Systems Engineer in the operations function based in New York City, NY. The posting calls out experience with Python, Bash, CUDA, AWS. Compensation is listed at $142,300–$192,400 per year.
- Role
- Systems Engineer
- Function
- operations
- Level
- mid
- Track
- Individual contributor
- Employment
- Full-time
- Location
- New York City, NY
- Department
- Systems, Quality, & Security Engineering
- Posted
- May 13, 2026
More roles at Amazon
Job description
from Amazon careersWe are seeking a Systems Development Engineer to own the research compute platform for Fauna Robotics. You will build and operate the physical and virtual infrastructure that our ML scientists use to train reinforcement learning policies for real robots, from fleet provisioning and job scheduling to cloud burst capacity and environment reproducibility. This role requires both strong systems engineering fundamentals and genuine comfort working alongside researchers. The ideal candidate is as happy diagnosing a GPU thermal fault as they are designing a job scheduler, and treats “the scientist’s training run just works” as the north star for everything they build. Key job responsibilities - Own on-prem GPU compute end-to-end: provisioning, imaging, driver and CUDA management, monitoring, failure diagnosis, hardware RMA, and capacity planning - Build and operate a job scheduling layer (Slurm, Ray, SkyPilot, or equivalent) so scientists submit training runs without managing individual machines - Design and implement the bridge between on-prem and cloud compute - Partner directly with ML scientists to triage training issues, profile workloads, identify bottlenecks, and advise on how to structure training for the hardware at hand About the team Fauna Robotics, an Amazon company, is building capable, safe, and genuinely delightful robots for everyday…