About this role

Together AI is hiring a senior-level Infrastructure Engineer in the software engineering function based in San Francisco, CA. The posting calls out experience with Go, CUDA, AWS, GCP and roughly 5+ years of relevant work. Compensation is listed at $160,000–$230,000 per year.

Role: Infrastructure Engineer
Function: software engineering
Level: senior
Track: Individual contributor
Employment: Full-time
Location: San Francisco, CA
Experience: 5+ years
Department: Engineering

AI Summary

Design and build high-performance backend services for AI cloud infrastructure, managing hardware virtualization, storage provisioning, and Kubernetes/Slurm clusters. Requires 5+ years backend development in Golang, deep Kubernetes and VM/hypervisor expertise, and experience with distributed microservice architectures.

Upgrade to Pro for AI summaries, resume match scores & career intelligence →

More roles at Together AI

Senior Backend Engineer, Inference Platform

San Francisco, CA · senior

Python TypeScript Rust

Senior Data Engineer

San Francisco, CA · senior

Python TypeScript Java

Senior Developer Productivity Engineer

San Francisco, CA · senior

Python JavaScript TypeScript

Senior Machine Learning Engineer, Voice AI

San Francisco, CA · senior

Python CUDA Serverless

Senior Network Engineer

San Francisco, CA · senior

Python AWS GCP All Together AI jobs →

Job description

from Together AI careers

About the Role

Together AI is building the AI Acceleration Cloud, an end-to-end platform for the full generative AI lifecycle, combining the fastest LLM inference engine with state-of-the-art AI cloud infrastructure.

As a Senior AI Infrastructure Engineer, you will play a key role in building the next generation AI cloud platform – a highly available, global, blazing-fast cloud infrastructure that virtualizes cutting-edge ML hardware (GB200s/GB300s, BlueField DPUs) and enables state-of-the-art ML practitioners with self-serve AI cloud services, such as on-demand + managed Kubernetes and Slurm clusters. This platform serves both our internal SaaS products (inference, fine-tuning) and our external cloud customers, spanning dozens of data centers across the world.

Responsibilities

Design, build, and maintain performant, secure, and highly-available backend services/operators that run in our data centers and automate hardware management, such as Infiniband partitioning, in-DC parallel storage provisioning, and VM provisioning.
Design and build out the IaaS software layer for a new GB200 data center with thousands of GPUs.
Work on a global multi-exabyte high-performance object store, serving massive datasets for pretraining.
Build advanced observability stacks for our customers with automated node lifecycle management for fault-tolerant distributed pretraining.
Perform architecture and research work for decentralized AI workloads
Work on the core, open-source Together AI platform

This is an excerpt. Read the full job description on Together AI careers →

All software engineering jobs software engineering in San Francisco, CA Jobs in San Francisco, CA software engineering salaries software engineering career path

All Together AI Jobs Browse software engineering roles senior positions

Senior Software Engineer - Together Cloud Infrastructure

About this role

More roles at Together AI

Job description

About the Role

Responsibilities