mid Software Engineer ic · Posted Oct 11, 2025
$180,000 – $360,000
USD per year

About this role

Baseten is hiring a mid-level Software Engineer based in San Francisco, CA. The posting calls out experience with Kubernetes, LLMs, Distributed Systems, API Development. Compensation is listed at $180,000–$360,000 per year.

Role
Software Engineer
Function
software engineering
Level
mid
Track
Individual contributor
Employment
Full-time
Location
San Francisco, CA
Department
EPD
Posted
Oct 11, 2025

More roles at Baseten

Software Engineer — GPU Networking & Distributed Systems
San Francisco, CA · mid
Python Kubernetes LLMs
Program Manager, Marketing
San Francisco, CA · mid
Machine Learning SaaS Automation
Software Engineer - Billing and Internal Tooling
San Francisco, CA · mid
Machine Learning Spark Design Systems
Senior Manager, People Operations
San Francisco, CA · senior
Machine Learning Automation Spark
Immigration and Mobility Lead
San Francisco, CA · senior
Machine Learning Spark
All Baseten jobs →

Job description

from Baseten careers

ABOUT BASETEN

Baseten powers mission-critical inference for the world's most dynamic AI companies, like Cursor, Notion, OpenEvidence, Abridge, Clay, Gamma and Writer. By uniting applied AI research, flexible infrastructure, and seamless developer tooling, we enable companies operating at the frontier of AI to bring cutting-edge models into production. We're growing quickly and recently raised our $300M Series E, backed by investors including BOND, IVP, Spark Capital, Greylock, and Conviction. Join us and help build the platform engineers turn to to ship AI products.

THE ROLE:

Baseten’s Model Performance (MP) team is responsible for ensuring the models running on our platform are fast, reliable, and cost‑efficient. As part of this team, you’ll focus on Model API's — the infrastructure powering our hosted API endpoints for the latest open‑source models. This work spans distributed systems, model serving, and developer experience. You’ll join a small, high‑impact team operating at the intersection of product, model performance, and infra, helping to define how developers interact with AI models at scale.

RESPONSIBILITIES:

  • Design, build, and operate the Model APIs surface with focus on advanced inference capabilities: structured outputs (JSON mode, grammar-constrained generation), tool/function calling and multi-modal serving

  • Profile and optimize TensorRT-LLM kernels, analyze CUDA kernel performance, implement custom CUDA operators, tune memory allocation patterns for maximum throughput and optimize communication patterns across multi-GPU setups

    This is an excerpt. Read the full job description on Baseten careers →
All software engineering jobs software engineering in San Francisco, CA Jobs in San Francisco, CA software engineering salaries software engineering career path
All Baseten Jobs Browse software engineering roles mid positions