About this role

Baseten is hiring a mid-level Software Engineer based in San Francisco, CA. The posting calls out experience with Python, CUDA, Kubernetes, Docker. Compensation is listed at $180,000–$360,000 per year.

Role: Software Engineer
Function: software engineering
Level: mid
Track: Individual contributor
Employment: Full-time
Location: San Francisco, CA
Department: EPD
Posted: Mar 28, 2024

More roles at Baseten

Software Engineer, Model Performance Systems

San Francisco, CA · mid

Python LLMs Deep Learning

Solution Architect

San Francisco, CA · mid

Machine Learning Spark LLMs

Content Engineer

San Francisco, CA · mid

Python JavaScript SQL

Software Engineer - Baseten for Labs

San Francisco, CA · mid

TypeScript React Kubernetes

Software Engineer - AI Enablement

San Francisco, CA · mid

LLMs Machine Learning AI Agents All Baseten jobs →

Job description

from Baseten careers

ABOUT BASETEN

Baseten powers mission-critical inference for the world's most dynamic AI companies, like Cursor, Notion, OpenEvidence, Abridge, Clay, Gamma and Writer. By uniting applied AI research, flexible infrastructure, and seamless developer tooling, we enable companies operating at the frontier of AI to bring cutting-edge models into production. We're growing quickly and recently raised our $300M Series E, backed by investors including BOND, IVP, Spark Capital, Greylock, and Conviction. Join us and help build the platform engineers turn to to ship AI products.

THE ROLE

Are you passionate about advancing the application of artificial intelligence? We are looking for a Software Engineer focused on ML performance to join our dynamic team. This role is ideal for someone who thrives in a fast-paced startup environment and is eager to make significant contributions to the exciting field of LLM Inference. If you are a backend engineer who thrives on making things faster and is excited about open-source ML models, we look forward to your application.

EXAMPLE INITIATIVES

You'll get to work on these types of projects as part of our Model Performance team:

RESPONSIBILITIES

Implement, refine, and productionize cutting-edge techniques (quantization, speculative decoding, kv cache reuse, chunked prefill and LoRA) for ML model inference and infrastructure.
This is an excerpt. Read the full job description on Baseten careers →

All software engineering jobs software engineering in San Francisco, CA Jobs in San Francisco, CA software engineering salaries software engineering career path

All Baseten Jobs Browse software engineering roles mid positions

Software Engineer - Model Performance

About this role

More roles at Baseten

Job description