mid software engineering Site Reliability Engineer ic

About this role

Cerebras Systems is hiring a mid-level Site Reliability Engineer in the software engineering function based in Sunnyvale, CA | Toronto, Canada. The posting calls out experience with Python, C, C#, LLMs.

Role
Site Reliability Engineer
Function
software engineering
Level
mid
Track
Individual contributor
Employment
Full-time
Location
Sunnyvale, CA | Toronto, Canada
Department
Performance
AI Summary
Mid-level Site Reliability Engineer characterizing and optimizing performance and reliability of AI models on Cerebras' wafer-scale hardware. Analyzes ML workloads across hardware/software layers, develops solutions to improve reliability and power efficiency, and influences next-generation architecture design through rigorous workload analysis.

More roles at Cerebras Systems

Full Stack Engineer – Manufacturing Test
Sunnyvale, CA · mid
Python JavaScript C#
Full Stack LLM Engineer
Toronto, Canada · mid
Python C C#
Head of Data Center Acquisition
Sunnyvale, CA · director
C# LLMs Machine Learning
Head of IT
Sunnyvale, CA · intern
C# LLMs Okta
Infrastructure Hardware Technical Program Manager (Server and Network Systems)
Sunnyvale CA or Toronto Canada · mid
C# LLMs Linux
All Cerebras Systems jobs →

Job description

from Cerebras Systems careers

Cerebras Systems builds the world's largest AI chip, 56 times larger than GPUs. Our novel wafer-scale architecture provides the AI compute power of dozens of GPUs on a single chip, with the programming simplicity of a single device. This approach allows Cerebras to deliver industry-leading training and inference speeds and empowers machine learning users to effortlessly run large-scale ML applications, without the hassle of managing hundreds of GPUs or TPUs.

Cerebras' current customers include top model labs, global enterprises, and cutting-edge AI-native startups. OpenAI recently announced a multi-year partnership with Cerebras, to deploy 750 megawatts of scale, transforming key workloads with ultra high-speed inference.

Thanks to the groundbreaking wafer-scale architecture, Cerebras Inference offers the fastest Generative AI inference solution in the world, over 10 times faster than GPU-based hyperscale cloud inference services. This order of magnitude increase in speed is transforming the user experience of AI applications, unlocking real-time iteration and increasing intelligence via additional agentic computation.

About The Role
Join Cerebras as a Performance & Reliability Engineer within our innovative Co-Design and Next Generation Team. Our groundbreaking CS-3 system has set new benchmarks in high-performance ML training and inference solutions. It leverages a dinner-plate sized chip with 44GB of on-chip memory to surpass traditional hardware capabilities. This role focuses on characterizing and optimizing the performance and reliability of state-of-the-art AI models running on Cerebras' breakthrough hardware.
This is an excerpt. Read the full job description on Cerebras Systems careers →
All software engineering jobs software engineering in Sunnyvale, CA Jobs in Sunnyvale, CA software engineering salaries software engineering career path
All Cerebras Systems Jobs Browse software engineering roles mid positions