Staff Site Reliability Engineer – Automation and Platform
Cerebras Systems · Remote (United States) | Sunnyvale, CA | Toronto, Canada · AI Cloud
About this role
Cerebras Systems is hiring a staff-level Site Reliability Engineer in the software engineering function as a remote position. The posting calls out experience with C#, LLMs, CI/CD, Prometheus.
- Role
- Site Reliability Engineer
- Function
- software engineering
- Level
- staff
- Track
- Tech leadership
- Employment
- Full-time
- Location
- Remote (United States) | Sunnyvale, CA | Toronto, Canada
- Work mode
- Remote
- Department
- AI Cloud
More roles at Cerebras Systems
Job description
from Cerebras Systems careersCerebras Systems builds the world's largest AI chip, 56 times larger than GPUs. Our novel wafer-scale architecture provides the AI compute power of dozens of GPUs on a single chip, with the programming simplicity of a single device. This approach allows Cerebras to deliver industry-leading training and inference speeds and empowers machine learning users to effortlessly run large-scale ML applications, without the hassle of managing hundreds of GPUs or TPUs.
Cerebras' current customers include top model labs, global enterprises, and cutting-edge AI-native startups. OpenAI recently announced a multi-year partnership with Cerebras, to deploy 750 megawatts of scale, transforming key workloads with ultra high-speed inference.
Thanks to the groundbreaking wafer-scale architecture, Cerebras Inference offers the fastest Generative AI inference solution in the world, over 10 times faster than GPU-based hyperscale cloud inference services. This order of magnitude increase in speed is transforming the user experience of AI applications, unlocking real-time iteration and increasing intelligence via additional agentic computation.
About the Role
We are building a high-performance SRE function to support one of the world’s fastest-growing AI inference services, powered by the Wafer-Scale Engine (WSE). This team will help deliver world-class, ultra-reliable inference infrastructure for leading model builders such as OpenAI and other frontier labs.