Operations Engineer, Fleet Reliability
CoreWeave · New York City, NY | Bellevue, WA | Sunnyvale, CA | Richmond, VA · Technology
About this role
CoreWeave is hiring a mid-level Site Reliability Engineer in the software engineering function based in New York City, NY | Bellevue, WA | Sunnyvale, CA | Richmond, VA. The posting calls out experience with Python, C, Bash, Spring. Compensation is listed at $83,000–$110,000 per year.
- Role
- Site Reliability Engineer
- Function
- software engineering
- Level
- mid
- Track
- Individual contributor
- Employment
- Full-time
- Location
- New York City, NY | Bellevue, WA | Sunnyvale, CA | Richmond, VA
- Department
- Technology
More roles at CoreWeave
Job description
from CoreWeave careersWhat You'll Do:
The Fleet Reliability Operations team is responsible for the day-to-day provisioning, management and uptime of CoreWeave’s ever-expanding fleet of server nodes. Playing a central role in CoreWeave’s growth strategy, this team is on the front line for configuration, updates and remote troubleshooting of our highest tier of supercomputing clusters and their networking, delivery platforms and tools dependencies. You will be in a daily battle with the forces of entropy to maximize the number of nodes CoreWeave can deliver to customers.
We are seeking curious, creative and persistent problem solvers to join our Fleet Reliability Operations team to help us drive batches of server nodes through our provisioning and validation processes while efficiently and effectively troubleshooting node or cluster problems as they arise. This individual will join a team of committed engineers working to deploy nodes as fast as they can be racked and turned on.