Ceph Cluster Development Engineer (C++ Focus)
Fortinet · Santa Clara, CA · Software Development
About this role
Fortinet is hiring a mid-level Software Engineer based in Santa Clara, CA. The posting calls out experience with Python, C++, Kubernetes, Ansible. Compensation is listed at $179,000–$219,000 per year.
- Role
- Software Engineer
- Function
- software engineering
- Level
- mid
- Track
- Individual contributor
- Employment
- Full-time
- Location
- Santa Clara, CA
- Department
- Software Development
- Posted
- Nov 10, 2025
More roles at Fortinet
Job description
from Fortinet careersWe are seeking a highly skilled Ceph Cluster Development & Operations Engineer with strong expertise in C++ systems programming to design, extend, and maintain enterprise-scale Ceph distributed storage clusters. The role involves deep development in Ceph core subsystems (RADOS, OSD, RGW, MDS), performance optimization, and operational excellence across multi-site, multi-zone architectures.
You will work closely with system architects, SREs, and cloud infrastructure teams to ensure the reliability, scalability, and security of mission-critical storage systems deployed across multiple data centers and Kubernetes environments.
Key Responsibilities
- Design, build, and operate large-scale Ceph clusters including RADOS, RGW, RBD
- Contribute to or extend Ceph core components written in C++ (e.g., OSD, RGW, librados, BlueStore, MGR modules).
- Profile and optimize performance across network, disk I/O, and replication layers (PG placement, CRUSH rules, BlueStore tuning).
- Develop automation and tooling for cluster lifecycle management (deployment, upgrades, scaling, failover, and recovery).
- Integrate Ceph with Kubernetes (via Rook-Ceph, CSI drivers) and CI/CD pipelines for continuous delivery.
- Implement and validate multi-site replication and disaster recovery architectures for high availability.
- Develop and maintain secure storage solutions using dm-crypt, KMS integration, and CephX authentication.
- Build observability pipelines using Prometheus, Grafana, and custom exporters for metrics and health analytics.
- Write and maintain SOPs, automation scripts, and system documentation to support production-grade operations.