Site Reliability Engineer
Recorded Future · Gothenburg, Sweden · MA R&D
About this role
Recorded Future is hiring a mid-level Site Reliability Engineer in the software engineering function based in Gothenburg, Sweden. The posting calls out experience with AWS, Kubernetes, Terraform, MongoDB.
- Role
- Site Reliability Engineer
- Function
- software engineering
- Level
- mid
- Track
- Individual contributor
- Employment
- Full-time
- Location
- Gothenburg, Sweden
- Department
- MA R&D
More roles at Recorded Future
Job description
from Recorded Future careersWith 1,000+ intelligence professionals serving over 1,900 clients worldwide, Recorded Future is the world’s most advanced, and largest, intelligence company!
Recorded Future is seeking a highly motivated and experienced Site Reliability Engineer (SRE) to join our growing team. In this role, you will be instrumental in ensuring the reliability, scalability, and performance of our critical systems. You will work closely with development teams to build and maintain robust infrastructure, implement automation, and foster a culture of operational excellence. This position requires a strong understanding of cloud environments, observability, and infrastructure as code principles.
What You'll Do:
- Ensure the performance, capacity, scalability, reliability, resiliency, security, compliance, support, cost efficiency, SLA, SLOs, RPOs and RTOs for the platform, either directly or in collaboration with other teams.
- Make systemic improvements both proactively and for recurring issues.
- Perform comprehensive Root Cause Analysis for outages.
- Design, implement, and maintain scalable and reliable infrastructure on AWS.
- Develop and manage observability solutions using tools such as Grafana, ELK (Elasticsearch, Logstash, Kibana), and Prometheus to monitor system health and performance.
- Automate infrastructure provisioning and configuration using Terraform and Chef.
- Participate in a 24/7 on-call rotation to respond to and resolve production incidents.
- Collaborate with engineering teams to ensure applications are designed for high availability and resilience.