Overview
Location: Full remote.
Schedule: Full time, European timezone availability.
Job Purpose
As a Senior DevOps Engineer, you will play a key role in designing, building, and maintaining scalable Platform as a Service solutions. Your expertise in runtime environments, troubleshooting, and cloud infrastructure will be essential in shaping the next Runtime for Data and Product Engineering. You will collaborate closely with the team to ensure reliability, observability, and automation, leveraging tools like AWS, Terraform, Grafana, Prometheus, and Kibana. Embracing a culture of shared ownership, you will contribute to a high-performing and resilient platform.
Experience & Qualifications
- Extensive experience with Runtime Environments and Infrastructure management.
- Strong troubleshooting skills, with the ability to diagnose and resolve complex system issues.
- Proven experience with AWS and Terraform for cloud infrastructure automation.
- Hands-on expertise in Observability and Alerting, using tools such as Grafana, Prometheus, and Kibana.
- Commitment to shared ownership, actively contributing to the team’s mission and fostering a culture of accountability.
- Advanced English (written and spoken) for effective communication with cross-functional teams and stakeholders.
- Excellent problem-solving skills, with the ability to translate business requirements into scalable technical solutions.
Technical Skills:
- Kubernetes (5+ years): Proven expertise in cluster management, networking, and security.
- AWS (5+ years): Hands-on experience with core services (EC2, S3, VPC, IAM, RDS, Lambda), scalability, and cost optimization.
- Monitoring & Alerting (Grafana, Prometheus, Kibana) (5+ years): Strong ability to set up dashboards, alerts, and analyze system performance.
- Docker (4+ years): Skilled in container image management, deployment strategies, and best practices.
- Terraform (4+ years): Proficient in IaC, automation workflows, and best practices for environment provisioning.
- Debugging & Troubleshooting (4+ years): Adept at diagnosing complex issues, performing root cause analysis, and optimizing system performance.
- Python (2+ years): Scripting, automation, and basic tool development.
- GitHub & GitHub Actions (3+ years): Familiarity with version control workflows, CI/CD pipelines, and repository management.
- REST API: Basic understanding of HTTP methods, status codes, and integration with third-party services.
- Go: Ability to read and write basic Golang for debugging or contributing to Go-based tools.