Job
- Level
- Senior
- Job Feld
- IT, DevOps, Back End
- Anstellung
- Vollzeit
- Vertragsart
- Unbefristetes Dienstverhältnis
- Ort
- Berlin
- Arbeitsmodell
- Hybrid, Onsite
Job Zusammenfassung
In dieser Position entwickelst du die Observability-Strategie, optimierst Logging, Metrics und Tracing, führst große Zuverlässigkeitsinitiativen und verbesserst die Incident-Management-Prozesse auf unserer Plattform.
Job Technologien
Deine Rolle im Team
- We are looking for a Senior Site Reliability Engineer to join the Core Reliability & Observability team in Platform Engineering.
- Your mission will be to shape Doctolib's observability strategy and ensure our platform remains reliable, debuggable, and scalable at a European scale.
- You will work in a feature team developing logging, metrics, tracing, and alerting capabilities, contributing directly to supporting 400,000 health professionals and 80 million patients in their daily healthcare journey.
- Working in the tech team at Doctolib means building innovative products and features to improve the daily lives of care teams and patients.
- Your responsibilities include but are not limited to:
- Lead the observability strategy across the platform, with an emphasis on building scalable, developer-friendly logging and tracing capabilities.
- Identify and lead large-scale cross-cutting reliability initiatives, including improvements to our incident detection, response, and postmortem analysis capabilities.
- Take part in the on-call rotation, and actively contribute to improving our on-call experience by refining alerting, reducing noise, and ensuring actionable telemetry.
Unsere Erwartungen an dich
Qualifikationen
- You'll be a great fit if you:
- Have solid understanding of containerization and orchestration technologies (Docker and Kubernetes).
- Have a strong understanding of Helm for managing Kubernetes manifests and ArgoCD for GitOps workflows.
- Have deep expertise in observability tooling and architecture, such as: Logging: Fluent Bit, OpenTelemetry, Loki, Elasticsearch, Logstash, Vector; Tracing: OpenTelemetry or proprietary APMs; Metrics: Prometheus, Thanos, Datadog, or equivalent.
- Have proficiency in at least one programming language (Ruby, Python, Go, Java, etc.) and a deep understanding of infrastructure as code principles.
- Like troubleshooting performance issues in complex environments.
- Are fluent in English.
- It would be fantastic if you:
- Have worked in a high-growth tech environment.
Erfahrung
- Have a solid hands-on experience (3y+) on a large-scale production platform.
- Have proven experience with cloud platforms such as AWS, Azure or Google Cloud.
- Have experience with monitoring and observability tools.
- Have experience contributing to open-source observability projects.
- Are passionate about developer experience and platform engineering.
Unser Angebot
- Free comprehensive health insurance (basic package) for you and your children.
- 25 days of paid vacation per year, plus up to 14 days of RTT.
- Free mental health and coaching services through our partner Moka.care.
- Work from abroad for up to 10 days per year thanks to our flexibility days policy.
- Lunch vouchers (Swile card) worth €8.50 per working day, with €4.50 covered by Doctolib.
- A subsidy from the work council to refund part of the membership to a sport club or a creative class.
- 50% reimbursement of your public transport subscription.
- Parent Care Program: receive one additional month of leave on top of the legal parental leave.
- Enrollment in Doctolib's long-term employee value sharing plan called DoctoGrowth.
- For caregivers and workers with disabilities, a package including an adaptation of the remote policy, extra days off for medical reasons, and psychological support.
- Relocation support in case of international mobility.
- Access to the best AI tools for coding, development and dedicated training.
Benefits
Work-Life-Integration
Themen mit denen du dich im Job beschäftigst
Job Standorte
Das ist dein Arbeitgeber
Doctolib
Doctolib ist ein führendes Startup in der digitalen Gesundheitsbranche, das eine benutzerfreundliche Plattform für die Verwaltung medizinischer Termine bereitstellt. Die Lösung fördert die Effizienz in der Kommunikation zwischen Ärzten und Patienten.
Description
- Unternehmenstyp
- Startup
- Arbeitsmodell
- Hybrid, Onsite
- Branche
- Gesundheitswesen, Soziales