




Summary: Seeking an Observability Engineer to lead and evolve monitoring capabilities across complex hybrid environments, leveraging deep technical expertise and leadership skills. Highlights: 1. Lead and evolve monitoring and observability capabilities 2. Design and implement observability solutions using various tools 3. Provide technical leadership and mentorship to junior engineers **About the Role** **Hybrid position to work in Guadalajara´s offices.** We are seeking a highly skilled and experienced Observability Engineer to lead and evolve our monitoring and observability capabilities across complex, hybrid environments. This role requires deep technical expertise in tools like Grafana, Site24x7, Azure Monitor, and a strong understanding of Azure Cloud, Windows, and Linux operating systems. The ideal candidate will also bring leadership experience, guiding teams in implementing scalable observability strategies that drive performance, reliability, and resilience. **Key Responsibilities** * Design and implement observability solutions using Grafana, Site24x7, Azure Monitor, and other relevant tools. * Develop and maintain dashboards, alerts, and metrics to monitor infrastructure, applications, and services across cloud and on\-prem environments. * Collaborate with cross\-functional teams to define SLIs/SLOs and improve system reliability and performance. * Lead initiatives to enhance telemetry, logging, and tracing across distributed systems. * Provide technical leadership and mentorship to junior engineers and cross\-functional teams. * Troubleshoot and resolve complex issues in real\-time, leveraging deep knowledge of OS internals (Windows/Linux) and cloud infrastructure. * Drive adoption of best practices in monitoring, incident response, and post\-mortem analysis. * Stay current with industry trends and emerging technologies in observability and resilience engineering. **Required Qualifications** * 5\+ years of experience in observability, monitoring, or site reliability engineering. * Expert\-level proficiency in Grafana (including custom dashboards, integrations, and alerting). * Hands\-on experience with Site24x7, Azure Monitor, and other observability platforms. * Strong understanding of Azure Cloud architecture, services, and deployment models. * Deep technical knowledge of Windows and Linux operating systems, including performance tuning and diagnostics. * Experience with scripting and automation (e.g., PowerShell, Bash, Python). * Familiarity with containerized environments (Docker, Kubernetes) is a plus. * Excellent communication and team leadership skills, with a track record of mentoring and guiding technical teams. **Preferred Qualifications** * Certifications in Azure or related technologies. * Experience with OpenTelemetry, Prometheus, ELK stack, or similar tools. * Background in resilience engineering or incident management frameworks. Job Type: Full\-time Pay: $85,000\.00 \- $95,000\.00 per month Work Location: In person


