···
Log in / Register
Principal Service Reliability Applications Developer
Negotiable Salary
Indeed
Full-time
Onsite
No experience limit
No degree limit
C. A las Cumbres 121A, Col Benito Juarez, Residencial Cordilleras, 45020 Zapopan, Jal., Mexico
Favourites
New tab
Share
Description

Own and scale mission\-critical ERP/SaaS services while building intelligent, cloud\-native capabilities. This role requires a SRE mindset combined with AI/ML expertise and strong application engineering skills across public and private cloud environments. **Key Responsibilities** \- End\-to\-end service ownership: design for telemetry, security, resiliency, scalability, and performance; lead sizing/architecture; drive service health reviews and process simplification. \- Incident management and prevention: lead postmortems/RCAs, coordinate fixes, define repair items, and implement data\-driven prevention and continuous improvement. \- AI/ML and GenAI delivery: design and integrate solutions with LLMs, RAG, agentic workflows, and conversational AI; build low\-latency model serving and retraining pipelines. \- Application engineering: develop performant microservices for distributed, containerized, cloud\-native systems. \- Automation: eliminate toil by automating operational workflows, recovery procedures, code delivery, and configuration management; build internal tools and reusable scripts/services to accelerate delivery and reduce errors. \- Observability: define and implement monitoring, logging, alerting, and tracing strategies; establish SLOs/SLIs/error budgets; improve diagnostics and performance visibility for rapid triage. \- Cross\-functional collaboration: partner with product, operations, and data teams to translate requirements into secure, scalable solutions; communicate effectively with technical and non\-technical stakeholders. **Minimum Qualifications** \- BS/MS in Computer Science or related field; 10\+ years of software engineering in cloud environments. \- Strong in distributed systems/microservices using java / python; SQL/data modeling; python for AI/automation. \- SRE/DevOps expertise: systems and networking fundamentals, application security, observability, performance analysis, and incident response. \- Proven SDLC excellence: code quality, reviews, version control, CI/CD, testing, and release engineering. \- Excellent written and verbal communication; English fluency. **Preferred/Technical Skills** \- AI/ML/GenAI: experience with foundational models, RAG, agentic architectures; model deployment, optimization, monitoring, and retraining. \- Cloud and containers: experience with containerization, orchestration, and resilient, fault\-tolerant microservices. \- Observability: hands\-on experience designing dashboards, alerts, traces, logs, and metrics; defining SLOs/SLIs and error budgets; on\-call readiness and runbook quality. \- Operations: performance tuning across java / python and SQL for large\-scale enterprise applications; strong Linux/Unix expertise; capacity planning and reliability reviews. \- Automation and scripting: proficiency in scripting to automate operational workflows, build tooling, and CI/CD tasks (e.g., shell scripting, python, configuration\-as\-code, task runners). \- Familiarity with enterprise ERP applications and standard DevOps tooling and practices.

Source:  indeed View original post
Juan García
Indeed · HR

Company

Indeed
Cookie
Cookie Settings
Our Apps
Download
Download on the
APP Store
Download
Get it on
Google Play
© 2025 Servanan International Pte. Ltd.