···
Log in / Register
Lead AI Platform Engineer
Indeed
Full-time
Onsite
No experience limit
No degree limit
79Q22222+22
Favourites
Share
Description

Summary: Seeking an experienced Lead Platform Engineer to drive AI infrastructure development, optimization, and scalability for leveraging cutting-edge AI technologies. Highlights: 1. Drive development and optimization of enterprise AI infrastructure 2. Empower teams to leverage cutting-edge AI technologies 3. Lead multi-agent orchestration and build developer tools We are looking for a dynamic and experienced **Lead Platform Engineer** to join our AI Enablement team. In this role, you will drive the development, optimization, and scalability of our enterprise AI infrastructure, empowering teams across the organization to leverage cutting\-edge AI technologies. **Responsibilities** * Develop and maintain the proprietary agentic AI platform * Manage LiteLLM as the central AI gateway with optimized routing, cost control, load balancing, and failover * Implement and oversee robust monitoring and observability solutions using Prometheus, Grafana, and OpenTelemetry * Design and optimize Retrieval\-Augmented Generation (RAG) pipelines, including document ingestion, chunking, embeddings, and vector stores * Develop RAG solutions on GCP and Azure with managed AI services and vector databases * Deploy and manage AI services on Kubernetes (AKS, GKE) with automated infrastructure tools like Terraform, Helm, and GitOps * Implement CI/CD pipelines using Jenkins, Opsera, and GitHub Actions * Ensure system security and compliance standards are consistently met * Enable multi\-agent orchestration, building SDKs, APIs, and developer documentation * Develop MCP servers for tool integrations and support autonomous workflows across teams **Requirements** * 5\+ years of experience in platform engineering or DevOps with a strong infrastructure background * 2\+ years of hands\-on experience in AI/ML or LLM platform development and support * Expertise in Kubernetes and CI/CD tools alongside cloud platforms like GCP or Azure * Proficiency in programming languages such as Python and/or TypeScript * Background in automating infrastructure provisioning with tools like Terraform, Helm, and GitOps * Competency in observability tooling such as Prometheus, Grafana, and OpenTelemetry * Understanding of Retrieval\-Augmented Generation (RAG) methodologies and vector databases * Advanced proficiency in English (B2\+/C1\) **Nice to have** * Expertise in LangChain, LlamaIndex, or agent frameworks * Familiarity with LiteLLM, MCP, and Backstage solutions * Capability to optimize costs for LLM workloads * Experience in building and maintaining enterprise\-scale AI platforms

Source:  indeed View original post
Juan García
Indeed · HR

Company

Indeed
Cookie
Cookie Settings
Our Apps
Download
Download on the
APP Store
Download
Get it on
Google Play
© 2025 Servanan International Pte. Ltd.