Senior AI/AIOps Engineer

Negotiable Salary

Indeed

Full-time

Onsite

No experience limit

No degree limit

2222+22 Hernández, S.L.P., Mexico

Favourites

Some content was automatically translatedView Original

Description

**Key Responsibilities** * Design, implement, and automate ML lifecycle workflows using tools like **MLflow**, **Kubeflow**, **Airflow** and **OCI Data Science Pipelines**. * Build and maintain **CI/CD pipelines** for model training, validation, and deployment using **GitHub Actions**, **Jenkins** or **Argo Workflows**. * Collaborate with data engineers to deploy models within **modern data lakehouse architectures** (e.g., **Apache Iceberg**, **Delta Lake**, **Apache Hudi**). * Integrate machine learning frameworks such as **TensorFlow**, **PyTorch**, and **Scikit\-learn** into distributed environments like **Apache Spark**, **Ray** or **Dask**. * Operationalize model tracking, versioning, and drift detection using **DVC**, model registries, and ML metadata stores. * Manage **infrastructure as code (IaC)** using tools like **Terraform**, **Helm** or **Ansible** to support dynamic GPU/CPU training clusters. * Configure real\-time and batch data ingestion and feature transformation pipelines using **Kafka**, **Goldengate** and **OCI Streaming**. * Collaborate with DevOps and platform teams to implement robust **monitoring, observability**, and **alerting** with tools like **Prometheus**, **Grafana** and the **ELK Stack**. * Support **AI governance** by enabling model explainability, audit logging, and compliance mechanisms aligned with enterprise data and security policies. **Required Qualifications** * Bachelor’s or Master’s degree in **Computer Science**, **Data Science** or a related technical discipline. * **5–8 years** of experience in **ML engineering**, **DevOps** or **data platform engineering**, with at least **2 years in MLOps** or model operations. * Proficiency in **Python**, particularly for automation, data processing, and ML model development. * Solid experience with **SQL** and distributed query engines (e.g., **Trino**, **Spark SQL**). * Deep expertise in **Docker**, **Kubernetes** and cloud\-native container orchestration tools (e.g., **OCI Container Engine**, **EKS**, **GKE**). * Working knowledge of **open\-source data lakehouse frameworks** and **data versioning** tools (e.g., **Delta Lake**, **Apache Iceberg**, **DVC**). * Familiarity with model deployment strategies, including **batch**, **real\-time inference** and **edge deployments**. * Experience with **CI/CD pipelines** (GitHub Actions, GitLab CI, Jenkins) and **MLOps frameworks** (Kubeflow, MLflow, Seldon Core). * Competence in implementing monitoring and logging systems (e.g., **Prometheus**, **ELK Stack**, **Datadog**) for ML applications. * Strong understanding of **cloud platforms** (OCI, AWS, GCP) and **IaC tools** (Terraform, CloudFormation). **Preferred Qualifications** * Experience integrating AI workflows with **Oracle Data Lakehouse**, **Databricks** or **Snowflake**. * Hands\-on experience with orchestration tools like **Apache Airflow**, **Prefect** or **Dagster**. * Exposure to **real\-time ML systems** using **Kafka** or **Oracle Stream Analytics**. * Understanding of **vector databases** (e.g., **Oracle 23ai Vector Search**). * Knowledge of **AI governance**, including model explainability, auditability, and reproducibility frameworks. **Soft Skills** * Strong **problem\-solving** skills and an automation\-first mindset. * Excellent **cross\-functional communication**, especially when collaborating with data scientists, DevOps, and platform engineering teams. * A collaborative and **knowledge\-sharing** attitude, with good documentation habits. * Passion for **continuous learning**, especially in the areas of AI/ML tooling, open\-source platforms, and data engineering innovation.

Source: indeed View original post