




Key responsibilities * Architect, design, and optimize scalable big data solutions for batch and real\-time processing. * Develop and maintain ETL/ELT pipelines to ingest, transform, and synchronize data from diverse sources. * Integrate data from cloud applications, on\-prem systems, APIs, and streaming workspaces into centralized data repositories. * Implement and manage **data lakes** and **data warehouses** solutions on cloud infrastructure. * Ensure **data consistency, quality, and compliance** with governance and security standards. * Collaborate with data architects, data engineers, and business stakeholders to align integration solutions with organizational needs. Core qualifications * Proficiency in **Python, Java, or Scala** for big data processing. * **Big Data Frameworks:** Strong expertise in **Apache Spark** , Hadoop, Hive, Flink, or Kafka. * Hands\-on experience with data modeling, data lakes ( **Delta Lake** , Iceberg, Hudi), and data warehouses ( **Snowflake** , Redshift, BigQuery). * **ETL/ELT Development: Expertise with tools like Informatica, Talend, SSIS, Apache NiFi, dbt, or custom Python\-based frameworks.** * **APIs \& Integration: Strong hands\-on experience with REST, SOAP, GraphQL APIs, and integration platforms (MuleSoft, Dell Boomi, SnapLogic).** * **Data Pipelines: Proficiency in batch and real\-time integration (Kafka, AWS Kinesis/ Azure Event Hub/ GCP Pub/Sub).** * **Databases: Deep knowledge of SQL (Oracle, PostgreSQL, SQL Server) and NoSQL (MongoDB, Cassandra, DynamoDB) systems.** Preferred experience * Expertise with at least one major cloud platform (AWS, Azure, GCP). * Experience with data services such as AWS EMR/Glue, GCP Dataflow/Dataproc, or Azure Data Factory. * Familiarity with containerization (Docker) and orchestration (Kubernetes). * Knowledge of CI/CD pipelines for data engineering. * Experience with OCI and Oracle Database (including JSON/REST, sharding) and/or Oracle microservices tooling. How we’ll assess * Systems design interview: architect a scalable service; justify data models, caching, and failure handling. * Coding exercise: implement and optimize a core algorithm/data‑structure problem; discuss trade‑offs. * Code review: evaluate readability, testing, error handling, and security considerations. * Practical discussion: walk through a past end‑to‑end project, metrics/SLOs, incidents, and learnings.


