···
Log in / Register

ML Engineer - Large Language Model (LLM) Training

Computrabajo
Full-time
Onsite
No experience limit
No degree limit
Merida, Yucatan, Mexico
Favourites
Share
Some content was automatically translatedView Original

Description

Job Summary: We are seeking an AI specialist passionate about algorithm optimization and developing Spanish-language text processing solutions to build an innovative technology solution. Key Highlights: 1. Foundational role with direct impact on architectural decisions 2. High autonomy and advanced technical challenge 3. Dedicated infrastructure and hardware for model training We are assembling a select team to build an innovative technology solution in Latin America. We develop frontier AI focused on Spanish-language text processing, addressing a real-world problem with customers ready for implementation. If you are passionate about algorithm optimization, code debugging, and watching a loss curve finally converge, you are the person we are looking for. You will design and execute the full pipeline: corpus preparation, Continual Pre-Training (CPT), Supervised Fine-Tuning (SFT), RLHF, quantization, and deployment on proprietary hardware. We offer an environment with high autonomy, responsibility, and an advanced technical challenge. Key Responsibilities: Prepare and tokenize large-scale Spanish-language text datasets. Perform Continual Pre-Training on open-source base models using dedicated GPU infrastructure. Conduct supervised fine-tuning (Fine-tuning) using LoRA and QLoRA within the HuggingFace and TRL ecosystems. Design and operate RLHF and DPO pipelines with domain annotators. Quantize the final model for on-premise deployment using GGUF and MLX on specific hardware. Build information retrieval systems (RAG) on pgvector. Design rigorous evaluation metrics for model validation. Essential Requirements: Advanced proficiency in Python. Proven hands-on experience with PyTorch and HuggingFace Transformers. Experience fine-tuning LLMs in production environments (SFT, LoRA, QLoRA). Fluent command-line usage of Linux environments. Experience managing large volumes of data (ETL processes, tokenization, and pipelines). Native or advanced operational (C2) proficiency in Spanish for text evaluation. Desirable Requirements: Knowledge of MLX for Apple Silicon. Experience with RLHF, DPO, and Reward Modeling. Familiarity with tools such as Unsloth, DeepSpeed, or FSDP. Knowledge of quantization techniques: GGUF, GPTQ, AWQ. Experience with pgvector or vector databases. Familiarity with llama.cpp and Ollama. We Offer: Competitive salary commensurate with demonstrated technical expertise. Benefits exceeding statutory requirements. 100% remote work arrangement. Dedicated infrastructure and hardware for model training. Foundational role with direct impact on architectural decisions. Flexible working hours based on objective achievement. Selection Process: Per platform policy, please apply directly via the button on this portal, ensuring your profile is up-to-date and includes your portfolio or links to relevant code repositories (e.g., fine-tuning or training projects) in your attached information. Demonstrable code will be evaluated in the early stages of the process.-Requirements- Minimum Education: Higher Education – Specialization 6 years of experience Languages: Spanish, English Age: 30 years or older Knowledge Areas: Self-supervision, Databases, Spanish, Hardware, Artificial Intelligence, Technology Solutions

Source:  computrabajo View original post
Mateo García
Computrabajo

Company

Computrabajo
Mateo García
Computrabajo

Similar jobs

Cookie
Cookie Settings
Our Apps
Download
Download on the
APP Store
Download
Get it on
Google Play
© 2025 Servanan International Pte. Ltd.