‹ Back

Senior Machine Learning Engineer - LLM Evaluation / Task Creations (India Based)

JOB SUMMARY

IndiaPosted on 1/18/2026

Roles

ML/AI Engineer

Skills & Technologies

Languages:Python
ML/AI:TensorFlowPyTorchMLflow
Big Data:Airflow
Cloud/DevOps:AWSAzureGCPDocker
Apply

Job details

Role DescriptionMercor is hiring on behalf of a leading AI research lab to bring on highly skilled Machine Learning Engineers with a proven record of building, training, and evaluating high-performance ML systems in real-world environments.

In this role, you will design, implement, and curate high-quality machine learning datasets, tasks, and evaluation workflows that power the training and benchmarking of advanced AI systems.

This position is ideal for engineers who have excelled in competitive machine learning settings such as Kaggle, possess deep modelling intuition, and can translate complex real-world problem statements into robust, well-structured ML pipelines and datasets.

You will work closely with researchers and engineers to develop realistic ML problems, ensure dataset quality, and drive reproducible, high-impact experimentation. Candidates should have 3+ years of applied ML experience or a strong record in competitive ML, and must be based in India. Ideal applicants are proficient in Python, experienced in building reproducible pipelines, and familiar with benchmarking frameworks, scoring methodologies, and ML evaluation best practices. ResponsibilitiesFrame unique ML problems for enhancing ML capabilities of LLMs. Design, build, and optimise machine learning models for classification, prediction, NLP, recommendation, or generative tasks. Run rapid experimentation cycles, evaluate model performance, and iterate continuously. Conduct advanced feature engineering and data preprocessing. Implement adversarial testing, model robustness checks, and bias evaluations. Fine-tune, evaluate, and deploy transformer-based models where necessary. Maintain clear documentation of datasets, experiments, and model decisions. Stay updated on the latest ML research, tools, and techniques to push modelling capabilities forward. Required

QualificationsAt least 3 years of full-time experience in machine learning model development

Technical degree in Computer Science, Electrical Engineering, Statistics, Mathematics, or a related fieldDemonstrated competitive machine learning experience (Kaggle, DrivenData, or equivalent)Evidence of top-tier performance in ML competitions (Kaggle medals, finalist placements, leaderboard rankings)Strong proficiency in Python, PyTorch/TensorFlow, and modern ML/NLP frameworksSolid understanding of ML fundamentals: statistics, optimisation, model evaluation, architecturesExperience with distributed training, ML pipelines, and experiment trackingStrong problem-solving skills and algorithmic thinkingExperience working with cloud environments (AWS/GCP/Azure)Exceptional analytical, communication, and interpersonal skillsAbility to clearly explain modelling decisions, tradeoffs, and evaluation resultsFluency in EnglishPreferred / Nice to HaveKaggle Grandmaster, Master, or multiple Gold MedalsExperience creating benchmarks, evaluations, or ML challenge problemsBackground in generative models, LLMs, or multimodal learningExperience with large-scale distributed trainingPrior experience in AI research, ML platforms, or infrastructure teamsContributions to technical blogs, open-source projects, or research publicationsPrior mentorship or technical leadership experiencePublished research papers (conference or journal)Experience with LLM fine-tuning, vector databases, or generative AI workflowsFamiliarity with MLOps tools: Weights Biases, MLflow, Airflow, Docker, etc. Experience optimising inference performance and deploying models at scaleWhy JoinGain exposure to cutting-edge AI research workflows, collaborating closely with data scientists, ML engineers, and research leaders shaping next-generation AI systems. Work on high-impact machine learning challenges while experimenting with advanced modelling strategies, new analytical methods, and competition-grade validation techniques. Collaborate with world-class AI labs and technical teams operating at the frontier of forecasting, experimentation, tabular ML, and multimodal analytics. Flexible engagement options (30–40 hrs/week or full-time) — ideal for ML engineers eager to apply Kaggle-level problem solving to real-world, production-grade AI systems. Fully remote and globally flexible — optimised for deep technical work, async collaboration, and high-output research environments.

Discover the company

Explore other offers from this company or learn more about mercor.

The company

m
mercor
India