Columbia University logo

Columbia University

Machine Learning Engineer

🇺🇸 Hybrid - US

🕑 Full-Time

💰 $175K - $200K

💻 Software Engineering

🗓️ November 5th, 2025

Python CI/CD TensorFlow

Edtech.com's Summary

Columbia University is hiring a Machine Learning Engineer for its Department of Biomedical Informatics. The role focuses on developing, validating, and deploying advanced AI models, particularly generative AI, to analyze complex biomedical and clinical data while collaborating with a multidisciplinary team to create scalable AI solutions that enhance diagnosis and patient care.

Highlights
  • Develop, validate, and deploy AI and machine learning models for biomedical and clinical data analysis.
  • Collaborate with clinicians, researchers, software engineers, and informaticians to design scalable AI solutions.
  • Maintain and fine-tune foundational models using techniques like prompt engineering and instruction tuning.
  • Build data pipelines for ingestion, cleaning, and preprocessing of biomedical datasets.
  • Use programming languages such as Python and ML libraries including PyTorch, TensorFlow, and scikit-learn.
  • Experience with cloud platforms (Azure, AWS, GCP) and version control (Git, CI/CD pipelines) preferred.
  • Required qualifications include a Master's degree in computer science or related field with at least 2 years of relevant experience.
  • Preferred qualifications include a PhD and knowledge of clinical data processing.
  • Salary range is $175,000 to $200,000, depending on experience and other factors.
  • Position requires excellent analytical, communication, and collaboration skills and offers potential flexible and hybrid work options.

Machine Learning Engineer Full Description

  • Job Type: Officer of Administration
  • Bargaining Unit:
  • Regular/Temporary: Regular
  • End Date if Temporary:
  • Hours Per Week: 35
  • Standard Work Schedule: Monday - Friday
  • Building: PH-20
  • Salary Range: $175,000.00 - $200,000.00
The salary of the finalist selected for this role will be set based on a variety of factors, including but not limited to departmental budgets, qualifications, experience, education, licenses, specialty, and training. The above hiring range represents the University's good faith and reasonable estimate of the range of possible compensation at the time of posting.
 

Position Summary

Columbia University’s Department of Biomedical Informatics is advancing innovation at the intersection of healthcare and artificial intelligence. We are seeking a highly skilled and motivated Machine Learning Engineer with expertise in generative AI. The successful candidate will play a central role in developing, validating, and deploying advanced AI models to analyze and interpret complex biomedical and clinical data.  This position involves collaborating with a multi-disciplinary team of clinicians, academic researchers, software engineers, and informaticians to translate research into scalable, production-ready AI solutions that improve diagnosis, treatment planning, and patient care.

As a Machine Learning Engineer, you will play a crucial role in translating research, developing and deploying cutting-edge machine learning models to address healthcare challenges. You will be responsible for the entire lifecycle of AI projects, from data ingestion and foundational model finetuning and validation, model development and validation, and deployment. You will work collaboratively with cross-functional teams, including clinicians, academic researchers, software engineers, informaticians, and product managers, to design and implement scalable AI solutions.

Subject to business needs, we may support flexible and hybrid work arrangements. Options will be discussed during the interview process.

Responsibilities

  • Collaborate with cross-functional teams to analyze biomedical and clinical datasets—including free text, structured data, and multi-modal inputs—and design and implement scalable AI solutions.
  • Maintain, fine-tune, and validate foundational models for clinical data, applying methods such as in-context learning, prompt engineering, and instruction tuning.
  • Develop robust data pipelines for data ingestion, cleaning, and preprocessing.
  • Build, evaluate, and optimize AI/ML models (foundation-based or bespoke) tailored to specific healthcare tasks.
  • Develop and maintain machine learning pipelines for production deployment.
  • Contribute to continuous monitoring of model performance and support model retraining to ensure accuracy, fairness, and clinical reliability.
  • Write, validate, and execute code across local and cloud-based environments (CPU/GPU).
  • Create clear and informative reports and visualizations to summarize data, results, and performance metrics.
  • Actively participate in stakeholder meetings, presentations, and discussions.
  • Follow best practices in documentation, code repositories, containers/environments, and version control.
  • Mentor junior engineers and data scientists as appropriate.
  • Stay abreast of emerging methods and tools in AI, NLP, and healthcare technology to ensure Columbia’s solutions remain cutting edge.
  • Perform other related duties and special projects as assigned.

Minimum Qualifications

Master's degree in computer science, informatics or related field, and/or equivalent in education and experience, with at least 2 years’ related experience.

Preferred Qualifications

PhD degree in computer science, informatics or related field, and/or equivalent and experience in education, with at least 1 year of related work experience.

Other Requirements

  • Experience working with foundational models, prompt engineering, instruction tuning, in-context learning.
  • Experience optimizing training pipelines for scaling novel foundation model architectures to extreme datasets and model sizes over appropriate compute resources.
  • Strong programming skills in Python and experience with machine learning libraries (e.g., PyTorch, TensorFlow, scikit-learn) as well as visualization packages.
  • Experience with cloud platforms (e.g., Azure, Databricks, AWS, GCP) is a plus.
  • Experience with version control systems (e.g., Git) and continuous integration/continuous deployment (CI/CD) pipelines.
  • Experience working with clinical data is a plus.
  • Excellent analytical and problem-solving skills.
  • Strong communication and collaboration skills.

 

Equal Opportunity Employer / Disability / Veteran

Columbia University is committed to the hiring of qualified local residents.