McGraw Hill logo

McGraw Hill

Lead Site Reliability Engineer

🇺🇸 Remote - US

🕑 Full-Time

💰 $140K - $155K

💻 Software Engineering

🗓️ January 6th, 2026

Terraform K-12 CI/CD

Edtech.com's Summary

McGraw Hill LLC. is hiring a Lead Site Reliability Engineer. This role involves leading a team of six engineers to support and enhance the reliability, scalability, and performance of K–12 digital learning platforms, utilizing expertise in AWS, Terraform, and observability tools to maintain cloud infrastructure and collaborate with cross-functional teams.

Highlights
  • Lead a team of 6 Site Reliability Engineers supporting production infrastructure and services.
  • Manage backlog, sprint planning, and track team velocity.
  • Ensure reliability, uptime, security, cost efficiency, and performance of services.
  • Define and monitor Service Level Objectives (SLOs) for application workloads.
  • Plan on-call rotations and reduce alert fatigue.
  • Forecast seasonal growth and capacity requirements.
  • Mentor engineers and promote professional development.
  • Collaborate with development, CyberSecurity, and FinOps teams on risk mitigation and cost reduction.
  • Design, troubleshoot, and optimize highly-distributed, cloud-based production systems.
  • Maintain infrastructure-as-code and monitoring practices with tools like Terraform, AWS (ECS, RDS, EKS, IAM, CloudWatch), GitHub Actions, New Relic, and Datadog.
  • Participate in on-call rotation and resolve operational issues.
  • Support agile development processes including code reviews.
  • Requires 5+ years experience in SRE, DevOps, or Software Engineering for enterprise applications.
  • Strong problem-solving and root cause analysis skills with a systems engineering mindset.
  • Remote position open to candidates authorized to work in the U.S.
  • Annual salary range of $140,000-$155,000; compensation varies by experience and location with additional medical and other benefits available.

Lead Site Reliability Engineer Full Description

Overview

Impact the Moment

Could your creative thinking build the future? A Lead Site Reliability Engineer at McGraw Hill makes a difference for learners and educators across the world. Our team needs individuals with new ideas who connect with people in innovative ways.

 

How can you make an Impact?

McGraw Hill, a leading provider of digital educational resources and content, is seeking a Lead Site Reliability Engineer to lead a team of 6 Engineers for our Digital Platform Group in supporting our K-12 learning platforms. These platforms serve millions of students and educators nationwide, and you'll play a key role in ensuring their reliability, scalability, and performance. Working closely with engineering and product teams, you'll leverage your expertise in AWS, Terraform, and observability tools to drive automation, enhance resiliency, and maintain the health of our cloud-based infrastructure.

 

This is a remote position open to applicants authorized to work for any employer within the United States.

 

What you will be doing:

 

  • Lead a 6 member SRE team supporting production infrastructure and services
  • Manage backlog, sprint planning, and team velocity
  • Own reliability, uptime, security, cost, and performance of services
  • Define and monitor SLOs for application workloads
  • Plan on-call rotations and work to reduce alert fatigue
  • Forecast seasonal growth and capacity planning
  • Mentor engineers and foster professional growth
  • Report status and issues to leadership monthly
  • Partner with development teams
  • Collaborate with CyberSecurity on risk mitigation
  • Collaborate with FinOps on cost reduction
  • Design and troubleshoot highly-distributed, cloud-based production systems
  • Maintain infrastructure-as-code and monitoring-as-code practices
  • Improve system resiliency through failure injection and chaos testing
  • Participate in on-call rotation and resolve operational issues
  • Optimize existing systems for performance and cost
  • Ensure telemetry provides visibility to application performance
  • Support agile development practices and code reviews

 

We're looking for someone with:

  • 5+ years of experience in SRE, DevOps, or Software Engineering roles supporting enterprise applications.
  • Strong problem-solving, triage, and root cause analysis skills with a systems engineering mindset
  • Deep expertise in the AWS ecosystem, with hands-on experience across core services including primarily ECS, RDS, EKS, IAM, CloudWatch, and networking configurations.
  • Expertise with Terraform for managing and automating scalable cloud infrastructure
  • Skilled in CI/CD pipelines (e.g., GitHub Actions) and managing end-to-end software delivery lifecycles.
  • Strong familiarity with telemetry and observability tools (e.g., New Relic, Datadog), including querying logs and metrics for performance monitoring.

 

Why work for us?

The work you do at McGraw Hill will be work that matters. We are collectively designing content that will build the future of education. Play your part and experience a sense of fulfillment that will inspire you to even greater heights.

 

The pay range for this position is between $140,000-$155,000 annually, however, base pay offered may vary depending on job-related knowledge, skills, experience, and location. Additionally, a full range of medical and/or other benefits may be provided, depending on the position offered. Click here to learn more about our benefit offerings.

 

McGraw Hill recruiters always use a "@mheducation.com" or "@careers.mheducation.com" email addresses and/or from our Applicant Tracking System, iCIMS. Any variation of this email domain should be considered suspicious. Additionally, McGraw Hill recruiters and authorized representatives will never request sensitive information in email.

 

50146

McGraw Hill uses an automated employment decision tool (AEDT) to assist in the screening process by recommending candidates with "like skills" based on resume and job data. To request an alternative screening process, please select "Opt-Out" when asked to "Consent to use of Automated Employment Decision Tools" during the application.