McGraw Hill logo

McGraw Hill

Lead Site Reliability Engineer

🇨🇦 Remote - CA 🕑 Full-Time 💰 $140K - $167K 💻 Software Engineering 🗓️ May 29th, 2026
Python Golang Kubernetes

Edtech.com's Summary

McGraw Hill LLC. is hiring a Lead Site Reliability Engineer to build and manage reliable, scalable core infrastructure services that support millions of students and educators. This role involves leading cross-functional teams to design, deploy, and enhance infrastructure, focusing on system reliability, performance, and scalability, while working with automation tools and mentoring engineers.

Highlights
  • Lead the design, deployment, and management of core infrastructure services with a focus on reliability and scalability.
  • Collaborate with product development teams using a DevOps model to create automation tools.
  • Optimize systems balancing technical and business needs, ensuring Infrastructure-as-Code practices.
  • Monitor AWS costs and implement optimization for ROI and meeting Service Level Objectives.
  • Own reliability, security, capacity, and performance through observability engineering and data-driven analytics.
  • Maintain and enhance telemetry systems to track application performance and business metrics.
  • Strong expertise required in AWS, Kubernetes (EKS), Terraform or CloudFormation, and programming in Python, Golang, or Bash.
  • Experience with CI/CD pipelines, GitOps tools (ArgoCD, FluxCD), and observability platforms like NewRelic, CloudWatch, and DataDog.
  • Must have strong troubleshooting skills across web servers, app platforms, operating systems, and networks.
  • Requires a degree in Computer Science or equivalent industry experience.
  • Position is remote, open to candidates authorized to work for any employer within Canada.
  • Pay range is $140,000 - $167,000 CAD, with potential annual bonuses and benefits based on role and location.

Lead Site Reliability Engineer Full Description

Overview

Make an Impact!
At McGraw Hill, we create best-in-class, next-generation learning platforms that are used by millions of students and educators worldwide, from kindergarten through graduate school. Our goal is to accelerate student success through intuitive and effective learning tools and content that maximize a teacher's time and a student's learning experience. We do all of this in a supportive, collaborative environment where you can grow your career in a way that fits into your life.

 

How can you make an impact?
We are hiring a Lead Site Reliability Engineer to build and support reliable, high-capacity, and high-performing core infrastructure services that enable us to reimagine learning for millions of students and educators worldwide. You will lead cross-functional teams to design, deploy, and manage foundational infrastructure services while driving initiatives to enhance system reliability, performance, and scalability. If you thrive in building developer tools, automating processes, solving cloud-related challenges, and mentoring engineers, this role is for you.

 

This is a remote position open to applicants authorized to work for any employer within Canada.

 

What you will be doing:

 

Cloud Engineering:

  • Collaborate with product development teams in a DevOps model to design, deploy, and manage automation tools that enhance predictability and accelerate time to market.
  • Optimize existing systems to ensure "right-sized" solutions that balance technical and business constraints.
  • Drive initiatives to improve system reliability and performance.
  • Ensure repeatability, traceability, and transparency of infrastructure automation using Infrastructure-as-Code (IaC).
  • Actively monitor AWS costs and use optimization tools to maximize ROI while meeting Service Level Objectives.

Observability Engineering:

  • Own reliability, uptime, system security, cost, operations, capacity, resiliency, and performance analysis.
  • Lead initiatives to improve application and platform reliability using data-driven analytics.
  • Ensure architecture and deployment models meet SLA commitments.
  • Maintain and enhance telemetry systems to improve visibility into application performance and business metrics.
  • Develop and monitor standard processes to promote the long-term health and sustainability of operational tasks.

We are looking for someone with…

  • Proven experience building and managing large-scale systems and tools in AWS using repeatable and maintainable methods.
  • Expertise in Kubernetes (EKS or managing clusters) and container orchestration technologies.
  • Proficiency in infrastructure automation tools like Terraform or CloudFormation.
  • Strong programming skills in Python, Golang, or Bash, with a focus on production software development.
  • Experience with CI/CD pipelines, GitOps tools (ArgoCD, FluxCD), and observability platforms (NewRelic, CloudWatch, DataDog).
  • Versatility in troubleshooting hosting technologies, including web servers, application platforms, operating systems, and network components.
  • Strong communication, problem-solving, and systems engineering skills.
  • A proactive mindset and ability to work across team boundaries daily.
  • A degree in Computer Science or equivalent industry experience.

 

Why work for us?
The work you do at McGraw Hill will be work that matters. We are collectively designing content that will build the future of education. Play your part and experience a sense of fulfillment that will inspire you to even greater heights.

 

The pay range for this position is between $140,000 - 167,000 CAD.However, base pay offered may vary depending on job-related knowledge, skills, experience, and location. An annual bonus plan may be provided as part of the compensation package, in addition to a full range of medical and/or other benefits, depending on the position offered.  

 

McGraw Hill recruiters always use a "@mheducation.com" or "@careers.mheducation.com" email addresses and/or from our Applicant Tracking System, iCIMS. Any variation of this email domain should be considered suspicious. Additionally, McGraw Hill recruiters and authorized representatives will never request sensitive information in email.

 

CAN_TECH_25

McGraw Hill uses an automated employment decision tool (AEDT) to assist in the screening process by recommending candidates with "like skills" based on resume and job data. To request an alternative screening process, please select "Opt-Out" when asked to "Consent to use of Automated Employment Decision Tools" during the application.