Art of Problem Solving logo

Art of Problem Solving

Senior DevOps Engineer

🇺🇸 Hybrid - San Diego, CA

🕑 Full-Time

💰 $132K - $165K

💻 Software Engineering

🗓️ August 7th, 2025

NGINX Node.js PostgreSQL

Edtech.com's Summary

Art of Problem Solving is hiring a Senior DevOps Engineer. The role involves designing, implementing, and maintaining resilient cloud infrastructure, managing Infrastructure-as-Code with Terraform, and leading improvements in system reliability, automation, and security to support educational platforms.

Highlights
  • Design, implement, test, monitor, and maintain cloud infrastructure for all products.
  • Develop and manage Infrastructure-as-Code configurations using Terraform.
  • Lead automation and self-service tooling to empower engineering teams.
  • Collaborate with engineering leadership and software architects on release management and deployment processes.
  • Analyze and optimize system performance using reliability metrics and logs.
  • Lead incident response, troubleshoot production issues, and conduct post-mortems.
  • Implement security best practices and conduct security audits.
  • Manage cloud infrastructure primarily in the AWS ecosystem.
  • 3-5 years of Software or Site Reliability Engineering experience, with 2-3 years in Site Reliability Engineering.
  • Proficiency with Node.js and/or PHP, and familiarity with MariaDB, PostgreSQL, Redis, Apache, and nginx.
  • Full salary range $132k-$165k with a 6% year-end bonus and a $2000 referral bonus.
  • Benefits include medical, dental, vision plans, 401K with company match, PTO, hybrid work schedule, and relocation bonus.

Senior DevOps Engineer Full Description

As a Senior DevOps Engineer, you will play a critical role in enhancing and maintaining the resilience of our cloud-based infrastructure and services. You will leverage your deep technical expertise to ensure our systems are scalable, reliable, efficient, and secure, supporting our mission to discover, inspire, and train the great problem solvers of the next generation. This role is perfect for a proactive and analytical engineer passionate about solving complex problems and driving improvements in system reliability and performance.

The Senior DevOps Engineer will:
  • Design, implement, test, monitor, maintain, and document cloud infrastructure for all of our products
  • Develop and manage Infrastructure-as-Code (IaC) configurations using Terraform
  • Lead the strategic development and enhancement of automation and self-service tooling to empower engineering teams, driving efficiency improvements and fostering a culture of innovation and self-reliance among engineers
  • Collaborate with Engineering Leadership, Software Engineers, and Software Architects to drive best practices in release management and deployment processes
  • Drive continuous improvement initiatives in system reliability, performance, and efficiency
  • Analyze and optimize system performance based on reliability metrics and logs
  • Lead or assist with incident response, troubleshoot and resolve production issues, and lead incident post-mortems and analysis to prevent future incidents
  • Test, evaluate, and code review others' programs and infrastructure
  • Proactively identify and communicate reliability issues or risks to stakeholders, including engineers and product teams, while contributing to the development and implementation of risk management and reduction best practices
  • Lead the implementation of security best practices and the execution of comprehensive security audits to ensure system integrity and protection against vulnerabilities
  • Focus on cost management strategies and identify opportunities for significant cost reductions without compromising system reliability or performance
  • Proactively estimate and clearly communicate development timelines, roadblocks, and development status
  • Maintain an understanding of current web technologies and infrastructure best practices and proactively work to expand knowledge and skill-set
  • Complete other tasks and responsibilities, as assigned

The ideal candidate has:
  • 3-5 years of experience in Software Engineering or Site Reliability Engineering, with at least 2-3 years specifically in Site Reliability Engineering
  • Demonstrated expertise in designing, securing, and managing scalable infrastructure in the AWS ecosystem
  • Proficiency in Infrastructure-as-Code (IaC) tools, preferably Terraform
  • Strong analytical and problem-solving skills, with the ability to work independently and as part of a team
  • Familiarity with Node.js (preferred) and/or PHP in a professional environment
  • Familiarity with MariaDB, PostgreSQL, Redis, Apache, and nginx or similar technologies
  • Prior full-stack or backend software engineering experience is strongly preferred

Why Join AoPS:
This is a hybrid full-time position based at our headquarters in San Diego, CA. The full salary range for this position is 132k-165k with a 6% year-end bonus. Here are some things you can look forward to:
  • Impact: Your work will directly support the learning experience of hundreds of thousands of students worldwide, ensuring our educational platforms remain reliable, fast, and secure for the next generation of problem solvers.
  • Culture: Work and collaborate with an organization filled with builders and life-long learners who strive to discover, inspire, and train the great problem solvers of the next generation
  • Flexibility: Casual work environment with a hybrid work week and flexible scheduling
  • Benefits: Multiple options for Medical, Dental and Vision plans
  • Future Planning: 401K with company match
  • Quality of Life: PTO Plan and supportive leadership that gives you the work-life balance you deserve
  • Ease of Transition: Relocation bonus (if currently located outside of San Diego)

Background Check:
Please note that employment is contingent on the successful completion of a background check.

Work Authorization:
Please note that in order to be considered for this position you must be legally authorized to work in the US. We are unable to offer sponsorship, including STEM-OPT and H-1B.

About AoPS:
Art of Problem Solving (AoPS) is on a mission to discover, inspire, and train the great problem solvers of the next generation. Since 2003, we have trained hundreds of thousands of the country's top students, including nearly all the members of the US International Math Olympiad team, through our online school, in-person academies, textbooks, and online learning systems. While our primary focus has been math for most of our history, through the years we have expanded our unique problem solving curriculum into more subjects, such as language arts, science, and computer science.