AI Operations Engineer
Location: United States
Description
AI Operations Engineer
Location: Remote; Hybrid to Wayne, PA; Hybrid to Naperville, IL
How You’ll Contribute to Our Mission
The AI Ops Engineer supports the automation and modernization of IT operations using artificial intelligence and machine learning techniques. The role focuses on developing and integrating agentic AI systems and cloud based observability tools that help enterprise IT teams detect and resolve issues before they affect users. Unlike traditional monitoring, agentic AI Ops platforms go beyond pointing out problems; they evaluate context, choose the correct response and can initiate fixes autonomously The AI Ops Engineer will collaborate with site reliability, DevOps and platform engineering teams to implement these capabilities in complex, largescale environments.
How You’ll Drive Success
- Develop AI driven monitoring and analytics: Build, train and maintain machine learning models that analyze operational data (logs, metrics, events and traces) to detect anomalies, pinpoint root causes and predict incidents Configure AI agents that detect problems, understand context and execute remediation steps autonomously.
- Integrate AIOps tools with cloud and enterprise systems: Work with DevOps and SRE teams to integrate AIOps platforms into continuous integration/continuous delivery (CI/CD) pipelines and cloud infrastructure Leverage APIs to connect AI monitoring services to enterprise systems such as IT service management platforms and configuration management databases.
- Evaluate and deploy AIOps platforms: Assist in evaluating, implementing and tuning commercial AIOps tools (e.g., Moogsoft, Dynatrace, Splunk, BigPanda, Datadog) Configure agentic features in these platforms to enable automatic root cause analysis and remediation.
- Collaborate with cross functional teams: Partner with developers, operations staff and cybersecurity teams to ensure AI Ops integrations align with business goals Explain AIdriven insights and recommend improvements to stakeholders.
- Stay current on emerging technologies: Follow developments in agentic AI frameworks, generative AI, machine learning operations and observability. Continuously learn and experiment with new tools to improve system efficiency and reliability
What You Bring to Help Us Grow
- Education: Some college coursework in computer science, data science, information systems or a related field. Candidates without a degree should have an equivalent combination of training and experience.
- Experience: 1–2 years of hands-on experience in DevOps, site reliability or IT operations roles that used agentic AI or AIOps tools. Experience may include internships or co-ops.
- Technical skills:
- Proficiency in Python or another scripting language for automating tasks and interacting with APIs
- Understanding of machine learning concepts and frameworks (PyTorch)
- Familiarity with observability stacks such as Prometheus, Grafana and ELK/OpenTelemetry, and basic knowledge of data visualization
- Exposure to public cloud platforms (AWS) and infrastructure as code tools (Terraform, Ansible)
- Ability to interpret telemetry data and spot trends; comfortable working with charts, logs and metrics.
- Soft skills: Strong problem-solving and analytical thinking, with the ability to troubleshoot issues and implement solutions. Excellent communication and collaboration skills to work with cross functional teams
Preferred Qualifications
- Handson experience with commercial AIOps platforms or building machine learning based incident response systems.
- Knowledge of Kubernetes, container orchestration and service mesh architecture.
- Background in log analysis, timeseries forecasting or unsupervised anomaly detection.
- Familiarity with IT service management (ITSM) processes and how they integrate with AIdriven operations
- Exposure to agentic AI frameworks (e.g., LangChain/LangGraph) or generative AI pipelines, particularly for RAG or LLM‑based chatops.
What You’ll Need to Thrive
- Curiosity and continuous learning: AIOps is a rapidly evolving field. Candidates should be eager to explore new agentic AI techniques and stay up to date with emerging tools and frameworks
- Critical thinking: AI agents provide insights but still require human judgment; the engineer should be able to interpret AI recommendations, make strategic decisions and adjust guardrails as needed
- Team player: Success in AI Ops depends on collaboration. The candidate should be comfortable working with developers, SREs, data scientists and security teams to integrate AI capabilities into existing processes
- Attention to detail: Observability and AI pipelines handle large volumes of data; meticulous attention ensures data quality and accurate analysis
Impact and growth opportunities
AI Ops skills are in high demand as enterprises adopt automation and predictive analytics to handle complex IT environments. By mastering AI driven monitoring, anomaly detection and agentic workflows, the Junior AI Ops Engineer will help reduce downtime, improve service reliability and support scaling of enterprise systems. This role offers a path to more senior positions in AI Ops, site reliability engineering or machine learning operations and provides opportunities to work on advanced AI, cloud and observability technologies.
Our Mission, Our People, Our Purpose
At Frontline Education, we’re reimagining what’s possible by becoming an AI-first organization, transforming how we think, work, and serve the educators who shape our schools every day. By using AI in thoughtful, practical ways, we’re creating tools that help educators save time, gain insights, and focus more on what matters most — their students.
As part of our team, you’ll be expected and empowered to build and apply AI skillsets that grow with you, because at Frontline Education, technology amplifies what matters most: the human drive to learn, improve, and make a difference.
How We Support Growth, Balance, and Well-Being
- Personalized Time Off: Take time when it’s needed most — whether that’s a family vacation, a reset day, or simply time to rest and refocus.
- Paid Sick Time: Separate, dedicated sick leave to care for yourself or loved ones.
- Volunteer Time Off: Paid time to give back and support causes that matter to you.
- Ten Paid Holidays: Enjoy meaningful moments and traditions throughout the year.
- Our Philosophy: We believe time away from work helps you bring your best self to it.
Continuous Learning and Growth
- World-Class Learning Access: Explore thousands of on-demand courses through platforms like LinkedIn Learning.
- Leadership & Technical Skill Building: Develop new capabilities and chart your own professional path.
- AI Empowerment: Use OpenAI tools to build fluency with emerging technology and harness AI as a creative partner for innovation and problem-solving.
- Tuition Reimbursement: Invest in formal education to advance your skills and career.
- Ongoing Learning Culture: Participate in company-led webinars on AI, inclusion, and industry trends—designed to inspire curiosity and continuous improvement.
Health, Happiness, and Purpose
- Wellness Initiatives: Company-sponsored programs that support physical, mental, and emotional well-being.
- Employee Assistance Program (EAP): Confidential support for you and your family’s needs.
- Comprehensive Benefits: Health and financial benefits that support your happiness and future.
- A Culture That Cares: At Frontline Education, we want every team member to learn, grow, and thrive—personally, professionally, and purposefully
Compensation & Benefits
The salary range for this position is $115,000 - $145,000 and commensurate with your experience, skills, and internal equity. In addition to base salary, you will be eligible for performance-based incentives aligned to individual, team, and company results.
You’ll also have access to a comprehensive benefits package designed to support your well-being and future, including healthcare coverage, retirement savings with company match, employee stock purchase opportunities where applicable, and the time-off, wellness, and learning programs outlined above. Specific details will be shared during the interview process.
Inclusion, Belonging & Equal Opportunity
Frontline Education is an equal opportunity/affirmative action employer. We aspire to have an inclusive workplace and strongly encourage suitably qualified applicants from a wide range of backgrounds to apply and join our team.
Interview Process & Data Privacy
As part of our interview process, Frontline uses video conferencing tools that include photo capture and may include automated transcription features. A screenshot or photo will be taken at the start of the interview for internal identification and record-keeping purposes only, and transcription may be used to support notetaking and evaluation consistency. These materials are used solely by our recruiting and hiring teams, stored securely, and not shared outside the hiring process. Candidates may opt out of the transcription at any time by notifying their recruiter in advance. Frontline processes this information in accordance with applicable data privacy laws and only for legitimate business purposes related to recruitment and hiring.
Our Privacy Policy: Your privacy is important to us. Click here to read our general Privacy Statement and click here to read our Applicant Privacy Statement.