Data Engineer -AWS, Python, Spark

Edtech.com's Summary

McGraw Hill LLC. is hiring a Data Engineer - AWS, Python, Spark. This role involves designing, developing, and maintaining scalable data pipelines and solutions using AWS technologies, as well as optimizing big data processing workflows with Databricks and Spark to support data-driven decision-making across multiple business domains. The Data Engineer will collaborate with cross-functional teams to translate business requirements into technical designs and support reporting and analytics through well-designed data models.

Highlights

Design, develop, and maintain scalable data pipelines using AWS technologies such as Athena, Glue, EMR, Lambda, and Iceberg.
Build and optimize big data processing workflows using Databricks and Spark.
Develop efficient ETL/ELT solutions with parallel processing for fast data delivery.
Design and implement data models including facts, dimensions, star schemas, and aggregations to support analytics.
Monitor and manage cloud-based data platforms ensuring performance, reliability, and SLA adherence.
Translate complex business requirements into end-to-end technical data solutions.
Collaborate with analytics, product, and business stakeholders throughout the software development lifecycle.
Create and maintain high-quality technical and solution design documentation.
Follow Agile/Kanban methodologies using tools like Git and Jira.
Required qualifications include a Bachelor's degree or equivalent experience, 3+ years in Data Engineering, strong hands-on AWS data service experience, proficiency in Databricks, Spark, SQL, and Python, and knowledge of modern data warehousing and data lake architectures. Experience with infrastructure-as-code tools such as Terraform is also required.
Nice to have: experience in Education or Publishing, familiarity with BI tools like Tableau or Alteryx, experience with financial datasets, and exposure to IBM Planning Analytics (TM1).
Employment type: Full-time position based in India.

Data Engineer -AWS, Python, Spark Full Description

Overview

Build the future!

McGraw Hill is a global education innovation company offering solutions from textbooks to cutting-edge digital platforms that improve learning outcomes. To support our growing Data & Analytics capability, we are hiring a Data Engineer who will play a critical role in strengthening McGraw Hill's data platform and enabling high-impact, data-driven decision-making across the business and will be remotely based in India.

How are you creating an impact?
As a Data Engineer, you will be at the core of McGraw Hill's data ecosystem designing, building, and optimizing scalable data solutions that power insights across finance, product, customer, and operational domains. You will work closely with analytics, product, and business stakeholders to translate complex requirements into reliable, high-performance data pipelines and architectures, helping teams make timely and informed decisions.

What will you be doing?

Design, develop, and maintain scalable data pipelines and solutions using AWS technologies
Build and optimize big data processing workflows using Databricks and Spark
Develop robust ETL/ELT solutions with parallel processing for faster and more efficient data delivery
Design and implement data models (facts, dimensions, star schemas, aggregations) to support reporting and analytics
Monitor and manage cloud-based data platforms to ensure performance, reliability, and SLA adherence
Translate business requirements into technical designs and end-to-end data solutions
Collaborate with cross-functional teams throughout the software development lifecycle, from requirements to deployment
Create and maintain high-quality technical and solution design documentation
Follow Agile/Kanban practices using tools such as Git and Jira

We are looking for someone having:

A Bachelor's degree in a related field or equivalent practical experience
3+ years of experience in Data Engineering or a related discipline
Strong hands-on experience with AWS data services such as Athena, Glue, EMR, Lambda, and Iceberg
Proven experience working with Databricks and Spark-based data processing
Advanced proficiency in SQL and strong Python scripting skills
Solid understanding of modern data warehousing and data lake architectures
Experience with cloud infrastructure and infrastructure-as-code tools such as Terraform
Strong problem-solving skills, attention to detail, and ability to work independently
Excellent communication skills and the ability to collaborate with technical and non-technical stakeholders

Nice to have:

Experience in the Education or Publishing domain
Familiarity with analytics and BI tools such as Tableau or Alteryx
Experience working with financial datasets (sales, revenue, COGS, manufacturing, etc.)
Exposure to IBM Planning Analytics (TM1)

Why work with us?

There has never been a better time to join McGraw Hill. In our culture of curiosity, collaboration, and innovation, you'll have the opportunity to work on meaningful problems, own your growth, and help shape the future of learning through data and technology.

50444

McGraw Hill uses an automated employment decision tool (AEDT) to assist in the screening process by recommending candidates with "like skills" based on resume and job data. To request an alternative screening process, please select "Opt-Out" when asked to "Consent to use of Automated Employment Decision Tools" during the application.

Original Job Description

Overview

Build the future!

What will you be doing?

Design, develop, and maintain scalable data pipelines and solutions using AWS technologies
Build and optimize big data processing workflows using Databricks and Spark
Develop robust ETL/ELT solutions with parallel processing for faster and more efficient data delivery
Design and implement data models (facts, dimensions, star schemas, aggregations) to support reporting and analytics
Monitor and manage cloud-based data platforms to ensure performance, reliability, and SLA adherence
Translate business requirements into technical designs and end-to-end data solutions
Collaborate with cross-functional teams throughout the software development lifecycle, from requirements to deployment
Create and maintain high-quality technical and solution design documentation
Follow Agile/Kanban practices using tools such as Git and Jira

We are looking for someone having:

A Bachelor's degree in a related field or equivalent practical experience
3+ years of experience in Data Engineering or a related discipline
Strong hands-on experience with AWS data services such as Athena, Glue, EMR, Lambda, and Iceberg
Proven experience working with Databricks and Spark-based data processing
Advanced proficiency in SQL and strong Python scripting skills
Solid understanding of modern data warehousing and data lake architectures
Experience with cloud infrastructure and infrastructure-as-code tools such as Terraform
Strong problem-solving skills, attention to detail, and ability to work independently
Excellent communication skills and the ability to collaborate with technical and non-technical stakeholders

Nice to have:

Experience in the Education or Publishing domain
Familiarity with analytics and BI tools such as Tableau or Alteryx
Experience working with financial datasets (sales, revenue, COGS, manufacturing, etc.)
Exposure to IBM Planning Analytics (TM1)

Why work with us?

50444

Let me find your next job.

Thanks - you're signed up.

Data Engineer -AWS, Python, Spark

Edtech.com's Summary

Data Engineer -AWS, Python, Spark Full Description

Senior Data Analyst - Equity Research

Data Analyst

Data Analyst

Business Intelligence Analyst (System Application Analyst, Sr)

Senior Manager, Data Analytics - OIT