Edtech.com's Summary

Brainscape is hiring an AI Prompt Engineer to develop and maintain generative AI features that assist millions in creating improved flashcards. The role involves migrating and testing bulk flashcard creation prompts with newer GPT models, analyzing user data for prompt optimization, conducting QA and regression testing, and collaborating with the Content Team to uphold flashcard quality standards.

Highlights

Migrate and test bulk flashcard creation prompts using up-to-date GPT models.
Run test suites, manually review AI outputs, and fine-tune prompts for quality and correctness.
Analyze real user data to detect failure patterns and inform prompt improvements.
Streamline and automate testing and evaluation workflows.
Monitor production AI quality post-launch and detect regressions due to model changes.
Create and maintain evaluation datasets from real user inputs across AI features.
Develop test cases addressing edge cases, multilingual, and complex real-world inputs.
Document prompt modifications, test outcomes, and lessons learned.
Collaborate with the Content Team to apply flashcard authoring standards.
Technical skills required: experience with LLM prompt engineering, OpenAI API, Cursor IDE or similar AI-assisted development tools, Python, and GitLab version control.
Qualifications include 1+ years of hands-on prompt engineering experience, strong written communication, and ability to work independently.
Compensation ranges from $40 to $100 per hour, based on experience and location, for approximately 5-10 hours per week under a part-time contract through 2026.
Preferred bonuses: experience in prompt evaluation, AI QA, regression testing, AI model drift detection, plus EdTech or content creation background and degrees in Computer Science or related fields.

AI Prompt Engineer Full Description

Brainscape, the world's leading web & mobile EdTech study platform, is seeking an AI Prompt Engineer to help us ship and maintain high-quality generative AI features that help millions of learners create better flashcards.

You will be working directly with Brainscape's Knowledge Manager to iterate on LLM prompts, analyze real user data, and ensure our AI output meets a high quality bar - both at launch and as models evolve. The immediate priority is migrating and testing our existing bulk flashcard creation prompts in an updated AI environment with newer GPT models. These prompts power three user-facing features: importing pasted or uploaded content into flashcards, summarizing documents into flashcards, and generating flashcards from a user-described topic. From there, the role expands into ongoing QA, regression testing, and prompt optimization across all of Brainscape's AI features.

This is a part-time contract role (~5-10 hours/week, remote) through the end of 2026, with potential to extend or convert to a permanent position. Hourly rate is $40-$100 (based on experience and location).

Responsibilities

Migrate and test existing bulk flashcard creation prompts in an updated AI environment with newer GPT models - and plan future migrations as OpenAI retires older models
Run test suites and manually review AI outputs for quality and correctness (fine-tune prompts)
Analyze real user data to identify failure patterns and inform prompt improvements
Streamline testing and evaluation workflows to make QA faster and more repeatable
Monitor production quality post-launch and detect regressions as underlying models shift
Build and maintain model evaluation datasets from real user inputs across all AI features
Write new test cases for edge cases, multilingual content, and messy real-world inputs
Document prompt changes, test results, and lessons learned
Work with the Content Team to apply flashcard authoring quality standards

Qualifications

1+ years hands-on prompt engineering experience with LLMs / OpenAI API (systematic testing and iteration, not just casual ChatGPT usage)
Familiarity with Cursor IDE or similar AI-assisted development tools (our work is primarily Python - Cursor experience is more important than raw Python skill)
Some experience with Git version control and collaborating via shared repositories (we use GitLab)
A habit of documenting what you tried, what worked, and why - you don't need a formal QA background, but you naturally keep track of your process
Clear written communication skills
Proactive attitude; ability to work independently and manage your own time
BONUS: Experience building prompt evals, AI quality assurance, or using GPT to grade GPT outputs
BONUS: Experience with regression testing for AI systems or detecting model drift
BONUS: Background in education technology (EdTech) or content creation - especially microlearning, flashcards, or other concise Q&A formats
BONUS: A degree in Computer Science, Information Science, or a similar field

To Apply

Please apply above with a resume and anything you want us to know about why you think you'd be a good fit. If you are a top candidate, we will follow up with a short application form to help direct you to the right person for an interview.

Original Job Description

Responsibilities

Migrate and test existing bulk flashcard creation prompts in an updated AI environment with newer GPT models - and plan future migrations as OpenAI retires older models
Run test suites and manually review AI outputs for quality and correctness (fine-tune prompts)
Analyze real user data to identify failure patterns and inform prompt improvements
Streamline testing and evaluation workflows to make QA faster and more repeatable
Monitor production quality post-launch and detect regressions as underlying models shift
Build and maintain model evaluation datasets from real user inputs across all AI features
Write new test cases for edge cases, multilingual content, and messy real-world inputs
Document prompt changes, test results, and lessons learned
Work with the Content Team to apply flashcard authoring quality standards

Qualifications

1+ years hands-on prompt engineering experience with LLMs / OpenAI API (systematic testing and iteration, not just casual ChatGPT usage)
Familiarity with Cursor IDE or similar AI-assisted development tools (our work is primarily Python - Cursor experience is more important than raw Python skill)
Some experience with Git version control and collaborating via shared repositories (we use GitLab)
A habit of documenting what you tried, what worked, and why - you don't need a formal QA background, but you naturally keep track of your process
Clear written communication skills
Proactive attitude; ability to work independently and manage your own time
BONUS: Experience building prompt evals, AI quality assurance, or using GPT to grade GPT outputs
BONUS: Experience with regression testing for AI systems or detecting model drift
BONUS: Background in education technology (EdTech) or content creation - especially microlearning, flashcards, or other concise Q&A formats
BONUS: A degree in Computer Science, Information Science, or a similar field

To Apply

Let me find your next job.

Thanks - you're signed up.

AI Prompt Engineer

Edtech.com's Summary

AI Prompt Engineer Full Description

Senior Salesforce Business Systems Analyst

Manager, Information Technology

Systems Administrator

Desktop Support Engineers

IT Support Analyst