Edtech.com's Summary

Handshake is hiring an AI Red Teamer (LLM Generalist). The role involves designing creative adversarial prompts to identify vulnerabilities in large language models, helping improve AI safety and robustness. The AI Red Teamer tests models across various risk categories and collaborates with engineers and researchers to strengthen model defenses.

Highlights

Craft adversarial prompts and multi-turn scenarios to stress-test AI guardrails across risk categories such as content safety, cybersecurity, and regulatory compliance.
Identify and bypass safety filters using jailbreak, evasion, and prompt injection techniques.
Evaluate and score model responses against harm taxonomies and severity rubrics.
Document experiments and refine adversarial prompts generated by the team.
Collaborate with engineers, data scientists, and researchers to share findings and improve model defenses.
Strong hands-on experience with multiple large language models (ChatGPT, Claude, Gemini, open-source models).
Creative problem-solving skills with familiarity in adversarial prompt crafting and jailbreak techniques preferred.
Clear written communication and strong ethical judgment are essential.
Extra credit for Python or scripting experience, working with LLM APIs, content moderation, or expertise in high-risk domains.
This role requires dealing regularly with harmful and disturbing content in a professional and sustainable manner.

AI Red Teamer (LLM Generalist) Full Description

About Handshake

Handshake was founded on a simple belief that everyone deserves a path to a great career, regardless of where they went to school or who they know. Today, we power 25 million job seekers, 1 million+ employers, and 1,600 educational institutions.

In 2025, we started Handshake AI and built the fastest-growing AI data business in history. We work directly with frontier AI lab researchers to create evaluations, publish benchmarks, and push the boundary of data. We’ve grown from $0 to ~$1B run rate and pay ~$60M to over 30K individuals every month.

Why join Handshake now:

Shape how every career evolves in the AI economy, at global scale, with impact your friends, family and peers can see and feel
Partner hand-in-hand with world-class AI labs, Fortune 500 partners and the world’s top educational institutions
Work together with engineers, scientists, operators, and more from Palantir, Meta, Scale AI, and former YC founders
Build a massive, fast-growing business with billions in revenue

About Handshake AI

Human data is the core infrastructure to AI advancement. Frontier AI labs currently improve model capabilities with various data-intensive post-training techniques. We believe that data spend for AI training will increase by 3-5x in the next few years and continue for much longer as models take on new domains. Handshake AI supports all of the frontier AI labs, working on their most complex data at the largest scale.

About the Role
As an AI Red Teamer, you will stress-test large language models by intentionally trying to break them. Rather than checking whether an answer is correct, you will design creative, adversarial prompts that expose vulnerabilities: unsafe content, bias, broken guardrails, hallucinations, prompt injection weaknesses, and unexpected behaviors. Your work directly supports AI safety and model robustness for leading research labs.

This is a generalist red teaming role. You will probe models across the full spectrum of risk categories, including content safety, CBRN (chemical, biological, radiological, nuclear), cybersecurity, persuasion and influence operations, child safety, self-harm, over-companionship, and regulatory compliance. Red teaming may span text, image, voice, and agentic model capabilities depending on project needs.

This role requires creativity, curiosity, and an ability to think like an adversary while operating with strong ethical judgment.

Craft creative prompts and multi-turn scenarios to stress-test AI guardrails across diverse risk categories
Discover ways around safety filters, restrictions, and defenses using jailbreak, evasion, and prompt injection techniques
Explore edge cases to provoke disallowed, harmful, or incorrect outputs
Evaluate and score model responses against structured harm taxonomies and severity rubrics
Document experiments clearly, including what you tried, why you tried it, and what it revealed
Review and refine adversarial prompts generated by other team members
Contribute to harm taxonomy development, calibration exercises, and inter-rater reliability work
Collaborate with engineers, data scientists, and researchers to share findings and strengthen defenses
Work with potentially disturbing content on a regular basis (see Content Warning below)
Stay current on jailbreaks, attack methods, and evolving model behaviors

Desired Capabilities

Strong hands-on experience using multiple LLMs (ChatGPT, Claude, Gemini, open-source models, etc.)
Intuition for crafting adversarial prompts; familiarity with jailbreak or evasion techniques is a strong plus
Creative, adversarial problem-solving skills
Clear and thoughtful written communication
Strong ethical judgment and the ability to separate adversarial thinking from personal values
Self-directed, collaborative, and comfortable in feedback-heavy environments
Curiosity, persistence, and comfort with frequent failure in experimentation

Extra Credit

Familiarity with Python or other scripting languages
Experience working with LLM APIs or evaluation tooling
Comfort with structured data annotation and rubric-based scoring
Prior work in trust and safety, content moderation, QA, or security research
Subject matter expertise in any high-risk domain (cybersecurity, chemistry, biology, medicine, law, finance, etc.)

You Will Thrive Here If

You treat every model response as a hypothesis to challenge
You can switch between creative free-association and rigorous documentation in the same session
You go deep into unusual interests (fandoms, niche internet cultures, gaming exploits, Wikipedia rabbit holes, etc.)
You come from a creative background: writing, visual art, improv, puzzle design, or similar
You are energized by finding the thing nobody else thought to try
You are genuinely passionate about AI and follow the space closely

Content Warning

This role involves regular and deliberate exposure to harmful content. You will encounter and intentionally generate content involving violence, self-harm, hate speech, sexually explicit material, child safety scenarios, and other categories of harmful output as part of structured adversarial testing. Candidates must be able to engage with this material professionally and sustainably. Support resources are available.

Original Job Description

About Handshake

Why join Handshake now:

Shape how every career evolves in the AI economy, at global scale, with impact your friends, family and peers can see and feel
Partner hand-in-hand with world-class AI labs, Fortune 500 partners and the world’s top educational institutions
Work together with engineers, scientists, operators, and more from Palantir, Meta, Scale AI, and former YC founders
Build a massive, fast-growing business with billions in revenue

About Handshake AI

This role requires creativity, curiosity, and an ability to think like an adversary while operating with strong ethical judgment.

Craft creative prompts and multi-turn scenarios to stress-test AI guardrails across diverse risk categories
Discover ways around safety filters, restrictions, and defenses using jailbreak, evasion, and prompt injection techniques
Explore edge cases to provoke disallowed, harmful, or incorrect outputs
Evaluate and score model responses against structured harm taxonomies and severity rubrics
Document experiments clearly, including what you tried, why you tried it, and what it revealed
Review and refine adversarial prompts generated by other team members
Contribute to harm taxonomy development, calibration exercises, and inter-rater reliability work
Collaborate with engineers, data scientists, and researchers to share findings and strengthen defenses
Work with potentially disturbing content on a regular basis (see Content Warning below)
Stay current on jailbreaks, attack methods, and evolving model behaviors

Desired Capabilities

Strong hands-on experience using multiple LLMs (ChatGPT, Claude, Gemini, open-source models, etc.)
Intuition for crafting adversarial prompts; familiarity with jailbreak or evasion techniques is a strong plus
Creative, adversarial problem-solving skills
Clear and thoughtful written communication
Strong ethical judgment and the ability to separate adversarial thinking from personal values
Self-directed, collaborative, and comfortable in feedback-heavy environments
Curiosity, persistence, and comfort with frequent failure in experimentation

Extra Credit

Familiarity with Python or other scripting languages
Experience working with LLM APIs or evaluation tooling
Comfort with structured data annotation and rubric-based scoring
Prior work in trust and safety, content moderation, QA, or security research
Subject matter expertise in any high-risk domain (cybersecurity, chemistry, biology, medicine, law, finance, etc.)

You Will Thrive Here If

You treat every model response as a hypothesis to challenge
You can switch between creative free-association and rigorous documentation in the same session
You go deep into unusual interests (fandoms, niche internet cultures, gaming exploits, Wikipedia rabbit holes, etc.)
You come from a creative background: writing, visual art, improv, puzzle design, or similar
You are energized by finding the thing nobody else thought to try
You are genuinely passionate about AI and follow the space closely

Content Warning

Let me find your next job.

Thanks - you're signed up.

AI Red Teamer (LLM Generalist)

Edtech.com's Summary

AI Red Teamer (LLM Generalist) Full Description

About Handshake

Instructional Coach, STEM (Amplify Desmos Math) (Fixed-Term)

Instructional Coach, Math (Fixed - Term)

Instructional Coach, Math - NYC (Fixed - Term)

Instructional Coach, Biliteracy - NYC (Fixed-Term)

SVP / Chief Impact Officer