Verified vocabulary infrastructure for Duolingo Max and 47 language pairs
The Scenario
Duolingo Max’s Roleplay feature lets learners practice conversation with AI characters. When a Spanish learner encounters “perseverance” in a Roleplay, the app needs a verified definition, a Spanish translation with native script, IPA pronunciation, and audio — not an LLM-generated guess that varies between sessions. One API call for the word. One for the lesson. One for the quiz. Deterministic content the AI-first strategy can trust.
Step 1 — Word Orb looks up the word
One API call returns a verified definition, translations, pronunciation audio, and etymology.
Loading word data…
Step 2 — Lesson Orb delivers a structured lesson
A 5-phase lesson (hook → story → wonder → action → wisdom) with the explorer teaching archetype.
Loading lesson data…
Step 3 — Quiz Orb assesses comprehension
Assessment questions aligned to the lesson content through the knowledge graph. Your agent tests what it taught.
Loading quiz data…
Step 4 — The Knowledge Graph connects everything
30,288 connections link words to lessons to assessments. Every quiz question tests what the lesson taught.
Loading knowledge graph…
Why this matters for Duolingo
47-language translations with native script — covers every Duolingo language pair, including the 148 new AI-generated courses launched in 2025
240,000 pronunciation audio files recorded by native speakers, not TTS — stream directly into Roleplay and Video Call with Lily
Deterministic API responses — the verified content layer that Duolingo Max’s GPT-4 can cite, not generate. Same word, same definition, every session
10 teaching archetypes adapt instructional tone per learner segment — supporting Duolingo’s five strategy pillars: Grow users, Teach better, Grow subscribers