In March 2026, Word Orb serves 162,000 words across 47 languages, structured daily lessons in 19 languages, 24,000 assessments, and 30,000 knowledge graph connections - all from Cloudflare's edge network in under 5ms. This is the technical story of how we got here.

Architecture Overview

Word Orb runs entirely on Cloudflare's developer platform:

38 Cloudflare Workers - Stateless compute at the edge, handling routing, API logic, content generation, and page rendering
22 D1 databases - SQLite-based databases for words, lessons, assessments, and the knowledge graph
20 R2 buckets - Object storage for pronunciation audio (240,000 files), word images, Kelly avatar assets, and backups
KV namespaces - Edge caching for hot paths (word lookups, pronunciation audio URLs)
Cloudflare AI - On-demand inference for content classification and image generation

There is no origin server. No EC2 instance. No Kubernetes cluster. Every request is handled at the edge, typically within 10km of the user. The median response time for a word lookup is 3ms.

The Data Pipeline

Words: From Dictionary to API

The Word Orb database contains 162,000 words. Each word entry includes:

Verified English definition
Part of speech and IPA pronunciation
Etymology
Translations in up to 47 languages (native script + transliteration)
Classification (tier, domain, complexity score)
AI-generated visual aid (stored in R2)
Pronunciation audio (stored in R2)

The pipeline for each word follows a multi-stage process. First, the word enters the content queue - either from a direct lookup (if the word is not in the database) or from batch processing. The scheduled Worker picks up queued words every 5 minutes, generates the full data package using Cloudflare AI for definitions and classifications, stores the result in D1, and uploads assets to R2.

The critical design decision was to make the database the source of truth, not the AI model. Once a word is generated and verified, it serves from D1 forever. The AI is used for authoring, not serving. This means responses are deterministic - the same word returns the same data every time - and we can guarantee content quality because we have reviewed what we serve.

Lessons: The 5-Phase Pedagogical Structure

Every lesson in Word Orb follows a 5-phase structure refined over 20 years of education research:

Hook - Grab attention in 2-3 sentences. Create curiosity, not confusion.
Story - Teach through narrative in 4-6 sentences. Humans remember stories, not facts.
Wonder - Spark the "why" in 3-5 sentences. Transform passive reception into active inquiry.
Action - Something to try right now in 3-5 sentences. Learning requires doing.
Wisdom - Land the takeaway in 2-3 sentences. What will the learner carry forward?

This structure is not arbitrary. It maps to well-established learning science: attention capture (Hook), narrative memory encoding (Story), metacognitive activation (Wonder), experiential learning (Action), and long-term memory consolidation (Wisdom). Each phase targets a different cognitive process, and the sequence is designed to build on the previous phase.

structured daily lessons are generated across 4 tracks (Learn, Grow, Teach, Trivia), multiple age groups (kid, teen, adult, elder), 19 languages, and 10 teaching archetypes that rotate daily. The archetype system ensures instructional variety - a Scientist archetype teaches through evidence and experiments, while a Storyteller archetype uses narrative and analogy. Same concept, different pedagogical approach, rotating on a daily cycle.

The Knowledge Graph: 30,000 Connections

The Knowledge Graph connects words to lessons to assessments. When a user looks up "photosynthesis," the graph returns not just the definition but which lessons teach it, which assessments test it, and which related words (chlorophyll, carbon dioxide, light energy) form a learning cluster.

The graph is stored in D1 as an adjacency list with weighted edges. Connections are typed: word-to-word (semantic similarity), word-to-lesson (appears in), lesson-to-assessment (tests), and cross-language (translation equivalence). The weighting allows the API to return the most relevant connections first, enabling AI agents to build coherent learning paths rather than random word lists.

The Sovereign Infrastructure Thesis

Word Orb does not depend on any single AI provider. This is a deliberate architectural decision, not an accident of technology choice.

Consider the dependency chain of a typical EdTech API: content generated by GPT-4 (OpenAI), served from AWS (Amazon), translated by Google Translate API (Google), with audio from ElevenLabs. If OpenAI changes its content policy, your definitions change. If AWS has an outage, your service is down. If Google deprecates a translation endpoint, your multilingual support breaks.

Word Orb uses AI models during the authoring phase but stores all generated content in our own D1 databases. Audio pronunciation is generated locally using Kokoro TTS running on our own RTX 5090 hardware, then uploaded to R2. Translations are generated using Groq's free tier, then stored permanently. The models are tools in the authoring pipeline, not runtime dependencies.

This means we can survive any single provider shutting down, changing pricing, or altering their content policy. Our data is ours. Our infrastructure is ours. Our mission cannot be blocked by a vendor decision.

Edge Performance: Why 5ms Matters

Cloudflare Workers execute in every one of Cloudflare's 300+ data centers worldwide. When a robot in Tokyo makes an API call, it hits the Tokyo edge node. When a chatbot in SÃ£o Paulo queries a word, it hits SÃ£o Paulo. There is no round-trip to us-east-1.

The practical impact: a word lookup takes 3ms median, 8ms at P99. A full lesson retrieval (5 phases, metadata, audio URLs) takes 5ms median. These numbers matter when your product is a real-time conversation between a robot and a human. A 200ms API call is a noticeable pause. A 3ms API call is invisible.

We achieve this through aggressive caching (KV for hot words, R2 for audio, CDN for images) and by keeping the compute simple - database reads, not model inference. The Worker code for a word lookup is essentially: read from KV cache -> if miss, read from D1 -> return JSON. No chained API calls, no model inference, no complex orchestration.

Scaling for Global Learning Products

Word Orb is designed for a world where educational apps, AI agents, and guided voice products need multilingual language capability. The architecture scales horizontally by default - Cloudflare Workers auto-scale to any request volume, D1 replicates globally, and R2 serves assets from the nearest edge.

The content pipeline scales linearly: more languages require more generation cycles, but the per-word cost is fixed and the infrastructure cost is near-zero thanks to Cloudflare's generous free tier for Workers and D1. We currently generate content at approximately 28,800 lessons per day across 18 non-English languages.

The next frontier is offline delivery. We are building iLearn - a purpose-built educational computer that ships with the full Orb database pre-loaded. A child in a village without internet connectivity gets the same 162,000 words, structured daily lessons, and 24,000 assessments as a student in Manhattan. The edge becomes the device itself.

This is infrastructure built for the long game. Not a startup looking for an exit. A public benefit corporation building language infrastructure for learning products that need consistency, reach, and trust.

From 162K Words to 62K Daily Lessons: How We Built Word Orb