Khan Academy’s journey to experimentation
How Khan Academy uses AI-driven experimentation to optimize their AI tutor for learning.
Experimentation rarely starts as a mature system. Most organizations begin with a few A/B tests and gradually learn what it takes to build a real culture of evidence.
In this session, Dr. Kelli Hill shares how Khan Academy evolved experimentation from early testing efforts into a cross-functional discipline spanning research, analytics, and engineering.
She walks through the lessons learned while scaling experimentation across product teams, including how they established governance, built trust in metrics, and operationalized testing with GrowthBook.
Kelli will also explore Khan Academy’s newest frontier: experimenting with generative AI.
Her team now uses a combination of AI-driven evaluation and traditional A/B testing to improve Khanmigo, their AI tutor. These experiments go beyond typical product metrics, focusing on learning quality, student outcomes, and responsible AI behavior.
If you're building AI-powered products or trying to turn experimentation into an engineering discipline, this session offers a practical look at what that journey actually looks like.
Topics
- Designing experiments for AI systems, prompts, and model behavior
- Evaluating learning quality, not just engagement metrics
- Combining automated AI evaluation with production A/B tests
- Scaling experimentation across product, engineering, and data science teams
Speakers
- Kelli Hill, PhD – Senior Director, Insights, Khan Academy
- Luke Sonnet – Head of Experimentation, GrowthBook, previously Twitter and Facebook
More events
Ready to ship faster?
No credit card required. Start with feature flags, experimentation, and product analytics — free.




.png)