Aug. 4, 2025

The Jagged Frontier

LLMs are weird. They can perform very well and very poorly, often unpredictably, and that creates unique challenges for education.

Show Notes

ChatGPT is the most well known of the Large Language Models (LLMs) but what is an LLM? We go deep into how this remarkable new technology is built, and why their performance is inconsistent — or jagged — across similar tasks. We dive into the techniques AI engineers use to align these tools’ behavior with our values, and explain why they don’t always work, and sometimes we get hallucinations or biased output.

This episode was produced by Steven Jackson and Jesse Dukes

Editing: Alexandra Salomon and Ruxandra Guidi

Reporting and research from Holly McDede, Natasha Esteves, Andrew Parsons, Andrew Meriwether, Marnette Federis, and Chris Bagg.

Sound design and music supervision by Steven Jackson.

Production assistance from Yebu Ji and Nathan Ray.

Data analysis from Manee Ngozi Nnamani and Manasa Kudumu.

Special thanks to Josh Sheldon, Camila Lee, Liz Hutner, and Eric Klopfer.

Administrative support from Jessica Rondon.

The research and reporting you heard in this episode was supported by the Spencer Foundation, the Kapor Foundation, the Jameel World Education Lab, the Social and Ethical Responsibility of Computing initiative at MIT, and the RAISE initiative, Responsible AI for Social Empowerment and Education also at MIT.

We had support from Google’s Academic Research Awards program.

The Homework Machine is a program of the MIT Teaching Systems Lab, Justin Reich, director.