Large Language Models explained briefly

And then much less briefly

Nov 20, 2024

I recently made a short piece about the Computer History Museum to be included in an exhibit they’re opening today about the history of chatbots. They wanted a short animated explainer to break down what a large language model is.

They asked back in January, and as many of you will know, I was already making some visual explainers of the architecture underlying large language models: Transformers. So while I usually shy away from commissions, this one fits in quite nicely. Plus, I love the Computer History Museum.

The larger videos I put out about transformers were targeted at a mildly technical audience, but some viewers gave the feedback that when they shared the videos with friends in their lives curious to know more about the topic, sometimes those friends found the videos a bit heavy or confusing.

So for you, the loyal email-list-subscribing audience, I’m very curious to know if this is something you would find helpful for sharing with others you know for whom the phrase “Large Language Model” is a vague black box.

Also, I posted a second video today, on the second channel, including another outlet for all these LLM and transformer animations made earlier this year. In July, I went to an event by the company TNG in Munich and gave a talk to break down how transformers and attention work.

For those who prefer explanations as a casual talk instead of a produced video, it offers an alternative to the additions made to the deep learning series.

3Blue1Brown mailing list

Discussion about this post