This is a handout I developed for a presentation about Transformers, where I hoped to demonstrate the connection between the conceptual elements of the model architecture and the fundamental mathematical operations behind it. This handout was developed in reference to the minGPT model repo and the Illustrated Transformer.

This graphic was made using tikz in LaTeX.