09 Feb 2026 1 min read

Retrieval Augmented Generation from Scratch Tutorial

Most Retrieval Augmented Generation (RAG) tutorials hand you a framework, tell you to call three functions, and say "congrats, you built a RAG system." You didn't build anything. You assembled someone else's IKEA furniture without understanding why the shelf holds weight.

I have been wanting to understand RAG concepts at a lower level, so I took a couple days and ran through the process (with the help of Claude Code) and created a Jupyter notebook to capture the learning and try the concepts.

The tutorial repo is at https://github.com/joescharf/rag-tutorial and includes the Jupyter Notebook that walks you through every layer of a Retrieval-Augmented Generation pipeline and a CLI tool / library that has the full RAG pipeline in a small Python script. I also presented the tutorial to a team of developers I have been helping adopt AI development tools.

Topics covered:

Embeddings,
Vector search & cosine similarity,
Document chunking,
Indexing and retrieval with ChromaDB,
Wiring it all up with generation.

What's in it:

A Jupyter notebook that builds up a RAG system piece by piece, with
explanations of why each piece exists, not just how to wire it up
A reusable Python CLI and library so you can run it against your
own documents when you're done learning
A discussion of ChromaDB internals and the HNSW search algorithm, to get a better understanding of why a vector database is needed for RAG to work.
Debugging strategies for when retrieval has problems.

If you want to actually understand how RAG works — not just use it — this
might be worth an afternoon.

Check it out: https://github.com/joescharf/rag-tutorial

Joe Scharf