Retrieval Augmented Generation from Scratch Tutorial
Most Retrieval Augmented Generation (RAG) tutorials hand you a framework, tell you to call three functions, and say "congrats, you built a RAG system." You didn't build anything. You assembled someone else's IKEA furniture without understanding why the shelf holds weight.
I have been wanting to understand RAG concepts at a lower level, so I took a couple days and ran through the process (with the help of Claude Code) and created a Jupyter notebook to capture the learning and try the concepts.
The tutorial repo is at https://github.com/joescharf/rag-tutorial and includes the Jupyter Notebook that walks you through every layer of a Retrieval-Augmented Generation pipeline and a CLI tool / library that has the full RAG pipeline in a small Python script. I also presented the tutorial to a team of developers I have been helping adopt AI development tools.
Topics covered:
- Embeddings,
- Vector search & cosine similarity,
- Document chunking,
- Indexing and retrieval with ChromaDB,
- Wiring it all up with generation.
What's in it:
- A Jupyter notebook that builds up a RAG system piece by piece, with
explanations of why each piece exists, not just how to wire it up - A reusable Python CLI and library so you can run it against your
own documents when you're done learning - A discussion of ChromaDB internals and the HNSW search algorithm, to get a better understanding of why a vector database is needed for RAG to work.
- Debugging strategies for when retrieval has problems.
If you want to actually understand how RAG works — not just use it — this
might be worth an afternoon.
Check it out: https://github.com/joescharf/rag-tutorial