Retrieval Augmented Generation from Scratch Tutorial

Retrieval Augmented Generation from Scratch Tutorial
Photo by Markus Winkler / Unsplash

Most Retrieval Augmented Generation (RAG) tutorials hand you a framework, tell you to call three functions, and say "congrats, you built a RAG system." You didn't build anything. You assembled someone else's IKEA furniture without understanding why the shelf holds weight.

I have been wanting to understand RAG concepts at a lower level, so I took a couple days and ran through the process (with the help of Claude Code) and created a Jupyter notebook to capture the learning and try the concepts.

The tutorial repo is at https://github.com/joescharf/rag-tutorial and includes the Jupyter Notebook that walks you through every layer of a Retrieval-Augmented Generation pipeline and a CLI tool / library that has the full RAG pipeline in a small Python script. I also presented the tutorial to a team of developers I have been helping adopt AI development tools.

Topics covered:

  • Embeddings,
  • Vector search & cosine similarity,
  • Document chunking,
  • Indexing and retrieval with ChromaDB,
  • Wiring it all up with generation.

What's in it:

  • A Jupyter notebook that builds up a RAG system piece by piece, with
    explanations of why each piece exists, not just how to wire it up
  • A reusable Python CLI and library so you can run it against your
    own documents when you're done learning
  • A discussion of ChromaDB internals and the HNSW search algorithm, to get a better understanding of why a vector database is needed for RAG to work.
  • Debugging strategies for when retrieval has problems.

If you want to actually understand how RAG works — not just use it — this
might be worth an afternoon.

Check it out: https://github.com/joescharf/rag-tutorial