A recipe to leverage embeddings and LLMs to construct a high quality eval dataset for a RAG system.
RAG course: www.jostai.com/p/local-rag-build-your-own-ai-assis…
Slides: docs.google.com/presentation/d/1mLQRpg1vP0Rtvr4BdG…
Chapters
00:00 - Intro
00:50 - RAG overview
02:53 - Challenges of optimizing a RAG system
03:57 - Why we need a good eval dataset
04:36 - Challenges of getting a good dataset
04:53 - The recipe overview
05:23 - Clustering
07:00 - Sampling
07:57 - LLM for Q/A pair generation
09:52 - Using generated data to evaluate retrieval quality
11:19 - LLM as a Virtual Judge
12:19 - Evaluating RAG answer quality with a Virtual Judge
13:16 - Summary
13:52 - Using metadata for fine-grained evaluation
コメント