Exploring the advanced RAG pipeline
Advanced RAG introduces a number of specific improvements to try to address the issues highlighted in naïve RAG. Advanced RAG, in other words, modifies the various components of RAG to try to optimize the RAG paradigm. These various modifications occur at the different steps of RAG: pre-retrieval and post-retrieval.
In the pre-retrieval process, the purpose is to optimize indexing and querying. For example, adding metadata enables more granular searching, and we provide more content to the LLM to generate text. Metadata can succinctly contain information that would otherwise be dispersed throughout the document.
In naïve RAG, we divide the document into different chunks and find the relevant chunks for each document. This approach has two limitations:
- When we have many documents, it impacts latency time and performance
- When the documents are large, we may not be able to easily find the relevant chunks
In naï...