In this text I’ll show you create your personal RAG dataset consisting of contexts, questions, and answers from documents in any language.
Retrieval-Augmented Generation (RAG) [1] is a way that enables LLMs to access an external knowledge base.
By uploading PDF files and storing them in a vector database, we are able to retrieve this data via a vector similarity search after which insert the retrieved text into the LLM prompt as additional context.
This provides the LLM with recent knowledge and reduces the opportunity of the LLM making up facts (hallucinations).
What are your thoughts on this topic?
Let us know in the comments below.
0 Comments
Oldest