Methods to Create a RAG Evaluation Dataset From Documents

-

Mechanically create domain-specific datasets in any language using LLMs

The HuggingFace dataset card showing an example RAG evaluation dataset that we generated.
Our mechanically generated RAG evaluation dataset on the Hugging Face Hub (PDF input file from the European Union licensed under CC BY 4.0). Image by the writer

In this text I’ll show you create your personal RAG dataset consisting of contexts, questions, and answers from documents in any language.

Retrieval-Augmented Generation (RAG) [1] is a way that enables LLMs to access an external knowledge base.

By uploading PDF files and storing them in a vector database, we are able to retrieve this data via a vector similarity search after which insert the retrieved text into the LLM prompt as additional context.

This provides the LLM with recent knowledge and reduces the opportunity of the LLM making up facts (hallucinations).

An overview of the RAG pipeline. For documents storage: input documents -> text chunks -> encoder model -> vector database. For LLM prompting: User query -> encoder model -> vector database -> top-k relevant chunks -> generator LLM model. The LLM then answers the query with the retrieved context.” class=”bh ms ny c” width=”700″ height=”608″ loading=”lazy”></picture></div><figcaption class=The fundamental RAG pipeline. Image by the writer from the article “Methods to Construct a Local Open-Source LLM Chatbot With RAG”

Nevertheless, there are lots of parameters we’d like to set in a RAG pipeline, and researchers are all the time suggesting recent improvements. How will we know which parameters to decide on and which methods will really improve performance for our particular use case?

For this reason we’d like a validation/dev/test dataset to judge our RAG pipeline. The dataset ought to be from the domain we have an interest…

ASK DUKE

What are your thoughts on this topic?
Let us know in the comments below.

0 0 votes
Article Rating
guest
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments

Share this article

Recent posts

0
Would love your thoughts, please comment.x
()
x