Designing RAGs

GenAI

A guide to Retrieval-Augmented Generation design selections.

20 min read

15 hours ago

Constructing Retrieval-Augmented Generation systems, or RAGs, is simple. With tools like LamaIndex or LangChain, you’ll be able to get your RAG-based Large Language Model up and running very quickly. Sure, some engineering effort is required to make sure the system is efficient and scales well, but in principle, constructing the RAG is the straightforward part. What’s rather more difficult is designing it well.

Having recently passed through the method myself, I discovered what number of big and small design selections have to be made for a Retrieval-Augmented Generation system. Each of them can potentially impact the performance, behavior, and value of your RAG-based LLM, sometimes in non-obvious ways.

Without further ado, let me present this — on no account exhaustive yet hopefully useful — list of RAG design selections. Let it guide your design efforts.

Retrieval-Augmented Generation gives a chatbot access to some external data in order that it might answer users’ questions based on this data somewhat than general knowledge or its own dreamed-up hallucinations.

As such, RAG systems can change into complex: we’d like to get the info, parse it to a chatbot-friendly format, make it available and searchable to the LLM, and eventually be sure that the chatbot is making the proper use of the info it was given access to.

I prefer to take into consideration RAG systems by way of the components they’re fabricated from. There are five most important pieces to the puzzle:

Indexing: Embedding external data right into a vector representation.
Storing: Persisting the indexed embeddings in a database.
Retrieval: Finding relevant pieces within the stored data.
Synthesis: Generating answers to user’s queries.
Evaluation: Quantifying how good the RAG system is.

In the rest of this text, we’ll undergo the five RAG components one after the other, discussing the design selections, their implications and trade-offs, and a few useful resources helping to make the choice.

Designing RAGs

GenAI

A guide to Retrieval-Augmented Generation design selections.

What are your thoughts on this topic?
Let us know in the comments below.

Share this article

Recent posts

Nous Research just released Nomos 1, an open-source AI that ranks second on the notoriously brutal Putnam math exam

Next-Generation AI Factory Telemetry with NVIDIA Spectrum-X Ethernet

Hugging Face to sell open-source robots because of Pollen Robotics acquisition 🤖

2.0 Flash, Flash-Lite, Pro Experimental

GPT-5.2 first impressions: a robust update, especially for business tasks and workflows

Designing RAGs

GenAI

A guide to Retrieval-Augmented Generation design selections.

What are your thoughts on this topic? Let us know in the comments below.

Share this article

Recent posts

What are your thoughts on this topic?
Let us know in the comments below.