Retrieval Augmented Generation (RAG) has revolutionized open-domain query answering, enabling systems to supply human-like responses to a wide selection of queries. At the guts of RAG lies a retrieval module that scans an unlimited corpus to seek out relevant context passages, that are then processed by a neural generative module — often a pre-trained language model like GPT-3 — to formulate a final answer.
While this approach has been highly effective, it’s not without its limitations.
One of the crucial critical components, the vector search over embedded passages, has inherent constraints that may hamper the system’s ability to reason in a nuanced manner. This is especially evident when questions require complex multi-hop reasoning across multiple documents.
Vector search refers to trying to find information using vector representations of information. It involves two key steps:
- Encoding data into vectors
First, the information being searched is encoded into numeric vector representations. For text data like passages or documents, this is finished using embedding models like BERT or RoBERTa. These models convert text into dense vectors of continuous numbers that represent the semantic meaning. Images, audio, and other formats will also be encoded into vectors using appropriate deep learning models.
2. Searching using vector similarity
Once data is encoded into vectors, searching involves finding vectors much like the vector representation of the search query. This relies on distance metrics like cosine similarity to quantify how close two vectors are and rank results. The vectors with the smallest distance (highest similarity) are returned as essentially the most relevant search hits.
The important thing advantage of vector search is the flexibility to look for semantic similarity, not only literal keyword matches. The vector representations capture conceptual meaning, allowing more relevant yet linguistically distinct results to be identified. This allows a better quality of search in comparison with traditional keyword matching.
Nonetheless, transforming data into vectors and searching in high-dimensional semantic space also comes with limitations. Balancing the tradeoffs of vector search is an lively area of research.
In this text, we’ll dissect the restrictions of vector search, exploring why it struggles to…