Home Artificial Intelligence Using LLM’s for Retrieval and Reranking Summary Introduction and Background LLM Retrieval and Reranking Initial Experimental Results Conclusion

Using LLM’s for Retrieval and Reranking Summary Introduction and Background LLM Retrieval and Reranking Initial Experimental Results Conclusion

102
Using LLM’s for Retrieval and Reranking
Summary
Introduction and Background
LLM Retrieval and Reranking
Initial Experimental Results
Conclusion

LlamaIndex Blog

This blog post outlines a few of the core abstractions we’ve got created in LlamaIndex around LLM-powered retrieval and reranking, which helps to create enhancements to document retrieval beyond naive top-k embedding-based lookup.

LLM-powered retrieval can return more relevant documents than embedding-based retrieval, with the tradeoff being much higher latency and value. We show how using embedding-based retrieval as a first-stage pass, and second-stage retrieval as a reranking step might help provide a glad medium. We offer results over the Great Gatsby and the Lyft SEC 10-k.

Two-stage retrieval pipeline: 1) Top-k embedding retrieval, then 2) LLM-based reranking

There was a wave of “Construct a chatbot over your data” applications up to now few months, made possible with frameworks like LlamaIndex and LangChain. Plenty of these applications use an ordinary stack for retrieval augmented generation (RAG):

  • Use a vector store to store unstructured documents (knowledge corpus)
  • Given a question, use a to retrieve relevant documents from the corpus, and a to generate a response.
  • The fetchesthe top-k documents by embedding similarity to the query.

On this stack, the retrieval model shouldn’t be a novel idea; the concept of top-k embedding-based semantic search has been around for at the very least a decade, and doesn’t involve the LLM in any respect.

There are loads of advantages to embedding-based retrieval:

  • It’s very fast to compute dot products. Doesn’t require any model calls during query-time.
  • Even when not perfect, embeddings can encode the semantics of the document and query reasonably well. There’s a category of queries where embedding-based retrieval returns very relevant results.

Yet for quite a lot of reasons, embedding-based retrieval could be imprecise and return irrelevant context to the query, which in turn degrades the standard of the general RAG system, whatever the quality of the LLM.

This can also be not a latest problem: one approach to resolve this in existing IR and advice systems is to create a . The primary stage uses embedding-based retrieval with a high top-k value to maximise recall while accepting a lower precision. Then the second stage uses a rather more computationally expensive process that’s higher precision and lower recall (for example with BM25) to “rerank” the present retrieved candidates.

Covering the downsides of embedding-based retrieval is price a whole series of blog posts. This blog post is an initial exploration of another retrieval method and the way it might (potentially) augment embedding-based retrieval methods.

Over the past week, we’ve developed quite a lot of initial abstractions across the concept of “LLM-based” retrieval and reranking. At a high-level, this approach uses the LLM to make your mind up which document(s) / text chunk(s) are relevant to the given query. The input prompt would consist of a set of candidate documents, and the LLM is tasked with choosing the relevant set of documents in addition to scoring their relevance with an internal metric.

Easy diagram of how LLM-based retrieval works

An example prompt would appear like the next:

An inventory of documents is shown below. Each document has a number next to it together with a summary of the document. An issue can also be provided.
Respond with the numbers of the documents you must seek the advice of to reply the query, so as of relevance, as well
because the relevance rating. The relevance rating is a number from 1–10 based on how relevant you think that the document is to the query.
Don't include any documents that should not relevant to the query.
Example format:
Document 1:

Document 2:


Document 10:

Query:
Answer:
Doc: 9, Relevance: 7
Doc: 3, Relevance: 4
Doc: 7, Relevance: 3
Let's do that now:
{context_str}
Query: {query_str}
Answer:

The prompt format implies that the text for every document ought to be relatively concise. There are two ways of feeding within the text to the prompt corresponding to every document:

  • You may directly feed within the raw text corresponding to the document. This works well if the document corresponds to a bite-sized text chunk.
  • You may feed in a condensed summary for every document. This could be preferred if the document itself corresponds to a long-piece of text. We do that under the hood with our latest document summary index, but you too can decide to do it yourself.

Given a set of documents, we are able to then create document “batches” and send each batch into the LLM input prompt. The output of every batch can be the set of relevant documents + relevance scores inside that batch. The ultimate retrieval response would aggregate relevant documents from all batches.

You should utilize our abstractions in two forms: as a standalone retriever module (ListIndexLLMRetriever) or a reranker module (LLMRerank). The rest of this blog primarily focuses on the reranker module given the speed/cost.

ListIndexLLMRetriever)

This module is defined over an inventory index, which simply stores a set of nodes as a flat list. You may construct the list index over a set of documents after which use the LLM retriever to retrieve the relevant documents from the index.

from llama_index import GPTListIndex
from llama_index.indices.list.retrievers import ListIndexLLMRetriever
index = GPTListIndex.from_documents(documents, service_context=service_context)
# high - level API
query_str = "What did the creator do during his time in college?"
retriever = index.as_retriever(retriever_mode="llm")
nodes = retriever.retrieve(query_str)
# lower-level API
retriever = ListIndexLLMRetriever()
response_synthesizer = ResponseSynthesizer.from_args()
query_engine = RetrieverQueryEngine(retriever=retriever, response_synthesizer=response_synthesizer)
response = query_engine.query(query_str)

This might potentially be used rather than our vector store index. You employ the LLM as an alternative of embedding-based lookup to pick the nodes.

This module is defined as a part of our NodePostprocessor abstraction, which is defined for second-stage processing after an initial retrieval pass.

The postprocessor could be used by itself or as a part of a RetrieverQueryEngine call. Within the below example we show how you can use the postprocessor as an independent module after an initial retriever call from a vector index.

from llama_index.indices.query.schema import QueryBundle
query_bundle = QueryBundle(query_str)
# configure retriever
retriever = VectorIndexRetriever(
index=index,
similarity_top_k=vector_top_k,
)
retrieved_nodes = retriever.retrieve(query_bundle)
# configure reranker
reranker = LLMRerank(choice_batch_size=5, top_n=reranker_top_n, service_context=service_context)
retrieved_nodes = reranker.postprocess_nodes(retrieved_nodes, query_bundle)

There are particular limitations and caveats to LLM-based retrieval, especially with this initial version.

  • LLM-based retrieval is orders of magnitude slower than embedding-based retrieval. Embedding search over 1000’s and even tens of millions of embeddings can take lower than a second. Each LLM prompt of 4000 tokens to OpenAI can take minutes to finish.
  • Using third-party LLM API’s costs money.
  • The present approach to batching documents might not be optimal, since it relies on an assumption that document batches could be scored independently of one another. This lacks a worldwide view of the rating for all documents.

Using the LLM to retrieve and rank every node within the document corpus could be prohibitively expensive. For this reason using the LLM as a second-stage reranking step, after a first-stage embedding pass, could be helpful.

Let’s take a have a look at how well LLM reranking works!

We show some comparisons between naive top-k embedding-based retrieval in addition to the two-stage retrieval pipeline with a first-stage embedding-retrieval filter and second-stage LLM reranking. We also showcase some results of pure LLM-based retrieval (though we don’t showcase as many results provided that it tends to run so much slower than either of the primary two approaches).

We analyze results over two very different sources of information: the Great Gatsby and the 2021 Lyft SEC 10-k. We only analyze results over the “retrieval” portion and never synthesis to raised isolate the performance of various retrieval methods.

The outcomes are presented in a qualitative fashion. A next step would definitely be more comprehensive evaluation over a whole dataset!

In our first example, we load within the Great Gatsby as a Document object, and construct a vector index over it (with chunk size set to 512).

# LLM Predictor (gpt-3.5-turbo) + service context
llm_predictor = LLMPredictor(llm=ChatOpenAI(temperature=0, model_name="gpt-3.5-turbo"))
service_context = ServiceContext.from_defaults(llm_predictor=llm_predictor, chunk_size_limit=512)
# load documents
documents = SimpleDirectoryReader('../../../examples/gatsby/data').load_data()
index = GPTVectorStoreIndex.from_documents(documents, service_context=service_context)

We then define a get_retrieved_nodes function — this function can either do exactly embedding-based retrieval over the index, or embedding-based retrieval + reranking.

def get_retrieved_nodes(
query_str, vector_top_k=10, reranker_top_n=3, with_reranker=False
):
query_bundle = QueryBundle(query_str)
# configure retriever
retriever = VectorIndexRetriever(
index=index,
similarity_top_k=vector_top_k,
)
retrieved_nodes = retriever.retrieve(query_bundle)
if with_reranker:
# configure reranker
reranker = LLMRerank(choice_batch_size=5, top_n=reranker_top_n, service_context=service_context)
retrieved_nodes = reranker.postprocess_nodes(retrieved_nodes, query_bundle)
return retrieved_nodes

We then ask some questions. With embedding-based retrieval we set k=3. With two-stage retrieval we set k=10 for embedding retrieval and n=3 for LLM-based reranking.

(For those of you who should not acquainted with the Great Gatsby, the narrator finds out afterward from Gatsby that Daisy was actually the one driving the automotive, but Gatsby takes the blame for her).

The highest retrieved contexts are shown in the pictures below. We see that in embedding-based retrieval, the highest two texts contain semantics of the automotive crash but give no details as to who was actually responsible. Only the third text incorporates the right answer.

Retrieved context using top-k embedding lookup (baseline)

In contrast, the two-stage approach returns only one relevant context, and it incorporates the right answer.

Retrieved context using two-stage pipeline (embedding lookup then rerank)

We would like to ask some questions over the 2021 Lyft SEC 10-K, specifically concerning the COVID-19 impacts and responses. The Lyft SEC 10-K is 238 pages long, and a ctrl-f for “COVID-19” returns 127 matches.

We use an identical setup because the Gatsby example above. The foremost differences are that we set the chunk size to 128 as an alternative of 512, we set k=5 for the embedding retrieval baseline, and an embedding k=40 and reranker n=5 for the two-stage approach.

We then ask the next questions and analyze the outcomes.

Results for the baseline are shown within the image above. We see that results corresponding to indices 0, 1, 3, 4, are about measures directly in response to Covid-19, although the query was specifically about company initiatives that were independent of the COVID-19 pandemic.

Retrieved context using top-k embedding lookup (baseline)

We get more relevant ends in approach 2, by widening the top-k to 40 after which using an LLM to filter for the top-5 contexts. The independent company initiatives include “expansion of Light Vehicles” (1), “incremental investments in brand/marketing” (2), international expansion (3), and accounting for misc. risks reminiscent of natural disasters and operational risks when it comes to financial performance (4).

Retrieved context using two-stage pipeline (embedding lookup then rerank)

That’s it for now! We’ve added some initial functionality to assist support LLM-augmented retrieval pipelines, but after all there’s a ton of future steps that we couldn’t quite get to. Some questions we’d like to explore:

  • How our LLM reranking implementation compares to other reranking methods (e.g. BM25, Cohere Rerank, etc.)
  • What the optimal values of embedding top-k and reranking top-n are for the 2 stage pipeline, accounting for latency, cost, and performance.
  • Exploring different prompts and text summarization methods to assist determine document relevance
  • Exploring if there’s a category of applications where LLM-based retrieval by itself would suffice, without embedding-based filtering (possibly over smaller document collections?)

Resources

You may mess around with the notebooks yourself!

Great Gatsby Notebook

2021 Lyft 10-K Notebook

102 COMMENTS

  1. We are a gaggle of volunteers and opening a new scheme in our community.
    Your site provided us with valuable information to work on. You’ve performed an impressive task and our entire community will likely be thankful to you.

  2. Great goods from you, man. I’ve understand your stuff previous to and you’re just extremely excellent.

    I really like what you have acquired here, really like what you’re saying and the way in which you say it.

    You make it entertaining and you still care for to
    keep it smart. I can not wait to read far more from you.

    This is really a terrific website.

  3. Please let me know if you’re looking for a author for your site.
    You have some really good articles and I believe I
    would be a good asset. If you ever want to take some
    of the load off, I’d absolutely love to write some
    articles for your blog in exchange for a link back to mine.
    Please blast me an email if interested. Kudos!

  4. I think this is among the most important info for me.
    And i am glad reading your article. But want to remark on few general things, The site style is perfect, the articles is really nice :
    D. Good job, cheers

  5. Hi! I know this is somewhat off topic but I was wondering if
    you knew where I could find a captcha plugin for my comment form?
    I’m using the same blog platform as yours and I’m having difficulty finding one?
    Thanks a lot!

  6. A fascinating discussion is definitely worth comment.

    I do think that you should publish more about this subject matter,
    it may not be a taboo subject but usually people do not
    talk about these issues. To the next! Kind regards!!

  7. Hey! I know this is kinda off topic but I’d figured I’d ask.
    Would you be interested in exchanging links or maybe guest writing
    a blog post or vice-versa? My site discusses a lot of the same
    topics as yours and I believe we could greatly benefit from each other.
    If you happen to be interested feel free to send me an e-mail.
    I look forward to hearing from you! Fantastic blog by the way!

  8. I loved as much as you will receive carried out right here.
    The sketch is tasteful, your authored subject matter stylish.
    nonetheless, you command get got an impatience over that you wish be delivering the following.

    unwell unquestionably come further formerly again since exactly the same nearly a
    lot often inside case you shield this increase.

  9. Howdy, i read your blog occasionally and i own a similar one and i was just wondering
    if you get a lot of spam comments? If so how do you prevent it, any plugin or anything you can suggest?
    I get so much lately it’s driving me mad so any help is very much appreciated.

  10. Aplikacja powstała jako projekt open source’owy i służy
    jako pokaz możliwości oprogramowania wykorzystującego narzędzia
    do wykrywania twarzy za pomocą wbudowanej w smartfon przedniej kamery.

  11. Hello! I could have sworn I’ve been to this web site
    before but after browsing through a few of the articles I realized
    it’s new to me. Regardless, I’m definitely pleased I stumbled upon it
    and I’ll be bookmarking it and checking back frequently!

  12. Fantastic goods from you, man. I have understand your stuff previous to and you’re just too
    magnificent. I really like what you’ve acquired here, really
    like what you’re stating and the way in which you say it.
    You make it entertaining and you still care for to keep it
    smart. I can’t wait to read much more from you.
    This is really a wonderful website.

  13. Greetings! This is my first visit to your blog! We are a
    collection of volunteers and starting a new initiative in a community in the same niche.
    Your blog provided us beneficial information to work on. You have done a
    extraordinary job!

  14. I have been exploring for a little for any high-quality articles or blog posts in this
    sort of area . Exploring in Yahoo I eventually stumbled upon this website.

    Studying this info So i’m glad to express that I have a very just right uncanny feeling I discovered exactly what I needed.

    I most certainly will make certain to don?t put out of
    your mind this web site and provides it a look regularly.

  15. Good day! Do you know if they make any plugins to assist with Search Engine Optimization? I’m trying
    to get my blog to rank for some targeted keywords but I’m
    not seeing very good results. If you know of any please share.
    Cheers!

  16. Hiya very cool blog!! Man .. Excellent .. Superb .. I’ll bookmark your blog and take the feeds
    also? I’m glad to seek out a lot of useful information right here within the
    put up, we’d like work out extra strategies on this
    regard, thank you for sharing. . . . . .

  17. hey there and thank you for your info – I’ve certainly picked up
    anything new from right here. I did however expertise several technical points using this website,
    since I experienced to reload the website many times previous to I
    could get it to load properly. I had been wondering if your
    web host is OK? Not that I’m complaining, but sluggish loading
    instances times will very frequently affect your
    placement in google and can damage your
    high quality score if ads and marketing with Adwords.

    Well I’m adding this RSS to my email and could look out for a lot more of
    your respective intriguing content. Ensure that you update this again soon.

  18. I’ve been browsing on-line more than 3 hours nowadays, but I by no means
    discovered any interesting article like yours. It’s beautiful value enough for me.
    Personally, if all site owners and bloggers made just right
    content material as you did, the net might be much more helpful than ever before.

  19. Hi, I do believe this is an excellent blog. I stumbledupon it
    😉 I’m going to return yet again since i have book-marked it.
    Money and freedom is the best way to change, may you be rich and continue to guide others.

  20. Fantastic blog! Do you have any hints for aspiring
    writers? I’m hoping to start my own blog soon but I’m a
    little lost on everything. Would you suggest starting with a
    free platform like WordPress or go for a paid option? There are so many options out there that I’m completely confused ..

    Any tips? Bless you!

  21. whoah this weblog is excellent i love reading
    your posts. Keep up the good work! You know, lots of persons are hunting around for this information, you can help them greatly.

  22. Howdy! This post could not be written any better! Reading through this post reminds
    me of my good old room mate! He always kept talking about this.
    I will forward this post to him. Pretty sure he will have a good read.
    Many thanks for sharing!

  23. My family every time say that I am wasting my time here at web, however I know I am getting experience all the time by
    reading such nice posts.

  24. hey there and thank you for your information – I have definitely picked up anything
    new from right here. I did however expertise a few technical points using this web site, since
    I experienced to reload the website lots of times previous to
    I could get it to load properly. I had been wondering if your web host is
    OK? Not that I’m complaining, but sluggish loading instances
    times will sometimes affect your placement in google
    and can damage your high-quality score if ads and marketing with Adwords.
    Anyway I’m adding this RSS to my email and can look out for much more of your
    respective fascinating content. Make sure you update this again soon.

  25. Hello there! This is my first visit to your blog! We are a group of volunteers and starting a new project in a community
    in the same niche. Your blog provided us beneficial information to
    work on. You have done a extraordinary job!

  26. I know this if off topic but I’m looking into starting my own weblog
    and was wondering what all is required to get set up?
    I’m assuming having a blog like yours would cost a pretty penny?
    I’m not very web savvy so I’m not 100% positive.
    Any tips or advice would be greatly appreciated. Thank you

  27. I am really impressed with your writing skills and also with the layout on your blog.

    Is this a paid theme or did you modify it yourself? Either way
    keep up the excellent quality writing, it is
    rare to see a great blog like this one nowadays.

  28. Thanks for another wonderful article. The place else may just anyone
    get that type of info in such a perfect means of writing? I’ve a presentation subsequent week, and I’m on the look for
    such information.

    my website – tapas near me

  29. I just like the helpful info you supply on your articles.
    I will bookmark your blog and test again here regularly.
    I’m slightly sure I’ll learn many new stuff proper right here!
    Best of luck for the next!

  30. Hello! I could have sworn I’ve been to this blog before but after browsing through some of the post
    I realized it’s new to me. Nonetheless, I’m definitely glad I found it and
    I’ll be book-marking and checking back often!

  31. Its such as you read my thoughts! You seem to know so much about this, like
    you wrote the e-book in it or something. I believe that you can do with some p.c.

    to force the message house a little bit, however instead of that, that
    is magnificent blog. A great read. I’ll definitely be back.

  32. I am extremely impressed with your writing skills as well as with the layout
    on your blog. Is this a paid theme or did you modify it yourself?
    Anyway keep up the nice quality writing, it’s rare to see a
    great blog like this one these days.

  33. Fascinating blog! Is your theme custom made or did you download it from somewhere?

    A design like yours with a few simple adjustements would really make my blog jump out.
    Please let me know where you got your theme. Thanks a lot

  34. I got this web site from my friend who shared with me on the topic of this web site
    and now this time I am browsing this web page and reading very informative articles or reviews at this time.

  35. I like the helpful info you provide in your
    articles. I’ll bookmark your blog and check again here frequently.
    I’m quite sure I will learn lots of new stuff right here!

    Best of luck for the next!

  36. Terrific post however I was wanting to know if you could
    write a litte more on this topic? I’d be very grateful if you
    could elaborate a little bit further. Thanks!

  37. Just wish to say your article is as surprising.
    The clearness in your post is just cool and i could assume you’re an expert on this subject.

    Well with your permission let me to grab your RSS feed to keep up to date with forthcoming post.
    Thanks a million and please keep up the gratifying work.

  38. I know this if off topic but I’m looking into starting my own weblog and was wondering
    what all is required to get setup? I’m assuming having
    a blog like yours would cost a pretty penny?
    I’m not very web smart so I’m not 100% positive. Any suggestions or advice would be greatly appreciated.
    Thank you

  39. Excellent items from you, man. I’ve understand your
    stuff previous to and you are simply too excellent.
    I actually like what you have received right here, really like what you’re stating and
    the way in which through which you say it. You’re making it enjoyable and you continue
    to care for to stay it wise. I cant wait to read far more from you.
    This is really a tremendous website.

  40. Good post however I was wanting to know if you could write a litte more
    on this topic? I’d be very thankful if you could elaborate a little
    bit more. Appreciate it!

  41. Helpful information. Lucky me I discovered your
    site unintentionally, and I’m surprised why this twist of fate did not came about
    in advance! I bookmarked it.

  42. I’m truly enjoying the design and layout of your website.
    It’s a very easy on the eyes which makes it much more enjoyable for me
    to come here and visit more often. Did you hire out a designer to create your
    theme? Superb work!

Leave a Reply to 검증 토토사이트 Cancel reply

Please enter your comment!
Please enter your name here