Chatting with ChatGPT is fun and informative — I’ve been chit-chatting with it for past time and exploring some recent ideas to learn. But these are more casual use cases and the novelty can quickly wean off, especially once you realize that it may well generate hallucinations.
How might we use it in a more productive way? With the recent release of the GPT 3.5 series API by OpenAI, we are able to do way more than simply chit-chatting. One very productive use case for businesses and your personal use is QA (Query Answering) — [1]. You should utilize it for customer support, synthesizing user research, your personal knowledge management, and more!
In this text, I’ll explore the right way to construct your personal Q&A chatbot based on your personal data, including why some approaches won’t work, and a step-by-step guide for constructing a document Q&A chatbot in an efficient way with llama-index and GPT API.
(Should you only need to know the right way to construct the Q&A chatbot, you’ll be able to jump on to the section “Constructing document Q&A chatbot step-by-step”)
My day job is as a product manager — reading customer feedback and internal documents takes a giant chunk of my life. When ChatGPT got here out, I immediately considered the thought of using it as an assistant to assist me synthesize customer feedback or find related old product documents in regards to the feature I’m working on.
I first considered fine-tuning the GPT model with my very own data to attain the goal. But fine-tuning costs quite some money and requires a giant dataset with examples. It’s also not possible to fine-tune each time when there may be a change to the document. A fair more crucial point is that fine-tuning simply CANNOT let the model “know” all the data within the documents, but fairly it teaches the model a recent skill. Subsequently, for (multi-)document QA, fine-tuning isn’t the method to go.
The second approach that comes into my mind is prompt engineering by providing the context within the prompts. For instance, as an alternative of asking the query directly, I can append the unique document content before the actual query. However the GPT model has a limited attention span — it may well only soak up just a few thousand words within the prompt (about 4000 tokens or 3000 words). It’s not possible to provide it all of the context within the prompt, provided that now we have hundreds of customer feedback emails and lots of of product documents. It’s also costly in the event you pass in an extended context to the API since the pricing is predicated on the variety of tokens you utilize.
I'll ask you questions based on the next context:
— Start of Context —YOUR DOCUMENT CONTENT
— End of Context—
My query is: “What features do users wish to see within the app?”
(If you desire to learn more about fine-tuning and prompt engineering for GPT, you’ll be able to read the article: https://medium.com/design-bootcamp/3-ways-to-tailor-foundation-language-models-like-gpt-for-your-business-e68530a763bd)
Since the prompt has limitations on the variety of input tokens, I got here up with the thought of first using an algorithm to go looking the documents and select the relevant excerpts after which passing only these relevant contexts to the GPT model with my questions. While I used to be researching this concept, I got here across a library called gpt-index (now renamed to LlamaIndex), which does exactly what I desired to do and it is simple to make use of [2].
In the subsequent section, I’ll give a step-by-step tutorial on using LlamaIndex and GPT to construct a Q&A chatbot on your personal data.
On this section, we are going to construct a Q&A chatbot based on existing documents with LlamaIndex and GPT (text-davinci-003), so which you could ask questions on your document and get a solution from the chatbot, all in natural language.
Prerequisites
Before we start the tutorial, we’d like to arrange just a few things:
- Your OpenAI API Key, which may be found at https://platform.openai.com/account/api-keys.
- A database of your documents. LlamaIndex supports many various data sources like Notion, Google Docs, Asana, etc [3]. For this tutorial, we’ll just use a straightforward text file for demonstration.
- A neighborhood Python environment or an internet Google Colab notebook.
Workflow
The workflow is simple and takes only just a few steps:
- Construct an index of your document data with LlamaIndex
- Query the index with natural language
- LlamaIndex will retrieve the relevant parts and pass them to the GPT prompt
- Ask GPT with the relevant context and construct a response
What LlamaIndex does is convert your original document data right into a vectorized index, which could be very efficient to question. It is going to use this index to seek out essentially the most relevant parts based on the similarity of the query and the info. Then, it would plug in what’s retrieved into the prompt it would send to GPT in order that GPT has the context for answering your query.
We’ll must install the libraries first. Simply run the next command in your terminal or on Google Colab notebook. These commands will install each LlamaIndex and OpenAI.
!pip install llama-index
!pip install openai
Next, we’ll import the libraries in python and arrange your OpenAI API key in a recent .py file.
# Import mandatory packages
from llama_index import GPTSimpleVectorIndex, Document, SimpleDirectoryReader
import osos.environ['OPENAI_API_KEY'] = 'sk-YOUR-API-KEY'
After we installed the required libraries and import them, we are going to must construct an index of your document.
To load your document, you should utilize the SimpleDirectoryReader method provided by LllamaIndex or you’ll be able to load it from strings.
# Loading from a directory
documents = SimpleDirectoryReader('your_directory').load_data()# Loading from strings, assuming you saved your data to strings text1, text2, ...
text_list = [text1, text2, ...]
documents = [Document(t) for t in text_list]
LlamaIndex also provides a wide range of data connectors, including Notion, Asana, Google Drive, Obsidian, etc. You’ll find the available data connectors at https://llamahub.ai/.
After loading the documents, we are able to then construct the index simply with
# Construct a straightforward vector index
index = GPTSimpleVectorIndex(documents)
If you desire to save the index and cargo it for future use, you should utilize the next methods
# Save your index to a index.json file
index.save_to_disk('index.json')
# Load the index out of your saved index.json file
index = GPTSimpleVectorIndex.load_from_disk('index.json')
Querying the index is easy
# Querying the index
response = index.query("What features do users wish to see within the app?")
print(response)
And voilà! You’re going to get your answer printed. Under the hood, LlamaIndex will take your prompt, seek for relevant chunks within the index, and pass your prompt and the relevant chunks to GPT.
The steps above show only a quite simple starter usage for query answering with LlamaIndex and GPT. But you’ll be able to do way more than that. In reality, you’ll be able to configure LlamaIndex to make use of a unique large language model (LLM), use a unique style of index for various tasks, update existing indices with a recent index, etc. Should you’re interested, you’ll be able to read their doc at https://gpt-index.readthedocs.io/en/latest/index.html.
On this post, we’ve seen the right way to use GPT together with LlamaIndex to construct a document question-answering chatbot. While GPT (and other LLM) is powerful in itself, its powers may be much amplified if we mix it with other tools, data, or processes.
What would you utilize a document question-answering chatbot for?
:
[1] What Is Query Answering? — Hugging Face. 5 Dec. 2022, https://huggingface.co/tasks/question-answering.
[2] Liu, Jerry. LlamaIndex. Nov. 2022. GitHub, https://github.com/jerryjliu/gpt_index.