Knowledge Graph-Based Chatbot With GPT-3 and Neo4j

Learn the best way to develop a chatbot that gives answers based on data stored in a knowledge graph.

ChatGPT has modified how I, and doubtless most of you, take a look at AI and chatbots. We are able to use chatbots to assist us find information, construct creative works, and more.

Nevertheless, one problem with ChatGPT and similar chatbots is that they’ll hallucinate and return great-sounding — yet wildly inaccurate — results. The issue is that these large language models (LLM) are inherently black boxes, so it is tough to repair and retrain models to cut back hallucinations. Consequently, it may not be idea to rely on answers from ChatGPT if mission-critical tasks or lives are at stake.

Alternatively, there’s tremendous value in having the power to interact with chatbots and use them as an interface for various applications.

So I desired to learn more about chatbots, and fortunately Sixing Huang gave me a crash course on alternative ways of implementing a chatbot. I used to be especially intrigued by the knowledge graph-based approach to chatbots, where the chatbot returns answers based on information and facts stored within the knowledge graph.

Using a knowledge graph as a storage object for answers gives you explicit and complete control over the answers provided by the chatbot and means that you can avoid hallucinations. Moreover, Sixing has already written about and shared the code to implement a knowledge graph-based chatbot, which meant I could borrow some existing ideas and wouldn’t have to start out from scratch.

My idea was to develop a chatbot that may very well be used to explore, analyze, and understand news articles.

Chatbot interface. Image by the creator.

But first, I needed to construct a knowledge graph based on news articles. Luckily, I actually have used and written in regards to the information extraction pipeline quite a few times, so I didn’t need to lose time doing that. Next, it was time to implement my first chatbot. It turned out that making a knowledge graph-based chatbot is as easy as a walk within the park because of GPT-3. I constructed the next chatbot architecture.

Knowledge graph based chatbot architecture. Image by the creator.

The user talks to a Chatbot on a straightforward Streamlit application. When the user inputs their query, the query gets sent to the OpenAI GPT-3 endpoint with a request to show it right into a Cypher statement. The OpenAI endpoint returns a Cypher statement, which is then used to retrieve the data from the knowledge graph stored in Neo4j. The retrieved data from the knowledge graph is then used to construct the reply to the user’s query. Moreover, I actually have added the choice to summarize articles using the GPT-3 endpoint, which might be demonstrated later.

All of the code is obtainable on GitHub.

Constructing a knowledge graph

With a view to have the opportunity to retrieve information from the knowledge graph, we first need to populate it. As mentioned, the thought is to construct a knowledge graph of stories articles. Due to this fact, we want to search out a source of quality and accurate news articles. For the aim of this demonstration, I actually have used the newest 1000 articles available as a Kaggle repository. The articles can be found under the CC BY-NC 4.0 license.

We won’t be delving into details in regards to the information extraction pipeline, as I actually have already written about this subject repeatedly.

For essentially the most part, the thought behind the data extraction pipeline is to extract structured details about mentioned entities and relationships from unstructured text.

Information extraction pipeline. Image by the creator.

In this instance, the data extraction pipeline would discover entities and within the text. Moreover, most named entity recognition models can infer the entity type, meaning that it deduces whether the mentioned entity is an individual, organization, or other.

In the subsequent step, a relationship extraction model is used to detect any structured relationships between entities. The text within the above image clearly signifies the working relationship between and , which could be represented as an relationship.

Interestingly, we could use GPT-3 to extract structured information from text. A project GraphGPT provides a straightforward prompt that could be used to generate structured data based on an input text.

Using GraphGPT prompt to extract structured information from text. Image by the creator.

GPT-3 does a good job of extracting relevant information from the text. It also knows that within the second sentence references ,which is great. Nevertheless, it doesn’t recognize that and reference the identical real-world entity. Entity disambiguation is an important a part of any information extraction pipeline. One approach may very well be to make use of an entity-linking technique to map entities to a goal knowledge base. Often, Wikipedia is used as a goal knowledge base.

Asking GPT-3 to link extracted entities to Wikidata. Image by the creator.

It’s improbable what we will achieve with a straightforward prompt using GPT-3 endpoint. Immediately, you’ll be able to notice that each and map to the identical Q145 id, which could be used for entity disambiguation. Alternatively, Boris Johnson is linked to Q1446 in each instances. All that might be great, nevertheless, the id Q1446 refers to a roman emperor .

Wikidata entry for ID Q1446. Image by the creator.

While GPT-3 is great at following prompts, it tends to hallucinate external information like WikiData ids. While we would provide prompt for entity disambiguation in a single paragraph, it is tough to construct approach to disambiguate entities between various texts without entity linking.

We could develop our information extraction pipeline that deals with relation extraction and entity linking. I implemented such a pipeline two years ago. Nevertheless, since two years is rather a lot in the sphere of NLP, we would find an answer that gives higher accuracy.

To avoid developing a custom information pipeline, we’ll use a Diffbot NLP endpoint. The Diffbot NLP endpoint extracts relationships and provides entity linking out of the box. Moreover, it offers each paragraph and entity-level sentiments, which significantly expand the set of questions we will ask our chatbot, as we will ask it about positive or negative news regarding particular people or entities.

The code to run the data extraction pipeline using the Diffbot endpoint is obtainable as a Jupyter notebook. For this demonstration, you don’t have to run it, as I actually have stored the output of the data extraction pipeline within the project’s data folder. Nevertheless, if you must test it on other datasets and evaluate the way it performs, do give it a try.

Now that the brand new articles have been processed, we will import the output of the data extraction pipeline right into a graph database. In this instance, we might be using Neo4j. The GitHub repository is ready as much as run as two docker services, one for Neo4j and the opposite for the Streamlit application, so that you don’t need to install Neo4j on your individual.

You possibly can either run the seed_database.sh script or execute the Import notebook to populate the graph database with news articles. The graph schema of the populated knowledge graph in regards to the news is the next:

Schema of the populated knowledge graph. Image by the creator.

The knowledge graph comprises nodes containing information in regards to the article’s web title, body content or text, and sentiment. As well as, the articles can mention one or multiple nodes. The Entity nodes contain the URL property, which is the output of the entity-linking process, together with their id and sort.

Interestingly, the relationships between entities are usually not represented as connections in a graph but moderately as separate Relationship nodes. The concept behind this graph modeling decision is that we wish to trace the text where the extracted relationships originate from. As we all know, no NLP pipeline is ideal. Due to this fact it is crucial to have the power to confirm if a relationship is accurately extracted by manually examining the originating text. In a labeled property graph database like Neo4j, we cannot have a connection pointing to a different connection. Consequently, we model the extracted relationships between entities as an intermediate node.

Using a GPT-3 model to generate Cypher statements

We now have already learned that GPT-3 does an excellent job of following orders given in a prompt. Moreover, Sixing Huang has already written about how easy it’s to train the GPT-3 model to generate Cypher statements. The concept is to offer the model just a few examples after which let it generate a Cypher statement given the brand new user input. Specifically, I actually have prepared the next Cypher examples to coach the GPT-3 model.