Graphs are relevant
A Knowledge Graph could possibly be defined as a structured representation of knowledge that connects concepts, entities, and their relationships in a way that mimics human understanding. It is commonly used to organise and integrate data from various sources, enabling machines to reason, infer, and retrieve relevant information more effectively.
In a previous post on Medium I made the purpose that this sort of structured representation will be used to boost and ideal the performances of LLMs in Retrieval Augmented Generation applications. We could speak of GraphRAG as an ensemble of techniques and methods employing a graph-based representation of data to higher serve information to LLMs in comparison with more standard approaches that could possibly be taken for “Chat along with your documents” use cases.
The “vanilla” RAG approach relies on vector similarity (and, sometimes, hybrid search) with the goal of retrieving from a vector database pieces of knowledge (chunks of documents) which might be to the user’s input, in keeping with some similarity measure resembling cosine or euclidean. These pieces of knowledge are then passed to a Large Language Model that’s prompted to make use of them as context to generate a relevant output to the user’s query.
My argument is that the most important point of failure in those type of applications is similarity search counting on explicit mentions within the knowledge base (), leaving the LLM blind to cross-references between documents, and even to implied (implicit) and contextual references. In short, the LLM is proscribed because it cannot reason at a level.
This will be addressed moving away from pure vector representations and vector stores to a more comprehensive way of organizing the knowledge base, extracting concepts from each bit of text and storing while keeping track of relationships between pieces of knowledge.
Graph structure is in my view one of the simplest ways of organizing a knowledge base with documents containing cross-references and implicit mentions to one another prefer it all the time happens inside organizations and enterprises. A graph foremost features are in truth
- Entities (Nodes): they represent real-world objects like people, places, organizations, or abstract concepts;
- Relationships (Edges): they define how entities are connected between them (i.e: “Bill → WORKS_AT → Microsoft”);
- Attributes (Properties): provide additional details about entities (e.g., Microsoft’s founding 12 months, revenue, or location) or relationships ( i.e. “Bill → FRIENDS_WITH {since: 2021} → Mark”).
A Knowledge Graph can then be defined because the Graph representation of corpora of documents coming from a coherent domain. But how exactly will we move from vector representation and vector databases to a Knowledge Graph?
Further, how will we even extract the important thing information to construct a Knowledge Graph?
In this text, I’ll present my standpoint on the topic, with code examples from a repository I developed while learning and experimenting with Knowledge Graphs. This repository is publicly available on my Github and comprises:
- the source code of the project
- example notebooks written while constructing the repo
- a Streamlit app to showcase work done until this point
- a Docker file to built the image for this project without having to undergo the manual installation of all of the software needed to run the project.
The article will present the repo with a view to cover the next topics:
✅ Tech Stack Breakdown of the tools available, with a temporary presentation of every of the components used to construct the project.
✅ The best way to get the Demo up and running in your individual local environment.
✅ The best way to perform the Ingestion Process of documents, including extracting concepts from them and assembling them right into a Knowledge Graph.
✅ The best way to query the Graph, with a give attention to the range of possible strategies that will be employed to perform semantic search, graph query language generation and hybrid search.
For those who are a Data Scientist, a ML/AI Engineer or simply someone curious on easy methods to construct smarter search systems, this guide will walk you thru the complete workflow with code, context and clarity.
Tech Stack Breakdown
As a Data Scientist who began learning programming in 2019/20, my foremost language is in fact Python. Here, I’m using its 3.12 version.
This project is built with a give attention to open-source tools and free-tier accessibility each on the storage side in addition to on the provision of Large Language Models. This makes it a great start line for newcomers or for individuals who should not willing to pay for cloud infrastructure or for OpenAI’s API KEYs.
The source code is, nevertheless, written with production use cases in mind — focusing not only on quick demos, but on easy methods to transition a project to real-world deployment. The code is due to this fact designed to be easily customizable, modular, and extendable, so it could possibly be adapted to your individual data sources, LLMs, and workflows with minimal friction.
Below is a breakdown of the important thing components and the way they work together. You too can read the repo’s README.md for further information on easy methods to stand up and running with the demo app.
🕸️ Neo4j — Graph Database + Vector Store
Neo4j powers the knowledge graph layer and likewise stores vector embeddings for semantic search. The core of Neo4j is Cypher, the query language needed to interact with a Neo4j Database. A number of the key other features from Neo4j which might be utilized in this project are:
- GraphDB: To store structured relationships between entities and ideas.
- VectorDB: Embedding support allows similarity search and hybrid queries.
- Python SDK: Neo4j offers a python driver to interact with its instance and wrap around it. Because of the python driver, knowing Cypher shouldn’t be mandatory to interact with the code on this repo. Because of the SDK, we’re in a position to use other python graph Data Science libraries as well, resembling
networkx
orpython-louvain
. - Local Development: Neo4j offers a Desktop version and it also could possibly be easily deployed via Docker images into containers or on any Virtual Machine (Linux/macOS/Windows).
- Production Cloud: You too can use Neo4j Aura for a fully-managed solution; this comes with a free tier, and it’s able to be hosted in any cloud of your alternative depending in your needs.
🦜 LangChain — Agent Framework for LLM Workflows
LangChain is used to coordinate how LLMs interact with tools just like the vector index and the entities within the Knowledge Graphs, and naturally with the user input.
- Used to define custom agents and toolchains.
- Integrates with retrievers, memory, and prompt templates.
- Makes it easy to swap in numerous LLM backends.
🤖 LLMs + Embeddings
LLMs and Embeddings will be invoked each from an area deployment using Ollama or a web-based endpoint of your alternative. I’m currently using the Groq free-tier API to experiment, switching between gemma2-9b-it
and various versions of Llama, resembling meta-llama/llama-4-scout-17b-16e-instruct
. For Embeddings, I’m using mxbai-embed-large
running via Ollama on my M1 Macbook Air; on the identical setup I used to be also in a position to run llama3.2
(2B) up to now, keeping in mind my hardware limitations.
Each Ollama and Groq are plug and play and have Langchain’s wrappers.
👑 Streamlit — Frontend UI for Interactions & Demos
I actually have written a small demo app using Streamlit, a python library that enables developers to construct minimal frontend layers without writing any HTML or CSS, just pure python.
On this demo app you will note easy methods to
- Ingest your documents into Neo4j under a Graph-based representation.
- Run live demos of the graph-based querying, showcasing key differences between various querying strategies.
Streamlit’s foremost benefits is that it’s super lightweight, fast to deploy, and doesn’t require a separate frontend framework or backend. Its features make it the right fit for demos and prototypes resembling this one.
Nevertheless, it shouldn’t be suitable for production apps due to it limited customisation features and UI control, in addition to the absence of a native option to perform authorisation and authentication, and a correct option to handle scaling. Going from demo to production normally requires a more suitable front-end framework and a transparent separation between back-end and front-end frameworks and their responsibilities.
🐳 Docker — Containerisation for Local Dev & Deployment
Docker is a tool that allows you to package your application and all its dependencies right into a container — a light-weight, standalone, and portable environment that runs consistently on any system.
Since I imagined it could possibly be difficult to administer all of the mentioned dependencies, I also added a Dockerfile for constructing a picture of the app, in order that Neo4j, Ollama and the app itself could run in isolated, reproducible containers via docker-compose.
To run the demo app yourself, you may follow the instructions on the README.md
Now that the tech stack we’re going to use has been presented, we are able to deep dive into how the app actually works behind the curtains, ranging from the ingestion pipeline.
From Text Corpus to Knowledge Graph
As I previously mentioned, it’s recommendable that documents which might be being ingested right into a Knowledge Graph come from the identical domain. These could possibly be manuals from the medical domain on diseases and their symptoms, code documentation from past projects, or newspaper articles on a selected subject.
Being a politics geek, to check and play with my code, I select pdf Press Materials from the European Commission’s Press corner.
Once the documents have been collected, we’ve to ingest them into the Knowledge Graph.
The ingestion pipeline must follow the steps reported below
The reference source code for this a part of the article is in src/ingestion.
1. Load files right into a machine-friendly format
Within the code example below, the category Ingestor
is used to infer the mime style of each file we’re attempting to read and langchain’s document loaders are employed to read its content accordingly; this enables for customisations regarding the format of source files that may populate our Knowledge Graph.
class Ingestor:
"""
Base `Ingestor` Class with common methods.
May be specialized by source.
"""
def ___init__(self, source: Source):
self.source = source
@abstractmethod
def list_files(self)-> List[str]:
pass
@abstractmethod
def file_preparation(self, file) -> Tuple[str, dict]:
pass
@staticmethod
def load_file(filepath: str, metadata: dict) -> List[Document]:
mime = magic.Magic(mime=True)
mime_type = mime.from_file(filepath) or metadata.get('Content-Type')
if mime_type == 'inode/x-empty':
return []
loader_class = MIME_TYPE_MAPPING.get(mime_type)
if not loader_class:
logger.warning(f'Unsupported MIME type: {mime_type} for file {filepath}, skipping.')
return []
if loader_class == PDFPlumberLoader:
loader = loader_class(
file_path=filepath,
extract_images=False,
)
elif loader_class == Docx2txtLoader:
loader = loader_class(
file_path=filepath
)
elif loader_class == TextLoader:
loader = loader_class(
file_path=filepath
)
elif loader_class == BSHTMLLoader:
loader = loader_class(
file_path=filepath,
open_encoding="utf-8",
)
try:
return loader.load()
except Exception as e:
logger.warning(f"Error loading file: {filepath} with exception: {e}")
pass
@staticmethod
def merge_pages(pages: List[Document]) -> str:
return "nn".join(page.page_content for page in pages)
@staticmethod
def create_processed_document(file: str, document_content: str, metadata: dict):
processed_doc = ProcessedDocument(filename=file, source=document_content, metadata=metadata)
return processed_doc
def ingest(self, filename: str, metadata: Dict[str, Any]) -> ProcessedDocument | None:
"""
Loads a file from a path and switch it right into a `ProcessedDocument`
"""
base_name = os.path.basename(filename)
document_pages = self.load_file(filename, metadata)
try:
document_content = self.merge_pages(document_pages)
except(TypeError):
logger.warning(f"Empty document {filename}, skipping..")
if document_content shouldn't be None:
processed_doc = self.create_processed_document(
base_name,
document_content,
metadata
)
return processed_doc
def batch_ingest(self) -> List[ProcessedDocument]:
"""
Ingests all files in a folder
"""
processed_documents = []
for file in self.list_files():
file, metadata = self.file_preparation(file)
processed_doc = self.ingest(file, metadata)
if processed_doc:
processed_documents.append(processed_doc)
return processed_documents
2. Clean and split document content into text chunks
That is essential for the graph extraction phase ahead of us. To wash texts, depending on domain and on the document’s format, it would make sense to jot down custom cleansing and chunking functions. That is where the document’s chunks
list is populated.
Chunking size, overlap and other possible configurations here could possibly be domain dependent and ought to be configured in keeping with the expertise of the DS / AI Engineer; the category accountable for chunking is exemplified below.
class Chunker:
"""
Comprises methods to chunk the text of a (list of) `ProcessedDocument`.
"""
def __init__(self, conf: ChunkerConf):
self.chunker_type = conf.type
if self.chunker_type == "recursive":
self.chunk_size = conf.chunk_size
self.chunk_overlap = conf.chunk_overlap
self.splitter = RecursiveCharacterTextSplitter(
chunk_size=self.chunk_size,
chunk_overlap=self.chunk_overlap,
is_separator_=False
)
else:
logger.warning(f"Chunker type '{self.chunker_type}' not supported.")
def _chunk_document(self, text: str) -> list[str]:
"""Chunks the document and returns a listing of chunks."""
return self.splitter.split_text(text)
def get_chunked_document_with_ids(
self,
text: str,
) -> list[dict]:
"""Chunks the document and returns a listing of dictionaries with chunk ids and chunk text."""
return [
{
"chunk_id": i + 1,
"text": chunk,
"chunk_size": self.chunk_size,
"chunk_overlap": self.chunk_overlap
}
for i, chunk in enumerate(self._chunk_document(text))
]
def chunk_document(self, doc: ProcessedDocument) -> ProcessedDocument:
"""
Chunks the text of a `ProcessedDocument` instance.
"""
chunks_dict = self.get_chunked_document_with_ids(doc.source)
doc.chunks = [Chunk(**chunk) for chunk in chunks_dict]
logger.info(f"DOcument {doc.filename} has been chunked into {len(doc.chunks)} chunks.")
return doc
def chunk_documents(self, docs: List[ProcessedDocument]) -> List[ProcessedDocument]:
"""
Chunks the text of a listing of `ProcessedDocument` instances.
"""
updated_docs = []
for doc in docs:
updated_docs.append(self.chunk_document(doc))
return updated_docs
3. Extract Concepts Graph
For every chunk within the document, we wish to extract a graph of concepts. To accomplish that, we program a custom agent powered by a LLM with this precise task. Langchain is useful here because of a technique called with_structured_output
that wraps LLM calls and allows you to define the expected output schema using a pydantic model. This ensures that the LLM of your alternative returns structured, validated responses and never free-form text.
That is what the GraphExtractor
looks like:
class GraphExtractor:
"""
Agent in a position to extract informations in a graph representation format from a given text.
"""
def __init__(self, conf: LLMConf, ontology: Optional[Ontology]=None):
self.conf = conf
self.llm = fetch_llm(conf)
self.prompt = get_graph_extractor_prompt()
self.prompt.partial_variables = {
'allowed_labels':ontology.allowed_labels if ontology and ontology.allowed_labels else "",
'labels_descriptions': ontology.labels_descriptions if ontology and ontology.labels_descriptions else "",
'allowed_relationships': ontology.allowed_relations if ontology and ontology.allowed_relations else ""
}
def extract_graph(self, text: str) -> _Graph:
"""
Extracts a graph from a text.
"""
if self.llm shouldn't be None:
try:
graph: _Graph = self.llm.with_structured_output(
schema=_Graph
).invoke(
input=self.prompt.format(input_text=text)
)
return graph
except Exception as e:
logger.warning(f"Error while extracting graph: {e}")
Notice that the expected output _Graph
is defined as:
class _Node(Serializable):
id: str
type: str
properties: Optional[Dict[str, str]] = None
class _Relationship(Serializable):
source: str
goal: str
type: str
properties: Optional[Dict[str, str]] = None
class _Graph(Serializable):
nodes: List[_Node]
relationships: List[_Relationship]
Optionally, the LLM agent accountable for extracting a graph from chunks will be supplied with an Ontology describing the domain of the documents.
An ontology will be described because the formal specification of the kinds of entities and relationships that may exist within the graph — it’s, essentially, its blueprint.
class Ontology(BaseModel):
allowed_labels: Optional[List[str]]=None
labels_descriptions: Optional[Dict[str, str]]=None
allowed_relations: Optional[List[str]]=None
4. Embed each chunk of the document
Next, we wish to acquire a vector representation of the text contained in each chunk. This will be done using the Embeddings model of your alternative and passing the list of documents to the ChunkEmbedder
class.
class ChunkEmbedder:
""" Comprises methods to embed Chunks from a (list of) `ProcessedDocument`."""
def __init__(self, conf: EmbedderConf):
self.conf = conf
self.embeddings = get_embeddings(conf)
if self.embeddings:
logger.info(f"Embedder of type '{self.conf.type}' initialized.")
def embed_document_chunks(self, doc: ProcessedDocument) -> ProcessedDocument:
"""
Embeds the chunks of a `ProcessedDocument` instance.
"""
if self.embeddings shouldn't be None:
for chunk in doc.chunks:
chunk.embedding = self.embeddings.embed_documents([chunk.text])
chunk.embeddings_model = self.conf.model
logger.info(f"Embedded {len(doc.chunks)} chunks.")
return doc
else:
logger.warning(f"Embedder type '{self.conf.type}' shouldn't be yet implemented")
def embed_documents_chunks(self, docs: List[ProcessedDocument]) -> List[ProcessedDocument]:
"""
Embeds the chunks of a listing of `ProcessedDocument` instances.
"""
if self.embeddings shouldn't be None:
for doc in docs:
doc = self.embed_document_chunks(doc)
return docs
else:
logger.warning(f"Embedder type '{self.conf.type}' shouldn't be yet implemented")
return docs
5. Save the embedded chunks into the Knowledge Graph
Finally, we’ve to upload the documents and their chunks in our Neo4j instance. I’ve built upon the already available Neo4jGraph
langchain class to create a personalized version for this repo.
The code of the KnowledgeGraph
class is obtainable at src/graph/knowledge_graph.py and that is how its core method add_documents
works:
a. for every file, create a Document node on the Graph with its properties (metadata) resembling the source of the file, the name, the ingestion date..
b. for every chunk, create a Chunk node, connected to the unique Document node by a relationship (PART_OF
) and save the embedding of the chunk as a property of the node; connect each Chunk node with the next with one other relationship (NEXT
).
c. for every chunk, save the extracted subgraph: nodes, relationships and their properties; we also connect them to their source Chunk
with a relationship (MENTIONS
).
d. perform hierarchical clustering on the Graph to detect communities of nodes inside it. Then, use a LLM to summarise the resulting communities obtaining Community Reports and embed said summaries.
Communities in a graph are clusters or groups of nodes which might be more densely connected to every aside from to the remaining of the graph. In other words, nodes inside the same community have many connections with one another and comparatively fewer connections with nodes outside the group.
The results of this process in Neo4j looks something like this: data structured into entities and relationships with their properties, just as we wanted. Specifically, Neo4j also offers the chance to have multiple vector indexes in the identical instance, and we exploit this feature to separate the embeddings of chunks from those of communities.

Within the image above, you would possibly have noticed that some nodes within the Graph are more connected to one another, while other nodes have fewer connection and lie on the borders of the Graph. For the reason that image you’re looking at is produced from the European Commission’s Press Corner pdfs, it is barely normal that in the middle we could find entities resembling “” (President of the European Commission) and even “”: in truth, those are a number of the most mentioned entities in our Knowledge Graph.
Below, you’ll find a more zoomed-in screenshot, where relationship and entity names are literally visible. The unique filename of the document (lightblue) at the middle is “”. Apparently the extraction of entities and relationships via LLM worked fairly positive on this one.

Once the Knowledge Graph has been created, we are able to employ LLMs and Agents to question it and ask questions on the available documents. Let’s go for it!
Graph-informed Retrieval Augmented Generation
For the reason that release of ChatGPT in late 2022, I actually have built my fair proportion of POCs and Demos on Retrieval Augmented Generation, “” use cases.
All of them share the identical methodology for giving the top user the specified answer: embed the user query, perform similarity search on the vector store of alternative, retrieve chunks (pieces of knowledge) from the vector store, then pass the user’s query and the context obtained from those chunks to a LLM; finally, answer the query.
It is advisable to add some memory of the conversation (read: a ) and even callbacks to perform some activities resembling keeping track of tokens spent in the method and latency of the reply. Many vector stores also allow for , which is identical process mentioned above, only adding a filter on chunks based on their metadata before the similarity search even happens.
That is the extent of complexity you get with this sort of RAG applications: select the variety of texts you desire to retrieve, predetermine the filters, select the LLM accountable for answering. Eventually, these type of approaches reach an asymptote by way of performance, and also you is likely to be left with only a handful of options on easy methods to tweak the LLM parameters to higher handle user queries.
As an alternative, what does the RAG approach looks like with a Knowledge Graph? The honest answer to that query is: It really boils down on what type of questions you’re going to ask.
While learning about Knowledge Graphs and their applications in real world use cases, I spent an extended time reading. Blogposts, articles and Medium posts, even some books. The more I dug, the more questions got here to my mind, the less definitive my answers: apparently, when coping with knowledge that’s structured BOTH in a graph representation and into vector indexes, a variety of options open up.
After my reading, I spent a while developing my very own answers (and the code that goes with it) on strategies that will be applied when querying the Knowledge Graph using Large Language Models. What follows is a temporary excursus on my tackle the topic.
the reference source code is an element of the GraphAgentResponder class, available at src/agents/graph_qa.py .
1. Enhanced RAG
Initially, you would all the time perform the same old RAG process: chunks embeddings are, in any case, available within the Graph, and so they will be retrieved like with another vector database. Furthermore, you would also perform hybrid search, since those chunks are literally nodes, and nodes have properties (attributes) that will be used for filtering.
What if we went a step further and took into consideration that Chunk nodes are connected to one another by a NEXT
relationship, and to the unique Document node by a PART_OF
relationship? Querying for similarity on embeddings after which enriching the context with neighbouring chunks might unlock latest possibilities on the subject of level of details provided in the reply, but additionally the power to cite the source file and its metadata.

As an example, similarity search could find yourself pointing to chunk 2 of the previously mentioned document, but querying for neighbouring chunks could allow the LLM to offer a more detailed, contextualised answer.
To provide a working example, let’s consider an issue on the nodes depicted within the screenshot previously shown a couple of press release focused on Artificial Intelligence: “ That is what our Agent answers:
> `The EU strategy on artificial intelligence is printed within the AI Continent Motion Plan, which goals to make Europe a worldwide leader in AI. The plan focuses on five key pillars:`
> 1. `Constructing a large-scale AI data and computing infrastructure`
> 2. `Making a true internal marketplace for data (with a comprehensive Data Union Technique to be launched in 2025)`
> 3. `Developing algorithms and fostering AI adoption in strategic EU sectors (through the Apply AI Strategy)`
> 4. `Strengthening AI skills and abilities (through initiatives resembling the Talent Pool, Marie Skłodowska-Curie Motion, and AI fellowships schemes)`
>
> `The plan also includes initiatives resembling organising AI Factories and AI Gigafactories to support EU AI startups, industry, and researchers in developing AI models and applications.`
2. Community Reports
One other version of enhanced RAG that got here to my mind while constructing this primary version of the repository employs Community Reports.
Within the last step of the ingestion phase in truth, we used hierarchical clustering techniques resembling the Leiden or Louvain algorithm to detect communities between nodes within the Knowledge Graph, after which employed a LLM to summarise those communities, obtaining Community Reports.
These reports are stored within the Graph under a distinct vector index than the one for Chunks, and each of those two indexes will be queried via similarity search. Further, retrieved contexts will be combined when passing them to the LLM to obtain answers which might be grounded on the intra-document level but additionally are informed of what is occurring within the graph on the inter-document level.

This strategy is definitely higher fitted for those type of questions that may need information from multiple sources in the shape of overviews or high level summaries.
Let’s return to the “” query and see how the bot answers now, with the ability to retrieve each chunks and community reports via similarity search:
> `The EU strategy on artificial intelligence goals to make the EU a pacesetter in the event and deployment of AI that's trustworthy, secure, and respects human values and rights.`
>
>
> `To realize this goal, in April 2019, the European Commission presented a Communication on AI, which outlined a European approach to AI. The approach focuses on:`
>
> 1. **`Promoting excellence in AI**: The EU desires to support research and innovation in AI, and encourage the event of AI startups and scale-ups.`
> 2. **`Ensuring trust in AI**: The EU goals to construct trust in AI through a sturdy regulatory and legal framework. This includes the event of AI that respects EU values, resembling the General Data Protection Regulation (GDPR).`
>
> `The strategy also highlights the necessity for:`
>
> - **`Investing in AI education and skills**: Developing the talents of the longer term workforce.`
> - **`Addressing the socio-economic changes driven by AI**: Preparing for changes within the job market.`
> - **`Establishing an EU AI ecosystem**: Encouraging collaboration and coordination amongst member states.`
>
> `The European Commission also established an **AI regulatory framework**, proposing to control AI based on a **risk-based approach**, with stricter rules for high-risk AI applications. Would you want more information on specific features of the EU AI strategy?`
Clearly, the reply is more high-leveled than before. This is predicted and is in truth what happens when gaining access to inter-documents contexts.
3. Cypher Queries
Moving away from the purely RAG-based strategy, a distinct option at our disposal now that we’ve our knowledge base structured in a graph is to ask the LLM to traverse it using a graph query language. In Neo4j, because of this we wish to instruct the LLM with the schema of the graph after which ask it to jot down Cypher queries to examine nodes, entities and relationships, based on the user’s query.
That is all possible due to the GraphCyperQAChain
, which is a Chain class from langchain for question-answering against a graph by generating Cypher statements.
In the instance below you’re seeing what happens in case you ask to the LLM the query “”.
The model writes a Cypher query much like
MATCH (person:Person {name: "Thomas Regnier"})-[r]-(connected)
RETURN person.name AS name,
type(r) AS relationship_type,
labels(connected) AS connected_node_labels,
connected
and after taking a look at the intermediate results answers like:
Thomas Regnier is the Contact person for Tech Sovereignity,
defence, space and Research of the European Commission

One other example query that you just is likely to be wanting to ask and that needs graph traversal capabilities to be answered could possibly be “”. The query would lead the Agent to jot down a Cypher query that seek for the Europe Direct node → seek for Chunk nodes mentioning that node → follow the PART_OF
relationship that goes from Chunk to Document node(s).
That is what the reply appear to be:
> `The next documents mention Europe Direct:`
> 1. `STATEMENT/25/964`
> 2. `STATEMENT/25/1028`
> 3. `European Commission Press release (about Discover EU travel passes)`
> `These documents provide a phone number (00 800 67 89 10 11) and an email for Europe Direct for general public inquiries.`
Notice that this purely query-based approach might work out best for those questions which have a concise and direct answer contained in the Knowledge Graph or when the Graph schema is well defined. In fact, the concept of schema within the Graph is tightly linked with the Ontology concept mentioned within the ingestion a part of this text: the more precise and descriptive the Ontology, the higher defined the schema, the better for the LLM to jot down Cypher queries to examine the Graph.
4. Community Subgraph
This strategy is a mixture of the approach on CommunityReport and the Cypher approach, and will be broken down in the next steps:
- obtain essentially the most relevant Community Report(s) via similarity search
- fetch the Chunks belonging to essentially the most relevant communities
- follow the
MENTIONS
relationship of those Chunks and use the community ids to acquire a community subgraph - pass the resulting context and a dictionary representing the subgraph to a Large Language Model to choose easy methods to answer to the user.

That is essentially the most “” out of the strategies I listed to this point, with results that fluctuate greatly between different runs. They should not currently consistent and frequently the LLM tends to get confused by all the knowledge gathered. Nevertheless, I do have the sensation this approach is value exploring and investigating a bit more.
What is straight away clear is how this strategy could get really complicated very soon. To handle this, I’m pondering whether to make use of filters of some sort when fetching the community subgraph, while an even bigger context size for the LLM (or an even bigger LLM) could also surely help.
5. Cypher + RAG
The last strategy I need to propose relies on the mixture of the Enhanced RAG approach and the Cypher Approach. In reality, it employs each the context from similarity search in addition to the intermediate steps of the GraphCypherQAChain
to provide you with an exhaustive and coherent answer.
To provide you an example of how the Agent might behave, let’s return to one in all the questions used for the query-based approach: “”.
The reply is shown below. As will be noticed, it’s each shorter and more informative.
> `The next documents mention Europe Direct:`
>
> - `A press release concerning the European Commission offering 36,000 free EU travel passes to 18-year-olds, where general public inquiries will be made through Europe Direct by phone or email.`
> - `An announcement concerning the European Commission's efforts to support businesses, employees, and Europeans, which incorporates contact information for general public inquiries through Europe Direct.`
> - `A press release about progress in research and innovation in Europe, which also provides contact information for general public inquiries through Europe Direct.`
>
> `You possibly can contact Europe Direct by phone at 00 800 67 89 10 11 or by email.`
This answering method is currently some of the complete approaches I got here up with, and it also has a fallback strategy: if something goes flawed on the query generation part (say, a question is just too complex to jot down, or the LLM dedicated to it reaches its tokens limit), the Agent can still depend on the Enhanced RAG approach, in order that we still get a solution from it.
Summing up and approach comparison
Previously few paragraphs, I presented my tackle different answering strategies available when our knowledge base is well-organised right into a Graph. My presentation nevertheless is much from complete: many other possibilities could possibly be available and I plan to proceed on studying on the matter and provide you with more options.
In my view, since Graphs unlock so many options, the goal must be understanding how these strategies would behave under different scenarios — from lightweight semantic lookups to multi-hop reasoning over a richly linked knowledge graph — and easy methods to make informed trade-offs depending on the use case.
When constructing real-world applications, it’s critical to weight answering strategies not only by accuracy, but additionally by cost, speed, and scalability.
When deciding what technique to employ, the key drivers that we’d want to have a look at are
- Tokens Usage: What number of tokens are consumed per query, especially when traversing multi-hop paths or injecting large subgraphs into the prompt
- Latency: The time it takes to process a retrieval + generation cycle, including graph traversal, prompt construction, and model inference
- Performance: The standard and relevance of the generated responses, with respect to semantic fidelity, factual grounding, and coherence.
Below, I present a comparison table breaking down the answering methods proposed on this section, under the sunshine of those drivers.

Closing Remarks
In this text, we walked through a whole pipeline for constructing and interacting with knowledge graphs using LLMs — from document ingestion all of the option to querying the graph through a demo app.
We covered:
- The best way to ingest documents and transform unstructured content right into a structured Knowledge Graph representation using semantic concepts and relationships extracted via LLMs
- The best way to host the Knowledge Graph in Neo4j
- The best way to query the graph using a wide range of strategies, from vector similarity and hybrid search to graph traversal and multi-hop reasoning — depending on the retrieval task
- How the pieces integrate into a completely functional demo created with Streamlit and containerized with Docker.
Now I would really like to listen to opinions and comments.. and contributions are also welcome!
For those who find this project useful, have ideas for brand spanking new features, or wish to help improve the prevailing components, be at liberty to leap in, open issues or sending in Pull Requests.
Thanks for reading until this point!
References
[1]. Data showcased in this text come from the European Commission’s press corner: https://ec.europa.eu/commission/presscorner/home/en. Press releases can be found under Creative Commons Attribution 4.0 International (CC BY 4.0) license.