Zero to Advanced Prompt Engineering with Langchain in Python

Artificial Intelligence

Zero to Advanced Prompt Engineering with Langchain in Python

admin

August 4, 2023

Zero to Advanced Prompt Engineering with Langchain in Python

A very important aspect of Large Language Models (LLMs) is the variety of parameters these models use for learning. The more parameters a model has, the higher it will probably comprehend the connection between words and phrases. Because of this models with billions of parameters have the capability to generate various creative text formats and answer open-ended and difficult questions in an informative way.

LLMs equivalent to ChatGPT, which utilize the Transformer model, are proficient in understanding and generating human language, making them useful for applications that require natural language understanding. Nevertheless, they should not without their limitations, which include outdated knowledge, inability to interact with external systems, lack of context understanding, and sometimes generating plausible-sounding but incorrect or nonsensical responses, amongst others.

Addressing these limitations requires integrating LLMs with external data sources and capabilities, which might present complexities and demand extensive coding and data handling skills. This, coupled with the challenges of understanding AI concepts and complicated algorithms, contributes to the educational curve related to developing applications using LLMs.

Nevertheless, the mixing of LLMs with other tools to form LLM-powered applications could redefine our digital landscape. The potential of such applications is vast, including improving efficiency and productivity, simplifying tasks, enhancing decision-making, and providing personalized experiences.

In this text, we are going to delve deeper into these issues, exploring the advanced techniques of prompt engineering with Langchain, offering clear explanations, practical examples, and step-by-step instructions on the right way to implement them.

Langchain, a state-of-the-art library, brings convenience and adaptability to designing, implementing, and tuning prompts. As we unpack the principles and practices of prompt engineering, you’ll learn the right way to utilize Langchain’s powerful features to leverage the strengths of SOTA Generative AI models like GPT-4.

Understanding Prompts

Before diving into the technicalities of prompt engineering, it is important to know the concept of prompts and their significance.

A ‘prompt‘ is a sequence of tokens which might be used as input to a language model, instructing it to generate a selected form of response. Prompts play a vital role in steering the behavior of a model. They will impact the standard of the generated text, and when crafted appropriately, can assist the model provide insightful, accurate, and context-specific results.

Prompt engineering is the art and science of designing effective prompts. The goal is to elicit the specified output from a language model. By rigorously choosing and structuring prompts, one can guide the model toward generating more accurate and relevant responses. In practice, this involves fine-tuning the input phrases to cater to the model’s training and structural biases.

The sophistication of prompt engineering ranges from easy techniques, equivalent to feeding the model with relevant keywords, to more advanced methods involving the design of complex, structured prompts that use the interior mechanics of the model to its advantage.

Langchain: The Fastest Growing Prompt Tool

LangChain, launched in October 2022 by Harrison Chase, has turn into one among the most highly rated open-source frameworks on GitHub in 2023. It offers a simplified and standardized interface for incorporating Large Language Models (LLMs) into applications. It also provides a feature-rich interface for prompt engineering, allowing developers to experiment with different strategies and evaluate their results. By utilizing Langchain, you may perform prompt engineering tasks more effectively and intuitively.

LangFlow serves as a user interface for orchestrating LangChain components into an executable flowchart, enabling quick prototyping and experimentation.

LangChain fills a vital gap in AI development for the masses. It enables an array of NLP applications equivalent to virtual assistants, content generators, question-answering systems, and more, to unravel a variety of real-world problems.

Quite than being a standalone model or provider, LangChain simplifies the interaction with diverse models, extending the capabilities of LLM applications beyond the constraints of an easy API call.

The Architecture of LangChain

LangChain’s predominant components include Model I/O, Prompt Templates, Memory, Agents, and Chains.

Model I/O

LangChain facilitates a seamless reference to various language models by wrapping them with a standardized interface often called Model I/O. This facilitates an easy model switch for optimization or higher performance. LangChain supports various language model providers, including OpenAI, HuggingFace, Azure, Fireworks, and more.

Prompt Templates

These are used to administer and optimize interactions with LLMs by providing concise instructions or examples. Optimizing prompts enhances model performance, and their flexibility contributes significantly to the input process.

An easy example of a prompt template:

from langchain.prompts import PromptTemplate
prompt = PromptTemplate(input_variables=["subject"],
template="What are the recent advancements in the sector of {subject}?")
print(prompt.format(subject="Natural Language Processing"))

As we advance in complexity, we encounter more sophisticated patterns in LangChain, equivalent to the Reason and Act (ReAct) pattern. ReAct is an important pattern for motion execution where the agent assigns a task to an appropriate tool, customizes the input for it, and parses its output to perform the duty. The Python example below showcases a ReAct pattern. It demonstrates how a prompt is structured in LangChain, using a series of thoughts and actions to reason through an issue and produce a final answer:

PREFIX = """Answer the next query using the given tools:"""
FORMAT_INSTRUCTIONS = """Follow this format:
Query: {input_question}
Thought: your initial thought on the query
Motion: your chosen motion from [{tool_names}]
Motion Input: your input for the motion
Remark: the motion's end result"""
SUFFIX = """Start!
Query: {input}
Thought:{agent_scratchpad}"""

Memory

Memory is a critical concept in LangChain, enabling LLMs and tools to retain information over time. This stateful behavior improves the performance of LangChain applications by storing previous responses, user interactions, the state of the environment, and the agent’s goals. The ConversationBufferMemory and ConversationBufferWindowMemory strategies help keep track of the complete or recent parts of a conversation, respectively. For a more sophisticated approach, the ConversationKGMemory strategy allows encoding the conversation as a knowledge graph which will be fed back into prompts or used to predict responses without calling the LLM.

Agents

An agent interacts with the world by performing actions and tasks. In LangChain, agents mix tools and chains for task execution. It will probably establish a connection to the surface world for information retrieval to enhance LLM knowledge, thus overcoming their inherent limitations. They will resolve to pass calculations to a calculator or Python interpreter depending on the situation.

Agents are equipped with subcomponents:

Tools: These are functional components.
Toolkits: Collections of tools.
Agent Executors: That is the execution mechanism that permits selecting between tools.

Agents in LangChain also follow the Zero-shot ReAct pattern, where the choice relies only on the tool’s description. This mechanism will be prolonged with memory with a view to consider the complete conversation history. With ReAct, as an alternative of asking an LLM to autocomplete your text, you may prompt it to reply in a thought/act/commentary loop.

Chains

Chains, because the term suggests, are sequences of operations that allow the LangChain library to process language model inputs and outputs seamlessly. These integral components of LangChain are fundamentally made up of links, which will be other chains, or primitives equivalent to prompts, language models, or utilities.

Imagine a sequence as a conveyor belt in a factory. Each step on this belt represents a certain operation, which may very well be invoking a language model, applying a Python function to a text, and even prompting the model in a selected way.

LangChain categorizes its chains into three types: Utility chains, Generic chains, and Mix Documents chains. We’ll dive into Utility and Generic chains for our discussion.

Utility Chains are specifically designed to extract precise answers from language models for narrowly defined tasks. For instance, let’s take a have a look at the LLMMathChain. This utility chain enables language models to perform mathematical calculations. It accepts an issue in natural language, and the language model in turn generates a Python code snippet which is then executed to provide the reply.
Generic Chains, then again, function constructing blocks for other chains but can’t be directly used standalone. These chains, equivalent to the LLMChain, are foundational and are sometimes combined with other chains to perform intricate tasks. As an illustration, the LLMChain is often used to question a language model object by formatting the input based on a provided prompt template after which passing it to the language model.

Step-by-step Implementation of Prompt Engineering with Langchain

We’ll walk you thru the technique of implementing prompt engineering using Langchain. Before proceeding, be sure that you could have installed the obligatory software and packages.

You’ll be able to reap the benefits of popular tools like Docker, Conda, Pip, and Poetry for organising LangChain. The relevant installation files for every of those methods will be found inside the LangChain repository at https://github.com/benman1/generative_ai_with_langchain. This features a Dockerfile for Docker, a requirements.txt for Pip, a pyproject.toml for Poetry, and a langchain_ai.yml file for Conda.

In our article we are going to use Pip, the usual package manager for Python, to facilitate the installation and management of third-party libraries. If it isn’t included in your Python distribution, you may install Pip by following the instructions at https://pip.pypa.io/.

To put in a library with Pip, use the command pip install library_name.

Nevertheless, Pip doesn’t manage environments by itself. To handle different environments, we use the tool virtualenv.

In the following section, we might be discussing model integrations.

Step 1: Establishing Langchain

First, you could install the Langchain package. We’re using Windows OS. Run the next command in your terminal to put in it:

pip install langchain

Step 2: Importing Langchain and other obligatory modules

Next, import Langchain together with other obligatory modules. Here, we also import the transformers library, which is extensively utilized in NLP tasks.

import langchain
from transformers import AutoModelWithLMHead, AutoTokenizer

Step 3: Load Pretrained Model

Open AI

OpenAI models will be conveniently interfaced with the LangChain library or the OpenAI Python client library. Notably, OpenAI furnishes an Embedding class for text embedding models. Two key LLM models are GPT-3.5 and GPT-4, differing mainly in token length. Pricing for every model will be found on OpenAI’s website. While there are more sophisticated models like GPT-4-32K which have higher token acceptance, their availability via API is not at all times guaranteed.

Accessing these models requires an OpenAI API key. This will be done by creating an account on OpenAI’s platform, organising billing information, and generating a recent secret key.

import os
os.environ["OPENAI_API_KEY"] = 'your-openai-token'

After successfully creating the important thing, you may set it as an environment variable (OPENAI_API_KEY) or pass it as a parameter during class instantiation for OpenAI calls.

Consider a LangChain script to showcase the interaction with the OpenAI models:

from langchain.llms import OpenAI
llm = OpenAI(model_name="text-davinci-003")
# The LLM takes a prompt as an input and outputs a completion
prompt = "who's the president of america of America?"
completion = llm(prompt)

The present President of america of America is Joe Biden.

In this instance, an agent is initialized to perform calculations. The agent takes an input, an easy addition task, processes it using the provided OpenAI model and returns the result.

Hugging Face

Hugging Face is a FREE-TO-USE Transformers Python library, compatible with PyTorch, TensorFlow, and JAX, and includes implementations of models like BERT, T5, etc.

Hugging Face also offers the Hugging Face Hub, a platform for hosting code repositories, machine learning models, datasets, and web applications.

To make use of Hugging Face as a provider in your models, you’ll have an account and API keys, which will be obtained from their website. The token will be made available in your environment as HUGGINGFACEHUB_API_TOKEN.

Consider the next Python snippet that utilizes an open-source model developed by Google, the Flan-T5-XXL model:

from langchain.llms import HuggingFaceHub
llm = HuggingFaceHub(model_kwargs={"temperature": 0.5, "max_length": 64},repo_id="google/flan-t5-xxl")
prompt = "Wherein country is Tokyo?"
completion = llm(prompt)
print(completion)

This script takes an issue as input and returns a solution, showcasing the knowledge and prediction capabilities of the model.

Step 4: Basic Prompt Engineering

To begin with, we are going to generate an easy prompt and see how the model responds.

prompt="Translate the next English text to French: "{0}""
input_text="Hello, how are you?"
input_ids = tokenizer.encode(prompt.format(input_text), return_tensors="pt")
generated_ids = model.generate(input_ids, max_length=100, temperature=0.9)
print(tokenizer.decode(generated_ids[0], skip_special_tokens=True))

Within the above code snippet, we offer a prompt to translate English text into French. The language model then tries to translate the given text based on the prompt.

Step 5: Advanced Prompt Engineering

While the above approach works high-quality, it doesn’t take full advantage of the facility of prompt engineering. Let’s improve upon it by introducing some more complex prompt structures.

prompt="As a highly proficient French translator, translate the next English text to French: "{0}""
input_text="Hello, how are you?"
input_ids = tokenizer.encode(prompt.format(input_text), return_tensors="pt")
generated_ids = model.generate(input_ids, max_length=100, temperature=0.9)
print(tokenizer.decode(generated_ids[0], skip_special_tokens=True))

On this code snippet, we modify the prompt to suggest that the interpretation is being done by a ‘highly proficient French translator. The change within the prompt can result in improved translations, because the model now assumes a persona of an authority.

Constructing an Academic Literature Q&A System with Langchain

We’ll construct an Academic Literature Query and Answer system using LangChain that may answer questions on recently published academic papers.

Firstly, to establish our surroundings, we install the obligatory dependencies.

pip install langchain arxiv openai transformers faiss-cpu

Following the installation, we create a recent Python notebook and import the obligatory libraries:

from langchain.llms import OpenAI
from langchain.chains.qa_with_sources import load_qa_with_sources_chain
from langchain.docstore.document import Document
import arxiv

The core of our Q&A system is the power to fetch relevant academic papers related to a certain field, here we consider Natural Language Processing (NLP), using the arXiv academic database. To perform this, we define a function get_arxiv_data(max_results=10). This function collects essentially the most recent NLP paper summaries from arXiv and encapsulates them into LangChain Document objects, using the summary as content and the unique entry id because the source.

We’ll use the arXiv API to fetch recent papers related to NLP:

def get_arxiv_data(max_results=10):
    search = arxiv.Search(
        query="NLP",
        max_results=max_results,
        sort_by=arxiv.SortCriterion.SubmittedDate,
    )
   
    documents = []
   
    for lead to search.results():
        documents.append(Document(
            page_content=result.summary,
            metadata={"source": result.entry_id},
        ))
    return documents

This function retrieves the summaries of essentially the most recent NLP papers from arXiv and converts them into LangChain Document objects. We’re using the paper’s summary and its unique entry id (URL to the paper) because the content and source, respectively.

def print_answer(query):
    print(
        chain(
            {
                "input_documents": sources,
                "query": query,
            },
            return_only_outputs=True,
        )["output_text"]
    )

Let’s define our corpus and arrange LangChain:

sources = get_arxiv_data(2)
chain = load_qa_with_sources_chain(OpenAI(temperature=0))

With our academic Q&A system now ready, we are able to test it by asking an issue:

print_answer("What are the recent advancements in NLP?")

The output might be the reply to your query, citing the sources from which the knowledge was extracted. As an illustration:

Recent advancements in NLP include Retriever-augmented instruction-following models and a novel computational framework for solving alternating current optimal power flow (ACOPF) problems using graphics processing units (GPUs).
SOURCES: http://arxiv.org/abs/2307.16877v1, http://arxiv.org/abs/2307.16830v1

You’ll be able to easily switch models or alter the system as per your needs. For instance, here we’re changing to GPT-4 which find yourself giving us a significantly better and detailed response.

sources = get_arxiv_data(2)
chain = load_qa_with_sources_chain(OpenAI(model_name="gpt-4",temperature=0))

Recent advancements in Natural Language Processing (NLP) include the event of retriever-augmented instruction-following models for information-seeking tasks equivalent to query answering (QA). These models will be adapted to varied information domains and tasks without additional fine-tuning. Nevertheless, they often struggle to follow the provided knowledge and will hallucinate of their responses. One other advancement is the introduction of a computational framework for solving alternating current optimal power flow (ACOPF) problems using graphics processing units (GPUs). This approach utilizes a single-instruction, multiple-data (SIMD) abstraction of nonlinear programs (NLP) and employs a condensed-space interior-point method (IPM) with an inequality rest strategy. This strategy allows for the factorization of the KKT matrix without numerical pivoting, which has previously hampered the parallelization of the IPM algorithm.
SOURCES: http://arxiv.org/abs/2307.16877v1, http://arxiv.org/abs/2307.16830v1

A token in GPT-4 will be as short as one character or so long as one word. As an illustration, GPT-4-32K, can process as much as 32,000 tokens in a single run while GPT-4-8K and GPT-3.5-turbo support 8,000 and 4,000 tokens respectively. Nevertheless, it is important to notice that each interaction with these models comes with a price that’s directly proportional to the variety of tokens processed, be it input or output.

Within the context of our Q&A system, if a chunk of educational literature exceeds the utmost token limit, the system will fail to process it in its entirety, thus affecting the standard and completeness of responses. To work around this issue, the text will be broken down into smaller parts that comply with the token limit.

FAISS (Facebook AI Similarity Search) assists in quickly finding essentially the most relevant text chunks related to the user’s query. It creates a vector representation of every text chunk and uses these vectors to discover and retrieve the chunks most much like the vector representation of a given query.

It is important to do not forget that even with using tools like FAISS, the need to divide the text into smaller chunks attributable to token limitations can sometimes result in the lack of context, affecting the standard of answers. Due to this fact, careful management and optimization of token usage are crucial when working with these large language models.

 
pip install faiss-cpu langchain CharacterTextSplitter

After ensuring the above libraries are installed, run

 
from langchain.embeddings.openai import OpenAIEmbeddings 
from langchain.vectorstores.faiss import FAISS 
from langchain.text_splitter import CharacterTextSplitter 
documents = get_arxiv_data(max_results=10) # We are able to now use feed more data
document_chunks = []
splitter = CharacterTextSplitter(separator=" ", chunk_size=1024, chunk_overlap=0)
for document in documents:
    for chunk in splitter.split_text(document.page_content):
        document_chunks.append(Document(page_content=chunk, metadata=document.metadata))
search_index = FAISS.from_documents(document_chunks, OpenAIEmbeddings())
chain = load_qa_with_sources_chain(OpenAI(temperature=0))
def print_answer(query):
    print(
        chain(
            {
                "input_documents": search_index.similarity_search(query, k=4),
                "query": query,
            },
            return_only_outputs=True,
        )["output_text"]
    )

With the code complete, we now have a strong tool for querying the newest academic literature in the sector of NLP.

 
Recent advancements in NLP include using deep neural networks (DNNs) for automatic text evaluation and natural language processing (NLP) tasks equivalent to spell checking, language detection, entity extraction, creator detection, query answering, and other tasks. 
SOURCES: http://arxiv.org/abs/2307.10652v1, http://arxiv.org/abs/2307.07002v1, http://arxiv.org/abs/2307.12114v1, http://arxiv.org/abs/2307.16217v1

Conclusion

The mixing of Large Language Models (LLMs) into applications has speed up adoption of several domains, including language translation, sentiment evaluation, and data retrieval. Prompt engineering is a strong tool in maximizing the potential of those models, and Langchain is leading the way in which in simplifying this complex task. Its standardized interface, flexible prompt templates, robust model integration, and the progressive use of agents and chains ensure optimal outcomes for LLMs’ performance.

Nevertheless, despite these advancements, there are few tricks to bear in mind. As you utilize Langchain, it’s essential to grasp that the standard of the output depends heavily on the prompt’s phrasing. Experimenting with different prompt styles and structures can yield improved results. Also, do not forget that while Langchain supports quite a lot of language models, each has its strengths and weaknesses. Selecting the fitting one in your specific task is crucial. Lastly, it is important to do not forget that using these models comes with cost considerations, as token processing directly influences the price of interactions.

As demonstrated within the step-by-step guide, Langchain can power robust applications, equivalent to the Academic Literature Q&A system. With a growing user community and increasing prominence within the open-source landscape, Langchain guarantees to be a pivotal tool in harnessing the complete potential of LLMs like GPT-4.