More sophisticated approaches to solving much more complex tasks at the moment are being actively developed. While they significantly outperform in some scenarios, their practical usage stays somewhat limited. I’ll mention two such techniques: self-consistency and the Tree of Thoughts.
The authors of the self-consistency paper offered the next approach. As an alternative of just counting on the initial model output, they suggested sampling multiple times and aggregating the outcomes through majority voting. By counting on each intuition and the success of ensembles in classical machine learning, this method enhances the model’s robustness.
You too can apply self-consistency without implementing the aggregation step. For tasks with short outputs ask the model to suggest several options and select the perfect one.
Tree of Thoughts (ToT) takes this idea a stride further. It puts forward the concept of applying tree-search algorithms for the model’s “reasoning thoughts”, essentially backtracking when it stumbles upon poor assumptions.
For those who have an interest, take a look at Yannic Kilcher’s video with a ToT paper review.
For our particular scenario, utilizing Chain-of-Thought reasoning is just not needed, yet we will prompt the model to tackle the summarization task in two phases. Initially, it may condense all the job description, after which summarize the derived summary with a give attention to job responsibilities.
On this particular example, the outcomes didn’t show significant changes, but this approach works thoroughly for many tasks.
Few-shot Learning
The last technique we’ll cover is named few-shot learning, also generally known as in-context learning. It’s so simple as incorporating several examples into your prompt to supply the model with a clearer picture of your task.
These examples shouldn’t only be relevant to your task but in addition diverse to encapsulate the range in your data. “Labeling” data for few-shot learning is likely to be a bit more difficult once you’re using CoT, particularly in case your pipeline has many steps or your inputs are long. Nonetheless, typically, the outcomes make it well worth the effort. Also, have in mind that labeling just a few examples is much inexpensive than labeling a whole training/testing set as in traditional ML model development.
If we add an example to our prompt, it’ll understand the necessities even higher. For example, if we exhibit that we’d prefer the ultimate summary in bullet-point format, the model will mirror our template.
This prompt is kind of overwhelming, but don’t be afraid: it’s only a previous prompt (v5) and one labeled example with one other job description within the For instance: 'input description' -> 'output JSON'
format.
Summarizing Best Practices
To summarize the perfect practices for prompt engineering, consider the next:
- Don’t be afraid to experiment. Try different approaches and iterate step by step, correcting the model and taking small steps at a time;
- Use separators in input (e.g. <>) and ask for a structured output (e.g. JSON);
- Provide a listing of actions to finish the duty. At any time when feasible, offer the model a set of actions and let it output its “internal thoughts”;
- In case of short outputs ask for multiple suggestions;
- Provide examples. If possible, show the model several diverse examples that represent your data with the specified output.
I’d say that this framework offers a sufficient basis for automating a big selection of day-to-day tasks, like information extraction, summarization, text generation equivalent to emails, etc. Nonetheless, in a production environment, it continues to be possible to further optimize models by fine-tuning them on specific datasets to further enhance performance. Moreover, there may be rapid development within the plugins and agents, but that’s a complete different story altogether.
Prompt Engineering Course by DeepLearning.AI and OpenAI
Together with the earlier-mentioned talk by Andrej Karpathy, this blog post draws its inspiration from the ChatGPT Prompt Engineering for Developers course by DeepLearning.AI and OpenAI. It’s absolutely free, takes just a few hours to finish, and, my personal favorite, it lets you experiment with the OpenAI API without even signing up!
That’s an excellent playground for experimenting, so definitely test it out.
Wow, we covered quite plenty of information! Now, let’s move forward and begin constructing the appliance using the knowledge we’ve gained.
Generating OpenAI Key
To start, you’ll have to register an OpenAI account and create your API key. OpenAI currently offers $5 of free credit for 3 months to each individual. Follow the introduction to the OpenAI API page to register your account and generate your API key.
Once you will have a key, create an OPENAI_API_KEY
environment variable to access it within the code with os.getenv('OPENAI_API_KEY')
.
Estimating the Costs with Tokenizer Playground
At this stage, you is likely to be interested by how much you’ll be able to do with only a free trial and what options can be found after the initial three months. It’s a reasonably good query to ask, especially when you think about that LLMs cost thousands and thousands of dollars!
In fact, these thousands and thousands are about training. It seems that the inference requests are quite reasonably priced. While GPT-4 could also be perceived as expensive (although the value is prone to decrease), gpt-3.5-turbo
(the model behind default ChatGPT) continues to be sufficient for nearly all of tasks. The truth is, OpenAI has done an incredible engineering job, given how inexpensive and fast these models at the moment are, considering their original size in billions of parameters.
The gpt-3.5-turbo
model comes at a price of $0.002 per 1,000 tokens.
But how much is it? Let’s see. First, we want to know what’s a token. In easy terms, a token refers to a component of a word. Within the context of the English language, you’ll be able to expect around 14 tokens for each 10 words.
To get a more accurate estimation of the variety of tokens in your specific task and prompt, the perfect approach is to provide it a try! Luckily, OpenAI provides a tokenizer playground that may assist you with this.
Side note: Tokenization for Different Languages
Because of the widespread use of English on the Web, this language advantages from essentially the most optimal tokenization. As highlighted within the “All languages should not tokenized equal” blog post, tokenization is just not a uniform process across languages, and certain languages may require a greater variety of tokens for representation. Keep this in mind if you would like to construct an application that involves prompts in multiple languages, e.g. for translation.
For example this point, let’s take a have a look at the tokenization of pangrams in numerous languages. On this toy example, English required 9 tokens, French — 12, Bulgarian — 59, Japanese — 72, and Russian — 73.
Cost vs Performance
As you’ll have noticed, prompts can change into quite lengthy, especially when incorporating examples. By increasing the length of the prompt, we potentially enhance the standard, but the associated fee grows similtaneously we use more tokens.
Our latest prompt (v6) consists of roughly 1.5k tokens.
Considering that the output length is often the identical range because the input length, we will estimate a median of around 3k tokens per request (input tokens + output tokens). By multiplying this number by the initial cost, we discover that each request is about $0.006 or 0.6 cents, which is kind of reasonably priced.
Even when we consider a rather higher cost of 1 cent per request (comparable to roughly 5k tokens), you’ll still have the ability to make 100 requests for just $1. Moreover, OpenAI offers the pliability to set each soft and hard limits. With soft limits, you receive notifications once you approach your defined limit, while hard limits restrict you from exceeding the required threshold.
For local use of your LLM application, you’ll be able to comfortably configure a tough limit of $1 monthly, ensuring that you just remain inside budget while having fun with the advantages of the model.
Streamlit App Template
Now, let’s construct an internet interface to interact with the model programmatically eliminating the necessity to manually copy prompts every time. We are going to do that with Streamlit.
Streamlit is a Python library that means that you can create easy web interfaces without the necessity for HTML, CSS, and JavaScript. It’s beginner-friendly and enables the creation of browser-based applications using minimal Python knowledge. Let’s now create a straightforward template for our LLM-based application.
Firstly, we want the logic that can handle the communication with the OpenAI API. In the instance below, I consider generate_prompt()
function to be defined and return the prompt for a given input text (e.g. much like what you saw before).
And that’s it! Know more about different parameters in OpenAI’s documentation, but things work well just out of the box.
Having this code, we will design a straightforward web app. We want a field to enter some text, a button to process it, and a few output widgets. I prefer to have access to each the total model prompt and output for debugging and exploring reasons.
The code for all the application will look something like this and may be present in this GitHub repository. I actually have added a placeholder function called toy_ask_chatgpt()
since sharing the OpenAI key is just not a superb idea. Currently, this application simply copies the prompt into the output.
Without defining functions and placeholders, it is simply about 50 lines of code!
And due to a recent update in Streamlit it now allows embed it right in this text! So you need to have the ability to see it right below.
Now you see how easy it’s. For those who wish, you’ll be able to deploy your app with Streamlit Cloud. But watch out, since every request costs you money in case you put your API key there!
On this blog post, I listed several best practices for prompt engineering. We discussed iterative prompt development, using separators, requesting structural output, Chain-of-Thought reasoning, and few-shot learning. I also provided you with a template to construct a straightforward web app using Streamlit in under 100 lines of code. Now, it’s your turn to provide you with an exciting project idea and switch it into reality!
It’s truly amazing how modern tools allow us to create complex applications in only just a few hours. Even without extensive programming knowledge, proficiency in Python, or a deep understanding of machine learning, you’ll be able to quickly construct something useful and automate some tasks.
Don’t hesitate to ask me questions in case you’re a beginner and need to create an identical project. I’ll be greater than completely happy to help you and respond as soon as possible. Better of luck together with your projects!
jazz instrumental
cozy coffee shop
relaxing october coffee jazz