, and it possesses powerful and helpful features. The model has a wide range of parameters and options you’ll be able to select from, which you’ve got to appropriately select to optimize GPT-5’s performance in your application area.
In this text, I’ll deep-dive into the various options you’ve got when using GPT-5, and assist you select the optimal settings to make it work well in your use case. I’ll discuss the various input modalities you need to use, the available features GPT-5 has, equivalent to tools and file upload, and I’ll discuss the parameters you’ll be able to set for the model.
This text isn’t sponsored by OpenAI, and is solely a summary of my experiences from using GPT-5, discussing how you need to use the model effectively.
Why you need to use GPT-5
GPT-5 is a really powerful model you’ll be able to utilize for a wide selection of tasks. You possibly can, for instance, use it for a chatbot assistant or to extract vital metadata from documents. Nevertheless, GPT-5 also has a whole lot of different options and settings, a whole lot of which you’ll be able to read more about in OpenAI’s guide to GPT-5. I’ll discuss the way to navigate all of those options and optimally utilize GPT-5 in your use case.
Multimodal abilities
GPT-5 is a multimodal model, meaning you’ll be able to input text, images, and audio, and the model will output text. You may as well mix different modalities within the input, for instance, inputting a picture and a prompt asking in regards to the image, and receive a response. Inputting text is, in fact, expected from an LLM, but the flexibility to input images and audio may be very powerful.
As I’ve discussed in previous articles, VLMs are extremely powerful for his or her ability to directly understand images, which often works higher than performing OCR on a picture after which understanding the extracted text. The identical concept applies to audio as well. You possibly can, for instance, directly send in an audio clip, and never only analyze the words within the clip, but in addition the pitch, talking speed, and so forth from the audio clip. Multimodal understanding simply allows you a deeper understanding of the info you’re analyzing.
Tools
Tools is one other powerful feature you’ve got available. You possibly can define tools that the model can utilize during execution, which turns GPT-5 into an agent. An example of a straightforward tool is the get_weather() function:
def get_weather(city: str):
return "Sunny"
You possibly can then make your custom tools available to your model, together with an outline and the parameters in your function:
tools = [
{
"type": "function",
"name": "get_weather",
"description": "Get today's weather.",
"parameters": {
"type": "object",
"properties": {
"city": {
"type": "string",
"description": "The city you want the weather for",
},
},
"required": ["city"],
},
},
]
It’s vital to make sure detailed and descriptive information in your function definitions, including an outline of the function and the parameters to utilize the function.
You possibly can define a whole lot of tools to make available to your model, nevertheless it’s vital to recollect the core principles for AI tool definitions:
- Tools are well described
- Tools don’t overlap
- Make it obvious to the model when to make use of the function. Ambiguity makes tool usage ineffective
Parameters
There are three important parameters you need to care about when using GPT-5:
- Reasoning effort
- Verbosity
- Structured output
I’ll now describe the various parameters and the way to approach choosing them.
Reasoning effort
Reasoning effort is a parameter where you choose from:
Minimal reasoning essentially makes GPT-5 a non-reasoning model and ought to be used for easier tasks, where you would like quick responses. You possibly can, for instance, use minimal reasoning effort in a chat application where the questions are easy to reply, and the users expect rapid responses.
The tougher your task is, the more reasoning you need to use, though you need to be mindful the fee and latency of using more reasoning. Reasoning counts as output tokens, and on the time of writing this text, 10 USD / million tokens for GPT-5.
I often experiment with the model, ranging from the bottom reasoning effort. If I notice the model struggles to present high-quality responses, I move up on the reasoning level, first from minimal -> low. I then proceed to check the model and see how well it performs. You need to strive to make use of the bottom reasoning effort with acceptable quality.
You possibly can set the reasoning effort with:
client = OpenAI()
request_params = {
"model" = "gpt-5",
"input" = messages,
"reasoning": {"effort": "medium"}, # could be: minimal, low, medium, high
}
client.responses.create(**request_params)
Verbosity
Verbosity is one other vital configurable parameter, and you’ll be able to select from:
Verbosity sets what number of output tokens (excluding considering tokens here) the model should output. The default is medium verbosity, which OpenAI has also stated is actually the setting used for his or her previous models.
Suppose you wish the model to generate longer and more detailed responses, you need to set verbosity to high. Nevertheless, I mostly find myself selecting between low and medium verbosity.
- For chat applications, medium verbosity is sweet because a really concise model may make the users feel the model is less helpful (a whole lot of users prefer some more details in responses).
- For extraction purposes, nevertheless, where you simply wish to output specific information, equivalent to the date from a document, I set the verbosity to low. This helps make sure the model only responds with the output I would like (the date), without providing additional reasoning and context.
You possibly can set the verbosity level with:
client = OpenAI()
request_params = {
"model" = "gpt-5",
"input" = messages,
"text" = {"verbosity": "medium"}, # could be: low, medium, high
}
client.responses.create(**request_params)
Structured output
Structured output is a robust setting you need to use to make sure GPT-5 responds in JSON format. That is again useful if you must extract specific datapoints, and no other text, equivalent to the date from a document. This guarantees that the model responds with a sound JSON object, which you’ll be able to then parse. All metadata extraction I do uses this structured output, because it is incredibly useful for ensuring consistency. You need to use structured output by adding the “text” key within the request params to GPT-5, equivalent to below.
client = OpenAI()
request_params = {
"model" = "gpt-5",
"input" = messages,
"text" = {"format": {"type": "json_object"}},
}
client.responses.create(**request_params)
Make certain to say “JSON” in your prompt; if not, you’ll get an error when you’re using structured output.
File upload
File upload is one other powerful feature available through GPT-5. I discussed earlier the multimodal abilities of the model. Nevertheless, in some scenarios, it’s useful to upload a document directly and have OpenAI parse the document. For instance, when you haven’t performed OCR or extracted images from a document yet, you’ll be able to as an alternative upload the document on to OpenAI and ask it questions. From experience, uploading files can also be fast, and also you’ll often get rapid responses, mostly depending on the hassle you ask for.
In case you need quick responses from documents and don’t have time to make use of OCR first, file upload is a robust feature you need to use.
Downsides of GPT-5
GPT-5 also has some downsides. The important downside I’ve noticed during use is that OpenAI doesn’t share the considering tokens whenever you use the model. You possibly can only access a summary of the considering.
This may be very restrictive in live applications, because if you must use higher reasoning efforts (medium or high), you can not stream any information from GPT-5 to the user, while the model is considering, making for a poor user experience. The choice is then to make use of lower reasoning efforts, which ends up in lower quality outputs. Other frontier model providers, equivalent to Anthropic and Gemini, each have available considering tokens.
There’s also been a whole lot of discussion about how GPT-5 is less creative than its predecessors, though this is often not an enormous problem with the applications I’m working on, since creativity often isn’t a requirement for API usage of GPT-5.
Conclusion
In this text, I’ve provided an summary of GPT-5 with the various parameters and options, and the way to most effectively utilize the model. If used right, GPT-5 is a really powerful model, though it naturally also comes with some downsides, the important one from my perspective being that OpenAI doesn’t share the reasoning tokens. Every time working on LLM applications, I at all times recommend having backup models available from other frontier model providers. This might, for instance, be having GPT-5 because the important model, but when it fails, you’ll be able to fall back to using Gemini 2.5 Pro from Google.
👉 Find me on socials:
🧑💻 Get in contact
✍️ Medium
You may as well read my other articles:
