At ‘DevDay’, an annual developer conference, OpenAI introduced tools that allow developers to more easily develop applications based on the corporate’s AI model. Unlike last 12 months, when it made waves by launching ‘GPT-4 Turbo’, ‘GPT Builder’, and ‘GPT Store’, this event is in search of to expand the ecosystem centered on its AI model by introducing various tools for developers.
On the Dev Day held in San Francisco on the first (local time), OpenAI presented ▲’Realtime API’ for constructing voice-based applications for corporations and developers, and ▲’fine-tuned ‘GPT-4o’ with images and text. We launched ‘Vision Nice-Tuning’, ‘Model Distillation’, which fine-tunes small models using large frontier model output, and ‘Prompt Caching’, which reuses user input.
First, ‘Real-Time API’ is a cloud service that enables developers to equip applications with multi-modal processing functions using OpenAI’s AI model. Through this, third-party applications also can add the power to know voice commands and skim responses aloud.
Typically, an AI model must undergo several steps to process voice commands. Developers must convert audio to text, input that text right into a model, after which convert the model’s text-based output into synthesized speech. The Real-Time API means that you can stream audio on to GPT-4o without going through these steps.
Not only does it simplify development, but it surely also reduces model latency. Consequently, AI applications using real-time APIs can respond faster to user instructions. This service also provides the power for applications to routinely perform tasks on external systems.
Through this, the intention is for a lot of corporations and developers to make use of open AI services and increase profits.
The identical goes for ‘Vision Nice Tuning’ for developers constructing applications that process images. This tool allows developers to offer custom image data to GPT-4o to enhance the output produced by computer vision tasks.
By improving the model’s image understanding capabilities, visual search and object detection functions for autonomous vehicles could be enhanced. For instance, a business that uses GPT-4o to create website layouts can provide a set of sample designs to the model. No less than 100 images are sufficient to enhance the performance of GPT-4o.
As well as, OpenAI introduced two features designed to lower inference costs. ‘Model distillation’ realizes cost savings through an AI method called knowledge distillation. This method allows developers to interchange large, high-performance models with smaller models that use less hardware.
When given the identical prompt, a bigger neural network is prone to produce a greater response than a smaller neural network. Knowledge distillation allows developers to take high-quality responses from the larger model and feed them into the smaller model. This enables smaller models to supply output of comparable quality to high-end models while using less hardware.
Model distillation functionality is provided via API. This enables developers to submit a prompt to considered one of OpenAI’s cutting-edge models and transform that model’s responses into an AI training dataset. This dataset could be used to enhance the standard of small models.
One other feature that OpenAI launched to scale back customers’ inference costs is ‘prompt caching’. This feature reuses the user’s input in certain situations to avoid repeating previously accomplished calculations. OpenAI expects to scale back inference costs by as much as 50% and improve response times through this feature.
Meanwhile, OpenAI said, “Currently, 3 million developers in dozens of nations are experimenting with our technology.”
Dev Day shall be held thrice, in London on October thirtieth and in Singapore on November twenty first. It was announced that no major models can be revealed at this event.
Reporter Park Chan cpark@aitimes.com