Open AI chosen strengthening ‘reasoning’ and ‘tool use’ capabilities as priorities for upgrading artificial intelligence (AI) agent performance. He also explained that the recently released real-time API and ChatGPT search were essential processes for developing AI agents.
Olivier Godment, head of the OpenAI platform product, introduced information concerning the AI agent currently under development in an interview with MIT Technology Interview on the fifth (local time).
To begin with, Director Godment emphasized that the recently released real-time API and ChatGPT search are ultimately for AI agent functions.
“In a number of years, every one and each business on the planet may have a personalised agent,” he said. “That agent will know you thoroughly.” The agent will give you the chance to access email, apps, calendars, etc. on the user’s behalf, and can act like a “general manager,” interacting with each tool and solving long-term problems, akin to writing a paper on a selected topic.
OpenAI’s strategy shouldn’t be only to construct agents directly, but additionally to assist developers construct their very own agents. To this end, a real-time API that supports developers to construct chatbots using open AI technology is essential.
Advanced Voice Mode (AVM) supported through real-time API is anticipated to play a very important role in not only enhancing the texture of the agent but additionally usability. “Most apps today are chat-based, but that doesn’t suit every use case,” he said. “To be used cases where you possibly can’t type or see the screen, voice is important.”
But Godmont identified that there are two major obstacles that should be overcome for agents to really exist.
The primary is inference. To ensure that an AI agent to finish complex tasks and handle tasks appropriately, it must deliver reliable performance.
Subsequently, OpenAI also said that it was capable of strengthen its agent function after developing the o1 model. Giving o1 more time to generate a solution allows him to acknowledge and proper mistakes, break problems into smaller problems, and take a look at different approaches to answering questions.
In fact, there may be some skepticism about o1’s reasoning ability. Last month, Apple researchers published a study that said, “AI is basically unable to perform inference tasks, and its pattern matching ability is simply improved.”
Godmont also acknowledged that there continues to be a variety of work to be done. Within the short term, the goal is to make inference models like o1 more reliable, faster, and cheaper. In the long run, the goal is to expand reasoning functions currently focused on math, science, and coding to varied fields akin to law, accounting, and economics.
The second is the flexibility to attach various tools. A representative example is the search function. If it has to rely only on existing training data, the agent’s functions will inevitably be limited.
It’s essential give you the chance to look and take motion in the true world. Like Antropic’s ‘Computer Use’, it refers to interfaces, interactions, and the flexibility to really operate a pc.
“o1 can use tools to some extent, but there continues to be a variety of room for advancement,” he said.
It is anticipated that many AI agents will appear next 12 months attributable to the event of related technologies. Nevertheless, it’s difficult to predict how people will adopt and use this technology, he said.
“To be honest, once I look back yearly, there are a variety of unexpected use cases that surprise me,” he said. “I expect there will likely be quite a number of surprises that nobody expected.”
Meanwhile, in accordance with a report by The Information on the twenty second of last month, OpenAI introduced a general-purpose AI agent internally and even conducted a demo. Director Godmont’s interview is basically an admission of this.
Reporter Lim Da-jun ydj@aitimes.com