Open AI has greatly strengthened its performance by applying the newest reasoning model ‘O3’ to ‘Operator’ ‘Operator’.
Open AI announced on the twenty third (local time) that the bottom model of the operator can be replaced with O3 in ‘GPT‑4O’. It’s going to be provided to a $ 200 Chat GPT Pro subscriber in the shape of a research preview.
The operator is an open AI CUA, which was released in January, and has the flexibility to perform users’ tasks in the online browser. Various web -based tasks, equivalent to reservations, online shopping, and knowledge collection, could be performed autonomously by clicking, input and scrolling.
Nonetheless, the initial performance was not very impressive. Nonetheless, with this update, the O3, which has the most effective performance among the many models of the open AI, is anticipated to enhance its performance. O3 is an affordable -based model, especially since it shows improved performance in complex instructions and browser interactions.
The operator doesn’t use the present web browser, but works through the cloud -based virtual browser environment (operator.chatgpt.com) built by Open AI. You may observe the user request in real time, and there are built -in security and privacy functions equivalent to watch mode and high -risk website restrictions.
The brand new O3 -based operator is larger when it comes to accuracy, persistence and clarity in comparison with the present version. For instance, when the restaurant reservation request is processed, the O3 version was provided in the shape of a table in the shape of location, Michelin grade, and seat information, while the previous version was evaluated as lacking in information and composition.

The performance indicators also revealed certain performance improvements.
Within the ‘OSWORLD’ indicator, which evaluates the browser-based work processing ability, the O3 version scored 42.9 points, surpassing the GPT-4O version of 38.1 points. “The rating difference isn’t big,” said Open AI, “due to the constraints of the automated evaluation system, the actual performance difference is more likely to reach 20 points.”
In one other benchmark, Webarena, the O3 version was 62.9 points and the GPT-4O was 48.1 points.
Specifically, within the ‘GAIA’ indicator, which evaluates the high-level agent ability, the O3 version has 62.2 points, which is overwhelming in comparison with the GPT-4O, which is simply 12.3. This result signifies that the O3 version can respond rather more effectively in tasks that require complicated instructions or multi -stage processing.
Within the user’s preference survey, the O3 version showed a powerful advantage in style, the structural of response, and the flexibility to implement instructions.
The operator continues to be a preview of the study and isn’t provided to the overall user. As well as, the operator’s RESPONSES API version will maintain the GPT-4O model in the interim.
Nonetheless, with this upgrade, open AI is more likely to stand out within the agent.
Open AI plans to expand the operator to general consumers, firms, and institutions.
By Park Chan, reporter cpark@aitimes.com