On the second day of its 12-day continuous announcement, OpenAI introduced a preview version of ‘Reinforcement Advantageous-Tuning’ using the inference model ‘o1’ and announced that it would be officially released next 12 months.
Open AI demonstrated an enhanced fine-tuning function that may create a fine-tuned expert model suitable for tasks in a selected domain through streaming held on the sixth (local time).
Mark Chen, Senior Vice President of OpenAI Research, explained, “We have now raised the reinforcement learning algorithm from the highschool level to the expert doctorate level,” and added, “Today is a preview of a product that will likely be officially released next 12 months.”
Last week, ‘o1’ announced that it had applied this technology to Thomson Reuters, which was the primary to construct a customized model for businesses. Thomson Reuters, which applied the o1 custom model to its legal AI assistant, said, “The increased inference capabilities have shown remarkable ability to seek out subtle cases previously missed by even highly capable models equivalent to ‘GPT-4.’”
This technology is alleged to be suitable for fields where hallucinations have to be avoided, equivalent to law, finance, engineering, and insurance, because of o1’s improved reasoning ability.
The strengthening fine-tuning research program was also revealed on the web site. They said they will access the ‘alpha version’ of the API and test the technology in domain-specific tasks. today Accepting applicationsam.
Meanwhile, OpenAI held an announcement event on the twelfth and unveiled the ‘o1 Pro’, ‘o1 Full Version’, and ‘ChatGPT Pro’ plans on the primary day. We plan to take a break over the weekend and begin the third day next Monday.
Reporter Lim Da-jun ydj@aitimes.com
