Introducing improvements to the fine-tuning API and expanding our custom models program

Artificial Intelligence

Introducing improvements to the fine-tuning API and expanding our custom models program

admin

April 4, 2024

Introducing improvements to the fine-tuning API and expanding our custom models program

Assisted Advantageous-Tuning

At DevDay last November, we announced a Custom Model program designed to coach and optimize models for a selected domain, in partnership with a dedicated group of OpenAI researchers. Since then, we have met with dozens of consumers to evaluate their custom model needs and evolved our program to further maximize performance.

Today, we’re formally announcing our assisted fine-tuning offering as a part of the Custom Model program. Assisted fine-tuning is a collaborative effort with our technical teams to leverage techniques beyond the fine-tuning API, akin to additional hyperparameters and various parameter efficient fine-tuning (PEFT) methods at a bigger scale. It’s particularly helpful for organizations that need support establishing efficient training data pipelines, evaluation systems, and bespoke parameters and methods to maximise model performance for his or her use case or task.

For instance, SK Telecom, a telecommunications operator serving over 30 million subscribers in South Korea, desired to customize a model to be an authority within the telecommunications domain with an initial concentrate on customer support. They worked with OpenAI to fine-tune GPT-4 to enhance its performance in telecom-related conversations within the Korean language. Over the course of multiple weeks, SKT and OpenAI drove meaningful performance improvement in telecom customer support tasks—a 35% increase in conversation summarization quality, a 33% increase in intent recognition accuracy, and a rise in satisfaction scores from 3.6 to 4.5 (out of 5) when comparing the fine-tuned model to GPT-4.

Custom-Trained Model

In some cases, organizations have to train a purpose-built model from scratch that understands their business, industry, or domain. Fully custom-trained models imbue recent knowledge from a selected domain by modifying key steps of the model training process using novel mid-training and post-training techniques. Organizations that see success with a totally custom-trained model often have large quantities of proprietary data—thousands and thousands of examples or billions of tokens—that they wish to use to show the model recent knowledge or complex, unique behaviors for highly specific use cases.

For instance, Harvey, an AI-native legal tool for attorneys, partnered with OpenAI to create a custom-trained large language model for case law. While foundation models were strong at reasoning, they lacked the extensive knowledge of legal case history and other knowledge required for legal work. After testing out prompt engineering, RAG, and fine-tuning, Harvey worked with our team so as to add the depth of context needed to the model—the equivalent of 10 billion tokens price of information. Our team modified every step of the model training process, from domain-specific mid-training to customizing post-training processes and incorporating expert attorney feedback. The resulting model achieved an 83% increase in factual responses and attorneys preferred the customized model’s outputs 97% of the time over GPT-4.

1 COMMENT

zencortex reviews April 4, 2024 At 7:04 pm

The breadth of knowledge compiled on this website is astounding. Every article is a well-crafted masterpiece brimming with insights. I’m grateful to have discovered such a rich educational resource. You’ve gained a lifelong fan!

1 COMMENT

LEAVE A REPLY Cancel reply