AI corporations generally keep a decent grip on their models to discourage misuse. For instance, should you ask ChatGPT to provide you somebody’s phone number or instructions for doing something illegal, it would likely just let you know it cannot help. Nonetheless, as many examples over time have shown, clever prompt engineering or model fine-tuning can sometimes get these models to say things they otherwise wouldn’t. The unwanted information should be hiding somewhere contained in the model in order that it could possibly be accessed with the best techniques.
At present, corporations are inclined to take care of this issue by applying guardrails; the concept is to examine whether the prompts or the AI’s responses contain disallowed material. Machine unlearning as a substitute asks whether an AI might be made to forget a chunk of data that the corporate doesn’t want it to know. The technique takes a leaky model and the precise training data to be redacted and uses them to create a brand new model—essentially, a version of the unique that never learned that piece of knowledge. While machine unlearning has ties to older techniques in AI research, it’s only up to now couple of years that it’s been applied to large language models.
Jinju Kim, a master’s student at Sungkyunkwan University who worked on the paper with Ko and others, sees guardrails as fences across the bad data put in place to maintain people away from it. “You may’t get through the fence, but some people will still attempt to go under the fence or over the fence,” says Kim. But unlearning, she says, attempts to remove the bad data altogether, so there’s nothing behind the fence in any respect.
The way in which current text-to-speech systems are designed complicates this somewhat more, though. These so-called “zero-shot” models use examples of individuals’s speech to learn to re-create any voice, including those not within the training set—with enough data, it could possibly be a superb mimic when supplied with even a small sample of somebody’s voice. So “unlearning” means a model not only must “forget” voices it was trained on but additionally has to learn to not mimic specific voices it wasn’t trained on. All of the while, it still must perform well for other voices.
To show how you can get those results, Kim taught a recreation of VoiceBox, a speech generation model from Meta, that when it was prompted to provide a text sample in one in every of the voices to be redacted, it should as a substitute respond with a random voice. To make these voices realistic, the model “teaches” itself using random voices of its own creation.
In accordance with the team’s results, that are to be presented this week on the International Conference on Machine Learning, prompting the model to mimic a voice it has “unlearned” gives back a result that—in response to state-of-the-art tools that measure voice similarity—mimics the forgotten voice greater than 75% less effectively than the model did before. In practice, this makes the brand new voice unmistakably different. However the forgetfulness comes at a price: The model is about 2.8% worse at mimicking permitted voices. While these percentages are a bit hard to interpret, the demo the researchers released online offers very convincing results, each for a way well redacted speakers are forgotten and the way well the remaining are remembered. A sample from the demo is given below.
Ko says the unlearning process can take “several days,” depending on what number of speakers the researchers want the model to forget. Their method also requires an audio clip about five minutes long for every speaker whose voice is to be forgotten.
In machine unlearning, pieces of knowledge are sometimes replaced with randomness in order that they’ll’t be reverse-engineered back to the unique. On this paper, the randomness for the forgotten speakers could be very high—an indication, the authors claim, that they’re truly forgotten by the model.
“I actually have seen people optimizing for randomness in other contexts,” says Vaidehi Patil, a PhD student on the University of North Carolina at Chapel Hill who researches machine unlearning. “That is one in every of the primary works I’ve seen for speech.” Patil is organizing a machine unlearning workshop affiliated with the conference, and the voice unlearning research may even be presented there.