How OpenAI is attempting to make ChatGPT safer and fewer biased


It’s not only freaking out journalists (a few of whom should really know higher than to anthropomorphize and hype up a dumb chatbot’s ability to have feelings.) The startup has also gotten a whole lot of heat from conservatives within the US who claim its chatbot ChatGPT has a “woke” bias

All this outrage is finally having an impact. Bing’s trippy content is generated by AI language technology called ChatGPT developed by startup OpenAI, and last Friday, OpenAI issued a blog post aimed toward clarifying how its chatbots should behave. It also released its guidelines on how ChatGPT should respond when prompted with things about US “culture wars.” The foundations include not affiliating with political parties or judging one group pretty much as good or bad, for instance. 

I spoke to Sandhini Agarwal and Lama Ahmad, two AI policy researchers at OpenAI, about how the corporate is making ChatGPT safer and fewer nuts. The corporate refused to comment on its relationship with Microsoft, but they still had some interesting insights. Here’s what that they had to say: 

Methods to get well answers: In AI language model research, one among the most important open questions is the right way to stop the models “hallucinating,” a polite term for making stuff up. ChatGPT has been utilized by thousands and thousands of individuals for months, but we haven’t seen the sort of falsehoods and hallucinations that Bing has been generating. 

That’s because OpenAI has used a way in ChatGPT called reinforcement learning from human feedback, which improves the model’s answers based on feedback from users. The technique works by asking people to select between a variety of various outputs before rating them by way of various different criteria, like factualness and truthfulness. Some experts consider Microsoft may need skipped or rushed this stage to launch Bing, although the corporate is yet to substantiate or deny that claim. 

But that method will not be perfect, in response to Agarwal. People may need been presented with options that were all false, then picked the choice that was the least false, she says. In an effort to make ChatGPT more reliable, the corporate has been specializing in cleansing up its dataset and removing examples where the model has had a preference for things which can be false. 

Jailbreaking ChatGPT: Since ChatGPT’s release, people have been attempting to “jailbreak” it, which implies finding workarounds to prompt the model to break its own rules and generate racist or conspiratory stuff. This work has not gone unnoticed at OpenAI HQ. Agarwal says OpenAI has undergone its entire database and chosen the prompts which have led to unwanted content with a purpose to improve the model and stop it from repeating these generations. 

OpenAI desires to listen: The corporate has said it would start gathering more feedback from the general public to shape its models. OpenAI is exploring using surveys or organising residents assemblies to debate what content needs to be completely banned, says Lama Ahmad. “Within the context of art, for instance, nudity will not be something that is considered vulgar, but how do you consider that within the context of ChatGPT within the classroom,” she says.


What are your thoughts on this topic?
Let us know in the comments below.


0 0 votes
Article Rating
Newest Most Voted
Inline Feedbacks
View all comments

Share this article

Recent posts

Would love your thoughts, please comment.x