China has reportedly been strengthening its censorship of artificial intelligence (AI) chatbots. It has been reported that it’s forcing re-censorship on chatbots that previously received government approval to be able to strengthen “socialist AI.”
The Financial Times reported on the seventeenth (local time) that the Cyberspace Administration of China (CAC) has forced large technology corporations and AI startups, including ByteDance, Alibaba, Moonshot, and 01.AI, to take part in mandatory government reviews of AI models.
In keeping with this, this censorship is a test of those corporations’ large language models (LLMs) to see in the event that they ‘implement core socialist values.’ It involves reviewing the content of LLMs’ answers to quite a few questions, lots of that are about China’s political sensitivities and President Xi Jinping.
This work is being conducted by officials from CAC regional chapters across the country, and includes reviewing the model’s training data and safety processes.
In China, to be able to provide AI chatbots, they need to first pass government censorship. ByteDance and Alibaba have already been providing chatbots after receiving government approval last 12 months. Nevertheless, they’re currently undergoing re-examination on account of recent strengthened censorship.
“A special team from CAC got here to our office and conducted an audit in a conference room,” said an worker of an AI company in Hangzhou who requested anonymity.
“We didn’t pass on the primary try,” he explained. “We struggled to work out why. We passed on the second try, but the entire process took months.”
The federal government’s stringent approval process has forced Chinese AI corporations to learn quickly easy methods to control LLMs, in accordance with testimony. Many insiders said the duty was difficult and sophisticated, as LLMs involve learning an unlimited amount of English content.
“Security filtering may be very vital because our basic model outputs very unconstrained answers,” said an worker at a outstanding AI startup in Beijing.
The filtering task begins by filtering out problematic information from the training data and constructing a database of sensitive keywords.
In keeping with China’s guidelines for AI corporations released in February, they need to collect hundreds of sensitive keywords and questions that “violate core socialist values, similar to inciting subversion of state power or undermining national unity,” and these should be updated weekly.
Consequently, Chinese AI chatbot users are left without answers to questions on the Tiananmen Square protests or Winnie the Pooh memes about President Xi Jinping. Baidu’s Ernie chatbot says, “Please try one other query,” while Alibaba’s Tongyi Chenwen replies, “I haven’t learned easy methods to answer this query yet. I’ll proceed to learn to serve you higher.”
Then again, CAC launched an AI chatbot in May with “Xi Jinping Thought on Socialism with Chinese Characteristics for a Recent Era” as its core content. The model was trained on greater than 12 books known to have been written by President Xi Jinping himself.
But Chinese officials are also said to wish to avoid creating AI that avoids all political topics. In keeping with one worker who participated within the test, they’ve introduced limits on the variety of questions LLM can reject. Chinese standards released in February state that LLM cannot reject greater than 5% of questions.
“To avoid potential problems, some LLMs have banned answers to topics related to President Xi Jinping altogether,” said a developer at an online company in Shanghai. One example is Kimi, a chatbot from startup Moonshot that refuses to reply most questions on the president.
The Financial Times reported that when it asked a chatbot created by popular startup 01.AI an issue in regards to the Chinese leader, the initial response it gave was that “Xi Jinping’s policies have further restricted freedom of the press and human rights, and suppressed civil society.”
But soon the reply disappeared and was replaced with, “Sorry, I am unable to give you the data you are searching for.”
“It’s totally difficult to regulate the text that LLM generates, so we use a technique of constructing one other layer,” said developer Li Huan. That’s, when the system receives a question a few sensitive matter, it switches to a separate, secure model.
ByteDance is alleged to be probably the most proficient on this censorship technology. In a study conducted by a research lab at Fudan University, ByteDance’s DuBao model ranked first amongst LLMs with a security compliance rate of 66.4%. On this test, OpenAI’s ‘GPT-4o’ scored only 7.1%.
Fan Binxing, generally known as the “father of the Chinese firewall,” recently told a technology conference in Beijing that he’s developing an LLM safety protocol system that he hopes shall be adopted by Chinese AI corporations.
“Large-scale predictive models targeting the general public require greater than safety reports,” he said. “China needs its own technological path.”
Reporter Im Dae-jun ydj@aitimes.com