China’s ChatGPT Rival Needs to Watch Its Words

0

China’s censorship regime requires Baidu and other internet companies to block access to certain websites and avoid politically sensitive subjects. The words or phrases that should be blocked can be updated rapidly in response to protests or during special events.

But Jeffrey Ding, an assistant professor at Georgetown University who studies China’s tech industry, says that concerns about censorship do not seem to have slowed the development of large language models in China. He notes that Baidu has made the Ernie language model that underpins its new bot available via an API for some time and that other companies have offered similar models.

Baidu has not given details of Ernie Bot’s training data, but it most likely was scraped from the Chinese internet. This will mean the bot’s feedstock has largely already been curated by China’s censorship rules, which, for example, aim to limit criticism of the government.

Censorship might also affect Chinese chatbots in more subtle ways. An academic research project from 2021 that trained algorithms on the Chinese-language version of Wikipedia, which is blocked in China, and Baidu’s Baike, a crowdsourced encyclopedia subject to government censorship, found that using censored training data significantly changed the meaning that AI software assigned to different words.

The algorithm trained on Chinese-language Wikipedia associated the words “democracy” closer to positive words such as “stability.” The algorithm trained on the censored Baike material represented “democracy” closer to “chaos,” more in line with the policy of China’s government. But because chatbots like ChatGPT can be extremely flexible and remix material in their training data, Baidu has likely had to introduce additional safeguards 

Despite its mixed reception, Ernie Bot appears to be a capable competitor to ChatGPT. The bot is currently available only to a limited number of users, some of whom say they are impressed. ChatGPT is not available in China, although it is capable of conversing in Chinese.

Lei Li, a professor at UC Sant Barbara who specializes in AI and previously worked on the technology used to build some of the machine learning behind Ernie bot, points out that Baidu has been working on the underlying technology for around a decade. Microsoft, by contrast, licensed the core technology for Bing’s new chatbot and some forthcoming text-generation features for Office from OpenAI, in which it has invested billions of dollars in return for exclusive rights to its creations.

Li also says he is also impressed with some of what Ernie Bot can do, including its ability to generate stories and business reports. He adds that the hallucination problem is a challenge for all such language models. “This is where researchers still have work to do,” he says.

One WeChat poster compared the Chinese bot’s demoed capabilities to those of ChatGPT and found it better at handling Chinese idioms and more accurate in some instances. For example, ChatGPT incorrectly claimed that the ancestral home of science fiction author Liu Cixin, who wrote The Three Body Problem, is Hubei, while Ernie Bot correctly answered Henan. ChatGPT is blocked in China, but many people have found ways of accessing it.

Stay connected with us on social media platform for instant update click here to join our  Twitter, & Facebook

We are now on Telegram. Click here to join our channel (@TechiUpdate) and stay updated with the latest Technology headlines.

For all the latest  Business News Click Here 

Read original article here

Denial of responsibility! Rapidtelecast.com is an automatic aggregator around the global media. All the content are available free on Internet. We have just arranged it in one platform for educational purpose only. In each content, the hyperlink to the primary source is specified. All trademarks belong to their rightful owners, all materials to their authors. If you are the owner of the content and do not want us to publish your materials on our website, please contact us by email – [email protected]. The content will be deleted within 24 hours.
Leave a comment