Why detecting AI-generated text is so difficult (and what to do about it)

This tool is OpenAI’s response to the warmth it’s gotten from educators, journalists, and others for launching ChatGPT with none ways to detect text it has generated. Nonetheless, it remains to be very much a piece in progress, and it’s woefully unreliable. OpenAI says its AI text detector appropriately identifies 26% of AI-written text as “likely AI-written.”

While OpenAI clearly has lots more work to do to refine its tool, there’s a limit to only how good it might make it. We’re extremely unlikely to ever get a tool that may spot AI-generated text with 100% certainty. It’s really hard to detect AI-generated text since the whole point of AI language models is to generate fluent and human-seeming text, and the model is mimicking text created by humans, says Muhammad Abdul-Mageed, a professor who oversees research in natural-language processing and machine learning on the University of British Columbia

We’re in an arms race to construct detection methods that may match the newest, strongest models, Abdul-Mageed adds. Latest AI language models are more powerful and higher at generating much more fluent language, which quickly makes our existing detection tool kit outdated.

OpenAI built its detector by creating an entire latest AI language model akin to ChatGPT that’s specifically trained to detect outputs from models like itself. Although details are sparse, the corporate apparently trained the model with examples of AI-generated text and examples of human-generated text, after which asked it to identify the AI-generated text. We asked for more information, but OpenAI didn’t respond.

Last month, I wrote about one other method for detecting text generated by an AI: watermarks. These act as a form of secret signal in AI-produced text that permits computer programs to detect it as such.

Researchers on the University of Maryland have developed a neat way of applying watermarks to text generated by AI language models, they usually have made it freely available. These watermarks would allow us to inform with almost complete certainty when AI-generated text has been used.

The difficulty is that this method requires AI firms to embed watermarking of their chatbots right from the beginning. OpenAI is developing these systems but has yet to roll them out in any of its products. Why the delay? One reason is perhaps that it’s not all the time desirable to have AI-generated text watermarked.

Considered one of the most promising ways ChatGPT might be integrated into products is as a tool to assist people write emails or as an enhanced spell-checker in a word processor. That’s not exactly cheating. But watermarking all AI-generated text would mechanically flag these outputs and may lead to wrongful accusations.

Why detecting AI-generated text is so difficult (and what to do about it)

What are your thoughts on this topic?
Let us know in the comments below.

Share this article

Recent posts

AI in Finance and Its Impact on Worker Retention

AI’s Growing Power Needs: Tech Industry’s Move Towards Nuclear Power

“Human Intelligence Created”… Human Intelligence Challenge Spreads Against ‘Made by AI’

What We Still Don’t Understand About Machine Learning

OpenAI Unveils SearchGPT: A Recent AI-Powered Search Engine

Why detecting AI-generated text is so difficult (and what to do about it)

What are your thoughts on this topic? Let us know in the comments below.

Share this article

Recent posts

What are your thoughts on this topic?
Let us know in the comments below.