Why watermarking AI-generated content won’t guarantee trust online

Further complicating matters, watermarking is commonly used as a “catch-all” term for the overall act of providing content disclosures, though there are lots of methods. A better read of the White House commitments describes one other method for disclosure often known as provenance, which relies on cryptographic signatures, not invisible signals. Nonetheless, that is often described as watermarking in the favored press. For those who find this mish-mash of terms confusing, rest assured you’re not the just one. But clarity matters: the AI sector cannot implement consistent and robust transparency measures if there just isn’t even agreement on how we discuss with the several techniques.

I’ve provide you with six initial questions that might help us evaluate the usefulness of watermarks and other disclosure methods for AI. These should help be certain different parties are discussing the very same thing, and that we are able to evaluate each method in an intensive, consistent manner.

Can the watermark itself be tampered with?

Mockingly, the technical signals touted as helpful for gauging where content comes from and the way it’s manipulated can sometimes be manipulated themselves. While it’s difficult, each invisible and visual watermarks will be removed or altered, rendering them useless for telling us what’s and isn’t synthetic. And notably, the convenience with which they will be manipulated varies in accordance with what form of content you’re coping with.

Is the watermark’s durability consistent for various content types?

While invisible watermarking is commonly promoted as a broad solution for coping with generative AI, such embedded signals are way more easily manipulated in text than in audiovisual content. That likely explains why the White House’s summary document suggests that watermarking can be applied to every type of AI, but in the complete text it’s made clear that firms only committed to disclosures for audiovisual material. AI policymaking must subsequently be specific about how disclosure techniques like invisible watermarking vary of their durability and broader technical robustness across different content types. One disclosure solution could also be great for images, but useless for text.

Who can detect these invisible signals?

Even when the AI sector agrees to implement invisible watermarks, deeper questions are inevitably going to emerge around who has the capability to detect these signals and eventually make authoritative claims based on them. Who gets to determine whether content is AI-generated, and maybe as an extension, whether it’s misleading? If everyone can detect watermarks, that may render them liable to misuse by bad actors. Then again, controlled access to detection of invisible watermarks—especially whether it is dictated by large AI firms—might degrade openness and entrench technical gatekeeping. Implementing these types of disclosure methods without understanding how they’re governed could leave them distrusted and ineffective. And if the techniques are usually not widely adopted, bad actors might turn to open-source technologies that the invisible watermarks to create harmful and misleading content.

Do watermarks preserve privacy?

As key work from Witness, a human rights and technology group, makes clear, any tracing system that travels with a bit of content over time may also introduce privacy issues for those creating the content. The AI sector must make sure that watermarks and other disclosure techniques are designed in a way that doesn’t include identifying information that may put creators in danger. For instance, a human rights defender might capture abuses through photographs which are watermarked with identifying information, making the person a simple goal for an authoritarian government. Even the knowledge that watermarks reveal an activist’s identity might need chilling effects on expression and speech. Policymakers must provide clearer guidance on how disclosures will be designed in order to preserve the privacy of those creating content, while also including enough detail to be useful and practical.

Do visible disclosures help audiences understand the role of generative AI?

Even when invisible watermarks are technically durable and privacy preserving, they won’t help audiences interpret content. Though direct disclosures like visible watermarks have an intuitive appeal for providing greater transparency, such disclosures don’t necessarily achieve their intended effects, they usually can often be perceived as paternalistic, biased, and punitive, even after they are usually not saying anything concerning the truthfulness of a bit of content. Moreover, audiences might misinterpret direct disclosures. A participant in my 2021 research misinterpreted Twitter’s “manipulated media” label as suggesting that the institution of “the media” was manipulating him, not that the content of the precise video had been edited to mislead. While research is emerging on how different user experience designs affect audience interpretation of content disclosures, much of it’s concentrated inside large technology firms and focused on distinct contexts, like elections. Studying the efficacy of direct disclosures and user experiences, and never merely counting on the visceral appeal of labeling AI-generated content, is important to effective policymaking for improving transparency.

Could visibly watermarking AI-generated content diminish trust in “real” content?

Perhaps the thorniest societal query to guage is how coordinated, direct disclosures will affect broader attitudes toward information and potentially diminish trust in “real” content. If AI organizations and social media platforms are simply labeling the incontrovertible fact that content is AI-generated or modified—as an comprehensible, albeit limited, technique to avoid making judgments about which claims are misleading or harmful—how does this affect the best way we perceive what we see online?

Why watermarking AI-generated content won’t guarantee trust online

Can the watermark itself be tampered with?

Is the watermark’s durability consistent for various content types?

Who can detect these invisible signals?

Do watermarks preserve privacy?

Do visible disclosures help audiences understand the role of generative AI?

Could visibly watermarking AI-generated content diminish trust in “real” content?

What are your thoughts on this topic?
Let us know in the comments below.

3 COMMENTS

Share this article

Recent posts

The Machine Learning “Advent Calendar” Day 24: Transformers for Text in Excel

Diffusers welcomes Stable Diffusion 3.5 Large

4 Techniques to Optimize AI Coding Efficiency

structured generation in Rust and Python

Is Your Model Time-Blind? The Case for Cyclical Feature Encoding

Why watermarking AI-generated content won’t guarantee trust online

Can the watermark itself be tampered with?

Is the watermark’s durability consistent for various content types?

Who can detect these invisible signals?

Do watermarks preserve privacy?

Do visible disclosures help audiences understand the role of generative AI?

Could visibly watermarking AI-generated content diminish trust in “real” content?

What are your thoughts on this topic? Let us know in the comments below.

3 COMMENTS

Share this article

Recent posts

What are your thoughts on this topic?
Let us know in the comments below.