Forecasting Potential Misuses of Language Models for Disinformation Campaigns—and The right way to Reduce Risk

OpenAI researchers collaborated with Georgetown University’s Center for Security and Emerging Technology and the Stanford Web Observatory to analyze how large language models may be misused for disinformation purposes. The collaboration included an October 2021 workshop bringing together 30 disinformation researchers, machine learning experts, and policy analysts, and culminated in a co-authored report constructing on greater than a yr of research. This report outlines the threats that language models pose to the data environment if used to reinforce disinformation campaigns and introduces a framework for analyzing potential mitigations. Read the complete report here.

Read report

As generative language models improve, they open up latest possibilities in fields as diverse as healthcare, law, education and science. But, as with all latest technology, it’s price considering how they could be misused. Against the backdrop of recurring online influence operations—covert or deceptive efforts to influence the opinions of a target market—the paper asks:

How might language models change influence operations, and what steps could be taken to mitigate this threat?

Our work brought together different backgrounds and expertise—researchers with grounding within the tactics, techniques, and procedures of online disinformation campaigns, in addition to machine learning experts within the generative artificial intelligence field—to base our evaluation on trends in each domains.

We imagine that it’s critical to research the specter of AI-enabled influence operations and description steps that could be taken before language models are used for influence operations at scale. We hope our research will inform policymakers which can be latest to the AI or disinformation fields, and spur in-depth research into potential mitigation strategies for AI developers, policymakers, and disinformation researchers.

How Could AI Affect Influence Operations?

When researchers evaluate influence operations, they consider the actors, behaviors, and content. The widespread availability of technology powered by language models has the potential to affect all three facets:

Actors: Language models could drive down the price of running influence operations, placing them nearby of recent actors and actor types. Likewise, propagandists-for-hire that automate production of text may gain latest competitive benefits.
Behavior: Influence operations with language models will develop into easier to scale, and tactics which can be currently expensive (e.g., generating personalized content) may develop into cheaper. Language models can also enable latest tactics to emerge—like real-time content generation in chatbots.
Content: Text creation tools powered by language models may generate more impactful or persuasive messaging in comparison with propagandists, especially those that lack requisite linguistic or cultural knowledge of their goal. They can also make influence operations less discoverable, since they repeatedly create latest content while not having to resort to copy-pasting and other noticeable time-saving behaviors.

Our bottom-line judgment is that language models will likely be useful for propagandists and can likely transform online influence operations. Even when probably the most advanced models are kept private or controlled through application programming interface (API) access, propagandists will likely gravitate towards open-source alternatives and nation states may spend money on the technology themselves.

Critical Unknowns

Many aspects impact whether, and the extent to which, language models will likely be utilized in influence operations. Our report dives into lots of these considerations. For instance:

What latest capabilities for influence will emerge as a side effect of well-intentioned research or business investment? Which actors will make significant investments in language models?
When will easy-to-use tools to generate text develop into publicly available? Will it’s simpler to engineer specific language models for influence operations, slightly than apply generic ones?
Will norms develop that disincentivize actors who wage AI-enabled influence operations? How will actor intentions develop?

While we expect to see diffusion of the technology in addition to improvements within the usability, reliability, and efficiency of language models, many questions on the longer term remain unanswered. Because these are critical possibilities that may change how language models may impact influence operations, additional research to scale back uncertainty is very priceless.

A Framework for Mitigations

To chart a path forward, the report lays out key stages within the language model-to-influence operation pipeline. Each of those stages is a degree for potential mitigations.To successfully wage an influence operation leveraging a language model, propagandists would require that: (1) a model exists, (2) they will reliably access it, (3) they will disseminate content from the model, and (4) an end user is affected. Many possible mitigation strategies fall along these 4 steps, as shown below.

Stage within the pipeline	1. Model Construction	2. Model Access	3. Content Dissemination	4. Belief Formation
Illustrative Mitigations	AI developers construct models which can be more fact-sensitive.	AI providers impose stricter usage restrictions on language models.	Platforms and AI providers coordinate to discover AI content.	Institutions engage in media literacy campaigns.
	Developers spread radioactive data to make generative models detectable.	AI providers develop latest norms around model release.	Platforms require “proof of personhood” to post.	Developers provide consumer focused AI tools.
	Governments impose restrictions on data collection.	AI providers close security vulnerabilities.	Entities that depend on public input take steps to scale back their exposure to misleading AI content.
	Governments impose access controls on AI hardware.	AI providers close security vulnerabilities.	Digital provenance standards are widely adopted.

If a Mitigation Exists, is it Desirable?

Simply because a mitigation could reduce the specter of AI-enabled influence operations doesn’t mean that it must be put into place. Some mitigations carry their very own downside risks. Others might not be feasible. While we don’t explicitly endorse or rate mitigations, the paper provides a set of guiding questions for policymakers and others to contemplate:

Technical Feasibility: Is the proposed mitigation technically feasible? Does it require significant changes to technical infrastructure?
Social Feasibility: Is the mitigation feasible from a political, legal, and institutional perspective? Does it require costly coordination, are key actors incentivized to implement it, and is it actionable under existing law, regulation, and industry standards?
Downside Risk: What are the potential negative impacts of the mitigation, and the way significant are they?
Impact: How effective would a proposed mitigation be at reducing the threat?

We hope this framework will spur ideas for other mitigation strategies, and that the guiding questions will help relevant institutions begin to contemplate whether various mitigations are price pursuing.

This report is removed from the ultimate word on AI and the longer term of influence operations. Our aim is to define the current environment and to assist set an agenda for future research. We encourage anyone desirous about collaborating or discussing relevant projects to attach with us. For more, read the complete report here.

Read report

Josh A. Goldstein (Georgetown University’s Center for Security and Emerging Technology)
Girish Sastry (OpenAI)
Micah Musser (Georgetown University’s Center for Security and Emerging Technology)
Renée DiResta (Stanford Web Observatory)
Matthew Gentzel (Longview Philanthropy) (work done at OpenAI)
Katerina Sedova (US Department of State) (work done at Center for Security and Emerging Technology prior to government service)

Forecasting Potential Misuses of Language Models for Disinformation Campaigns—and The right way to Reduce Risk

How Could AI Affect Influence Operations?

Critical Unknowns

A Framework for Mitigations

If a Mitigation Exists, is it Desirable?

What are your thoughts on this topic?
Let us know in the comments below.

5 COMMENTS

Share this article

Recent posts

AI’s Growing Power Needs: Tech Industry’s Move Towards Nuclear Power

“Human Intelligence Created”… Human Intelligence Challenge Spreads Against ‘Made by AI’

What We Still Don’t Understand About Machine Learning

OpenAI Unveils SearchGPT: A Recent AI-Powered Search Engine

Public Release: Kling AI Video Generator

Forecasting Potential Misuses of Language Models for Disinformation Campaigns—and The right way to Reduce Risk

How Could AI Affect Influence Operations?

Critical Unknowns

A Framework for Mitigations

If a Mitigation Exists, is it Desirable?

What are your thoughts on this topic? Let us know in the comments below.

5 COMMENTS

Share this article

Recent posts

What are your thoughts on this topic?
Let us know in the comments below.