help
There may be a notable and robust strand in computer vision literature dedicated to protecting copyrighted images from being trained into AI models, or getting used in direct image>image AI processes. Systems of this type are generally geared toward Latent Diffusion Models (LDMs) equivalent to Stable Diffusion and Flux, which use noise-based procedures to encode and decode imagery.
By inserting adversarial noise into otherwise normal-looking images, it may be possible to cause image detectors to guess image content incorrectly, and hobble image-generating systems from exploiting copyrighted data:
Source: https://arxiv.org/pdf/2302.06588
Since an artists’ backlash against Stable Diffusion’s liberal use of web-scraped imagery (including copyrighted imagery) in 2023, the research scene has produced multiple variations on the identical theme – the concept that pictures could be invisibly ‘poisoned’ against being trained into AI systems or sucked into generative AI pipelines, without adversely affecting the standard of the image, for the typical viewer.
In all cases, there’s a direct correlation between the intensity of the imposed perturbation, the extent to which the image is subsequently protected, and the extent to which the image doesn’t look quite nearly as good because it should:

Source: https://arxiv.org/pdf/2002.08327
Of particular interest to artists looking for to guard their styles against unauthorized appropriation is the capability of such systems not only to obfuscate identity and other information, but to ‘persuade’ an AI training process that it’s seeing something apart from it is actually seeing, in order that connections don’t form between semantic and visual domains for ‘protected’ training data (i.e., a prompt equivalent to ).

Source: https://arxiv.org/pdf/2506.04394
Own Goal
Now, recent research from the US has found not only that perturbations can fail to guard a picture, but that adding perturbation can actually the image’s exploitability in all of the AI processes that perturbation is supposed to immunize against.
The paper states:
higher
In tests, the protected images were exposed to 2 familiar AI editing scenarios: straightforward image-to-image generation and style transfer. These processes reflect the common ways in which AI models might exploit protected content, either by directly altering a picture, or by borrowing its stylistic traits to be used elsewhere.
The protected images, drawn from standard sources of photography and artwork, were run through these pipelines to see whether the added perturbations could block or degrade the edits.
As a substitute, the presence of protection often looked as if it would sharpen the model’s alignment with the prompts, producing clean, accurate outputs where some failure had been expected.
The authors advise, in effect, that this very talked-about approach to protection could also be providing a false sense of security, and that any such perturbation-based immunization approaches ought to be tested thoroughly against the authors’ own methods.
Method
The authors ran experiments using three protection methods that apply carefully-designed adversarial perturbations: PhotoGuard; Mist; and Glaze.

p = 0.05 p = 0.1. https://arxiv.org/pdf/2302.04222
PhotoGuard was applied to natural scene images, while Mist and Glaze were used on artworks (i.e., ‘artistically-styled’ domains).
Tests covered each natural and artistic images to reflect possible real-world uses. The effectiveness of every method was assessed by checking whether an AI model could still produce realistic and prompt-relevant edits when working on protected images; if the resulting images appeared convincing and matched the prompts, the protection was judged to have failed.
Stable Diffusion v1.5 was used because the pre-trained image generator for the researchers’ editing tasks. Five seeds were chosen to make sure reproducibility: 9222, 999, 123, 66, and 42. All other generation settings, equivalent to guidance scale, strength, and total steps, followed the default values utilized in the PhotoGuard experiments.
PhotoGuard was tested on natural scene images using the Flickr8k dataset, which comprises over 8,000 images paired with as much as five captions each.
Opposing Thoughts
Two sets of modified captions were created from the primary caption of every image with the assistance of Claude Sonnet 3.5. One set contained prompts that were to the unique captions; the opposite set contained prompts that were .
For instance, from the unique caption , a detailed prompt can be . Against this, a prompt can be .
Close prompts were constructed by replacing nouns and adjectives with semantically similar terms; far prompts were generated by instructing the model to create captions that were contextually very different.
All generated captions were manually checked for quality and semantic relevance. Google’s Universal Sentence Encoder was used to calculate semantic similarity scores between the unique and modified captions:

Source: https://sigport.org/sites/default/files/docs/IncompleteProtection_SM_0.pdf
Each image, together with its protected version, was edited using each the close and much prompts. The Blind/Referenceless Image Spatial Quality Evaluator (BRISQUE) was used to evaluate image quality:

The generated images scored 17.88 on BRISQUE, with 17.82 for close prompts and 17.94 for much prompts, while the unique images scored 22.27. This shows that the edited images remained close in quality to the originals.
Metrics
To guage how well the protections interfered with AI editing, the researchers measured how closely the ultimate images matched the instructions they got, using scoring systems that compared the image content to the text prompt, to see how well they align.
To this end, the CLIP-S metric uses a model that may understand each images and text to ascertain how similar they’re, while PAC-S++, adds extra samples created by AI to align its comparison more closely to a human estimation.
These Image-Text Alignment (ITA) scores denote how accurately the AI followed the instructions when modifying a protected image: if a protected image still led to a highly aligned output, it means the protection was deemed to have to dam the edit.

The researchers compared how well the AI followed prompts when editing protected images versus unprotected ones. They first checked out the difference between the 2, called the . Then the difference was scaled to create a , making it easier to match results across many tests.
This process revealed whether the protections made it harder or easier for the AI to match the prompts. The tests were repeated five times using different random seeds, covering each small and huge changes to the unique captions.
Art Attack
For the tests on natural photographs, the Flickr1024 dataset was used, containing over one thousand high-quality images. Each image was edited with prompts that followed the pattern: , where represented one among seven famous art styles: Cubism; Post-Impressionism; Impressionism; Surrealism; Baroque; Fauvism; and Renaissance.
The method involved applying PhotoGuard to the unique images, generating protected versions, after which running each protected and unprotected images through the identical set of fashion transfer edits:

To check protection methods on artwork, style transfer was performed on images from the WikiArt dataset, which curates a wide selection of artistic styles. The editing prompts followed the identical format as before, instructing the AI to vary the style to a randomly chosen, unrelated style drawn from the WikiArt labels.
Each Glaze and Mist protection methods were applied to the photographs before the edits, allowing the researchers to watch how well each defense could block or distort the style transfer results:

The researchers tested the comparisons quantitatively as well:

Of those results, the authors comment:
The authors explain that the unexpected results could be traced to how diffusion models work: LDMs edit images by first converting them right into a compressed version called a latent; noise is then added to this latent through many steps, until the info becomes almost random.
The model reverses this process during generation, removing the noise step-by-step. At each stage of this reversal, the text prompt helps guide how the noise ought to be cleaned up, step by step shaping the image to match the prompt:

Protection methods add small amounts of additional noise to the unique image before it enters this process. While these perturbations are minor at the beginning, they accumulate because the model applies its own layers of noise.
This buildup leaves more parts of the image ‘uncertain’ when the model begins removing noise. With greater uncertainty, the model leans more heavily on the text prompt to fill within the missing details, giving the prompt .
In effect, the protections make it easier for the AI to reshape the image to match the prompt, fairly than harder.
Finally, the authors conducted a test that substituted crafted perturbations from the paper for pure Gaussian noise.
The outcomes followed the identical pattern observed earlier: across all tests, the Percentage Change values remained positive. Even this random, unstructured noise led to stronger alignment between the generated images and the prompts.

This supported the underlying explanation that any added noise, no matter its design, creates greater uncertainty for the model during generation, allowing the text prompt to exert much more control over the ultimate image.
Conclusion
The research scene has been pushing adversarial perturbation on the LDM copyright issue for nearly so long as LDMs have been around; but no resilient solutions have emerged from the extraordinary variety of papers published on this tack.
Either the imposed disturbances excessively lower the standard of the image, or the patterns prove to not be resilient to manipulation and transformative processes.
Nonetheless, it’s a hard dream to desert, for the reason that alternative would appear to be third-party monitoring and provenance frameworks equivalent to the Adobe-led C2PA scheme, which seeks to keep up a chain-of-custody for images from the camera sensor on, but which has no innate reference to the content depicted.
In any case, if adversarial perturbation is definitely making the issue worse, as the brand new paper indicates may very well be true in lots of cases, one wonders if the seek for copyright protection via such means falls under ‘alchemy’.