In February of this yr, the JPEG AI international standard was published, after several years of research geared toward using machine learning techniques to supply a smaller and more easily transmissible and storable image codec, with out a loss in perceptual quality.
Source: https://jpeg.org/jpegai/documentation.html
One possible reason why this advent made few headlines is that the core PDFs for this announcement were (satirically) not available through free-access portals akin to Arxiv. Nonetheless, Arxiv had already recommend various studies examining the importance of JPEG AI across several features, including the tactic’s unusual compression artifacts and its significance for forensics.

Source: https://arxiv.org/pdf/2411.06810
Because JPEG AI alters images in ways in which mimic the artifacts of synthetic image generators, existing forensic tools have difficulty differentiating real from fake imagery:

Source: https://arxiv.org/pdf/2412.03261
One reason is that JPEG AI is trained using a model architecture much like those utilized by generative systems that forensic tools aim to detect:

Source: https://arxiv.org/pdf/2504.03191
Due to this fact each models may produce some similar underlying visual characteristics, from a forensic standpoint.
Quantization
This cross-over occurs due to , common to each architectures, and which is utilized in machine learning each as a technique of converting continuous data into discrete data points, and as an optimization technique that may significantly slim down the file-size of a trained model (casual image synthesis enthusiasts shall be accustomed to the wait between an unwieldy official model release, and a community-led quantized version that may run on local hardware).
On this context, quantization refers back to the means of converting the continual values within the image’s latent representation into fixed, discrete steps. JPEG AI uses this process to to store or transmit a picture by simplifying the inner numerical representation.
Though quantization makes encoding more efficient, it also imposes structural regularities that may resemble the artifacts left by generative models – adequately subtle to evade perception, but disruptive to forensic tools.
In response, the authors of a latest work titled propose interpretable, non-neural techniques that detect JPEG AI compression; determine if a picture has been recompressed; and distinguish compressed real images from those generated entirely by AI.
Method
Color Correlations
The paper proposes three ‘forensic cues’ tailored to JPEG AI images: , introduced during JPEG AI’s preprocessing steps; across repeated compressions that reveal recompression events; and that help distinguish between images compressed by JPEG AI and people generated by AI models.
Regarding the colour correlation-based approach, JPEG AI’s preprocessing pipeline introduces statistical dependencies between the image’s color channels, making a signature that may function a forensic cue.
JPEG AI converts RGB images to the YUV color space and performs 4:2:0 chroma subsampling, which involves downsampling the chrominance channels before compression. This process results in subtle correlations between the high-frequency residuals of the red, green, and blue channels – correlations that are usually not present in uncompressed images, and which differ in strength from those produced by traditional JPEG compression or synthetic image generators.

Above we will see a comparison from the paper illustrating how JPEG AI compression alters color correlations in images, using the red channel for example.
Panel A compares uncompressed images to JPEG AI-compressed ones, showing that compression significantly increases inter-channel correlation; panel B isolates the effect of JPEG AI’s preprocessing – just the colour conversion and subsampling – demonstrating that even this step alone raises correlations noticeably; panel C shows that traditional JPEG compression also increases correlations barely, but to not the identical degree; and Panel D examines synthetic images, with Midjourney-V5 and Adobe Firefly displaying moderate correlation increases, while others remain closer to uncompressed levels.
Rate-Distortion
The speed-distortion cue identifies JPEG AI recompression by tracking how image quality, measured by Peak Signal-to-Noise Ratio (PSNR), declines in a predictable pattern across multiple compression passes.
The research contends that repeatedly compressing a picture with JPEG AI results in progressively smaller, but still measurable, losses in image quality, as quantified by PSNR, and that this gradual degradation forms the idea of a forensic cue for detecting whether a picture has been recompressed.
Unlike traditional JPEG, where earlier methods tracked changes in specific image blocks, JPEG AI requires a distinct approach, as a consequence of its neural compression architecture; due to this fact the authors propose monitoring how each bitrate and PSNR evolve over successive compressions. Each round of compression alters the image lower than the one prior, and this diminishing change (when plotted against bitrate) can reveal whether a picture has passed through multiple compression stages:

Within the image above, we see charted rate-distortion curves for JPEG AI; a second AI-based codec; and traditional JPEG, finding that JPEG AI and the neural codec show a consistent PSNR decline across all bitrates, while traditional JPEG only shows noticeable degradation at much higher bitrates. This behavior provides a quantifiable signal that could be used to flag recompressed JPEG AI images.
By extracting how bitrate and image quality evolve over multiple compression rounds, the authors similarly constructed a signature that helps flag whether a picture has been recompressed, affording a possible practical forensic cue within the context of JPEG AI.
Quantization
As we saw earlier, considered one of the tougher forensic problems raised by JPEG AI is its visual similarity to synthetic images generated by diffusion models. Each systems use encoder–decoder architectures that process images in a compressed latent space and sometimes leave behind subtle upsampling artifacts.
These shared traits can confuse detectors – even those retrained on JPEG AI images. Nonetheless, a key structural difference stays: JPEG AI applies quantization, a step that rounds latent values to discrete levels for efficient compression, while generative models typically don’t.
The brand new paper uses this distinction to design a forensic cue that obliquely tests for the presence of quantization. The strategy analyzes how the latent representation of a picture responds to rounding, on the belief that if a picture has already been quantized, its latent structure will exhibit a measurable pattern of alignment with rounded values.
These patterns, while invisible to the attention, produce statistical differences that can assist separate compressed real images from fully synthetic ones.

Importantly, the authors show that this cue works across different generative models and stays effective even when compression is powerful enough to zero out entire sections of the latent space. In contrast, synthetic images show much weaker responses to this rounding test, offering a practical option to distinguish between the 2.
The result is meant as a light-weight and interpretable tool targeting the core difference between compression and generation, slightly than counting on brittle surface artifacts.
Data and Tests
Compression
To judge whether their color correlation cue could reliably detect JPEG AI compression (i.e., a primary pass from uncompressed source), the authors tested it on high-quality uncompressed images from the RAISE dataset, compressing these at quite a lot of bitrates, using the JPEG AI reference implementation.
They trained an easy random forest on the statistical patterns of color channel correlations (particularly how residual noise in each channel aligned with the others)Â and compared this to a ResNet50 neural network trained directly on the image pixels.

While the ResNet50 achieved higher accuracy when the test data closely matched its training conditions, it struggled to generalize across different compression levels. The correlation-based approach, although far simpler, proved more consistent across bitrates, especially at lower compression rates where JPEG AI’s preprocessing has a stronger effect.
These results suggest that even without deep learning, it is feasible to detect JPEG AI compression using statistical cues that remain interpretable and resilient.
Recompression
To judge whether JPEG AI compression could be reliably detected, the researchers tested the rate-distortion cue on a set of images compressed at diverse bitrates – some just once and others a second time using JPEG AI.
This method involved extracting a 17-dimensional feature vector to trace how the image’s bitrate and PSNR evolved across three compression passes. This feature set captured how much quality was lost at each step, and the way the latent and hyperprior rates behave—metrics that traditional pixel-based methods can’t easily access.
The researchers trained a random forest on these features and compared its performance to a ResNet50 trained on image patches:

The random forest proved notably effective when the initial compression was strong (i.e., at lower bitrates), revealing clear differences between single and double-compressed images. As with the prior cue, the ResNet50 iteration struggled to generalize, particularly when tested on compression levels it had not seen during training.
The speed-distortion features, against this, remained stable across a wide selection of scenarios. Notably, the cue worked even when applied to a distinct AI-based codec, suggesting that the approach generalizes beyond JPEG AI.
JPEG AI and Synthetic Images
For the ultimate testing round, the authors tested whether their quantization-based features can distinguish between JPEG AI-compressed images and fully synthetic images generated by models akin to Midjourney, Stable Diffusion, DALL-E 2, Glide, and Adobe Firefly.
For this, the researchers used a subset of the Synthbuster dataset, mixing real photos from the RAISE database with generated images from a variety of diffusion and GAN-based models.

Source: https://ieeexplore.ieee.org/document/10334046
The actual images were compressed using JPEG AI at several bitrate levels, and classification was posed as a two-way task: either JPEG AI versus a selected generator, or a selected bitrate versus Stable Diffusion XL.
The quantization features (correlations extracted from latent representations) were calculated from a hard and fast 256×256 region and fed to a random forest classifier. As a baseline, a ResNet50 was trained on pixel patches from the identical data.

Across most conditions, the quantization-based approach outperformed the ResNet50 baseline, particularly at low bitrates where compression artifacts were stronger.
The authors state:
A projection of the feature space using UMAP showed clear separation between JPEG AI and artificial images, with lower bitrates increasing the space between classes. One consistent outlier was Glide, whose images clustered otherwise and had the bottom detection accuracy of any generator tested.

Finally, the authors evaluated how well the features held up under typical post-processing, akin to JPEG recompression or downsampling. While performance declined with heavier processing, the drop was gradual, suggesting that the approach retains some robustness even under degraded conditions.

Conclusion
It’s not guaranteed that JPEG AI will enjoy wide adoption. For one thing, there’s enough infrastructural debt at hand to impose friction on latest codec; and even a ‘conventional’ codec with a fantastic pedigree and broad consensus as to its value, akin to AV1, has a tough time dislodging long-established incumbent methods.
With reference to the system’s potential clash with AI generators, the characteristic quantization artifacts that help the generation of AI image detectors could also be diminished or ultimately replaced by traces of a distinct kind, in later systems (assuming that AI generators will at all times leave forensic residue, which isn’t certain).
This is able to mean that JPEG AI’s own quantization characteristics, perhaps together with other cues identified by the brand new paper, may not find yourself colliding with the forensic trail of essentially the most effective latest generative AI systems.
If, nonetheless, JPEG AI continues to operate as a ‘AI wash’, significantly blurring the excellence between real and generated images, it might be hard to make a convincing case for its uptake.
Â