Why CVEs Belong in Frameworks and Apps, Not AI Models

-


The Common Vulnerabilities and Exposures (CVE) system is the worldwide standard for cataloging security flaws in software. Maintained by MITRE and backed by CISA, this system gives each vulnerability a novel ID and outline for developers, vendors, and defenders to speak clearly and act quickly on known risks.

As AI models grow to be core components of enterprise systems, the safety community is asking: should the CVE system also apply to models? AI models expose latest failure modes‚adversarial prompts, poisoned training data, and data leakage—that appear to be vulnerabilities but don’t fit the CVE definition. In response to CVE policy, a vulnerability should be a weakness in a product that violates a confidentiality, integrity, or availability guarantee.

Most often, assigning a CVE to a person model is the incorrect scope. The true vulnerabilities sit within the frameworks and applications that load and use those weights. Models are parameterized mathematical systems: binary artifacts loaded into frameworks that provide APIs, tooling, and business logic. Vulnerabilities live in the encompassing code, comparable to session management, data handling, or framework serialization, not within the static weight files.

On this post, we outline why CVEs should generally be scoped to applications and frameworks, to not AI models. 

When a CVE is proposed for a model, what’s it describing?

Most proposed AI model CVE fall into three categories:

  1. Application or framework vulnerabilities: Issues within the software that wraps or serves the model, not within the model itself. Example: insecure session handling or framework-level flaws (e.g., TensorFlow CVEs).
  2. Supply chain issues: Risks comparable to tampered weights, poisoned datasets, or compromised repos. These are best tracked with supply chain security mechanisms, not CVEs.
  3. Statistical behaviors of models: Inherent properties like data memorization, bias, or adversarial susceptibility. These aren’t vulnerabilities by CVE’s definition and should be mitigated in application design.

Consider the case of blind SQL injection: the flaw isn’t within the SQL database but in the appliance that fails to sanitize queries. With blind SQL injection, an attacker can craft queries to exfiltrate sensitive data one bit at a time, though the database is functioning exactly as designed. The vulnerability lies in the appliance exposing raw query access, not within the database itself.

Adversarial and inference-time attacks against AI models follow the identical pattern. The model is performing inference as designed, but the encompassing application fails to regulate access or detect malicious queries. Any CVE, if appropriate, must be issued against that application layer, not the model weights.

How do different classes of attacks on AI models align, or fail to align, with CVE criteria?

AI models are software artifacts, but their probabilistic nature introduces behaviors that appear to be vulnerabilities. Most of those are normal inference outcomes exploited in unsafe application contexts.

When evaluating whether a CVE should apply, two questions matter:

  1. Has the model failed its intended inference function in a way that violates a security property?
  2. Is the problem unique to this model instance such that a CVE ID would help users discover and remediate it, reasonably than simply restating a category of attacks that apply to all models?

In just about all cases, the reply to each questions isn’t any. Any model with unrestricted access may be subjected to extraction, inference, adversarial queries, or poisoning. These are properties of machine learning systems as a category, not flaws in a particular model instance. Issuing a CVE here adds no actionable value; it only restates that AI models are liable to known attack families.

CVEs exist to trace discrete, fixable weaknesses. Most AI attack techniques exploit either:

  • Normal inference behavior, where models map inputs to outputs in ways that will expose statistical artifacts.
  • System design flaws, where applications expose the model without access control, query monitoring, or output filtering.

Labeling these as “model vulnerabilities” dilutes the aim of CVEs. The right scope is the frameworks, APIs, and applications where exploitable weaknesses‌ live.

The next attack classes show why. They are sometimes proposed as reasons to issue CVEs for models, but in practice, they align with application and provide chain security. The one gray area is deliberate training data poisoning, which might create a reproducible, model-specific compromise.

How do attackers extract model weights or replicate model behavior, and is that a vulnerability?

These attacks aim to extract model weights or replicate model behavior without authorization. Examples include model stealing, partial weight extraction, and next-token distribution replay. The basis cause is nearly all the time unrestricted queries or overly detailed outputs, comparable to exposed logits. The model itself performs normal inference. The weakness lies in access control and output handling by the appliance or service.

When models leak details about training data, does that constitute a CVE?

Here, the attacker seeks to disclose sensitive information from the training set. Techniques include membership inference, which predicts whether a record was utilized in training, and data reconstruction, which induces the model to regenerate memorized samples. These attacks exploit overfitting and confidence leakage through rigorously crafted inputs. Again, the model is behaving as expected. Mitigations comparable to differential privacy or query monitoring should be applied on the system level.

Do adversarial inputs that force misclassification create a model vulnerability or a system flaw?

These attacks force misclassification or unwanted outputs by manipulating inputs. In vision, imperceptible perturbations can flip labels (for instance, a stop sign to a speed limit sign). Within the language, jailbreak prompts can bypass controls. Generative models may be steered into producing disallowed content through adversarial prompting. The model applies its parameters as trained. The failure is that the appliance doesn’t detect or certain adversarial queries.

If malicious code executes during model loading, is the model itself at fault?

Many so-called “model attacks” are‌ attacks on the frameworks or formats used to load and serve models. Examples include pickle deserialization exploits that enable distant code execution, or lambda-layer payloads that embed malicious code within the forward pass. These issues stem from insecure serialization formats and framework flaws. The model itself isn’t implicated. Converting to a protected format or switching frameworks removes the chance, and any CVE belongs to the framework, not the model weights.

When does poisoned training data create a backdoored model that may warrant a CVE?

This category represents the one edge case where a CVE could have value. By injecting malicious samples during training, an attacker can implant backdoors or targeted behaviors into the model itself. For instance, a picture classifier may mislabel inputs with a hidden trigger, or a language model could also be biased through poisoned prompts. In these cases, the model is compromised during training, and the poisoned weights are a discrete, trackable artifact. While many incidents are higher addressed as supply chain issues through data provenance and authenticity checks, deliberately backdoored models may warrant CVE-level tracking.

Where should CVEs be applied, and the way should AI-specific risks be tracked?

The creation of CVE IDs serves a particular purpose: tracking and communicating about exploitable vulnerabilities in software components, for developers and security teams to triage, prioritize, and remediate risk.

AI models don’t fit this scope. Most attacks exploit expected inference behavior or weaknesses within the software that serves the model. Issuing CVEs on the model level adds noise without delivering actionable guidance.

The right scope for CVEs is the frameworks and applications that load and expose models. That’s where exploitable conditions exist, and where patches and mitigations may be applied. The one narrow exception is deliberate training data poisoning that implants reproducible backdoors in specific weight files, but even this is usually higher addressed through supply chain integrity mechanisms.

The trail forward is obvious: apply CVEs where they drive real remediation, and strengthen the encompassing ecosystem with supply chain assurance, access controls, and monitoring. AI security depends upon defending the systems that wrap and serve models, not on cataloging every statistical property as a vulnerability.

To learn more about NVIDIA’s AI Red Team: NVIDIA AI Red Team: An Introduction | NVIDIA Technical Blog. To report a possible security vulnerability in any NVIDIA product, please fill out the security vulnerability submission form or email psirt@nvidia.com.



Source link

ASK ANA

What are your thoughts on this topic?
Let us know in the comments below.

0 0 votes
Article Rating
guest
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments

Share this article

Recent posts

0
Would love your thoughts, please comment.x
()
x