🐶Safetensors audited as really protected and becoming the default

Hugging Face, in close collaboration with EleutherAI and Stability AI, has ordered
an external security audit of the safetensors library, the outcomes of which permit
all three organizations to maneuver toward making the library the default format
for saved models.

The complete results of the safety audit, performed by Trail of Bits,
might be found here: Report.

The next blog post explains the origins of the library, why these audit results are vital,
and the following steps.

What’s safetensors?

🐶Safetensors is a library
for saving and loading tensors in essentially the most common frameworks (including PyTorch, TensorFlow, JAX, PaddlePaddle, and NumPy).

For a more concrete explanation, we’ll use PyTorch.

import torch
from safetensors.torch import load_file, save_file

weights = {"embeddings": torch.zeros((10, 100))}
save_file(weights, "model.safetensors")
weights2 = load_file("model.safetensors")

It also has plenty of cool features in comparison with other formats, most notably that loading files is protected, as we’ll see later.

While you’re using transformers, if safetensors is installed, then those files will already
be used preferentially as a way to prevent issues, which suggests that

pip install safetensors

is more likely to be the one thing needed to run safetensors files safely.

Going forward and due to the validation of the library, safetensors will now be installed in transformers by
default. The following step is saving models in safetensors by default.

We’re thrilled to see that the safetensors library is already seeing use within the ML ecosystem, including:

Why create something latest?

The creation of this library was driven by the incontrovertible fact that PyTorch uses pickle under
the hood, which is inherently unsafe. (Sources: 1, 2, video, 3)

With pickle, it is feasible to put in writing a malicious file posing as a model
that offers full control of a user’s computer to an attacker without the user’s knowledge,
allowing the attacker to steal all their bitcoins 😓.

While this vulnerability in pickle is widely known in the pc security world (and is acknowledged within the PyTorch docs), it’s not common knowledge within the broader ML community.

For the reason that Hugging Face Hub is a platform where anyone can upload and share models, it is necessary to make efforts
to forestall users from getting infected by malware.

We’re also taking steps to be certain that the present PyTorch files aren’t malicious, but the perfect we are able to do is flag suspicious-looking files.

After all, there are other file formats on the market, but
none looked as if it would meet the complete set of ideal requirements our team identified.

Along with being protected, safetensors allows lazy loading and usually faster loads (around 100x faster on CPU).

Lazy loading means loading only a part of a tensor in an efficient manner.
This particular feature enables arbitrary sharding with efficient inference libraries, comparable to text-generation-inference, to load LLMs (comparable to LLaMA, StarCoder, etc.) on various kinds of hardware
with maximum efficiency.

Since it loads so fast and is framework agnostic, we are able to even use the format
to load models from the identical file in PyTorch or TensorFlow.

The safety audit

Since safetensors fundamental asset is providing safety guarantees, we desired to be certain that
it actually delivered. That is why Hugging Face, EleutherAI, and Stability AI teamed as much as get an external
security audit to substantiate it.

Vital findings:

No critical security flaw resulting in arbitrary code execution was found.
Some imprecisions within the spec format were detected and stuck.
Some missing validation allowed polyglot files, which was fixed.
Numerous improvements to the test suite were proposed and implemented.

Within the name of openness and transparency, all corporations agreed to make the report
fully public.

Full report

One import thing to notice is that the library is written in Rust. This adds
an additional layer of security
coming directly from the language itself.

While it’s unimaginable to
prove the absence of flaws, this can be a major step in giving reassurance that safetensors
is indeed protected to make use of.

Going forward

For Hugging Face, EleutherAI, and Stability AI, the master plan is to shift to using this format by default.

EleutherAI has added support for evaluating models stored as safetensors of their LM Evaluation Harness and is working on supporting the format of their GPT-NeoX distributed training library.

Inside the transformers library we’re doing the next:

Create safetensors.
Confirm it really works and may deliver on all guarantees (lazy load for LLMs, single file for all frameworks, faster loads).
Confirm it’s protected. (That is today’s announcement.)
Make safetensors a core dependency. (That is already done or soon to come back.)
Make safetensors the default saving format. It will occur in a number of months when we’ve enough feedback
to be certain that it would cause as little disruption as possible and enough users have already got the library
to have the option to load latest models even on relatively old transformers versions.

As for safetensors itself, we’re looking into adding more advanced features for LLM training,
which has its own set of issues with current formats.

Finally, we plan to release a 1.0 within the near future, with the big user base of transformers providing the ultimate testing step.
The format and the lib have had only a few modifications since their inception,
which is a great sign of stability.

We’re glad we are able to bring ML one step closer to being protected and efficient for all!

Source link

🐶Safetensors audited as really protected and becoming the default

What’s safetensors?

Why create something latest?

The safety audit

Going forward

What are your thoughts on this topic?
Let us know in the comments below.

Share this article

Recent posts

Machine Learning at Scale: Managing More Than One Model in Production

Enhancing Distributed Inference Performance with the NVIDIA Inference Transfer Library

Ulysses Sequence Parallelism: Training with Million-Token Contexts

I Stole a Wall Street Trick to Solve a Google Trends Data Problem

CUDA 13.2 Introduces Enhanced CUDA Tile Support and Recent Python Features

🐶Safetensors audited as really protected and becoming the default

What’s safetensors?

Why create something latest?

The safety audit

Going forward

What are your thoughts on this topic? Let us know in the comments below.

Share this article

Recent posts

What are your thoughts on this topic?
Let us know in the comments below.