We’re excited to announce our partnership and integration with Truffle Security, bringing TruffleHog’s powerful secret scanning features to our platform as a part of our ongoing commitment to security.

TruffleHog is an open-source tool that detects and verifies secret leaks in code. With a big selection of detectors for popular SaaS and cloud providers, it scans files and repositories for sensitive information like credentials, tokens, and encryption keys.
By chance committing secrets to code repositories can have serious consequences. By scanning repositories for secrets, TruffleHog helps developers catch and take away this sensitive information before it becomes an issue, protecting data and stopping costly security incidents.
To combat secret leakage in private and non-private repositories, we worked with the TruffleHog team on two different initiatives:
Enhancing our automated scanning pipeline with TruffleHog
Making a native Hugging Face scanner in TruffleHog
Enhancing our automated scanning pipeline with TruffleHog
At Hugging Face, we’re committed to protecting our users’ sensitive information. Because of this we have implemented an automatic security scanning pipeline that scans all repos and commits. We’ve prolonged our automated scanning pipeline to incorporate TruffleHog, which suggests there at the moment are three varieties of scans:
- malware scanning: scans for known malware signatures with ClamAV
- pickle scanning: scans pickle files for malicious executable code with picklescan
- secret scanning: scans for passwords, tokens and API keys with TruffleHog
We run the trufflehog filesystem command on every recent or modified file on each push to a repository, scanning for potential secrets. If and when a verified secret is detected, we notify the user via email, empowering them to take corrective motion.
Verified secrets are those which have been confirmed to work for authentication against their respective providers. Note, nonetheless, that unverified secrets usually are not necessarily harmless or invalid: verification can fail as a consequence of technical reasons, resembling within the case of down time from the provider.
It’s going to all the time be beneficial to run trufflehog in your repositories yourself, even after we do it for you. As an example, you would have rotated the secrets that were leaked and wish to make sure that they arrive up as “unverified”, otherwise you’d prefer to manually check if unverified secrets still pose a threat.
We are going to eventually migrate to the trufflehog huggingface command, the native Hugging Face scanner, once support for LFS lands.

TruffleHog Native Hugging Face Scanner
The goal for making a native Hugging Face scanner in TruffleHog is to empower our users (and the safety teams protecting them) to proactively scan their very own account data for leaked secrets.
TruffleHog’s recent open-source Hugging Face integration can scan models, datasets and Spaces, in addition to any relevant PRs or Discussions. The one limitation is TruffleHog won’t currently scan files stored in LFS. Their team is looking to deal with this for all of their git sources soon.
To scan your entire, or your organization’s Hugging Face models, datasets, and Spaces for secrets using TruffleHog, run the next command(s):
trufflehog huggingface --user
trufflehog huggingface --org
trufflehog huggingface --user --org
You may optionally include the (--include-discussions) and PRs (--include-prs) flags to scan Hugging Face discussion and PR comments.
For those who’d prefer to scan only one model, dataset or Space, TruffleHog has specific flags for every of those.
trufflehog huggingface --model
trufflehog huggingface --dataset
trufflehog huggingface --space
If you might want to pass in an authentication token, you’ll be able to achieve this using the –token flag or by setting a HUGGINGFACE_TOKEN environment variable.
Here is an example of TruffleHog’s output when run on mcpotato/42-eicar-street:
trufflehog huggingface --model mcpotato/42-eicar-street
🐷🔑🐷 TruffleHog. Unearth your secrets. 🐷🔑🐷
2024-09-02T16:39:30+02:00 info-0 trufflehog running source {"source_manager_worker_id": "3KRwu", "with_units": false, "target_count": 0, "source_manager_units_configurable": true}
2024-09-02T16:39:30+02:00 info-0 trufflehog Accomplished enumeration {"num_models": 1, "num_spaces": 0, "num_datasets": 0}
2024-09-02T16:39:32+02:00 info-0 trufflehog scanning repo {"source_manager_worker_id": "3KRwu", "model": "https://huggingface.co/mcpotato/42-eicar-street.git", "repo": "https://huggingface.co/mcpotato/42-eicar-street.git"}
Found unverified result 🐷🔑❓
Detector Type: HuggingFace
Decoder Type: PLAIN
Raw result: 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
Commit: 9cb322a7c2b4ec7c9f18045f0fa05015b831f256
Email: Luc Georges
File: token_leak.yml
Line: 1
Link: https://huggingface.co/mcpotato/42-eicar-street/blob/9cb322a7c2b4ec7c9f18045f0fa05015b831f256/token_leak.yml#L1
Repository: https://huggingface.co/mcpotato/42-eicar-street.git
Resource_type: model
Timestamp: 2024-06-17 13:11:50 +0000
2024-09-02T16:39:32+02:00 info-0 trufflehog finished scanning {"chunks": 19, "bytes": 2933, "verified_secrets": 0, "unverified_secrets": 1, "scan_duration": "2.176551292s", "trufflehog_version": "3.81.10"}
Kudos to the TruffleHog team for offering such an important tool to make our community secure! Stay tuned for more features as we proceed to collaborate to make the Hub safer for everybody.
