The Complete Swift Client for Hugging Face

-


Mattt's avatar

Today, we’re announcing swift-huggingface,
a brand new Swift package that gives an entire client for the Hugging Face Hub.

You’ll be able to start using it today as a standalone package,
and it should soon integrate into swift-transformers as a alternative for its current HubApi implementation.



The Problem

After we released swift-transformers 1.0 earlier this yr,
we heard loud and clear from the community:

  • Downloads were slow and unreliable.
    Large model files (often several gigabytes)
    would fail partway through with no method to resume.
    Developers resorted to manually downloading models and bundling them with their apps —
    defeating the aim of dynamic model loading.
  • No shared cache with the Python ecosystem.
    The Python transformers library stores models in ~/.cache/huggingface/hub.
    Swift apps downloaded to a unique location with a unique structure.
    Should you’d already downloaded a model using the Python CLI,
    you’d download it again in your Swift app.
  • Authentication is confusing.
    Where should tokens come from?
    Environment variables? Files? Keychain?
    The reply is, “It depends”,
    and the prevailing implementation didn’t make the choices clear.



Introducing swift-huggingface

swift-huggingface is a ground-up rewrite focused on reliability and developer experience.
It provides:

  • Complete Hub API coverage — models, datasets, spaces, collections, discussions, and more
  • Robust file operations — progress tracking, resume support, and proper error handling
  • Python-compatible cache — share downloaded models between Swift and Python clients
  • Flexible authentication — a TokenProvider pattern that makes credential sources explicit
  • OAuth support — first-class support for user-facing apps that have to authenticate users
  • Xet storage backend support (Coming soon!) — chunk-based deduplication for significantly faster downloads

Let us take a look at some examples.




Flexible Authentication with TokenProvider

One among the most important improvements is how authentication works. The TokenProvider pattern makes it explicit where credentials come from:

import HuggingFace



let client = HubClient.default


let client = HubClient(tokenProvider: .static("hf_xxx"))


let client = HubClient(tokenProvider: .keychain(service: "com.myapp", account: "hf_token"))

The auto-detection follows the identical conventions because the Python huggingface_hub library:

  1. HF_TOKEN environment variable
  2. HUGGING_FACE_HUB_TOKEN environment variable
  3. HF_TOKEN_PATH environment variable (path to token file)
  4. $HF_HOME/token file
  5. ~/.cache/huggingface/token (standard HF CLI location)
  6. ~/.huggingface/token (fallback location)

This implies should you’ve already logged in with hf auth login,
swift-huggingface will robotically find and use that token.



OAuth for User-Facing Apps

Constructing an app where users register with their Hugging Face account?
swift-huggingface includes an entire OAuth 2.0 implementation:

import HuggingFace


let authManager = try HuggingFaceAuthenticationManager(
    clientID: "your_client_id",
    redirectURL: URL(string: "yourapp://oauth/callback")!,
    scope: [.openid, .profile, .email],
    keychainService: "com.yourapp.huggingface",
    keychainAccount: "user_token"
)


try await authManager.signIn()


let client = HubClient(tokenProvider: .oauth(manager: authManager))


let userInfo = try await client.whoami()
print("Signed in as: (userInfo.name)")

The OAuth manager handles token storage in Keychain,
automatic refresh, and secure sign-out.
No more manual token management.



Reliable Downloads

Downloading large models is now straightforward with proper progress tracking and resume support:


let progress = Progress(totalUnitCount: 0)

Task {
    for await _ in progress.publisher(for: .fractionCompleted).values {
        print("Download: (Int(progress.fractionCompleted * 100))%")
    }
}

let fileURL = try await client.downloadFile(
    at: "model.safetensors",
    from: "microsoft/phi-2",
    to: destinationURL,
    progress: progress
)

If a download is interrupted,
you possibly can resume it:


let fileURL = try await client.resumeDownloadFile(
    resumeData: savedResumeData,
    to: destinationURL,
    progress: progress
)

For downloading entire model repositories,
downloadSnapshot handles all the things:

let modelDir = try await client.downloadSnapshot(
    of: "mlx-community/Llama-3.2-1B-Instruct-4bit",
    to: cacheDirectory,
    matching: ["*.safetensors", "*.json"],  
    progressHandler: { progress in
        print("Downloaded (progress.completedUnitCount) of (progress.totalUnitCount) files")
    }
)

The snapshot function tracks metadata for every file,
so subsequent calls only download files which have modified.



Shared Cache with Python

Remember the second problem we mentioned?
“No shared cache with the Python ecosystem.”
That is now solved.

swift-huggingface implements a Python-compatible cache structure
that permits seamless sharing between Swift and Python clients:

~/.cache/huggingface/hub/
├── models--deepseek-ai--DeepSeek-V3.2/
│   ├── blobs/
│   │   └──            # actual file content
│   ├── refs/
│   │   └── primary             # comprises commit hash
│   └── snapshots/
│       └── /
│           └── config.json  # symlink → ../../blobs/

This implies:

  • Download once, use in every single place.
    Should you’ve already downloaded a model with the hf CLI or the Python library,
    swift-huggingface will find it robotically.
  • Content-addressed storage.
    Files are stored by their ETag within the blobs/ directory.
    If two revisions share the identical file, it’s only stored once.
  • Symlinks for efficiency.
    Snapshot directories contain symlinks to blobs,
    minimizing disk usage while maintaining a clean file structure.

The cache location follows the identical environment variable conventions as Python:

  1. HF_HUB_CACHE environment variable
  2. HF_HOME environment variable + /hub
  3. ~/.cache/huggingface/hub (default)

It’s also possible to use the cache directly:

let cache = HubCache.default


if let cachedPath = cache.cachedFilePath(
    repo: "deepseek-ai/DeepSeek-V3.2",
    kind: .model,
    revision: "primary",
    filename: "config.json"
) {
    let data = try Data(contentsOf: cachedPath)
    
}

To stop race conditions when multiple processes access the identical cache,
swift-huggingface uses file locking
(flock(2)).



Before and After

Here’s what downloading a model snapshot looked like with the old HubApi:


let hub = HubApi()
let repo = Hub.Repo(id: "mlx-community/Llama-3.2-1B-Instruct-4bit")


let modelDir = try await hub.snapshot(
    from: repo,
    matching: ["*.safetensors", "*.json"]
) { progress in
    
    print(progress.fractionCompleted)
}

And here’s the identical operation with swift-huggingface:


let client = HubClient.default

let modelDir = try await client.downloadSnapshot(
    of: "mlx-community/Llama-3.2-1B-Instruct-4bit",
    to: cacheDirectory,
    matching: ["*.safetensors", "*.json"],
    progressHandler: { progress in
        
        print("(progress.completedUnitCount)/(progress.totalUnitCount) files")
    }
)

The API is comparable, however the implementation is totally different —
built on URLSession download tasks with proper
delegate handling, resume data support, and metadata tracking.



Beyond Downloads

But wait, there’s more!
swift-huggingface comprises an entire Hub client:


let models = try await client.listModels(
    filter: "library:mlx",
    sort: "trending",
    limit: 10
)


let model = try await client.getModel("mlx-community/Llama-3.2-1B-Instruct-4bit")
print("Downloads: (model.downloads ?? 0)")
print("Likes: (model.likes ?? 0)")


let collections = try await client.listCollections(owner: "huggingface", sort: "trending")


let discussions = try await client.listDiscussions(kind: .model, "username/my-model")

And that is not all!
swift-huggingface has all the things you might want to interact with
Hugging Face Inference Providers,
giving your app quick access to a whole bunch of machine learning models,
powered by world-class inference providers:

import HuggingFace


let client = InferenceClient.default


let response = try await client.textToImage(
    model: "black-forest-labs/FLUX.1-schnell",
    prompt: "A serene Japanese garden with cherry blossoms",
    provider: .hfInference,
    width: 1024,
    height: 1024,
    numImages: 1,
    guidanceScale: 7.5,
    numInferenceSteps: 50,
    seed: 42
)


try response.image.write(to: URL(fileURLWithPath: "generated.png"))

Check the README for a full list of all the things that is supported.



What’s Next

We’re actively working on two fronts:

Integration with swift-transformers.
We have now a pull request in progress to switch HubApi with swift-huggingface.
It will bring reliable downloads to everyone using swift-transformers,
mlx-swift-lm,
and the broader ecosystem.
Should you maintain a Swift-based library or app and wish help adopting swift-huggingface, reach out — we’re completely satisfied to assist.

Faster downloads with Xet.
We’re adding support for the Xet storage backend,
which enables chunk-based deduplication and significantly faster downloads for giant models.
More on this soon.



Try It Out

Add swift-huggingface to your project:

dependencies: [
    .package(url: "https://github.com/huggingface/swift-huggingface.git", from: "0.4.0")
]

We might love your feedback.
Should you’ve been frustrated with model downloads in Swift, give this a attempt to
allow us to understand how it goes.
Your experience reports will help us prioritize what to enhance next.



Resources


Due to the swift-transformers community for the feedback that shaped this project, and to everyone who filed issues and shared their experiences. That is for you. ❤️



Source link

ASK ANA

What are your thoughts on this topic?
Let us know in the comments below.

0 0 votes
Article Rating
guest
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments

Share this article

Recent posts

0
Would love your thoughts, please comment.x
()
x