Swift Transformers Reaches 1.0 – and Looks to the Future

We released swift-transformers two years ago (!) with the goal to support Apple developers and help them integrate local LLMs of their apps. Loads has modified since then (MLX and chat templates didn’t exist!), and we’ve learned how the community is definitely using the library.

We wish to double down on the use cases that provide most advantages to the community, and lay out the foundations for the long run. Spoiler alert: after this release, we’ll focus loads on MLX and agentic use cases 🚀

What’s `swift-transformers`

swift-transformers is a Swift library that goals to scale back the friction for developers that wish to work with local models on Apple Silicon platforms, including iPhones. It includes the missing pieces that should not provided by Core ML or MLX alone, but which might be required to work with local inference. Namely, it provides the next components:

Tokenizers. Preparing inputs for a language model is surprisingly complex. We have built a variety of experience with our tokenizers Python and Rust libraries, that are foundational to the AI ecosystem. We desired to bring the identical performant, ergonomic experience to Swift. The Swift version of Tokenizers should handle the whole lot for you, including chat templates and agentic use!
Hub. That is an interface to the Hugging Face Hub, where all open models can be found. It permits you to download models from the Hub and cache them locally, and supports background resumable downloads, model updates, offline mode. It incorporates a subset of the functionality provided by the Python and JavaScript libraries, focused on the tasks that Apple developers need essentially the most (i.e., uploads should not supported).
Models and Generation. These are wrappers for LLMs converted to the Core ML format. Converting them is out of the scope of the library (but we have now some guides). Once they’re converted, these modules make it easy to run inference with them.

Test app from mlx-swift-examples, showing SmolVLM2 explaining actions in a video.

How is the community using it

More often than not people use the Tokenizers or Hub modules, and steadily each. Some notable projects that depend on swift-transformers include:

mlx-swift-examples, by Apple. It’s, in actual fact, not only a set of examples, but a listing of libraries you should use to run various varieties of models using MLX, including LLMs and VLMs (vision-language models). It’s form of our Models and Generation libraries but for MLX as a substitute of Core ML – and it supports many more model types like embedders or Stable Diffusion.
WhisperKit, by argmax. Open Source ASR (speech recognition) framework, super heavily optimized for Apple Silicon. It relies on our Hub and Tokenizers modules.
FastVLM, by Apple, and lots of other app demos, reminiscent of our own SmolVLM2 native app.

What changes with v1.0

Version 1.0 signals stability within the package. Developers are constructing apps on swift-transformers, and this primary major release recognizes those use cases and brings the version number according to that reality. It also provides the muse on which to iterate with the community to construct the subsequent set of features. These are a few of our preferred updates:

Tokenizers and Hub at the moment are first-class, top-level modules. Before 1.0, you needed to depend upon and import the complete package, whereas now you may just pick Tokenizers, as an illustration.
Speaking of Jinja, we’re super proud to announce that we have now collaborated with John Mai (X) to create the next version of his excellent Swift Jinja library.
John’s work has been crucial for the community: he single-handedly took on the duty to offer a solid chat template library that would grow as templates became increasingly more complex. The new edition is a pair orders of magnitude faster (no kidding), and lives here as swift-jinja.
To further reduce the load imposed on downstream users, we have now removed our example CLI targets and the swift-argument-parser dependency, which in turn prevents version conflicts for projects that already use it.
Due to contributions by Apple, we have now adopted Modern Core ML APIs with support for stateful models (for easier KV-caching) and expressive MLTensor APIs – this removes hundreds of lines of custom tensor operations and math code.
A lot of additional cruft removed and API surface reduced to scale back cognitive load and iterate faster.
Tests are higher, faster, stronger.
Documentation comments have been added to public APIs.
Swift 6 is fully supported.

Version 1.0 comes with breaking API changes. Nonetheless, we don’t expect major problems if you happen to are a user of Tokenizers or Hub. When you use the Core ML components of the library, please get in contact so we are able to support you during transition. We’ll prepare a migration guide and add it to the documentation.

Usage Examples

Here’s use Tokenizers to format tool calling input for an LLM:

import Tokenizers

let tokenizer = try await AutoTokenizer.from(pretrained: "mlx-community/Qwen2.5-7B-Instruct-4bit")

let weatherTool = [
    "type": "function",
    "function": [
        "name": "get_current_weather",
        "description": "Get the current weather in a given location",
        "parameters": [
            "type": "object",
            "properties": ["location": ["type": "string", "description": "City and state"]],
            "required": ["location"]
        ]
    ]
]

let tokens = try tokenizer.applyChatTemplate(
    messages: [["role": "user", "content": "What's the weather in Paris?"]],
    tools: [weatherTool]
)

For extra examples, please check this section within the README and the Examples folder.

What comes next

Truthfully, we don’t know. We do know that we’re super desirous about exploring MLX, because that’s normally the present go-to approach for developers getting began with ML in native apps, and we wish to assist make the experience as seamless as possible. We’re pondering along the lines of higher integration with mlx-swift-examples for LLMs and VLMs, potentially through pre-processing and post-processing operations that developers encounter steadily.

We’re also extremely enthusiastic about agentic use usually and MCP specifically. We expect that exposure of system resources to local workflows can be 🚀

If you should follow along on this journey or wish to share your ideas, please contact us through our social networks or the repo.

We couldn’t have done this without you 🫵

We’re immensely grateful to all of the contributors and users of the library to your help and feedback. We love you all, and might’t wait to proceed working with you to shape the long run of on-device generation! ❤️

Source link

Swift Transformers Reaches 1.0 – and Looks to the Future

What’s `swift-transformers`

How is the community using it

What changes with v1.0

Usage Examples

What comes next

We couldn’t have done this without you 🫵

What are your thoughts on this topic?
Let us know in the comments below.

Share this article

Recent posts

Scale Biology Transformer Models with PyTorch and NVIDIA BioNeMo Recipes

Post-Training GUI Agents for Computer Use

UK government will buy tech to spice up AI sector in $130M growth push

Accelerating Large-Scale Mixture-of-Experts Training in PyTorch

OpenAI braces for “rough vibes”

Swift Transformers Reaches 1.0 – and Looks to the Future

What’s swift-transformers

How is the community using it

What changes with v1.0

Usage Examples

What comes next

We couldn’t have done this without you 🫵

What are your thoughts on this topic? Let us know in the comments below.

Share this article

Recent posts

What’s `swift-transformers`

What are your thoughts on this topic?
Let us know in the comments below.