Using Stable Diffusion with Core ML on Apple Silicon

-


Pedro Cuenca's avatar


Due to Apple engineers, you possibly can now run Stable Diffusion on Apple Silicon using Core ML!

This Apple repo provides conversion scripts and inference code based on 🧨 Diffusers, and we like it! To make it as easy as possible for you, we converted the weights ourselves and put the Core ML versions of the models in the Hugging Face Hub.

Update: some weeks after this post was written we created a native Swift app you can use to run Stable Diffusion effortlessly on your individual hardware. We released an app within the Mac App Store in addition to the source code to permit other projects to make use of it.

The remaining of this post guides you on use the converted weights in your individual code or convert additional weights yourself.



Available Checkpoints

The official Stable Diffusion checkpoints are already converted and prepared to be used:

Core ML supports all of the compute units available in your device: CPU, GPU and Apple’s Neural Engine (NE). It is also possible for Core ML to run different portions of the model in several devices to maximise performance.

There are several variants of every model which will yield different performance depending on the hardware you utilize. We recommend you are attempting them out and stick to the one which works best in your system. Read on for details.



Notes on Performance

There are several variants per model:

  • “Original” attention vs “split_einsum”. These are two alternative implementations of the critical attention blocks. split_einsum was previously introduced by Apple, and is compatible with all of the compute units (CPU, GPU and Apple’s Neural Engine). original, then again, is just compatible with CPU and GPU. Nevertheless, original may be faster than split_einsum on some devices, so do test it out!
  • “ML Packages” vs “Compiled” models. The previous is suitable for Python inference, while the compiled version is required for Swift code. The compiled models within the Hub split the big UNet model weights in several files for compatibility with iOS and iPadOS devices. This corresponds to the --chunk-unet conversion option.

On the time of this writing, we got best results on my MacBook Pro (M1 Max, 32 GPU cores, 64 GB) using the next combination:

  • original attention.
  • all compute units (see next section for details).
  • macOS Ventura 13.1 Beta 4 (22C5059b).

With these, it took 18s to generate one image with the Core ML version of Stable Diffusion v1.4 🤯.

⚠️ Note

Several improvements to Core ML were introduced in macOS Ventura 13.1, they usually are required by Apple’s implementation. Chances are you’ll get black images –and far slower times– should you use previous versions of macOS.

Each model repo is organized in a tree structure that gives these different variants:

coreml-stable-diffusion-v1-4
├── README.md
├── original
│   ├── compiled
│   └── packages
└── split_einsum
    ├── compiled
    └── packages

You may download and use the variant you wish as shown below.



Core ML Inference in Python



Prerequisites

pip install huggingface_hub
pip install git+https://github.com/apple/ml-stable-diffusion



Download the Model Checkpoints

To run inference in Python, you could have to make use of certainly one of the versions stored within the packages folders, since the compiled ones are only compatible with Swift. Chances are you’ll select whether you would like to use the original or split_einsum attention styles.

That is the way you’d download the original attention variant from the Hub:

from huggingface_hub import snapshot_download
from pathlib import Path

repo_id = "apple/coreml-stable-diffusion-v1-4"
variant = "original/packages"

model_path = Path("./models") / (repo_id.split("https://huggingface.co/")[-1] + "_" + variant.replace("https://huggingface.co/", "_"))
snapshot_download(repo_id, allow_patterns=f"{variant}/*", local_dir=model_path, local_dir_use_symlinks=False)
print(f"Model downloaded at {model_path}")

The code above will place the downloaded model snapshot inside a directory called models.



Inference

Once you could have downloaded a snapshot of the model, the simplest approach to run inference can be to make use of Apple’s Python script.

python -m python_coreml_stable_diffusion.pipeline --prompt "a photograph of an astronaut riding a horse on mars" -i models/coreml-stable-diffusion-v1-4_original_packages -o  --compute-unit ALL --seed 93

should point to the checkpoint you downloaded within the step above, and --compute-unit indicates the hardware you would like to allow for inference. It have to be certainly one of the next options: ALL, CPU_AND_GPU, CPU_ONLY, CPU_AND_NE. Chances are you’ll also provide an optional output path, and a seed for reproducibility.

The inference script assumes the unique version of the Stable Diffusion model, stored within the Hub as CompVis/stable-diffusion-v1-4. If you happen to use one other model, you have to specify its Hub id within the inference command-line, using the --model-version option. This works each for models already supported, and for custom models you trained or fine-tuned yourself.

For Stable Diffusion 1.5 (Hub id: runwayml/stable-diffusion-v1-5):

python -m python_coreml_stable_diffusion.pipeline --prompt "a photograph of an astronaut riding a horse on mars" --compute-unit ALL -o output --seed 93 -i models/coreml-stable-diffusion-v1-5_original_packages --model-version runwayml/stable-diffusion-v1-5

For Stable Diffusion 2 base (Hub id: stabilityai/stable-diffusion-2-base):

python -m python_coreml_stable_diffusion.pipeline --prompt "a photograph of an astronaut riding a horse on mars" --compute-unit ALL -o output --seed 93 -i models/coreml-stable-diffusion-2-base_original_packages --model-version stabilityai/stable-diffusion-2-base



Core ML inference in Swift

Running inference in Swift is barely faster than in Python, since the models are already compiled within the mlmodelc format. This might be noticeable on app startup when the model is loaded, but shouldn’t be noticeable should you run several generations afterwards.



Download

To run inference in Swift in your Mac, you wish certainly one of the compiled checkpoint versions. We recommend you download them locally using Python code just like the one we showed above, but using certainly one of the compiled variants:

from huggingface_hub import snapshot_download
from pathlib import Path

repo_id = "apple/coreml-stable-diffusion-v1-4"
variant = "original/compiled"

model_path = Path("./models") / (repo_id.split("https://huggingface.co/")[-1] + "_" + variant.replace("https://huggingface.co/", "_"))
snapshot_download(repo_id, allow_patterns=f"{variant}/*", local_dir=model_path, local_dir_use_symlinks=False)
print(f"Model downloaded at {model_path}")



Inference

To run inference, please clone Apple’s repo:

git clone https://github.com/apple/ml-stable-diffusion
cd ml-stable-diffusion

After which use Apple’s command-line tool using Swift Package Manager’s facilities:

swift run StableDiffusionSample --resource-path models/coreml-stable-diffusion-v1-4_original_compiled --compute-units all "a photograph of an astronaut riding a horse on mars"

You could have to specify in --resource-path certainly one of the checkpoints downloaded within the previous step, so please be certain it accommodates compiled Core ML bundles with the extension .mlmodelc. The --compute-units needs to be certainly one of these values: all, cpuOnly, cpuAndGPU, cpuAndNeuralEngine.

For more details, please confer with the instructions in Apple’s repo.



Bring Your personal Model

If you could have created your individual models compatible with Stable Diffusion (for instance, should you used Dreambooth, Textual Inversion or fine-tuning), then you could have to convert the models yourself. Fortunately, Apple provides a conversion script that lets you achieve this.

For this task, we recommend you follow these instructions.



Next Steps

We’re really excited concerning the opportunities this brings and may’t wait to see what the community can create from here. Some potential ideas are:

  • Native, high-quality apps for Mac, iPhone and iPad.
  • Bring additional schedulers to Swift, for even faster inference.
  • Additional pipelines and tasks.
  • Explore quantization techniques and further optimizations.

Looking forward to seeing what you create!



Source link

ASK ANA

What are your thoughts on this topic?
Let us know in the comments below.

0 0 votes
Article Rating
guest
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments

Share this article

Recent posts

0
Would love your thoughts, please comment.x
()
x