Segmind Mixture of Diffusion Experts

-



SegMoE is an exciting framework for creating Mixture-of-Experts Diffusion models from scratch! SegMoE is comprehensively integrated inside the Hugging Face ecosystem and comes supported with diffusers 🔥!

Among the many features and integrations being released today:



Table of Contents



What’s SegMoE?

SegMoE models follow the identical architecture as Stable Diffusion. Like Mixtral 8x7b, a SegMoE model comes with multiple models in a single. The way in which this works is by replacing some Feed-Forward layers with a sparse MoE layer. A MoE layer accommodates a router network to pick which experts process which tokens most efficiently.
You need to use the segmoe package to create your personal MoE models! The method takes just a couple of minutes. For further information, please visit the Github Repository. We take inspiration from the favored library mergekit to design segmoe. We thank the contributors of mergekit for such a useful library.

For more details on MoEs, see the Hugging Face 🤗 post: hf.co/blog/moe.

SegMoE release TL;DR;

  • Release of SegMoE-4×2, SegMoE-2×1 and SegMoE-SD4x2 versions
  • Release of custom MoE-making code



In regards to the name

The SegMoE MoEs are called SegMoE-AxB, where A refers back to the variety of expert models MoE-d together, while the second number refers back to the variety of experts involved within the generation of every image. Just some layers of the model (the feed-forward blocks, attentions, or all) are replicated depending on the configuration settings; the remaining of the parameters are the identical as in a Stable Diffusion model. For more details about how MoEs work, please check with the “Mixture of Experts Explained” post.



Inference

We release 3 merges on the Hub:

  1. SegMoE 2×1 has two expert models.
  2. SegMoE 4×2 has 4 expert models.
  3. SegMoE SD 4×2 has 4 Stable Diffusion 1.5 expert models.



Samples

Images generated using SegMoE 4×2

image

Images generated using SegMoE 2×1:

image

Images generated using SegMoE SD 4×2

image



Using 🤗 Diffusers

Please, run the next command to put in the segmoe package. Be sure you could have the newest version of diffusers and transformers installed.

pip install -U segmoe diffusers transformers

The next loads up the second model (“SegMoE 4×2”) from the list above, and runs generation on it.

from segmoe import SegMoEPipeline

pipeline = SegMoEPipeline("segmind/SegMoE-4x2-v0", device="cuda")

prompt = "cosmic canvas, orange city background, painting of a chubby cat"
negative_prompt = "nsfw, bad quality, worse quality"
img = pipeline(
    prompt=prompt,
    negative_prompt=negative_prompt,
    height=1024,
    width=1024,
    num_inference_steps=25,
    guidance_scale=7.5,
).images[0]
img.save("image.png")

image



Using a Local Model

Alternatively, an area model may also be loaded up, here segmoe_v0 is the trail to the directory containing the local SegMoE model. Checkout Creating your Own SegMoE to learn tips on how to construct your personal!

from segmoe import SegMoEPipeline

pipeline = SegMoEPipeline("segmoe_v0", device="cuda")

prompt = "cosmic canvas, orange city background, painting of a chubby cat"
negative_prompt = "nsfw, bad quality, worse quality"
img = pipeline(
    prompt=prompt,
    negative_prompt=negative_prompt,
    height=1024,
    width=1024,
    num_inference_steps=25,
    guidance_scale=7.5,
).images[0]
img.save("image.png")



Comparison

Prompt understanding seems to enhance, as shown in the photographs below. Each image shows the next models left to right: SegMoE-2×1-v0, SegMoE-4×2-v0, Base Model (RealVisXL_V3.0)

image

three green glass bottles

image

panda bear with aviator glasses on its head

image

the statue of Liberty next to the Washington Monument

image

Taj Mahal with its reflection. detailed charcoal sketch.



Creating your Own SegMoE

Simply prepare a config.yaml file, with the next structure:

base_model: Base Model Path, Model Card or CivitAI Download Link
num_experts: Number of experts to use
moe_layers: Type of Layers to Mix (can be "ff", "attn" or "all"). Defaults to "attn"
num_experts_per_tok: Number of Experts to use 
experts:
  - source_model: Expert 1 Path, Model Card or CivitAI Download Link
    positive_prompt: Positive Prompt for computing gate weights
    negative_prompt: Negative Prompt for computing gate weights
  - source_model: Expert 2 Path, Model Card or CivitAI Download Link
    positive_prompt: Positive Prompt for computing gate weights
    negative_prompt: Negative Prompt for computing gate weights
  - source_model: Expert 3 Path, Model Card or CivitAI Download Link
    positive_prompt: Positive Prompt for computing gate weights
    negative_prompt: Negative Prompt for computing gate weights
  - source_model: Expert 4 Path, Model Card or CivitAI Download Link
    positive_prompt: Positive Prompt for computing gate weights
    negative_prompt: Negative Prompt for computing gate weights

Any variety of models could be combined. For detailed information on tips on how to create a config file, please check with the github repository

Note
Each Hugging Face and CivitAI models are supported. For CivitAI models, paste the download link of the model, for instance: “https://civitai.com/api/download/models/239306

Then run the next command:

segmoe config.yaml segmoe_v0

It will create a folder called segmoe_v0 with the next structure:

├── model_index.json
├── scheduler
│   └── scheduler_config.json
├── text_encoder
│   ├── config.json
│   └── model.safetensors
├── text_encoder_2
│   ├── config.json
│   └── model.safetensors
├── tokenizer
│   ├── merges.txt
│   ├── special_tokens_map.json
│   ├── tokenizer_config.json
│   └── vocab.json
├── tokenizer_2
│   ├── merges.txt
│   ├── special_tokens_map.json
│   ├── tokenizer_config.json
│   └── vocab.json
├── unet
│   ├── config.json
│   └── diffusion_pytorch_model.safetensors
└──vae
    ├── config.json
    └── diffusion_pytorch_model.safetensors

Alternatively, you may as well use the Python API to create a mix of experts model:

from segmoe import SegMoEPipeline

pipeline = SegMoEPipeline("config.yaml", device="cuda")

pipeline.save_pretrained("segmoe_v0")



Push to Hub

The Model could be pushed to the hub via the huggingface-cli

huggingface-cli upload segmind/segmoe_v0 ./segmoe_v0

The model may also be pushed to the Hub directly from Python:

from huggingface_hub import create_repo, upload_folder
 
model_id = "segmind/SegMoE-v0"

repo_id = create_repo(repo_id=model_id, exist_ok=True).repo_id

upload_folder(
    repo_id=repo_id,
    folder_path="segmoe_v0",
    commit_message="Initial Commit",
    ignore_patterns=["step_*", "epoch_*"],
)

Detailed usage could be found here



Disclaimers and ongoing work

  • Slower Speed: If the variety of experts per token is larger than 1, the MoE performs computation across several expert models. This makes it slower than a single SD 1.5 or SDXL model.

  • High VRAM usage: MoEs run inference in a short time but still need a considerable amount of VRAM (and hence an expensive GPU). This makes it difficult to make use of them in local setups, but they’re great for deployments with multiple GPUs. As a reference point, SegMoE-4×2 requires 24GB of VRAM in half-precision.



Conclusion

We built SegMoE to supply the community a brand new tool that may potentially create SOTA Diffusion Models with ease, just by combining pretrained models while keeping inference times low. We’re excited to see what you may construct with it!



Additional Resources



Source link

ASK ANA

What are your thoughts on this topic?
Let us know in the comments below.

0 0 votes
Article Rating
guest
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments

Share this article

Recent posts

0
Would love your thoughts, please comment.x
()
x