Convert Transformers to ONNX with Hugging Face Optimum

There are currently 3 ways to convert your Hugging Face Transformers models to ONNX. On this section, you’ll learn learn how to export distilbert-base-uncased-finetuned-sst-2-english for text-classification using all three methods going from the low-level torch API to probably the most user-friendly high-level API of optimum. Each method will do the exact same

Export with `torch.onnx` (low-level)

torch.onnx lets you convert model checkpoints to an ONNX graph by the export method. But you have got to supply lots of values like input_names, dynamic_axes, etc.

You’ll first need to put in some dependencies:

pip install transformers torch

exporting our checkpoint with export

import torch
from transformers import AutoModelForSequenceClassification, AutoTokenizer


model_id = "distilbert-base-uncased-finetuned-sst-2-english"
model = AutoModelForSequenceClassification.from_pretrained(model_id)
tokenizer = AutoTokenizer.from_pretrained(model_id)
dummy_model_input = tokenizer("This can be a sample", return_tensors="pt")


torch.onnx.export(
    model, 
    tuple(dummy_model_input.values()),
    f="torch-model.onnx",  
    input_names=['input_ids', 'attention_mask'], 
    output_names=['logits'], 
    dynamic_axes={'input_ids': {0: 'batch_size', 1: 'sequence'}, 
                  'attention_mask': {0: 'batch_size', 1: 'sequence'}, 
                  'logits': {0: 'batch_size', 1: 'sequence'}}, 
    do_constant_folding=True, 
    opset_version=13, 
)

Export with `transformers.onnx` (mid-level)

transformers.onnx lets you convert model checkpoints to an ONNX graph by leveraging configuration objects. That way you don’t have to supply the complex configuration for dynamic_axes etc.

You’ll first need to put in some dependencies:

pip install transformers[onnx] torch

Exporting our checkpoint with the transformers.onnx.

from pathlib import Path
import transformers
from transformers.onnx import FeaturesManager
from transformers import AutoConfig, AutoTokenizer, AutoModelForSequenceClassification


model_id = "distilbert-base-uncased-finetuned-sst-2-english"
feature = "sequence-classification"
model = AutoModelForSequenceClassification.from_pretrained(model_id)
tokenizer = AutoTokenizer.from_pretrained(model_id)


model_kind, model_onnx_config = FeaturesManager.check_supported_model_or_raise(model, feature=feature)
onnx_config = model_onnx_config(model.config)


onnx_inputs, onnx_outputs = transformers.onnx.export(
        preprocessor=tokenizer,
        model=model,
        config=onnx_config,
        opset=13,
        output=Path("trfs-model.onnx")
)

Export with Optimum (high-level)

Optimum Inference includes methods to convert vanilla Transformers models to ONNX using the ORTModelForXxx classes. To convert your Transformers model to ONNX you just must pass from_transformers=True to the from_pretrained() method and your model shall be loaded and converted to ONNX leveraging the transformers.onnx package under the hood.

You’ll first need to put in some dependencies:

pip install optimum[onnxruntime]

Exporting our checkpoint with ORTModelForSequenceClassification

from optimum.onnxruntime import ORTModelForSequenceClassification

model = ORTModelForSequenceClassification.from_pretrained("distilbert-base-uncased-finetuned-sst-2-english",from_transformers=True)

One of the best part in regards to the conversion with Optimum is you can immediately use the model to run predictions or load it inside a pipeline.

Source link

Convert Transformers to ONNX with Hugging Face Optimum

Export with `torch.onnx` (low-level)

Export with `transformers.onnx` (mid-level)

Export with Optimum (high-level)

What are your thoughts on this topic?
Let us know in the comments below.

Share this article

Recent posts

Intel and Hugging Face Partner to Democratize Machine Learning Hardware Acceleration

Getting Began With Embeddings

The Evolving Role of the ML Engineer

Custom Kernels for All from Codex and Claude

AI in Multiple GPUs: Point-to-Point and Collective Operations

Convert Transformers to ONNX with Hugging Face Optimum

Export with torch.onnx (low-level)

Export with transformers.onnx (mid-level)

Export with Optimum (high-level)

What are your thoughts on this topic? Let us know in the comments below.

Share this article

Recent posts

Export with `torch.onnx` (low-level)

Export with `transformers.onnx` (mid-level)

What are your thoughts on this topic?
Let us know in the comments below.