There are currently 3 ways to convert your Hugging Face Transformers models to ONNX. On this section, you’ll learn learn how to export distilbert-base-uncased-finetuned-sst-2-english for text-classification using all three methods going from the low-level torch API to probably the most user-friendly high-level API of optimum. Each method will do the exact same
Export with torch.onnx (low-level)
torch.onnx lets you convert model checkpoints to an ONNX graph by the export method. But you have got to supply lots of values like input_names, dynamic_axes, etc.
You’ll first need to put in some dependencies:
pip install transformers torch
exporting our checkpoint with export
import torch
from transformers import AutoModelForSequenceClassification, AutoTokenizer
model_id = "distilbert-base-uncased-finetuned-sst-2-english"
model = AutoModelForSequenceClassification.from_pretrained(model_id)
tokenizer = AutoTokenizer.from_pretrained(model_id)
dummy_model_input = tokenizer("This can be a sample", return_tensors="pt")
torch.onnx.export(
model,
tuple(dummy_model_input.values()),
f="torch-model.onnx",
input_names=['input_ids', 'attention_mask'],
output_names=['logits'],
dynamic_axes={'input_ids': {0: 'batch_size', 1: 'sequence'},
'attention_mask': {0: 'batch_size', 1: 'sequence'},
'logits': {0: 'batch_size', 1: 'sequence'}},
do_constant_folding=True,
opset_version=13,
)
Export with transformers.onnx (mid-level)
transformers.onnx lets you convert model checkpoints to an ONNX graph by leveraging configuration objects. That way you don’t have to supply the complex configuration for dynamic_axes etc.
You’ll first need to put in some dependencies:
pip install transformers[onnx] torch
Exporting our checkpoint with the transformers.onnx.
from pathlib import Path
import transformers
from transformers.onnx import FeaturesManager
from transformers import AutoConfig, AutoTokenizer, AutoModelForSequenceClassification
model_id = "distilbert-base-uncased-finetuned-sst-2-english"
feature = "sequence-classification"
model = AutoModelForSequenceClassification.from_pretrained(model_id)
tokenizer = AutoTokenizer.from_pretrained(model_id)
model_kind, model_onnx_config = FeaturesManager.check_supported_model_or_raise(model, feature=feature)
onnx_config = model_onnx_config(model.config)
onnx_inputs, onnx_outputs = transformers.onnx.export(
preprocessor=tokenizer,
model=model,
config=onnx_config,
opset=13,
output=Path("trfs-model.onnx")
)
Export with Optimum (high-level)
Optimum Inference includes methods to convert vanilla Transformers models to ONNX using the ORTModelForXxx classes. To convert your Transformers model to ONNX you just must pass from_transformers=True to the from_pretrained() method and your model shall be loaded and converted to ONNX leveraging the transformers.onnx package under the hood.
You’ll first need to put in some dependencies:
pip install optimum[onnxruntime]
Exporting our checkpoint with ORTModelForSequenceClassification
from optimum.onnxruntime import ORTModelForSequenceClassification
model = ORTModelForSequenceClassification.from_pretrained("distilbert-base-uncased-finetuned-sst-2-english",from_transformers=True)
One of the best part in regards to the conversion with Optimum is you can immediately use the model to run predictions or load it inside a pipeline.
