Change transformers.onnx to use optimum.exporters.onnx (#20529)

* Change transformers.onnx to use optimum.exporters.onnx

* Update doc

* Remove print

* Fix transformers.onnx cli

* Update documentation

* Update documentation

* Small fixes

* Fix log message

* Apply suggestions

* Update src/transformers/onnx/__main__.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Apply suggestions

* Add missing line break

* Ran make fix-copies

* Update src/transformers/onnx/__main__.py

Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>

* Update src/transformers/onnx/__main__.py

Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>

Co-authored-by: Michael Benayoun <michael@huggingface.co>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>
This commit is contained in:
Michael Benayoun
2022-12-09 10:42:02 +01:00
committed by GitHub
parent 9a6c6ef97f
commit 6a062a3ed9
3 changed files with 147 additions and 50 deletions

View File

@@ -17,15 +17,6 @@ exporting them to a serialized format that can be loaded and executed on special
runtimes and hardware. In this guide, we'll show you how to export 🤗 Transformers
models to [ONNX (Open Neural Network eXchange)](http://onnx.ai).
<Tip>
Once exported, a model can be optimized for inference via techniques such as
quantization and pruning. If you are interested in optimizing your models to run with
maximum efficiency, check out the [🤗 Optimum
library](https://github.com/huggingface/optimum).
</Tip>
ONNX is an open standard that defines a common set of operators and a common file format
to represent deep learning models in a wide variety of frameworks, including PyTorch and
TensorFlow. When a model is exported to the ONNX format, these operators are used to
@@ -41,6 +32,23 @@ you to convert model checkpoints to an ONNX graph by leveraging configuration ob
These configuration objects come ready made for a number of model architectures, and are
designed to be easily extendable to other architectures.
<Tip>
You can also export 🤗 Transformers models with the [`optimum.exporters.onnx` package](https://huggingface.co/docs/optimum/exporters/onnx/usage_guides/export_a_model)
from 🤗 Optimum.
Once exported, a model can be:
- Optimized for inference via techniques such as quantization and graph optimization.
- Run with ONNX Runtime via [`ORTModelForXXX` classes](https://huggingface.co/docs/optimum/onnxruntime/package_reference/modeling_ort),
which follow the same `AutoModel` API as the one you are used to in 🤗 Transformers.
- Run with [optimized inference pipelines](https://huggingface.co/docs/optimum/main/en/onnxruntime/usage_guides/pipelines),
which has the same API as the [`pipeline`] function in 🤗 Transformers.
To explore all these features, check out the [🤗 Optimum library](https://github.com/huggingface/optimum).
</Tip>
Ready-made configurations include the following architectures:
<!--This table is automatically generated by `make fix-copies`, do not fill manually!-->
@@ -117,6 +125,14 @@ In the next two sections, we'll show you how to:
## Exporting a model to ONNX
<Tip>
The recommended way of exporting a model is now to use
[`optimum.exporters.onnx`](https://huggingface.co/docs/optimum/main/en/exporters/onnx/usage_guides/export_a_model#exporting-a-model-to-onnx-using-the-cli),
do not worry it is very similar to `transformers.onnx`!
</Tip>
To export a 🤗 Transformers model to ONNX, you'll first need to install some extra
dependencies:
@@ -245,6 +261,14 @@ python -m transformers.onnx --model=local-tf-checkpoint onnx/
## Selecting features for different model tasks
<Tip>
The recommended way of exporting a model is now to use `optimum.exporters.onnx`.
You can check the [🤗 Optimum documentation](https://huggingface.co/docs/optimum/main/en/exporters/onnx/usage_guides/export_a_model#selecting-a-task)
to learn how to select a task.
</Tip>
Each ready-made configuration comes with a set of _features_ that enable you to export
models for different types of tasks. As shown in the table below, each feature is
associated with a different `AutoClass`:
@@ -312,6 +336,15 @@ exported separately as two ONNX files named `encoder_model.onnx` and `decoder_mo
## Exporting a model for an unsupported architecture
<Tip>
If you wish to contribute by adding support for a model that cannot be currently exported, you should first check if it is
supported in [`optimum.exporters.onnx`](https://huggingface.co/docs/optimum/main/en/exporters/onnx/package_reference/configuration#supported-architectures),
and if it is not, [contribute to 🤗 Optimum](https://huggingface.co/docs/optimum/main/en/exporters/onnx/usage_guides/contribute)
directly.
</Tip>
If you wish to export a model whose architecture is not natively supported by the
library, there are three main steps to follow:
@@ -499,4 +532,4 @@ file
Check out how the configuration for [IBERT was
contributed](https://github.com/huggingface/transformers/pull/14868/files) to get an
idea of what's involved.
idea of what's involved.