Change transformers.onnx to use optimum.exporters.onnx (#20529)
* Change transformers.onnx to use optimum.exporters.onnx * Update doc * Remove print * Fix transformers.onnx cli * Update documentation * Update documentation * Small fixes * Fix log message * Apply suggestions * Update src/transformers/onnx/__main__.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Apply suggestions * Add missing line break * Ran make fix-copies * Update src/transformers/onnx/__main__.py Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> * Update src/transformers/onnx/__main__.py Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> Co-authored-by: Michael Benayoun <michael@huggingface.co> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>
This commit is contained in:
@@ -17,15 +17,6 @@ exporting them to a serialized format that can be loaded and executed on special
|
|||||||
runtimes and hardware. In this guide, we'll show you how to export 🤗 Transformers
|
runtimes and hardware. In this guide, we'll show you how to export 🤗 Transformers
|
||||||
models to [ONNX (Open Neural Network eXchange)](http://onnx.ai).
|
models to [ONNX (Open Neural Network eXchange)](http://onnx.ai).
|
||||||
|
|
||||||
<Tip>
|
|
||||||
|
|
||||||
Once exported, a model can be optimized for inference via techniques such as
|
|
||||||
quantization and pruning. If you are interested in optimizing your models to run with
|
|
||||||
maximum efficiency, check out the [🤗 Optimum
|
|
||||||
library](https://github.com/huggingface/optimum).
|
|
||||||
|
|
||||||
</Tip>
|
|
||||||
|
|
||||||
ONNX is an open standard that defines a common set of operators and a common file format
|
ONNX is an open standard that defines a common set of operators and a common file format
|
||||||
to represent deep learning models in a wide variety of frameworks, including PyTorch and
|
to represent deep learning models in a wide variety of frameworks, including PyTorch and
|
||||||
TensorFlow. When a model is exported to the ONNX format, these operators are used to
|
TensorFlow. When a model is exported to the ONNX format, these operators are used to
|
||||||
@@ -41,6 +32,23 @@ you to convert model checkpoints to an ONNX graph by leveraging configuration ob
|
|||||||
These configuration objects come ready made for a number of model architectures, and are
|
These configuration objects come ready made for a number of model architectures, and are
|
||||||
designed to be easily extendable to other architectures.
|
designed to be easily extendable to other architectures.
|
||||||
|
|
||||||
|
<Tip>
|
||||||
|
|
||||||
|
You can also export 🤗 Transformers models with the [`optimum.exporters.onnx` package](https://huggingface.co/docs/optimum/exporters/onnx/usage_guides/export_a_model)
|
||||||
|
from 🤗 Optimum.
|
||||||
|
|
||||||
|
Once exported, a model can be:
|
||||||
|
|
||||||
|
- Optimized for inference via techniques such as quantization and graph optimization.
|
||||||
|
- Run with ONNX Runtime via [`ORTModelForXXX` classes](https://huggingface.co/docs/optimum/onnxruntime/package_reference/modeling_ort),
|
||||||
|
which follow the same `AutoModel` API as the one you are used to in 🤗 Transformers.
|
||||||
|
- Run with [optimized inference pipelines](https://huggingface.co/docs/optimum/main/en/onnxruntime/usage_guides/pipelines),
|
||||||
|
which has the same API as the [`pipeline`] function in 🤗 Transformers.
|
||||||
|
|
||||||
|
To explore all these features, check out the [🤗 Optimum library](https://github.com/huggingface/optimum).
|
||||||
|
|
||||||
|
</Tip>
|
||||||
|
|
||||||
Ready-made configurations include the following architectures:
|
Ready-made configurations include the following architectures:
|
||||||
|
|
||||||
<!--This table is automatically generated by `make fix-copies`, do not fill manually!-->
|
<!--This table is automatically generated by `make fix-copies`, do not fill manually!-->
|
||||||
@@ -117,6 +125,14 @@ In the next two sections, we'll show you how to:
|
|||||||
|
|
||||||
## Exporting a model to ONNX
|
## Exporting a model to ONNX
|
||||||
|
|
||||||
|
<Tip>
|
||||||
|
|
||||||
|
The recommended way of exporting a model is now to use
|
||||||
|
[`optimum.exporters.onnx`](https://huggingface.co/docs/optimum/main/en/exporters/onnx/usage_guides/export_a_model#exporting-a-model-to-onnx-using-the-cli),
|
||||||
|
do not worry it is very similar to `transformers.onnx`!
|
||||||
|
|
||||||
|
</Tip>
|
||||||
|
|
||||||
To export a 🤗 Transformers model to ONNX, you'll first need to install some extra
|
To export a 🤗 Transformers model to ONNX, you'll first need to install some extra
|
||||||
dependencies:
|
dependencies:
|
||||||
|
|
||||||
@@ -245,6 +261,14 @@ python -m transformers.onnx --model=local-tf-checkpoint onnx/
|
|||||||
|
|
||||||
## Selecting features for different model tasks
|
## Selecting features for different model tasks
|
||||||
|
|
||||||
|
<Tip>
|
||||||
|
|
||||||
|
The recommended way of exporting a model is now to use `optimum.exporters.onnx`.
|
||||||
|
You can check the [🤗 Optimum documentation](https://huggingface.co/docs/optimum/main/en/exporters/onnx/usage_guides/export_a_model#selecting-a-task)
|
||||||
|
to learn how to select a task.
|
||||||
|
|
||||||
|
</Tip>
|
||||||
|
|
||||||
Each ready-made configuration comes with a set of _features_ that enable you to export
|
Each ready-made configuration comes with a set of _features_ that enable you to export
|
||||||
models for different types of tasks. As shown in the table below, each feature is
|
models for different types of tasks. As shown in the table below, each feature is
|
||||||
associated with a different `AutoClass`:
|
associated with a different `AutoClass`:
|
||||||
@@ -312,6 +336,15 @@ exported separately as two ONNX files named `encoder_model.onnx` and `decoder_mo
|
|||||||
|
|
||||||
## Exporting a model for an unsupported architecture
|
## Exporting a model for an unsupported architecture
|
||||||
|
|
||||||
|
<Tip>
|
||||||
|
|
||||||
|
If you wish to contribute by adding support for a model that cannot be currently exported, you should first check if it is
|
||||||
|
supported in [`optimum.exporters.onnx`](https://huggingface.co/docs/optimum/main/en/exporters/onnx/package_reference/configuration#supported-architectures),
|
||||||
|
and if it is not, [contribute to 🤗 Optimum](https://huggingface.co/docs/optimum/main/en/exporters/onnx/usage_guides/contribute)
|
||||||
|
directly.
|
||||||
|
|
||||||
|
</Tip>
|
||||||
|
|
||||||
If you wish to export a model whose architecture is not natively supported by the
|
If you wish to export a model whose architecture is not natively supported by the
|
||||||
library, there are three main steps to follow:
|
library, there are three main steps to follow:
|
||||||
|
|
||||||
|
|||||||
@@ -11,60 +11,63 @@
|
|||||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||||
# See the License for the specific language governing permissions and
|
# See the License for the specific language governing permissions and
|
||||||
# limitations under the License.
|
# limitations under the License.
|
||||||
|
import subprocess
|
||||||
|
import sys
|
||||||
|
import warnings
|
||||||
from argparse import ArgumentParser
|
from argparse import ArgumentParser
|
||||||
from pathlib import Path
|
from pathlib import Path
|
||||||
|
|
||||||
from ..models.auto import AutoFeatureExtractor, AutoProcessor, AutoTokenizer
|
from packaging import version
|
||||||
from ..onnx.utils import get_preprocessor
|
|
||||||
|
from .. import AutoFeatureExtractor, AutoProcessor, AutoTokenizer
|
||||||
from ..utils import logging
|
from ..utils import logging
|
||||||
|
from ..utils.import_utils import is_optimum_available
|
||||||
from .convert import export, validate_model_outputs
|
from .convert import export, validate_model_outputs
|
||||||
from .features import FeaturesManager
|
from .features import FeaturesManager
|
||||||
|
from .utils import get_preprocessor
|
||||||
|
|
||||||
|
|
||||||
|
MIN_OPTIMUM_VERSION = "1.5.0"
|
||||||
|
|
||||||
ENCODER_DECODER_MODELS = ["vision-encoder-decoder"]
|
ENCODER_DECODER_MODELS = ["vision-encoder-decoder"]
|
||||||
|
|
||||||
|
|
||||||
def main():
|
def export_with_optimum(args):
|
||||||
parser = ArgumentParser("Hugging Face Transformers ONNX exporter")
|
if is_optimum_available():
|
||||||
parser.add_argument(
|
from optimum.version import __version__ as optimum_version
|
||||||
"-m", "--model", type=str, required=True, help="Model ID on huggingface.co or path on disk to load model from."
|
|
||||||
)
|
parsed_optimum_version = version.parse(optimum_version)
|
||||||
parser.add_argument(
|
if parsed_optimum_version < version.parse(MIN_OPTIMUM_VERSION):
|
||||||
"--feature",
|
raise RuntimeError(
|
||||||
choices=list(FeaturesManager.AVAILABLE_FEATURES),
|
f"transformers.onnx requires optimum >= {MIN_OPTIMUM_VERSION} but {optimum_version} is installed. You "
|
||||||
default="default",
|
"can upgrade optimum by running: pip install -U optimum[exporters]"
|
||||||
help="The type of features to export the model with.",
|
)
|
||||||
)
|
else:
|
||||||
parser.add_argument("--opset", type=int, default=None, help="ONNX opset version to export the model with.")
|
raise RuntimeError(
|
||||||
parser.add_argument(
|
"transformers.onnx requires optimum to run, you can install the library by running: pip install "
|
||||||
"--atol", type=float, default=None, help="Absolute difference tolerance when validating the model."
|
"optimum[exporters]"
|
||||||
)
|
)
|
||||||
parser.add_argument(
|
cmd_line = [
|
||||||
"--framework",
|
sys.executable,
|
||||||
type=str,
|
"-m",
|
||||||
choices=["pt", "tf"],
|
"optimum.exporters.onnx",
|
||||||
default=None,
|
f"--model {args.model}",
|
||||||
help=(
|
f"--task {args.feature}",
|
||||||
"The framework to use for the ONNX export."
|
f"--framework {args.framework}" if args.framework is not None else "",
|
||||||
" If not provided, will attempt to use the local checkpoint's original framework"
|
f"{args.output}",
|
||||||
" or what is available in the environment."
|
]
|
||||||
),
|
proc = subprocess.Popen(" ".join(cmd_line), stdout=subprocess.PIPE, shell=True)
|
||||||
)
|
proc.wait()
|
||||||
parser.add_argument("output", type=Path, help="Path indicating where to store generated ONNX model.")
|
|
||||||
parser.add_argument("--cache_dir", type=str, default=None, help="Path indicating where to store cache.")
|
logger.info(
|
||||||
parser.add_argument(
|
"The export was done by optimum.exporters.onnx. We recommend using to use this package directly in future, as "
|
||||||
"--preprocessor",
|
"transformers.onnx is deprecated, and will be removed in v5. You can find more information here: "
|
||||||
type=str,
|
"https://huggingface.co/docs/optimum/exporters/onnx/usage_guides/export_a_model."
|
||||||
choices=["auto", "tokenizer", "feature_extractor", "processor"],
|
|
||||||
default="auto",
|
|
||||||
help="Which type of preprocessor to use. 'auto' tries to automatically detect it.",
|
|
||||||
)
|
)
|
||||||
|
|
||||||
# Retrieve CLI arguments
|
|
||||||
args = parser.parse_args()
|
def export_with_transformers(args):
|
||||||
args.output = args.output if args.output.is_file() else args.output.joinpath("model.onnx")
|
args.output = args.output if args.output.is_file() else args.output.joinpath("model.onnx")
|
||||||
|
|
||||||
if not args.output.parent.exists():
|
if not args.output.parent.exists():
|
||||||
args.output.parent.mkdir(parents=True)
|
args.output.parent.mkdir(parents=True)
|
||||||
|
|
||||||
@@ -172,6 +175,63 @@ def main():
|
|||||||
|
|
||||||
validate_model_outputs(onnx_config, preprocessor, model, args.output, onnx_outputs, args.atol)
|
validate_model_outputs(onnx_config, preprocessor, model, args.output, onnx_outputs, args.atol)
|
||||||
logger.info(f"All good, model saved at: {args.output.as_posix()}")
|
logger.info(f"All good, model saved at: {args.output.as_posix()}")
|
||||||
|
warnings.warn(
|
||||||
|
"The export was done by transformers.onnx which is deprecated and will be removed in v5. We recommend"
|
||||||
|
" using optimum.exporters.onnx in future. You can find more information here:"
|
||||||
|
" https://huggingface.co/docs/optimum/exporters/onnx/usage_guides/export_a_model.",
|
||||||
|
FutureWarning,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
parser = ArgumentParser("Hugging Face Transformers ONNX exporter")
|
||||||
|
parser.add_argument(
|
||||||
|
"-m", "--model", type=str, required=True, help="Model ID on huggingface.co or path on disk to load model from."
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
"--feature",
|
||||||
|
default="default",
|
||||||
|
help="The type of features to export the model with.",
|
||||||
|
)
|
||||||
|
parser.add_argument("--opset", type=int, default=None, help="ONNX opset version to export the model with.")
|
||||||
|
parser.add_argument(
|
||||||
|
"--atol", type=float, default=None, help="Absolute difference tolerance when validating the model."
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
"--framework",
|
||||||
|
type=str,
|
||||||
|
choices=["pt", "tf"],
|
||||||
|
default=None,
|
||||||
|
help=(
|
||||||
|
"The framework to use for the ONNX export."
|
||||||
|
" If not provided, will attempt to use the local checkpoint's original framework"
|
||||||
|
" or what is available in the environment."
|
||||||
|
),
|
||||||
|
)
|
||||||
|
parser.add_argument("output", type=Path, help="Path indicating where to store generated ONNX model.")
|
||||||
|
parser.add_argument("--cache_dir", type=str, default=None, help="Path indicating where to store cache.")
|
||||||
|
parser.add_argument(
|
||||||
|
"--preprocessor",
|
||||||
|
type=str,
|
||||||
|
choices=["auto", "tokenizer", "feature_extractor", "processor"],
|
||||||
|
default="auto",
|
||||||
|
help="Which type of preprocessor to use. 'auto' tries to automatically detect it.",
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
"--export_with_transformers",
|
||||||
|
action="store_true",
|
||||||
|
help=(
|
||||||
|
"Whether to use transformers.onnx instead of optimum.exporters.onnx to perform the ONNX export. It can be "
|
||||||
|
"useful when exporting a model supported in transformers but not in optimum, otherwise it is not "
|
||||||
|
"recommended."
|
||||||
|
),
|
||||||
|
)
|
||||||
|
|
||||||
|
args = parser.parse_args()
|
||||||
|
if args.export_with_transformers or not is_optimum_available():
|
||||||
|
export_with_transformers(args)
|
||||||
|
else:
|
||||||
|
export_with_optimum(args)
|
||||||
|
|
||||||
|
|
||||||
if __name__ == "__main__":
|
if __name__ == "__main__":
|
||||||
|
|||||||
@@ -571,6 +571,10 @@ def is_accelerate_available():
|
|||||||
return importlib.util.find_spec("accelerate") is not None
|
return importlib.util.find_spec("accelerate") is not None
|
||||||
|
|
||||||
|
|
||||||
|
def is_optimum_available():
|
||||||
|
return importlib.util.find_spec("optimum") is not None
|
||||||
|
|
||||||
|
|
||||||
def is_safetensors_available():
|
def is_safetensors_available():
|
||||||
return importlib.util.find_spec("safetensors") is not None
|
return importlib.util.find_spec("safetensors") is not None
|
||||||
|
|
||||||
|
|||||||
Reference in New Issue
Block a user