Update ONNX doc to match the removal of --optimize argument.

Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
This commit is contained in:
Morgan Funtowicz
2020-08-21 10:37:09 +02:00
parent e5f452275b
commit b105f2c6b3

View File

@@ -52,15 +52,17 @@ Below are some of the operators which can be enabled to speed up inference throu
* Skip connection LayerNormalization fusing
* FastGeLU approximation
Some of the optimizations performed by ONNX runtime can be hardware specific and thus lead to different performances
if used on another machine with a different hardware configuration than the one used for exporting the model.
For this reason, when using ``convert_graph_to_onnx.py`` optimizations are not enabled,
ensuring the model can be easily exported to various hardware.
Optimizations can then be enabled when loading the model through ONNX runtime for inference.
Fortunately, you can let ONNXRuntime find all the possible optimized operators for you. Simply add ``--optimize``
when exporting your model through ``convert_graph_to_onnx.py``.
Example:
.. code-block:: bash
python convert_graph_to_onnx.py --framework <pt, tf> --model bert-base-cased --optimize bert-base-cased.onnx
.. note::
When quantization is enabled (see below), ``convert_graph_to_onnx.py`` script will enable optimizations on the model
because quantization would modify the underlying graph making it impossible for ONNX runtime to do the optimizations
afterwards.
.. note::
For more information about the optimizations enabled by ONNXRuntime, please have a look at the (`ONNXRuntime Github <https://github.com/microsoft/onnxruntime/tree/master/onnxruntime/python/tools/transformers>`_)
@@ -112,8 +114,6 @@ Example of quantized BERT model export:
above command will contain the original ONNX model storing `float32` weights.
The second one, with ``-quantized`` suffix, will hold the quantized parameters.
.. note::
The quantization export gives the best performances when used in combination with ``--optimize``.
TorchScript
=======================================