M2M100 support for ONNX export (#15193)

* Add M2M100 support for ONNX export * Delete useless imports * Add M2M100 to tests * Fix protobuf issue
2022-03-02 04:03:14 -05:00
parent d1a29078c0
commit 4bfe75bd08
6 changed files with 177 additions and 29 deletions
--- a/docs/source/serialization.mdx
+++ b/docs/source/serialization.mdx
@@ -55,6 +55,7 @@ Ready-made configurations include the following architectures:
 - GPT Neo
 - I-BERT
 - LayoutLM
+- M2M100
 - Marian
 - mBART
 - OpenAI GPT-2
@@ -584,12 +585,12 @@ traced_model(tokens_tensor, segments_tensors)

 ### Deploying HuggingFace TorchScript models on AWS using the Neuron SDK

-AWS introduced the [Amazon EC2 Inf1](https://aws.amazon.com/ec2/instance-types/inf1/) 
-instance family for low cost, high performance machine learning inference in the cloud. 
-The Inf1 instances are powered by the AWS Inferentia chip, a custom-built hardware accelerator, 
-specializing in deep learning inferencing workloads. 
-[AWS Neuron](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/#) 
-is the SDK for Inferentia that supports tracing and optimizing transformers models for 
+AWS introduced the [Amazon EC2 Inf1](https://aws.amazon.com/ec2/instance-types/inf1/)
+instance family for low cost, high performance machine learning inference in the cloud.
+The Inf1 instances are powered by the AWS Inferentia chip, a custom-built hardware accelerator,
+specializing in deep learning inferencing workloads.
+[AWS Neuron](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/#)
+is the SDK for Inferentia that supports tracing and optimizing transformers models for
 deployment on Inf1. The Neuron SDK provides:


@@ -600,13 +601,13 @@ deployment on Inf1. The Neuron SDK provides:

 #### Implications

-Transformers Models based on the [BERT (Bidirectional Encoder Representations from Transformers)](https://huggingface.co/docs/transformers/master/model_doc/bert) 
+Transformers Models based on the [BERT (Bidirectional Encoder Representations from Transformers)](https://huggingface.co/docs/transformers/master/model_doc/bert)
 architecture, or its variants such as [distilBERT](https://huggingface.co/docs/transformers/master/model_doc/distilbert)
- and [roBERTa](https://huggingface.co/docs/transformers/master/model_doc/roberta) 
- will run best on Inf1 for non-generative tasks such as Extractive Question Answering, 
+ and [roBERTa](https://huggingface.co/docs/transformers/master/model_doc/roberta)
+ will run best on Inf1 for non-generative tasks such as Extractive Question Answering,
 Sequence Classification, Token Classification. Alternatively, text generation
-tasks can be adapted to run on Inf1, according to this [AWS Neuron MarianMT tutorial](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/src/examples/pytorch/transformers-marianmt.html). 
-More information about models that can be converted out of the box on Inferentia can be 
+tasks can be adapted to run on Inf1, according to this [AWS Neuron MarianMT tutorial](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/src/examples/pytorch/transformers-marianmt.html).
+More information about models that can be converted out of the box on Inferentia can be
 found in the [Model Architecture Fit section of the Neuron documentation](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/neuron-guide/models/models-inferentia.html#models-inferentia).

 #### Dependencies
@@ -618,8 +619,8 @@ Using AWS Neuron to convert models requires the following dependencies and envir

 #### Converting a Model for AWS Neuron

-Using the same script as in [Using TorchScript in Python](https://huggingface.co/docs/transformers/master/en/serialization#using-torchscript-in-python) 
-to trace a "BertModel", you import `torch.neuron` framework extension to access 
+Using the same script as in [Using TorchScript in Python](https://huggingface.co/docs/transformers/master/en/serialization#using-torchscript-in-python)
+to trace a "BertModel", you import `torch.neuron` framework extension to access
 the components of the Neuron SDK through a Python API.

 ```python
@@ -643,5 +644,5 @@ torch.neuron.trace(model, [token_tensor, segments_tensors])

 This change enables Neuron SDK to trace the model and optimize it to run in Inf1 instances.

-To learn more about AWS Neuron SDK features, tools, example tutorials and latest updates, 
+To learn more about AWS Neuron SDK features, tools, example tutorials and latest updates,
 please see the [AWS NeuronSDK documentation](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/index.html).