[Docs] Improve docs for MMS loading of other languages (#24292)

* Improve docs * Apply suggestions from code review * upload readme * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2023-06-15 14:29:32 +02:00
parent e6122c3f40
commit 604a21b1e6
1 changed files with 43 additions and 3 deletions
--- a/docs/source/en/model_doc/mms.mdx
+++ b/docs/source/en/model_doc/mms.mdx
@@ -44,11 +44,51 @@ MMS's architecture is based on the Wav2Vec2 model, so one can refer to [Wav2Vec2
 The original code can be found [here](https://github.com/facebookresearch/fairseq/tree/main/examples/mms).
 ## Loading
 By default MMS loads adapter weights for English. If you want to load adapter weights of another language 
 make sure to specify `target_lang=<your-chosen-target-lang>` as well as `"ignore_mismatched_sizes=True`.
 The `ignore_mismatched_sizes=True` keyword has to be passed to allow the language model head to be resized according
 to the vocabulary of the specified language.
 Similarly, the processor should be loaded with the same target language
 ```py
 from transformers import Wav2Vec2ForCTC, AutoProcessor
 model_id = "facebook/mms-1b-all"
 target_lang = "fra"
 processor = AutoProcessor.from_pretrained(model_id, target_lang=target_lang)
 model = Wav2Vec2ForCTC.from_pretrained(model_id, target_lang=target_lang, ignore_mismatched_sizes=True)
 ```
 <Tip>
 You can safely ignore a warning such as:
 ```text
 Some weights of Wav2Vec2ForCTC were not initialized from the model checkpoint at facebook/mms-1b-all and are newly initialized because the shapes did not match:
 - lm_head.bias: found shape torch.Size([154]) in the checkpoint and torch.Size([314]) in the model instantiated
 - lm_head.weight: found shape torch.Size([154, 1280]) in the checkpoint and torch.Size([314, 1280]) in the model instantiated
 You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
 ```
 </Tip>
 If you want to use the ASR pipeline, you can load your chosen target language as such:
 ```py
 from transformers import pipeline
 model_id = "facebook/mms-1b-all"
 target_lang = "fra"
 pipe = pipeline(model=model_id, model_kwargs={"target_lang": "fra", "ignore_mismatched_sizes": True})
 ```
 ## Inference
-By default MMS loads adapter weights for English, but those can be easily switched out for another language.
+Next, let's look at how we can run MMS in inference and change adapter layers after having called [`~PretrainedModel.from_pretrained`]
 Let's look at an example.
 First, we load audio data in different languages using the [Datasets](https://github.com/huggingface/datasets).
 ```py