Fix typo in example code (#25583)

`lang_code_to_id("en_XX")` => `lang_code_to_id["en_XX"]`

lang_code_to_id is a dict
This commit is contained in:
Amélie T. Reymond
2023-08-17 22:58:59 -07:00
committed by GitHub
parent 4a27c13f1e
commit 659ab0423e

View File

@@ -171,9 +171,9 @@ Tokenize the text:
MBart forces the target language id as the first generated token to translate to the target language. Set the `forced_bos_token_id` to `en` in the `generate` method to translate to English: MBart forces the target language id as the first generated token to translate to the target language. Set the `forced_bos_token_id` to `en` in the `generate` method to translate to English:
```py ```py
>>> generated_tokens = model.generate(**encoded_en, forced_bos_token_id=tokenizer.lang_code_to_id("en_XX")) >>> generated_tokens = model.generate(**encoded_en, forced_bos_token_id=tokenizer.lang_code_to_id["en_XX"])
>>> tokenizer.batch_decode(generated_tokens, skip_special_tokens=True) >>> tokenizer.batch_decode(generated_tokens, skip_special_tokens=True)
"Don't interfere with the wizard's affairs, because they are subtle, will soon get angry." "Don't interfere with the wizard's affairs, because they are subtle, will soon get angry."
``` ```
If you are using the `facebook/mbart-large-50-many-to-one-mmt` checkpoint, you don't need to force the target language id as the first generated token otherwise the usage is the same. If you are using the `facebook/mbart-large-50-many-to-one-mmt` checkpoint, you don't need to force the target language id as the first generated token otherwise the usage is the same.