Doc styler examples (#14953)

* Fix bad examples * Add black formatting to style_doc * Use first nonempty line * Put it at the right place * Don't add spaces to empty lines * Better templates * Deal with triple quotes in docstrings * Result of style_doc * Enable mdx treatment and fix code examples in MDXs * Result of doc styler on doc source files * Last fixes * Break copy from
2021-12-27 19:07:46 -05:00
parent e13f72fbff
commit b5e2b183af
211 changed files with 2738 additions and 1711 deletions
--- a/docs/source/model_doc/byt5.mdx
+++ b/docs/source/model_doc/byt5.mdx
@@ -51,12 +51,14 @@ ByT5 works on raw UTF-8 bytes, so it can be used without a tokenizer:
 from transformers import T5ForConditionalGeneration
 import torch

-model = T5ForConditionalGeneration.from_pretrained('google/byt5-small')
+model = T5ForConditionalGeneration.from_pretrained("google/byt5-small")

 input_ids = torch.tensor([list("Life is like a box of chocolates.".encode("utf-8"))]) + 3  # add 3 for special tokens
-labels = torch.tensor([list("La vie est comme une boîte de chocolat.".encode("utf-8"))]) + 3  # add 3 for special tokens
+labels = (
+    torch.tensor([list("La vie est comme une boîte de chocolat.".encode("utf-8"))]) + 3
+)  # add 3 for special tokens

-loss = model(input_ids, labels=labels).loss # forward pass
+loss = model(input_ids, labels=labels).loss  # forward pass
 ```

 For batched inference and training it is however recommended to make use of the tokenizer:
@@ -64,13 +66,17 @@ For batched inference and training it is however recommended to make use of the
 ```python
 from transformers import T5ForConditionalGeneration, AutoTokenizer

-model = T5ForConditionalGeneration.from_pretrained('google/byt5-small')
-tokenizer = AutoTokenizer.from_pretrained('google/byt5-small')
+model = T5ForConditionalGeneration.from_pretrained("google/byt5-small")
+tokenizer = AutoTokenizer.from_pretrained("google/byt5-small")

-model_inputs = tokenizer(["Life is like a box of chocolates.", "Today is Monday."], padding="longest", return_tensors="pt")
-labels = tokenizer(["La vie est comme une boîte de chocolat.", "Aujourd'hui c'est lundi."], padding="longest", return_tensors="pt").input_ids
+model_inputs = tokenizer(
+    ["Life is like a box of chocolates.", "Today is Monday."], padding="longest", return_tensors="pt"
+)
+labels = tokenizer(
+    ["La vie est comme une boîte de chocolat.", "Aujourd'hui c'est lundi."], padding="longest", return_tensors="pt"
+).input_ids

-loss = model(**model_inputs, labels=labels).loss # forward pass
+loss = model(**model_inputs, labels=labels).loss  # forward pass
 ```

 ## ByT5Tokenizer