Example of pad_to_multiple_of for padding and truncation guide & docstring update (#22278)

* added an example of pad_to_multiple_of

* make style

* addressed feedback
This commit is contained in:
Maria Khalusova
2023-03-20 14:18:55 -04:00
committed by GitHub
parent fb0a38b4f2
commit 7bd8650512
2 changed files with 4 additions and 2 deletions

View File

@@ -50,6 +50,7 @@ The following table summarizes the recommended way to setup padding and truncati
| | | `tokenizer(batch_sentences, padding='longest')` |
| | padding to max model input length | `tokenizer(batch_sentences, padding='max_length')` |
| | padding to specific length | `tokenizer(batch_sentences, padding='max_length', max_length=42)` |
| | padding to a multiple of a value | `tokenizer(batch_sentences, padding=True, pad_to_multiple_of=8) |
| truncation to max model input length | no padding | `tokenizer(batch_sentences, truncation=True)` or |
| | | `tokenizer(batch_sentences, truncation=STRATEGY)` |
| | padding to max sequence in batch | `tokenizer(batch_sentences, padding=True, truncation=True)` or |