[Tokenizer Utils Base] Make pad function more flexible (#9928)

* change tokenizer requirement

* split line

* Correct typo from list to str

* improve style

* make other function pretty as well

* add comment

* correct typo

* add new test

* pass tests for tok without padding token

* Apply suggestions from code review
This commit is contained in:
Patrick von Platen
2021-02-02 10:35:27 +03:00
committed by GitHub
parent d1b14c9b54
commit 538b3b4607
40 changed files with 187 additions and 107 deletions

View File

@@ -92,7 +92,7 @@ class BlenderbotSmallTokenizer(PreTrainedTokenizer):
},
}
max_model_input_sizes = {"facebook/blenderbot_small-90M": 512}
model_input_names = ["attention_mask"]
model_input_names = ["input_ids", "attention_mask"]
def __init__(
self,