Add token type ids to CodeGenTokenizer (#29265)
* Add create token type ids to CodeGenTokenizer * Fix inconsistent length of token type ids * Format source codes * Fix inconsistent order of methods * Update docstring * add test_tokenizer_integration test * Format source codes * Add `copied from` comment to CodeGenTokenizerFast * Add doc of create_token_type_ids_from_sequences * Make return_token_type_ids False by default * Make test_tokenizer_integration as slow test * Add return_token_type_ids to tokenizer init arg * Add test for tokenizer's init return_token_type_ids * Format source codes
This commit is contained in:
@@ -72,6 +72,7 @@ hello_world()
|
||||
## CodeGenTokenizer
|
||||
|
||||
[[autodoc]] CodeGenTokenizer
|
||||
- create_token_type_ids_from_sequences
|
||||
- save_vocabulary
|
||||
|
||||
## CodeGenTokenizerFast
|
||||
|
||||
Reference in New Issue
Block a user