Files
HuggingFace_transformer/tests
Tom Aarsen 32e0db8a69 [tokenizers] Ensure that add_prefix_space is propagated to backend_tokenizer.pre_tokenizer (#35593)
* Ensure that add_prefix_space is propagated to backend_tokenizer.pre_tokenizer

in PreTrainedTokenizerFast, rather than relying on subclasses to take care of this.

* Simplify setting self.add_prefix_space, ensure pre_tok exists

* Wrap in try-except to catch 'Custom PreTokenizer cannot be serialized'

862d1a346a/bindings/python/src/pre_tokenizers.rs (L672) produces the Exception. They're triggered by the roformer tests, as the RoFormerTokenizerFast uses a custom PreTokenizer.

* Propagate add_prefix_space in T5TokenizerFast to superclass
2025-01-09 17:46:50 +01:00
..
2024-11-15 22:28:06 +01:00
2025-01-09 11:09:09 +01:00
2024-12-18 16:53:39 +01:00