Files
HuggingFace_transformer/tests/models
Joshua Lochner 6e2d04e429 Fix slow GemmaTokenizer and improve SPM slow -> fast conversion process (#32191)
* Remove user-defined tokens which can be obtained through merges

* Remove debug line

* formatting

* Refactor spm slow -> fast converter

* revert unnecessary refactor

* set comprehension

* remove test files

* Use `vocab_scores`

* Always replace spiece underline with space in decode

* we no longer need token filtering

* Add save fast load slow unit test

* Remove tokenizers version check

* Remove duplicate code

* Make `<start_of_turn>` and `<end_of_turn>` special tokens

* Bias merge priority with length if score is the same

* Add unit test for merge priority

* CI
2024-07-30 23:36:38 +02:00
..
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-07-25 16:12:23 +02:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-07-17 08:37:43 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-07-24 17:36:32 +01:00
2024-05-22 06:40:15 +02:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-07-18 10:30:37 +05:30
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-11 15:47:38 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-05-22 06:40:15 +02:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-07-24 17:36:32 +01:00
2024-06-26 21:59:08 +01:00
2024-05-22 06:40:15 +02:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-07-11 22:13:56 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-05-22 06:40:15 +02:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-05-22 06:40:15 +02:00
2024-06-26 21:59:08 +01:00
2024-06-11 15:47:38 +01:00
2024-05-22 06:40:15 +02:00
2024-05-22 06:40:15 +02:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-11 15:47:38 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-05-22 06:40:15 +02:00
2024-06-26 21:59:08 +01:00
2024-06-11 15:47:38 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-07-08 11:43:33 +02:00
2022-05-03 14:42:02 +02:00