Files
SaulLu ade7371a41 improve saving strategy of sentencepiece tokenizer (#15328)
* add new test

* add a feature to same the sentencepiece tokenizer model when the init file was deleted

* update marian

* update m2m_100

* fix marian

* update speech to text

* override test for layoutxlm

* fix saving bartpho

* remove harcoded values bartpho

* special token string version

* finish bartpho

* override layoutxml test

* add mbart

* move special tokens list

* format

* Revert "format"

This reverts commit 37a40df37903a932c2f951cbd33acb684246bae7.

* simplify list of string of special tokens

* Re-write `self.fairseq_tokens_to_ids ` initialization logic with special tokens

Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com>

Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com>
2022-01-27 16:24:51 +01:00
..
2022-01-24 15:25:10 -05:00
2021-12-27 15:49:48 -08:00
2021-05-12 13:48:15 +05:30
2022-01-18 17:52:35 -05:00
2021-01-27 21:25:11 +03:00
2020-12-07 18:36:34 -05:00
2021-09-25 21:20:21 +02:00
2021-12-01 10:57:39 +05:30
2021-09-20 13:24:30 +02:00
2021-12-07 00:25:28 -05:00
2020-12-07 18:36:34 -05:00
2022-01-18 07:24:13 -05:00
2021-11-06 10:08:58 -04:00
2021-01-27 21:25:11 +03:00
2021-05-05 12:38:01 +02:00
2021-11-30 11:07:55 +01:00
2020-12-07 18:36:34 -05:00
2022-01-26 19:18:29 +01:00
2021-05-12 13:48:15 +05:30
2022-01-18 07:24:13 -05:00
2021-10-14 10:54:20 +02:00
2020-12-07 18:36:34 -05:00
2021-04-26 13:50:34 +02:00