Add mBART-50 (#10154)

* add tokenizer for mBART-50

* update tokenizers

* make src_lang and tgt_lang optional

* update tokenizer test

* add setter

* update docs

* update conversion script

* update docs

* update conversion script

* update tokenizer

* update test

* update docs

* doc

* address Sylvain's suggestions

* fix test

* fix formatting

* nits
This commit is contained in:
Suraj Patil
2021-02-15 20:58:54 +05:30
committed by GitHub
parent 570218878a
commit 6fc940ed09
13 changed files with 1008 additions and 34 deletions

View File

@@ -164,6 +164,15 @@ class LxmertTokenizerFast:
requires_tokenizers(self)
class MBart50TokenizerFast:
def __init__(self, *args, **kwargs):
requires_tokenizers(self)
@classmethod
def from_pretrained(self, *args, **kwargs):
requires_tokenizers(self)
class MBartTokenizerFast:
def __init__(self, *args, **kwargs):
requires_tokenizers(self)