[Umt5] Add google's umt5 to transformers (#24477)
* add tokenization template * update conversion script * update modeling code * update * update convert checkpoint * update modeling * revert changes on convert script * new conversion script for new format * correct position bias * cleaning a bit * Credit co authors Co-authored-by: agemagician <ahmed.elnaggar@tum.de> Co-authored-by: stefan-it <> * styling * Add docq * fix copies * add co author * Other Author * Merge branch 'main' of https://github.com/huggingface/transformers into add-umt5 * add testing * nit * Update docs/source/en/model_doc/umt5.mdx Co-authored-by: Stefan Schweter <stefan@schweter.it> * fix t5 * actual fix? * revert wrong changes * remove * update test * more fixes * revert some changes * add SPIECE_UNDERLINE * add a commone xample * upfate * fix copies * revert changes on t5 conversion script * revert bytefallback changes since there was no addition yet * fixup * fixup * ingore umt5 cutom testing folder * fix readmes * revertT5 changes * same outputs * fixup * update example * Apply suggestions from code review * style * draft addition of all new files * current update * fix attention and stuff * finish refactoring * auto config * fixup * more nits * add umt5 to init * use md format * Update README.md Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * revert changes on mt5 * revert mt4 changes * update test * more fixes * add to mapping * fix-copies * fix copies * foix retain grad * fix some tests * nits * done * Update src/transformers/models/umt5/modeling_umt5.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/en/model_doc/umt5.md * Update src/transformers/models/umt5/__init__.py * Update docs/source/en/model_doc/umt5.md Co-authored-by: Stefan Schweter <stefan@schweter.it> * Update src/transformers/models/umt5/modeling_umt5.py * update conversion script + use google checkpoints * nits * update test and modelling * stash slow convert * update fixupd * don't change slow --------- Co-authored-by: stefan-it <> Co-authored-by: Stefan Schweter <stefan@schweter.it> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
This commit is contained in:
@@ -46,6 +46,7 @@ PRIVATE_MODELS = [
|
||||
"RealmBertModel",
|
||||
"T5Stack",
|
||||
"MT5Stack",
|
||||
"UMT5Stack",
|
||||
"SwitchTransformersStack",
|
||||
"TFDPRSpanPredictor",
|
||||
"MaskFormerSwinModel",
|
||||
@@ -61,6 +62,7 @@ IGNORE_NON_TESTED = PRIVATE_MODELS.copy() + [
|
||||
"InstructBlipQFormerModel", # Building part of bigger (tested) model.
|
||||
"NllbMoeDecoder",
|
||||
"NllbMoeEncoder",
|
||||
"UMT5EncoderModel", # Building part of bigger (tested) model.
|
||||
"LlamaDecoder", # Building part of bigger (tested) model.
|
||||
"Blip2QFormerModel", # Building part of bigger (tested) model.
|
||||
"DetaEncoder", # Building part of bigger (tested) model.
|
||||
|
||||
@@ -110,6 +110,7 @@ UNCONVERTIBLE_MODEL_ARCHITECTURES = {
|
||||
"MaskFormerSwinBackbone",
|
||||
"MT5Model",
|
||||
"MT5ForConditionalGeneration",
|
||||
"UMT5ForConditionalGeneration",
|
||||
"TFMT5ForConditionalGeneration",
|
||||
"TFMT5Model",
|
||||
"QDQBertForSequenceClassification",
|
||||
|
||||
Reference in New Issue
Block a user