[PyTorch Bart] Split Bart into different models (#9343)
* first try * remove old template * finish bart * finish mbart * delete unnecessary line * init pegasus * save intermediate * correct pegasus * finish pegasus * remove cookie cutter leftover * add marian * finish blenderbot * replace in file * correctly split blenderbot * delete "old" folder * correct "add statement" * adapt config for tf comp * correct configs for tf * remove ipdb * fix more stuff * fix mbart * push pegasus fix * fix mbart * more fixes * fix research projects code * finish docs for bart, mbart, and marian * delete unnecessary file * correct attn typo * correct configs * remove pegasus for seq class * correct peg docs * correct peg docs * finish configs * further improve docs * add copied from statements to mbart * fix copied from in mbart * add copy statements to marian * add copied from to marian * add pegasus copied from * finish pegasus * finish copied from * Apply suggestions from code review * make style * backward comp blenderbot * apply lysandres and sylvains suggestions * apply suggestions * push last fixes * fix docs * fix tok tests * fix imports code style * fix doc
This commit is contained in:
committed by
GitHub
parent
4eec5d0cf6
commit
eef66035a2
@@ -33,7 +33,6 @@ Implementation Notes
|
||||
- The modeling code is the same as :class:`~transformers.BartForConditionalGeneration` with a few minor modifications:
|
||||
|
||||
- static (sinusoid) positional embeddings (:obj:`MarianConfig.static_position_embeddings=True`)
|
||||
- a new final_logits_bias (:obj:`MarianConfig.add_bias_logits=True`)
|
||||
- no layernorm_embedding (:obj:`MarianConfig.normalize_embedding=False`)
|
||||
- the model starts generating with :obj:`pad_token_id` (which has 0 as a token_embedding) as the prefix (Bart uses
|
||||
:obj:`<s/>`),
|
||||
@@ -56,9 +55,10 @@ Examples
|
||||
|
||||
- Since Marian models are smaller than many other translation models available in the library, they can be useful for
|
||||
fine-tuning experiments and integration tests.
|
||||
- :prefix_link:`Fine-tune on TPU <examples/seq2seq/builtin_trainer/train_distil_marian_enro_tpu.sh>`
|
||||
- :prefix_link:`Fine-tune on GPU <examples/seq2seq/builtin_trainer/train_distil_marian_enro.sh>`
|
||||
- :prefix_link:`Fine-tune on GPU with pytorch-lightning <examples/seq2seq/distil_marian_no_teacher.sh>`
|
||||
- `Fine-tune on GPU
|
||||
<https://github.com/huggingface/transformers/blob/master/examples/research_projects/seq2seq-distillation/train_distil_marian_enro_teacher.sh>`__
|
||||
- `Fine-tune on GPU with pytorch-lightning
|
||||
<https://github.com/huggingface/transformers/blob/master/examples/research_projects/seq2seq-distillation/train_distil_marian_no_teacher.sh>`__
|
||||
|
||||
Multilingual Models
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
@@ -179,10 +179,18 @@ MarianTokenizer
|
||||
:members: prepare_seq2seq_batch
|
||||
|
||||
|
||||
MarianModel
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: transformers.MarianModel
|
||||
:members: forward
|
||||
|
||||
|
||||
MarianMTModel
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. autoclass:: transformers.MarianMTModel
|
||||
:members: forward
|
||||
|
||||
|
||||
TFMarianMTModel
|
||||
|
||||
Reference in New Issue
Block a user