Make Flax GPT2 working with cross attention (#13008)

* make flax gpt2 working with cross attention

* Remove encoder->decoder projection layer

* A draft (incomplete) for FlaxEncoderDecoderModel

* Add the method from_encoder_decoder_pretrained + the docstrings

* Fix the mistakes of using EncoderDecoderModel

* Fix style

* Add FlaxEncoderDecoderModel to the library

* Fix cyclic imports

* Add FlaxEncoderDecoderModel to modeling_flax_auto.py

* Remove question comments

* add tests for FlaxEncoderDecoderModel

* add flax_encoder_decoder to the lists of ignored entries in check_repo.py

* fix missing required positional arguments

* Remove **kwargs when creating FlaxEncoderDecoderModel in from_encoder_decoder_pretrained()

Also fix generation eos/pad tokens issue

* Fix: Use sequences from the generated_output

* Change a check from assert to raise ValueError

* Fix examples and token ids issues

* Fix missing all_cross_attentions when outputting tuple in modeling_gpt2

* Remove the changes in configuration docstrings.

* allow for bert 2 gpt2

* make fix-copies

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Change remaining examples to bert2gpt2

* Change the test to Bert2GPT2

* Fix examples

* Fix import

* Fix unpack bug

* Rename to FlaxEncoderDecoderModelTest and change the test to bert2gpt2

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Fix: NotImplentedError -> NotImplementedError

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* up

* finalize

Co-authored-by: ydshieh <ydshieh@user.noreply>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
This commit is contained in:
Yih-Dar
2021-08-23 17:57:29 +02:00
committed by GitHub
parent 7223844df9
commit 2e20c0f34a
15 changed files with 1434 additions and 45 deletions

View File

@@ -40,3 +40,10 @@ EncoderDecoderModel
.. autoclass:: transformers.EncoderDecoderModel
:members: forward, from_encoder_decoder_pretrained
FlaxEncoderDecoderModel
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: transformers.FlaxEncoderDecoderModel
:members: __call__, from_encoder_decoder_pretrained