Fix TF s2s models (#9478)

* Fix Seq2Seq models for serving

* Apply style

* Fix lonfgormer

* Fix mBart/Pegasus/Blenderbot

* Apply style

* Add a main intermediate layer

* Apply style

* Remove import

* Apply tf.function to Longformer

* Fix utils check_copy

* Update S2S template

* Fix BART + Blenderbot

* Fix BlenderbotSmall

* Fix BlenderbotSmall

* Fix BlenderbotSmall

* Fix MBart

* Fix Marian

* Fix Pegasus + template

* Apply style

* Fix common attributes test

* Forgot to fix the LED test

* Apply Patrick's comment on LED Decoder
This commit is contained in:
Julien Plu
2021-01-21 17:03:29 +01:00
committed by GitHub
parent 23e5a36ee6
commit a7dabfb3d1
20 changed files with 1025 additions and 666 deletions

View File

@@ -352,30 +352,20 @@ class TFLEDModelTest(TFModelTesterMixin, unittest.TestCase):
self.assertEqual(model.config.output_hidden_states, True)
check_encoder_attentions_output(outputs)
@slow
def test_saved_model_with_attentions_output(self):
# longformer has special attentions which are not
# compatible in graph mode
pass
@slow
def test_saved_model_with_hidden_states_output(self):
# TODO(JPLU, PVP) this test should pass!!! PVP:
# IMO there is a problem with the signature check.
# Test passes for TFLEDModel, but not for TFLEDForConditionalGeneration
# IMO the reason is that the tensor variable name cannot be changed
# from decoder_input_ids -> input_ids, which poses a BIG restrictions
pass
@slow
def test_saved_model_creation_extended(self):
# All the tests about building a saved model
# fails because the Seq2Seq models uses model in a model
# as a layer.
# TODO(JPLU) WARNING: NEED TO BE FIXED ASAP
pass
def test_saved_model_creation(self):
# This test is too long (>30sec) and makes fail the CI
pass
def test_saved_model_with_attentions_output(self):
# This test don't pass because of the error:
# condition [13,8,4,5], then [13,8,4,5], and else [13,8,4,6] must be broadcastable
# This occurs line 323 in modeling_tf_led.py because the condition line 255
# returns a tensor of shape
# [batch_size, seq_len, self.num_heads, self.one_sided_attn_window_size * 2 + 2]
# if is_global_attn is True and a tensor of shape
# [batch_size, seq_len, self.num_heads, self.one_sided_attn_window_size * 2 + 1]
# This is due to the tf.concat call line 703 that adds one dimension
# Need to check with PVP how to properly fix this
pass