Move TF building to an actual build() method (#23760)

* A fun new PR where I break the entire codebase again

* A fun new PR where I break the entire codebase again

* Handle cross-attention

* Move calls to model(model.dummy_inputs) to the new build() method

* Seeing what fails with the build context thing

* make fix-copies

* Let's see what fails with new build methods

* Fix the pytorch crossload build calls

* Fix the overridden build methods in vision_text_dual_encoder

* Make sure all our build methods set self.built or call super().build(), which also sets it

* make fix-copies

* Remove finished TODO

* Tentatively remove unneeded (?) line

* Transpose b in deberta correctly and remove unused threading local

* Get rid of build_with_dummies and all it stands for

* Rollback some changes to TF-PT crossloading

* Correctly call super().build()
This commit is contained in:
Matt
2023-06-06 18:30:51 +01:00
committed by GitHub
parent cbf6bc2350
commit 4a55e47877
27 changed files with 159 additions and 138 deletions

View File

@@ -1070,9 +1070,9 @@ class TFEncoderDecoderModelSaveLoadTests(unittest.TestCase):
# create two random BERT models for bert2bert & initialize weights (+cross_attention weights)
encoder = TFBertModel(config.encoder)
encoder(encoder.dummy_inputs)
encoder.build()
decoder = TFBertLMHeadModel(config.decoder)
decoder(decoder.dummy_inputs)
decoder.build()
encoder_decoder_orig = TFEncoderDecoderModel(encoder=encoder, decoder=decoder)