Pytorch - Lazy initialization of models (#11471)

* lazy_init_weights * remove ipdb * save int * add necessary code * remove unnecessary utils * Update src/transformers/models/t5/modeling_t5.py * clean * add tests * correct * finish tests * finish tests * fix some more tests * fix xlnet & transfo-xl * fix more tests * make sure tests are independent * fix tests more * finist tests * final touches * Update src/transformers/modeling_utils.py * Apply suggestions from code review * Update src/transformers/modeling_utils.py Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Update src/transformers/modeling_utils.py Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * clean tests * give arg positive name * add more mock weights to xlnet Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
2021-05-05 17:22:20 +02:00
parent 8fa8e19429
commit 3e3e41ae20
7 changed files with 369 additions and 117 deletions
--- a/examples/pytorch/test_examples.py
+++ b/examples/pytorch/test_examples.py
@@ -195,6 +195,7 @@ class ExamplesTests(TestCasePlus):
            --per_device_train_batch_size=2
            --per_device_eval_batch_size=2
            --num_train_epochs={epochs}
+            --seed 7
        """.split()

        if torch_device != "cuda":