Feed forward chunking (#6024)

* Chunked feed forward for Bert

This is an initial implementation to test applying feed forward chunking for BERT.
Will need additional modifications based on output and benchmark results.

* Black and cleanup

* Feed forward chunking in BertLayer class.

* Isort

* add chunking for all models

* fix docs

* Fix typo

Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com>
This commit is contained in:
Pradhy729
2020-08-11 00:12:45 -07:00
committed by GitHub
parent 8a3db6b303
commit b25cec13c5
6 changed files with 50 additions and 32 deletions

View File

@@ -370,6 +370,7 @@ class BertModelTest(ModelTesterMixin, unittest.TestCase):
if is_torch_available()
else ()
)
test_chunking = True
def setUp(self):
self.model_tester = BertModelTester(self)