Add TF implementation of GPT-J (#15623)

* Initial commit

* Add TFGPTJModel

* Fix a forward pass

* Add TFGPTJCausalLM

* Add TFGPTJForSequenceClassification

* Add TFGPTJForQuestionAnswering

* Fix docs

* Deal with TF dynamic shapes

* Add Loss parents to models

* Adjust split and merge heads to handle 4 and 5-dim tensors

* Update outputs for @tooslow tests
This commit is contained in:
Daniel Stancl
2022-03-25 20:27:19 +01:00
committed by GitHub
parent aa4c0a86dc
commit ed2ee373d0
8 changed files with 1742 additions and 2 deletions

View File

@@ -130,6 +130,26 @@ model.
[[autodoc]] GPTJForQuestionAnswering
- forward
## TFGPTJModel
[[autodoc]] TFGPTJModel
- call
## TFGPTJForCausalLM
[[autodoc]] TFGPTJForCausalLM
- call
## TFGPTJForSequenceClassification
[[autodoc]] TFGPTJForSequenceClassification
- call
## TFGPTJForQuestionAnswering
[[autodoc]] TFGPTJForQuestionAnswering
- call
## FlaxGPTJModel
[[autodoc]] FlaxGPTJModel