Add TF implementation of GPT-J (#15623)

* Initial commit

* Add TFGPTJModel

* Fix a forward pass

* Add TFGPTJCausalLM

* Add TFGPTJForSequenceClassification

* Add TFGPTJForQuestionAnswering

* Fix docs

* Deal with TF dynamic shapes

* Add Loss parents to models

* Adjust split and merge heads to handle 4 and 5-dim tensors

* Update outputs for @tooslow tests
This commit is contained in:
Daniel Stancl
2022-03-25 20:27:19 +01:00
committed by GitHub
parent aa4c0a86dc
commit ed2ee373d0
8 changed files with 1742 additions and 2 deletions

View File

@@ -205,7 +205,7 @@ Flax), PyTorch, and/or TensorFlow.
| Funnel Transformer | ✅ | ✅ | ✅ | ✅ | ❌ |
| GLPN | ❌ | ❌ | ✅ | ❌ | ❌ |
| GPT Neo | ❌ | ❌ | ✅ | ❌ | ✅ |
| GPT-J | ❌ | ❌ | ✅ | | ✅ |
| GPT-J | ❌ | ❌ | ✅ | | ✅ |
| Hubert | ❌ | ❌ | ✅ | ✅ | ❌ |
| I-BERT | ❌ | ❌ | ✅ | ❌ | ❌ |
| ImageGPT | ❌ | ❌ | ✅ | ❌ | ❌ |

View File

@@ -130,6 +130,26 @@ model.
[[autodoc]] GPTJForQuestionAnswering
- forward
## TFGPTJModel
[[autodoc]] TFGPTJModel
- call
## TFGPTJForCausalLM
[[autodoc]] TFGPTJForCausalLM
- call
## TFGPTJForSequenceClassification
[[autodoc]] TFGPTJForSequenceClassification
- call
## TFGPTJForQuestionAnswering
[[autodoc]] TFGPTJForQuestionAnswering
- call
## FlaxGPTJModel
[[autodoc]] FlaxGPTJModel