Daniel Stancl
ed2ee373d0
Add TF implementation of GPT-J (#15623)
* Initial commit
* Add TFGPTJModel
* Fix a forward pass
* Add TFGPTJCausalLM
* Add TFGPTJForSequenceClassification
* Add TFGPTJForQuestionAnswering
* Fix docs
* Deal with TF dynamic shapes
* Add Loss parents to models
* Adjust split and merge heads to handle 4 and 5-dim tensors
* Update outputs for @tooslow tests
2022-03-25 19:27:19 +00:00
..
2022-03-25 07:25:20 -04:00
2022-03-25 07:25:20 -04:00
2022-03-25 19:27:19 +00:00
2022-03-25 09:12:23 -04:00
2022-03-23 03:46:59 -04:00
2022-03-23 16:18:43 -04:00
2022-01-27 13:49:04 -06:00
2022-03-25 07:25:20 -04:00
2021-12-27 19:07:46 -05:00
2022-03-15 10:13:34 -04:00
2022-03-25 09:12:23 -04:00
2022-03-23 03:46:59 -04:00
2022-01-31 17:03:06 +01:00
2020-06-17 14:01:10 -04:00
2022-03-23 03:46:59 -04:00
2022-03-15 10:13:34 -04:00
2022-03-25 09:12:23 -04:00
2022-02-23 11:40:06 -05:00
2022-03-23 03:46:59 -04:00
2021-12-22 16:14:35 -05:00
2021-12-27 19:07:46 -05:00
2022-03-25 19:27:19 +00:00
2022-03-23 03:46:59 -04:00
2021-12-27 19:07:46 -05:00
2022-03-22 16:14:58 -05:00
2022-01-14 10:12:30 -05:00
2022-03-23 03:46:59 -04:00
2020-04-06 14:32:39 -04:00
2022-03-23 03:46:59 -04:00
2022-03-25 07:25:20 -04:00
2021-12-27 19:07:46 -05:00
2021-12-22 16:14:35 -05:00
2022-02-01 18:31:35 -06:00
2022-03-23 03:46:59 -04:00
2022-03-15 10:13:34 -04:00
2022-03-21 11:33:18 +01:00
2022-03-23 03:46:59 -04:00
2021-12-22 16:14:35 -05:00
2022-03-25 17:04:43 +01:00
2022-03-23 03:46:59 -04:00
2022-03-25 09:12:23 -04:00
2022-02-11 16:51:30 -05:00
2022-03-23 03:46:59 -04:00
2022-03-21 11:37:18 -05:00