Suraj Patil
860264379f
GPT Neo (#10848)
* lets begin
* boom boom
* fix out proj in attn
* fix attention
* fix local attention
* add tokenizer
* fix imports
* autotokenizer
* fix checkpoint name
* cleanup
* more clean-up
* more cleanup
* output attentions
* fix attn mask creation
* fix imports
* config doc
* add tests
* add slow tests
* quality
* add conversion script
* copyright
* typo
* another bites the dust
* fix attention tests
* doc
* add embed init in convert function
* fix copies
* remove tokenizer
* enable caching
* address review comments
* improve config and create attn layer list internally
* more consistent naming
* init hf config from mesh-tf config json file
* remove neo tokenizer from doc
* handle attention_mask in local attn layer
* attn_layers => attention_layers
* add tokenizer_class in config
* fix docstring
* raise if len of attention_layers is not same as num_layers
* remove tokenizer_class from config
* more consistent naming
* fix doc
* fix checkpoint names
* fp16 compat
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-03-30 09:42:30 -04:00
..
2021-03-18 15:19:25 -04:00
2021-02-05 15:47:54 +03:00
2021-03-29 10:39:14 -04:00
2021-03-26 11:23:56 -04:00
2021-03-30 09:42:30 -04:00
2021-02-01 17:55:10 +03:00
2021-01-11 08:53:41 -05:00
2021-01-05 06:18:48 -05:00
2021-03-24 11:03:37 -04:00
2021-03-16 11:41:15 -04:00
2020-06-17 14:01:10 -04:00
2021-01-26 03:37:57 -05:00
2021-03-26 08:07:59 -04:00
2020-05-27 11:36:55 -04:00
2020-02-25 13:48:24 -05:00
2021-02-28 08:27:54 -05:00
2021-03-30 09:42:30 -04:00
2021-03-15 09:11:42 -04:00
2020-12-07 18:36:34 -05:00
2021-01-30 09:59:19 -05:00
2021-01-27 03:20:09 -05:00
2021-01-05 06:18:48 -05:00
2020-04-06 14:32:39 -04:00
2020-12-07 18:36:34 -05:00
2020-12-23 10:15:49 -05:00
2020-12-23 10:15:49 -05:00
2021-03-30 09:42:30 -04:00
2020-12-23 10:15:49 -05:00
2021-03-25 09:01:31 -04:00
2020-12-07 18:36:34 -05:00
2021-03-15 09:11:42 -04:00
2021-03-17 09:23:38 -04:00
2020-12-23 10:15:49 -05:00
2021-01-12 19:05:18 -08:00