Sylvain Gugger
b4d4d6fe87
Add RWKV-4 ( #22797 )
...
* First draft of RWKV-4
* Add support for generate
* Style post-rebase
* Properly use state
* Write doc
* Fix doc
* More math
* Add model to README, dummies and clean config
* Fix init
* multiple fixes:
- fix common tests
- fix configuraion default values
- add CI test for checking state computation
- fix some CI tests
* correct tokenizer
* some tweaks
- fix config docstring
- fix failing tests
* fix CI tests
- add output_attention / output_hidden_states
- override test_initialization
- fix failing CIs
* fix conversion script
- fix sharded case
- add new arguments
* add slow tests + more fixes on conversion script
* add another test
* final fixes
* change single name variable
* add mock attention mask for pipeline to work
* correct eos token id
* fix nits
* add checkpoints
* Apply suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
* add `tie_word_embeddings` in docstring
* change tensor name
* fix final nits
* Trigger CI
---------
Co-authored-by: younesbelkada <younesbelkada@gmail.com >
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com >
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
2023-05-09 13:04:10 -04:00
..
2023-03-17 14:30:17 +00:00
2023-03-16 13:41:48 +03:00
2023-01-04 09:18:57 +01:00
2023-03-17 14:30:17 +00:00
2023-04-21 10:04:45 -04:00
2023-03-17 14:30:17 +00:00
2022-04-04 10:25:46 -04:00
2022-04-04 10:25:46 -04:00
2023-03-17 14:30:17 +00:00
2022-04-29 17:42:15 -04:00
2022-04-04 10:25:46 -04:00
2023-03-17 14:30:17 +00:00
2022-04-04 10:25:46 -04:00
2023-03-17 14:30:17 +00:00
2023-03-17 14:30:17 +00:00
2023-05-01 09:17:27 -04:00
2023-03-17 14:30:17 +00:00
2023-03-17 14:30:17 +00:00
2023-03-17 14:30:17 +00:00
2023-02-28 15:42:55 +01:00
2023-04-04 16:05:22 +01:00
2023-03-17 14:30:17 +00:00
2022-04-04 10:25:46 -04:00
2023-03-08 09:00:54 -05:00
2022-04-13 11:36:54 +02:00
2023-03-17 14:30:17 +00:00
2023-03-17 14:30:17 +00:00
2022-11-30 19:22:23 +01:00
2023-04-24 14:00:29 +02:00
2023-01-17 17:18:56 +01:00
2022-11-09 18:31:22 +01:00
2023-03-17 14:30:17 +00:00
2023-03-17 14:30:17 +00:00
2023-03-17 14:30:17 +00:00
2023-03-17 14:30:17 +00:00
2023-03-14 12:08:14 +03:00
2022-05-02 12:47:39 -04:00
2023-04-12 07:33:20 -04:00
2023-03-17 14:30:17 +00:00
2023-03-17 14:30:17 +00:00
2023-03-17 14:30:17 +00:00
2023-03-17 14:30:17 +00:00
2023-03-17 14:30:17 +00:00
2022-04-04 10:25:46 -04:00
2023-03-17 14:30:17 +00:00
2023-03-17 14:30:17 +00:00
2023-04-05 17:43:48 +02:00
2023-03-17 14:30:17 +00:00
2023-03-17 14:30:17 +00:00
2022-04-04 10:25:46 -04:00
2023-03-17 14:30:17 +00:00
2023-05-03 18:23:09 +02:00
2023-01-17 17:18:56 +01:00
2022-11-29 10:38:01 +00:00
2023-02-15 10:35:14 -08:00
2023-03-17 14:30:17 +00:00
2023-03-17 14:30:17 +00:00
2023-02-20 16:37:11 +03:00
2023-03-17 14:30:17 +00:00
2022-07-27 10:08:59 +02:00
2023-03-17 14:30:17 +00:00
2023-03-17 14:30:17 +00:00
2023-03-17 14:30:17 +00:00
2022-12-27 02:26:14 -05:00
2023-03-10 07:44:45 -05:00
2023-03-17 14:30:17 +00:00
2022-11-08 19:54:41 +00:00
2023-03-17 14:30:17 +00:00
2023-04-23 20:03:05 +03:00
2022-04-04 10:25:46 -04:00
2023-03-17 14:30:17 +00:00
2023-03-17 14:30:17 +00:00
2023-03-17 10:27:12 -04:00
2023-05-03 18:23:09 +02:00
2023-04-10 10:57:21 +02:00
2023-05-03 15:59:19 -04:00
2023-03-17 14:30:17 +00:00
2023-05-04 10:15:15 -04:00
2023-03-17 14:30:17 +00:00
2023-03-17 14:30:17 +00:00
2023-02-20 11:25:27 +01:00
2023-01-19 13:05:59 -05:00
2023-01-17 17:18:56 +01:00
2022-04-04 10:25:46 -04:00
2023-03-17 14:30:17 +00:00
2023-03-17 14:30:17 +00:00
2023-03-17 14:30:17 +00:00
2023-03-07 21:36:38 +01:00
2023-02-27 08:36:36 +01:00
2023-03-17 14:30:17 +00:00
2023-05-03 09:53:00 -04:00
2023-03-17 14:30:17 +00:00
2022-04-04 10:25:46 -04:00
2023-03-17 14:30:17 +00:00
2023-03-17 14:30:17 +00:00
2023-03-17 14:30:17 +00:00
2023-05-07 18:52:44 -04:00
2023-03-17 14:30:17 +00:00
2023-03-17 14:30:17 +00:00
2023-03-17 14:30:17 +00:00
2023-03-17 14:30:17 +00:00
2023-03-17 14:30:17 +00:00
2023-03-17 14:30:17 +00:00
2023-03-17 14:30:17 +00:00
2023-01-25 12:34:43 +01:00
2023-01-16 20:37:07 +03:00
2023-04-05 17:43:48 +02:00
2023-03-17 14:30:17 +00:00
2023-03-17 14:30:17 +00:00
2023-04-25 17:58:45 -04:00
2022-04-04 10:25:46 -04:00
2023-03-17 14:30:17 +00:00
2023-03-17 09:40:06 +00:00
2022-04-04 10:25:46 -04:00
2023-03-17 14:30:17 +00:00
2023-03-17 14:30:17 +00:00
2023-03-17 14:30:17 +00:00
2023-03-17 14:30:17 +00:00
2023-03-17 14:30:17 +00:00
2023-03-17 14:30:17 +00:00
2023-03-17 14:30:17 +00:00
2023-03-17 14:30:17 +00:00
2023-03-17 14:30:17 +00:00
2023-04-21 17:09:40 +01:00
2023-04-04 14:53:06 +02:00
2023-03-17 14:30:17 +00:00
2023-01-25 12:34:43 +01:00
2023-04-28 11:01:32 -04:00
2023-03-17 14:30:17 +00:00
2023-02-21 10:35:11 -05:00
2023-01-03 19:25:09 +03:00
2023-03-17 14:30:17 +00:00
2023-03-17 14:30:17 +00:00
2023-03-17 14:30:17 +00:00
2022-04-04 10:25:46 -04:00
2023-04-13 19:51:13 +01:00
2023-03-17 14:30:17 +00:00
2023-03-17 14:30:17 +00:00
2023-03-17 14:30:17 +00:00
2023-03-17 14:30:17 +00:00
2023-02-15 10:35:14 -08:00
2022-04-04 10:25:46 -04:00
2023-03-17 14:30:17 +00:00
2023-04-04 12:41:12 -04:00
2023-03-17 14:30:17 +00:00
2023-03-24 19:45:57 +00:00
2022-04-04 10:25:46 -04:00
2023-03-17 14:30:17 +00:00
2023-05-03 18:23:09 +02:00
2023-03-17 14:30:17 +00:00
2023-03-17 14:30:17 +00:00
2023-05-09 13:04:10 -04:00
2023-05-09 08:58:19 -04:00
2023-03-17 14:30:17 +00:00
2023-03-17 14:30:17 +00:00
2023-03-17 14:30:17 +00:00
2023-03-17 14:30:17 +00:00
2022-11-03 14:18:45 +01:00
2022-07-29 08:09:09 -04:00
2023-02-03 12:43:46 -05:00
2023-03-17 14:30:17 +00:00
2023-03-17 14:30:17 +00:00
2022-12-16 16:24:01 +01:00
2023-03-17 14:30:17 +00:00
2023-03-17 14:30:17 +00:00
2023-03-17 14:30:17 +00:00
2023-05-03 18:23:09 +02:00
2022-11-07 09:19:04 -05:00
2023-02-21 10:35:11 -05:00
2023-03-17 14:30:17 +00:00
2022-04-08 10:57:51 +02:00
2023-02-13 10:11:16 -05:00
2023-03-17 14:30:17 +00:00
2022-05-17 19:07:43 -04:00
2023-03-17 14:30:17 +00:00
2023-05-05 11:29:20 -04:00
2023-02-15 18:10:30 +00:00
2022-06-21 10:24:50 +02:00
2023-03-17 14:30:17 +00:00
2023-03-17 14:30:17 +00:00
2023-03-17 14:30:17 +00:00
2023-03-17 14:30:17 +00:00
2023-03-17 14:30:17 +00:00
2022-11-08 19:54:41 +00:00
2022-11-30 14:50:55 +00:00
2023-03-01 18:00:48 +00:00
2022-11-30 14:50:55 +00:00
2023-03-17 14:30:17 +00:00
2023-01-17 17:18:56 +01:00
2023-03-17 14:30:17 +00:00
2023-03-17 14:30:17 +00:00
2022-04-04 10:25:46 -04:00
2023-03-17 14:30:17 +00:00
2023-04-26 13:35:30 +01:00
2023-03-17 14:30:17 +00:00
2023-05-05 13:23:46 -04:00
2023-01-17 17:18:56 +01:00
2023-03-17 14:30:17 +00:00
2023-03-17 14:30:17 +00:00
2023-03-17 14:30:17 +00:00
2023-03-17 14:30:17 +00:00
2023-02-07 16:43:19 -05:00
2023-03-17 14:30:17 +00:00
2023-03-17 14:30:17 +00:00
2022-04-04 10:25:46 -04:00
2022-04-04 10:25:46 -04:00
2023-03-17 14:30:17 +00:00
2023-03-17 14:30:17 +00:00
2023-03-17 14:30:17 +00:00