Joel Lamy-Poirier
e0921c6b53
Add GPTBigCode model (Optimized GPT2 with MQA from Santacoder & BigCode) (#22575)
* Add model with cli tool
* Remove unwanted stuff
* Add new code
* Remove inference runner
* Style
* Fix checks
* Test updates
* make fixup
* fix docs
* fix doc
* fix test
* hopefully fix pipeline tests
* refactor
* fix CIs
* add comment
* rename to `GPTBigCodeForCausalLM`
* correct readme
* make fixup + docs
* make fixup
* fixes
* fixes
* Remove pruning
* Remove import
* Doc updates
* More pruning removal
* Combine copies
* Single MQA implementation, remove kv cache pre-allocation and padding
* Update doc
* Revert refactor to match gpt2 style
* Merge back key and value caches, fix some type hints
* Update doc
* Fix position ids pith padding (PR 21080)
* Add conversion script temporarily
* Update conversion script
* Remove checkpoint conversion
* New model
* Fix MQA test
* Fix copies
* try fix tests
* FIX TEST!!
* remove `DoubleHeadsModel`
* add MQA tests
* add slow tests
* clean up
* add CPU checker
* final fixes
* fixes
- fix GPU issue
- fixed slow tests
- skip disk offload
* fix final issue
* Simplify and comment baddbmm fix
* Remove unnecessary code
* Transpose tweaks
* Use beta=1 on cpu, improve tests
---------
Co-authored-by: younesbelkada <younesbelkada@gmail.com>
2023-04-10 10:57:21 +02:00
..
2023-02-28 19:40:57 +01:00
2023-04-06 17:56:06 +02:00
2023-03-23 19:14:17 +01:00
2023-03-23 19:14:17 +01:00
2023-03-09 09:23:48 -05:00
2023-04-06 17:56:06 +02:00
2023-02-06 18:10:56 -05:00
2023-02-06 18:10:56 -05:00
2023-02-28 19:40:57 +01:00
2023-02-28 19:40:57 +01:00
2023-02-28 19:40:57 +01:00
2023-02-06 18:10:56 -05:00
2023-02-06 18:10:56 -05:00
2023-02-28 19:40:57 +01:00
2023-04-06 17:56:06 +02:00
2023-04-03 09:20:02 -04:00
2023-04-06 13:50:15 +01:00
2023-04-06 17:56:06 +02:00
2023-04-06 17:56:06 +02:00
2023-04-06 19:12:51 +02:00
2023-03-13 15:03:21 +01:00
2023-03-21 19:22:01 +01:00
2022-05-03 14:42:02 +02:00
2023-03-15 20:54:38 +01:00
2023-02-06 18:10:56 -05:00
2023-02-10 10:58:29 +01:00
2023-02-28 19:40:57 +01:00
2023-02-28 19:40:57 +01:00
2023-03-03 18:42:18 +01:00
2023-02-28 19:40:57 +01:00
2023-02-28 19:40:57 +01:00
2023-02-28 19:40:57 +01:00
2023-03-07 14:23:36 +01:00
2023-03-01 11:11:04 +01:00
2023-04-06 13:50:15 +01:00
2023-04-06 17:56:06 +02:00
2023-03-22 20:02:24 +01:00
2023-03-14 10:03:02 +01:00
2023-03-21 12:12:57 +00:00
2023-02-28 19:40:57 +01:00
2023-02-28 19:40:57 +01:00
2023-02-28 19:40:57 +01:00
2023-02-28 19:40:57 +01:00
2023-02-28 19:40:57 +01:00
2023-03-22 07:35:47 +03:00
2023-03-23 19:14:17 +01:00
2023-03-07 15:19:39 -05:00
2023-04-06 13:50:15 +01:00
2023-02-28 19:40:57 +01:00
2022-05-03 14:42:02 +02:00
2023-02-28 19:40:57 +01:00
2023-02-28 19:40:57 +01:00
2023-02-28 19:40:57 +01:00
2023-02-28 19:40:57 +01:00
2023-04-06 22:52:59 +02:00
2023-02-28 19:40:57 +01:00
2023-02-06 18:10:56 -05:00
2023-02-28 19:40:57 +01:00
2023-03-23 19:14:17 +01:00
2023-02-28 19:40:57 +01:00
2023-03-14 10:03:02 +01:00
2023-02-28 19:40:57 +01:00
2023-03-14 10:03:02 +01:00
2023-04-06 17:56:06 +02:00
2023-02-28 19:40:57 +01:00
2023-02-28 19:40:57 +01:00
2023-02-28 19:40:57 +01:00
2023-03-10 22:08:21 +01:00
2023-04-10 10:57:21 +02:00
2023-02-28 19:40:57 +01:00
2023-03-27 15:48:23 +01:00
2023-02-28 19:40:57 +01:00
2022-12-12 13:12:13 -05:00
2023-03-14 10:03:02 +01:00
2023-04-06 22:52:59 +02:00
2023-02-28 19:40:57 +01:00
2023-02-28 19:40:57 +01:00
2023-02-06 18:10:56 -05:00
2023-04-04 21:59:54 +01:00
2023-02-28 19:40:57 +01:00
2023-02-28 19:40:57 +01:00
2023-04-06 17:56:06 +02:00
2023-03-07 04:20:14 +01:00
2023-03-14 10:03:02 +01:00
2023-03-14 10:03:02 +01:00
2023-04-03 10:32:36 -04:00
2023-02-22 09:14:54 +01:00
2023-04-06 17:56:06 +02:00
2023-02-28 19:40:57 +01:00
2023-03-14 10:03:02 +01:00
2023-04-06 09:53:03 +02:00
2023-03-14 10:03:02 +01:00
2023-04-06 17:56:06 +02:00
2023-03-22 20:02:24 +01:00
2023-02-28 19:40:57 +01:00
2023-04-06 17:56:06 +02:00
2023-04-06 17:56:06 +02:00
2023-03-22 20:02:24 +01:00
2023-03-22 20:45:08 -04:00
2023-04-06 13:50:15 +01:00
2023-04-06 17:56:06 +02:00
2022-07-29 08:09:09 -04:00
2023-03-21 19:22:01 +01:00
2023-04-07 17:13:04 +02:00
2023-02-28 19:40:57 +01:00
2022-05-03 14:42:02 +02:00
2023-04-06 17:56:06 +02:00
2022-05-12 16:25:55 -04:00
2023-02-28 19:40:57 +01:00
2023-02-28 19:40:57 +01:00
2023-02-28 19:40:57 +01:00
2023-02-28 19:40:57 +01:00
2023-02-28 19:40:57 +01:00
2023-02-06 18:10:56 -05:00
2023-04-06 17:56:06 +02:00
2023-04-06 13:50:15 +01:00
2023-02-28 19:40:57 +01:00
2023-04-04 14:53:06 +02:00
2023-04-06 17:56:06 +02:00
2023-02-28 19:40:57 +01:00
2023-03-23 19:14:17 +01:00
2023-04-05 13:16:00 +01:00
2023-04-07 20:12:57 +02:00
2023-02-28 19:40:57 +01:00
2023-04-06 17:56:06 +02:00
2023-04-06 17:56:06 +02:00
2023-02-28 19:40:57 +01:00
2023-02-06 18:10:56 -05:00
2023-04-03 15:26:35 -04:00
2023-04-06 17:56:06 +02:00
2023-02-28 19:40:57 +01:00
2023-04-06 17:56:06 +02:00
2023-02-28 19:40:57 +01:00
2023-02-09 14:46:15 +00:00
2023-02-28 19:40:57 +01:00
2023-03-14 10:03:02 +01:00
2023-04-04 12:41:12 -04:00
2023-02-28 19:40:57 +01:00
2023-04-06 13:50:15 +01:00
2023-02-06 18:10:56 -05:00
2023-02-28 19:40:57 +01:00
2023-02-28 19:40:57 +01:00
2023-03-14 10:03:02 +01:00
2023-04-04 12:50:33 +02:00
2023-02-28 19:40:57 +01:00
2023-02-28 19:40:57 +01:00
2023-02-28 19:40:57 +01:00
2023-02-06 18:10:56 -05:00
2023-02-28 19:40:57 +01:00
2023-02-28 19:40:57 +01:00
2023-03-07 22:31:14 +01:00
2023-04-06 17:45:55 +02:00
2023-02-28 19:40:57 +01:00
2023-04-06 13:50:15 +01:00
2023-03-07 14:23:36 +01:00
2023-03-07 14:23:36 +01:00
2023-04-06 17:56:06 +02:00
2023-04-06 17:56:06 +02:00
2023-02-28 19:40:57 +01:00
2023-03-14 10:03:02 +01:00
2023-02-06 18:10:56 -05:00
2023-04-03 09:07:21 -04:00
2023-02-28 19:40:57 +01:00
2023-03-21 19:22:01 +01:00
2023-03-14 10:03:02 +01:00
2023-02-28 19:40:57 +01:00
2023-02-28 19:40:57 +01:00
2023-02-28 19:40:57 +01:00
2023-02-28 19:40:57 +01:00
2023-02-28 19:40:57 +01:00
2023-02-28 19:40:57 +01:00
2023-02-28 19:40:57 +01:00
2023-02-28 19:40:57 +01:00
2023-02-28 18:41:34 +00:00
2023-03-01 18:00:48 +00:00
2023-02-28 19:40:57 +01:00
2023-03-22 07:35:47 +03:00
2023-02-28 19:40:57 +01:00
2023-02-28 19:40:57 +01:00
2023-02-28 19:40:57 +01:00
2023-04-04 21:59:54 +01:00
2023-02-28 19:40:57 +01:00
2022-05-03 14:42:02 +02:00
2023-02-22 09:14:54 +01:00
2023-02-28 19:40:57 +01:00
2023-04-06 22:52:59 +02:00
2023-02-28 19:40:57 +01:00
2023-03-06 18:07:31 +01:00
2023-03-14 10:03:02 +01:00
2023-03-06 09:15:44 +01:00
2023-02-09 21:49:54 +01:00
2023-03-29 16:16:23 +02:00
2023-03-22 20:02:24 +01:00
2023-03-29 16:16:23 +02:00
2023-03-07 07:34:04 -05:00
2023-02-28 19:40:57 +01:00
2022-05-03 14:42:02 +02:00