Files
HuggingFace_transformer/tests/models
Joel Lamy-Poirier e0921c6b53 Add GPTBigCode model (Optimized GPT2 with MQA from Santacoder & BigCode) (#22575)
* Add model with cli tool

* Remove unwanted stuff

* Add new code

* Remove inference runner

* Style

* Fix checks

* Test updates

* make fixup

* fix docs

* fix doc

* fix test

* hopefully fix pipeline tests

* refactor

* fix CIs

* add comment

* rename to `GPTBigCodeForCausalLM`

* correct readme

* make fixup + docs

* make fixup

* fixes

* fixes

* Remove pruning

* Remove import

* Doc updates

* More pruning removal

* Combine copies

* Single MQA implementation, remove kv cache pre-allocation and padding

* Update doc

* Revert refactor to match gpt2 style

* Merge back key and value caches, fix some type hints

* Update doc

* Fix position ids pith padding (PR 21080)

* Add conversion script temporarily

* Update conversion script

* Remove checkpoint conversion

* New model

* Fix MQA test

* Fix copies

* try fix tests

* FIX TEST!!

* remove  `DoubleHeadsModel`

* add MQA tests

* add slow tests

* clean up

* add CPU checker

* final fixes

* fixes

- fix GPU issue
- fixed slow tests
- skip disk offload

* fix final issue

* Simplify and comment baddbmm fix

* Remove unnecessary code

* Transpose tweaks

* Use beta=1 on cpu, improve tests

---------

Co-authored-by: younesbelkada <younesbelkada@gmail.com>
2023-04-10 10:57:21 +02:00
..
2023-04-06 17:56:06 +02:00
2023-04-06 17:56:06 +02:00
2023-04-06 13:50:15 +01:00
2022-05-03 14:42:02 +02:00
2023-03-01 11:11:04 +01:00
2023-04-06 13:50:15 +01:00
2022-05-03 14:42:02 +02:00
2023-04-06 17:56:06 +02:00
2023-04-06 17:56:06 +02:00
2023-03-07 04:20:14 +01:00
2023-04-06 17:56:06 +02:00
2023-04-06 17:56:06 +02:00
2023-04-06 17:56:06 +02:00
2023-04-06 17:56:06 +02:00
2023-04-06 17:56:06 +02:00
2023-04-07 17:13:04 +02:00
2023-04-06 17:56:06 +02:00
2022-05-12 16:25:55 -04:00
2023-04-06 17:56:06 +02:00
2023-04-06 13:50:15 +01:00
2023-04-04 14:53:06 +02:00
2023-04-06 17:56:06 +02:00
2023-04-06 17:56:06 +02:00
2023-04-06 17:56:06 +02:00
2023-02-09 14:46:15 +00:00
2023-04-04 12:41:12 -04:00
2023-04-06 13:50:15 +01:00
2023-04-06 13:50:15 +01:00
2023-04-06 17:56:06 +02:00
2022-05-03 14:42:02 +02:00