Add GPTBigCode model (Optimized GPT2 with MQA from Santacoder & BigCode) (#22575)

* Add model with cli tool

* Remove unwanted stuff

* Add new code

* Remove inference runner

* Style

* Fix checks

* Test updates

* make fixup

* fix docs

* fix doc

* fix test

* hopefully fix pipeline tests

* refactor

* fix CIs

* add comment

* rename to `GPTBigCodeForCausalLM`

* correct readme

* make fixup + docs

* make fixup

* fixes

* fixes

* Remove pruning

* Remove import

* Doc updates

* More pruning removal

* Combine copies

* Single MQA implementation, remove kv cache pre-allocation and padding

* Update doc

* Revert refactor to match gpt2 style

* Merge back key and value caches, fix some type hints

* Update doc

* Fix position ids pith padding (PR 21080)

* Add conversion script temporarily

* Update conversion script

* Remove checkpoint conversion

* New model

* Fix MQA test

* Fix copies

* try fix tests

* FIX TEST!!

* remove  `DoubleHeadsModel`

* add MQA tests

* add slow tests

* clean up

* add CPU checker

* final fixes

* fixes

- fix GPU issue
- fixed slow tests
- skip disk offload

* fix final issue

* Simplify and comment baddbmm fix

* Remove unnecessary code

* Transpose tweaks

* Use beta=1 on cpu, improve tests

---------

Co-authored-by: younesbelkada <younesbelkada@gmail.com>

This commit is contained in:

Joel Lamy-Poirier

2023-04-10 04:57:21 -04:00

committed by

GitHub

parent 656e869a45

commit e0921c6b53

24 changed files with 2043 additions and 3 deletions

									
										2

docs/source/en/_toctree.yml
									
												View File
												
				@@ -309,6 +309,8 @@

				        title: GPT-J

				      - local: model_doc/gpt2

				        title: GPT2

				      - local: model_doc/gpt_bigcode

				        title: GPTBigCode

				      - local: model_doc/gptsan-japanese

				        title: GPTSAN Japanese

				      - local: model_doc/gpt-sw3

Add GPTBigCode model (Optimized GPT2 with MQA from Santacoder & BigCode) (#22575)

2 docs/source/en/_toctree.yml Unescape Escape View File

2

docs/source/en/_toctree.yml

View File