Files
HuggingFace_transformer/tests/models
Arthur dcb183f4bd [MPT] Add MosaicML's MPT model to transformers (#24629)
* draft add new model like

* some cleaning of the config

* nits

* add nested configs

* nits

* update

* update

* added layer norms + triton kernels

* consider only LPLayerNorm for now.

* update

* all keys match.

* Update

* fixing nits here and there

* working forward pass.

* removed einops dependency

* nits

* format

* add alibi

* byebye head mask

* refactor attention

* nits.

* format

* fix nits.

* nuke ande updates

* nuke tokenizer test

* don't reshape query with kv heads

* added a bit of documentation.

* remove unneeded things

* nuke more stuff

* nit

* logits match - same generations

* rm unneeded methods

* 1 remaining failing CI test

* nit

* fix nits

* fix docs

* fix docs

* rm tokenizer

* fixup

* fixup

* fixup and fix tests

* fixed configuration object.

* use correct activation

* few minor fixes

* clarify docs a bit

* logits match à 1e-12

* skip and unskip a test

* added some slow tests.

* fix readme

* add more details

* Update docs/source/en/model_doc/mpt.md

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* fix configuration issues

* more fixes in config

* added more models

* Apply suggestions from code review

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* remove unneeded position ids

* fix some  comments

* Apply suggestions from code review

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* revert suggestion

* mpt alibi + added batched generation

* Update src/transformers/models/mpt/__init__.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* remove init config

* Update src/transformers/models/mpt/configuration_mpt.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* fix nit

* add another slow test

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* fits in one line

* some refactor because make fixup doesn't pass

* add ft notebook

* update md

* correct doc path

---------

Co-authored-by: younesbelkada <younesbelkada@gmail.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2023-07-25 14:32:40 +02:00
..
2023-06-06 14:31:14 -04:00
2023-07-17 17:53:24 +01:00
2023-06-26 11:23:57 +02:00
2023-06-16 16:38:23 +02:00
2023-07-18 15:34:06 +01:00
2023-07-11 13:36:31 +01:00
2023-04-06 17:56:06 +02:00
2023-06-16 15:40:49 +01:00
2023-04-06 17:56:06 +02:00
2023-06-16 15:40:49 +01:00
2023-04-06 17:56:06 +02:00
2023-04-07 17:13:04 +02:00
2022-05-12 16:25:55 -04:00
2023-05-24 13:52:52 +01:00
2023-04-04 14:53:06 +02:00
2023-06-16 15:40:49 +01:00
2023-07-24 15:34:19 +01:00
2023-06-16 15:40:49 +01:00
2023-05-31 14:59:30 +01:00
2022-05-03 14:42:02 +02:00