Mayank Mishra
c35d2ccf5a
Granite language models (#31502)
* first commit
* drop tokenizer
* drop tokenizer
* drop tokenizer
* drop convert
* granite
* drop tokenization test
* mup
* fix
* reformat
* reformat
* reformat
* fix docs
* stop checking for checkpoint
* update support
* attention multiplier
* update model
* tiny drop
* saibo drop
* skip test
* fix test
* fix test
* drop
* drop useless imports
* update docs
* drop flash function
* copied from
* drop pretraining tp
* drop pretraining tp
* drop pretraining tp
* drop unused import
* drop code path
* change name
* softmax scale
* head dim
* drop legacy cache
* rename params
* cleanup
* fix copies
* comments
* add back legacy cache
* multipliers
* multipliers
* multipliers
* text fix
* fix copies
* merge
* multipliers
* attention multiplier
* drop unused imports
* fix
* fix
* fix
* move rope?
* Update src/transformers/models/granite/configuration_granite.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* fix
* Update src/transformers/models/granite/modeling_granite.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* fix
* fix
* fix
* fix
* fix-copies
* torch rmsnorm
* add authors
* change model path
* fix
* test
* drop static cache test
* uupdate readme
* drop non-causal
* readme
* drop useless imports
* Update docs/source/en/model_doc/granite.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update docs/source/en/model_doc/granite.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update docs/source/en/model_doc/granite.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-08-27 21:27:21 +02:00
..
2024-08-23 11:12:53 +01:00
2024-08-14 16:14:24 +02:00
2024-08-27 21:27:21 +02:00
2024-08-14 16:14:24 +02:00
2024-08-08 13:43:14 -07:00
2024-04-08 14:21:16 +01:00
2024-05-28 18:29:22 +02:00
2024-08-27 21:27:21 +02:00
2023-09-04 11:15:12 +01:00
2024-04-24 09:38:18 +02:00
2024-04-16 15:34:04 +01:00
2024-08-07 11:42:52 +02:00
2024-02-08 14:13:35 -08:00
2024-02-16 08:16:58 +01:00
2024-02-16 08:16:58 +01:00
2023-06-20 18:07:47 -04:00
2024-04-01 18:47:32 -07:00
2024-08-23 17:40:06 +01:00
2024-02-16 08:16:58 +01:00
2022-04-04 10:25:46 -04:00
2024-07-23 17:47:51 +01:00
2024-06-06 22:02:38 +01:00
2024-08-26 13:15:43 +02:00
2024-02-02 08:45:00 +01:00
2024-07-08 11:52:47 +01:00
2023-06-20 18:07:47 -04:00
2023-12-20 10:37:23 -08:00
2024-08-07 16:34:46 +01:00
2024-06-03 14:55:10 +01:00
2024-07-08 11:52:47 +01:00
2023-11-13 14:20:54 +01:00
2024-08-27 21:27:21 +02:00
2024-05-29 11:55:43 +01:00
2024-08-06 10:24:19 +05:00
2024-08-16 11:48:45 +01:00
2024-07-08 11:52:47 +01:00
2024-08-22 15:30:22 +02:00
2024-08-27 09:29:05 -07:00
2024-07-30 15:49:14 +01:00
2024-03-23 18:29:39 -07:00
2024-02-16 08:16:58 +01:00
2022-04-04 10:25:46 -04:00
2023-12-08 10:32:18 -08:00
2024-05-30 16:47:35 +02:00
2024-07-08 11:52:47 +01:00
2024-02-02 08:45:00 +01:00
2024-08-27 21:27:21 +02:00
2024-07-29 10:50:43 +01:00
2024-08-14 09:36:43 -07:00
2024-02-16 08:16:58 +01:00
2024-06-18 11:00:26 -07:00
2024-07-04 13:20:49 -04:00
2024-02-16 08:16:58 +01:00
2023-06-20 18:07:47 -04:00
2023-10-31 09:44:51 -07:00
2024-02-16 08:16:58 +01:00
2023-11-06 19:45:03 +00:00
2024-08-19 09:50:35 -07:00
2024-02-16 08:16:58 +01:00
2024-02-02 08:45:00 +01:00
2024-07-09 10:38:29 +01:00
2024-06-12 11:33:00 +01:00
2024-04-29 10:57:51 +01:00
2023-11-06 19:45:03 +00:00
2024-02-16 08:16:58 +01:00
2024-04-16 11:58:55 +02:00
2024-02-26 08:18:15 -08:00
2024-08-07 11:01:33 -07:00
2024-07-29 10:50:43 +01:00
2024-02-16 08:16:58 +01:00
2024-06-03 16:52:23 -07:00
2024-02-16 08:16:58 +01:00
2024-08-23 13:20:49 +02:00
2024-05-14 18:45:06 +01:00
2024-02-16 08:16:58 +01:00