Armaghan Shakir
55736eea99
Add support for MiniMax's MiniMax-Text-01 (#35831)
* end-to-end architecture
* lightning-attn: refactor, clean, optimize
* put minimax_text_01 in other files
* use latest __init__ standards and auto-generate modular
* support attention_mask for lightning-attn
* Revert "use latest __init__ standards and auto-generate modular"
This reverts commit d8d3c409d89e335c98a8cd36f47304a76eac7493.
* fix modular conversion
* pass both attention masks instead of tuple
* formatting
* Updated Dynamic Cache
* created MiniMaxText01Cache
* fix hardcoded slope_rate
* update attn_type_list in config
* fix lightning when use_cache=False
* copy tests from mixtral
* (checkpoint) all tests pass for normal attention
* fix all unittests
* fix import sorting
* fix consistency and formatting tests
* fix config
* update tests, since changes in main
* fix seq_len error
* create dummy docs
* fix checkpoint
* add checkpoint in config docstring
* run modular_conversion
* update docs
* fix checkpoint path and update tests
* fix ruff
* remove repeated expected_slice
* update docs
* rename "minimax-text-01" to "minimax"
* inherit config from mixtral
* remove from docs in other languages
* undo files that should be untouched
* move minimax to end in conversation docs
* use MiniMaxForCausalLM as it is
* ruff fixes
* run modular
* fix docstring example in causallm
* refactor attention loop and decay factors
* refactor config in modular
* run modular
* refactor cache
* rename static_cache to linear_cache
* make positional embeddings necessary
* remove unnecessary layernorms declarations
* fix import in tests
* refactor attention in next tokens
* remove outdated code
* formatting and modular
* update tests
* rename layernorm alpha/beta factors
* register decay factors as buffers
* remove unused declarations of decay factors
* update config for alpha/beta factors
* run modular
* remove head_dim in tests
* remove minimax from fx.py
* remove stuff that is not really needed
* update __init__
* update qkv torch.split
Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>
* fix qkv torch.split
* quality fixes
* remove mistakenly added dummy
* purge unused ModelTester code
* fix-copies
* run fix-copies
* fix head_dim
* write cache formatting tests
* remove postnorm
* avoid contiguous in attention current states
* update expected_slice
* add generation test for integration
* fix dtype in generation test
* update authors
* update with changes in main
* update graident checkpointing and minor fixes
* fix mutable attn_type_list
* rename: attn_type -> layer_type
* update for layer_types
* update integration tests
* update checkpoint
* clean overview in docs
---------
Co-authored-by: Shakib-IO <shakib.khan17@northsouth.edu>
Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>
2025-06-04 09:38:40 +02:00
..
2025-05-15 10:44:19 +02:00
2025-05-28 09:19:09 -07:00
2025-03-03 10:33:46 -08:00
2025-05-07 17:47:51 +02:00
2025-03-20 15:15:01 +00:00
2025-05-23 13:09:29 +00:00
2025-03-03 10:33:46 -08:00
2025-05-07 17:47:51 +02:00
2025-05-20 17:13:59 +02:00
2025-03-03 10:33:46 -08:00
2025-05-27 11:51:41 -07:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-05-09 11:51:46 -04:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-04-30 12:15:43 +01:00
2025-05-27 11:51:22 -07:00
2025-05-27 11:24:28 -07:00
2025-03-03 10:33:46 -08:00
2025-05-23 13:03:47 -07:00
2025-04-14 17:07:48 +02:00
2025-04-28 15:08:46 +02:00
2025-05-22 17:12:58 +02:00
2025-05-22 17:12:58 +02:00
2025-03-03 10:33:46 -08:00
2025-04-11 11:08:36 +02:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-04-16 22:39:18 +02:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-04-15 18:31:20 +02:00
2025-03-03 10:33:46 -08:00
2025-04-02 14:57:38 -07:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-04-30 12:15:43 +01:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-04-30 12:15:43 +01:00
2025-06-02 12:58:01 +00:00
2025-06-02 12:58:01 +00:00
2025-04-15 18:33:34 +02:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-05-27 17:03:55 +02:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-04-29 12:17:55 +01:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-05-15 10:44:19 +02:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-05-26 14:42:37 +00:00
2025-03-03 10:33:46 -08:00
2025-03-20 15:15:01 +00:00
2025-03-03 10:33:46 -08:00
2024-10-22 15:50:54 +02:00
2025-04-04 11:36:05 -07:00
2025-03-11 09:41:41 -07:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-20 15:15:01 +00:00
2025-05-01 08:54:22 -07:00
2025-04-30 12:15:43 +01:00
2025-03-03 10:33:46 -08:00
2025-04-14 16:24:01 +02:00
2025-03-03 10:33:46 -08:00
2025-03-20 15:15:01 +00:00
2025-03-03 10:33:46 -08:00
2025-04-16 21:59:24 +02:00
2025-04-30 12:15:43 +01:00
2025-05-07 17:47:51 +02:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-05-21 10:43:11 +02:00
2025-04-30 12:15:43 +01:00
2025-04-30 12:15:43 +01:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-04-14 15:05:31 +02:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2023-11-03 10:57:03 -04:00
2025-03-03 10:33:46 -08:00
2025-05-07 17:47:51 +02:00
2025-04-30 12:15:43 +01:00
2025-05-07 17:47:51 +02:00
2025-04-30 12:15:43 +01:00
2025-03-03 10:33:46 -08:00
2025-04-09 14:02:04 +02:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-05-07 17:47:51 +02:00
2025-05-02 09:55:16 +02:00
2025-05-15 10:44:19 +02:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-04-11 18:52:00 +02:00
2025-05-27 12:55:15 -07:00
2025-03-03 10:33:46 -08:00
2025-05-06 06:47:43 +02:00
2025-02-14 16:55:28 +01:00
2025-02-03 20:06:03 +01:00
2025-03-03 10:33:46 -08:00
2025-04-16 12:26:08 +02:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-04-29 12:17:55 +01:00
2025-03-03 10:33:46 -08:00
2025-05-15 10:44:19 +02:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-20 15:15:01 +00:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-05-07 17:47:51 +02:00
2025-05-12 11:55:51 +02:00
2025-05-12 11:55:51 +02:00
2025-05-23 17:17:38 +02:00
2025-04-17 09:18:51 +02:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-04-14 15:06:41 +02:00
2025-04-14 15:42:11 +02:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-04-14 17:07:36 +02:00
2025-03-03 10:33:46 -08:00
2025-04-30 12:15:43 +01:00
2025-03-03 10:33:46 -08:00
2025-04-05 22:02:22 +02:00
2025-04-30 12:15:43 +01:00
2025-05-12 11:55:51 +02:00
2025-05-07 17:47:51 +02:00
2025-05-21 11:50:46 +02:00
2025-05-07 17:47:51 +02:00
2025-04-30 12:15:43 +01:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-05-15 10:44:19 +02:00
2025-03-03 10:33:46 -08:00
2025-05-27 10:06:39 -07:00
2025-05-21 10:58:23 -07:00
2025-05-22 17:12:58 +02:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-05-15 10:44:19 +02:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-06-04 09:38:40 +02:00
2025-05-07 17:47:51 +02:00
2025-04-30 12:15:43 +01:00
2025-03-03 10:33:46 -08:00
2025-04-15 11:33:09 +01:00
2025-05-07 17:47:51 +02:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-04-30 12:15:43 +01:00
2025-05-28 09:20:19 -07:00
2025-05-28 09:20:19 -07:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-04-30 12:15:43 +01:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-05-15 10:44:19 +02:00
2025-05-15 10:44:19 +02:00
2025-03-03 10:33:46 -08:00
2024-10-06 10:33:16 +02:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-05-22 17:12:58 +02:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-05-27 16:24:36 -07:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-04-30 12:15:43 +01:00
2025-05-15 10:44:19 +02:00
2025-03-03 10:33:46 -08:00
2025-04-14 17:58:09 +02:00
2025-05-07 17:47:51 +02:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-05-22 17:12:58 +02:00
2025-05-22 17:12:58 +02:00
2025-04-14 13:49:13 +02:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-04-17 23:08:24 +02:00
2025-04-30 12:15:43 +01:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-05-22 17:12:58 +02:00
2025-04-23 15:55:33 -04:00
2025-03-03 10:33:46 -08:00
2025-03-20 16:12:44 +00:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-04-23 15:55:20 -04:00
2025-03-11 13:47:38 +00:00
2025-04-28 11:56:32 +01:00
2025-05-07 17:47:51 +02:00
2025-05-15 10:44:19 +02:00
2025-03-03 10:33:46 -08:00
2025-05-12 11:55:51 +02:00
2025-04-30 12:15:43 +01:00
2025-03-31 09:50:49 +02:00
2025-03-31 09:50:49 +02:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-05-12 15:56:31 +01:00
2025-03-03 10:33:46 -08:00
2025-05-23 16:27:56 -07:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-05-19 09:21:14 -07:00
2025-03-31 11:45:07 +02:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-05-15 10:44:19 +02:00
2025-04-03 16:26:29 +01:00
2025-04-25 12:46:17 -07:00
2025-04-18 13:30:41 -07:00
2025-05-12 11:55:51 +02:00
2024-05-28 18:07:07 +01:00
2025-03-04 13:47:41 +00:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-05-07 12:20:16 -04:00
2025-05-21 16:16:43 -07:00
2025-05-23 13:04:13 -07:00
2025-03-03 10:33:46 -08:00
2025-04-30 12:15:43 +01:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-04-16 15:00:53 +02:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-04 13:47:41 +00:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-05-15 10:44:19 +02:00
2025-05-15 10:44:19 +02:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-05-12 11:55:51 +02:00
2025-03-20 15:15:01 +00:00
2025-05-13 15:40:53 +00:00
2025-05-07 17:47:51 +02:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-11 13:47:38 +00:00
2025-03-03 10:33:46 -08:00
2025-05-28 09:19:43 -07:00
2025-03-20 15:15:01 +00:00
2025-03-21 15:35:22 -07:00
2025-03-03 10:33:46 -08:00
2025-04-28 14:51:50 -04:00
2025-03-03 10:33:46 -08:00
2025-04-15 13:16:05 -07:00
2025-03-20 15:15:01 +00:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-05-15 10:44:19 +02:00
2025-03-03 10:33:46 -08:00
2025-05-22 09:16:38 +00:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-04-15 14:23:08 +02:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-05-27 10:06:53 -07:00