Pablo Montalvo
80b90e7b2f
Add codestral mamba2 ( #32080 )
...
* add new model like
* draft cuda forward - mismatched keys (sharding on conv1)
* match keys successfully
* fix split
* get generation/forward running (wrong gens, norm?)
* :update
* some refactoring
* fixes
* works up until copy to cache
* fix
* update
* NON WORKING VERSION
* version that work?
* nit
* fix config
* fix conversion script
* working cuda forward
* nit
* update
* simplifcation
* make mamba slow simple work
* no einops
* todo
* fix style
* no einops
* update fix no einsum
* nit
* remove einops
* bug: scan_output differs strongly
* add rms norm option
* fix fast + slow generation with and w/o cache ✔️
* draft integration tests
* remove a big chunk of the einsum
* fix slow, fast generations, without any einsum
* fix copies
* fix structure
* fix up modeling and tests
* fix tests
* clamping is indeed worse
* recover mamba2 cache test
* fix copies
* no cache position (yet)
* fix tf tests
* fix matmul for generate
* fixup
* skip cache tests for now
* [run-slow]mamba2
* tune out hidden states for padding
* test batched generation
* propagate attention mask changes
* fix past length
* fix integration test
* style
* address comments
* update readme
* add mamba2 version check
* fix tests
* [run-slow]mamba2
* skip edge tests
* [run-slow]mamba2
* last fixup
* [run-slow]mamba2
* update README
---------
Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com >
2024-08-06 16:39:52 +02:00
..
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-07-23 14:54:38 +08:00
2024-07-25 16:12:23 +02:00
2024-06-07 19:40:26 +01:00
2024-06-26 21:59:08 +01:00
2024-08-05 14:15:36 +01:00
2024-06-26 21:59:08 +01:00
2024-03-13 14:53:27 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-03-13 14:53:27 +01:00
2024-03-13 14:53:27 +01:00
2024-03-13 14:53:27 +01:00
2024-07-17 08:37:43 +01:00
2024-07-16 09:32:01 -04:00
2024-06-26 21:59:08 +01:00
2024-06-19 10:18:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-08-06 11:33:05 +01:00
2024-06-26 21:59:08 +01:00
2024-07-24 17:36:32 +01:00
2024-08-06 11:33:05 +01:00
2024-05-22 06:40:15 +02:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-08-06 11:33:05 +01:00
2024-08-06 11:33:05 +01:00
2024-07-23 14:54:38 +08:00
2024-07-18 10:30:37 +05:30
2024-06-26 21:59:08 +01:00
2024-07-23 14:54:38 +08:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-08-06 11:33:05 +01:00
2024-06-26 21:59:08 +01:00
2023-03-22 20:02:24 +01:00
2024-06-26 21:59:08 +01:00
2024-07-26 10:33:02 +02:00
2024-06-19 10:18:08 +01:00
2024-07-23 14:54:38 +08:00
2024-07-29 10:52:13 +01:00
2024-07-26 10:33:02 +02:00
2024-07-26 10:33:02 +02:00
2024-06-07 19:40:26 +01:00
2024-06-26 21:59:08 +01:00
2024-08-06 11:33:05 +01:00
2024-06-19 10:18:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-07 19:40:26 +01:00
2024-07-26 10:33:02 +02:00
2023-06-29 10:17:36 +01:00
2024-08-06 11:33:05 +01:00
2024-03-25 10:33:38 +01:00
2024-08-06 11:33:05 +01:00
2024-08-06 11:33:05 +01:00
2024-06-26 21:59:08 +01:00
2024-07-23 14:54:38 +08:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-07-26 10:33:02 +02:00
2024-08-06 11:33:05 +01:00
2024-07-10 13:46:31 +01:00
2024-06-19 10:18:08 +01:00
2024-06-26 21:59:08 +01:00
2024-05-22 06:40:15 +02:00
2024-06-26 21:59:08 +01:00
2024-08-06 11:18:58 +01:00
2024-07-31 10:33:38 +05:00
2024-06-26 21:59:08 +01:00
2024-08-06 11:33:05 +01:00
2024-07-24 17:36:32 +01:00
2024-06-26 21:59:08 +01:00
2024-05-22 06:40:15 +02:00
2024-07-23 10:11:12 +02:00
2024-06-26 21:59:08 +01:00
2024-07-24 17:36:32 +01:00
2024-03-25 10:33:38 +01:00
2024-06-25 13:36:58 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-07-11 22:13:56 +01:00
2024-07-23 14:54:38 +08:00
2024-07-25 15:12:23 +02:00
2024-06-26 21:59:08 +01:00
2024-07-31 14:51:04 +01:00
2024-08-06 11:33:05 +01:00
2024-06-26 21:59:08 +01:00
2024-06-07 19:40:26 +01:00
2024-08-06 11:33:05 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-08-06 11:33:05 +01:00
2024-08-06 11:33:05 +01:00
2024-07-22 18:24:43 +01:00
2024-06-26 21:59:08 +01:00
2024-08-06 11:33:05 +01:00
2024-03-25 10:33:38 +01:00
2024-08-06 11:18:58 +01:00
2024-07-26 10:17:27 +05:00
2024-07-24 18:00:21 +01:00
2024-07-23 10:23:55 +05:00
2024-06-26 21:59:08 +01:00
2024-07-16 09:32:01 -04:00
2024-08-05 15:19:42 +01:00
2024-07-26 10:33:02 +02:00
2024-07-16 09:32:01 -04:00
2024-07-18 11:54:54 -04:00
2024-08-06 16:39:52 +02:00
2024-07-22 14:14:47 +01:00
2024-07-22 18:24:43 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-08-05 14:15:36 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-08-06 11:18:58 +01:00
2024-06-26 21:59:08 +01:00
2024-08-05 15:19:42 +01:00
2024-07-26 10:33:02 +02:00
2024-08-06 11:33:05 +01:00
2024-08-06 11:33:05 +01:00
2024-08-06 11:33:05 +01:00
2024-06-07 19:40:26 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-08-06 15:42:05 +02:00
2024-06-26 21:59:08 +01:00
2024-07-16 09:32:01 -04:00
2024-08-06 11:33:05 +01:00
2024-05-22 06:40:15 +02:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-08-06 11:33:05 +01:00
2024-08-06 11:33:05 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-07 19:40:26 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-05-22 06:40:15 +02:00
2024-08-05 11:58:42 +05:00
2024-03-13 14:53:27 +01:00
2024-06-26 21:59:08 +01:00
2024-07-16 09:32:01 -04:00
2024-08-06 11:33:05 +01:00
2024-07-23 14:54:38 +08:00
2024-06-26 21:59:08 +01:00
2024-08-06 11:33:05 +01:00
2024-06-26 21:59:08 +01:00
2024-08-06 11:18:58 +01:00
2024-08-06 11:18:58 +01:00
2024-06-26 21:59:08 +01:00
2024-07-10 13:46:31 +01:00
2024-06-26 21:59:08 +01:00
2024-06-19 10:18:08 +01:00
2024-06-21 01:48:10 -07:00
2024-06-19 10:18:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-04 10:09:24 +01:00
2024-05-22 06:40:15 +02:00
2024-07-23 17:07:31 +02:00
2024-08-06 11:33:05 +01:00
2024-06-20 14:15:01 +01:00
2024-06-07 19:40:26 +01:00
2024-08-05 08:40:58 +02:00
2024-07-16 09:32:01 -04:00
2024-06-26 21:59:08 +01:00
2024-06-11 15:47:38 +01:00
2024-07-23 14:54:38 +08:00
2024-07-23 14:54:38 +08:00
2024-08-06 11:33:05 +01:00
2024-06-26 21:59:08 +01:00
2024-07-23 14:54:38 +08:00
2024-07-23 14:54:38 +08:00
2024-05-22 06:40:15 +02:00
2024-07-26 10:33:02 +02:00
2024-08-06 11:18:58 +01:00
2024-08-06 11:18:58 +01:00
2024-06-19 10:18:08 +01:00
2024-06-07 19:40:26 +01:00
2024-06-19 10:18:08 +01:00
2024-08-06 11:33:05 +01:00
2024-06-26 18:46:48 +01:00
2024-08-06 11:48:32 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-07-22 18:24:43 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-08-06 11:33:05 +01:00
2024-07-22 18:24:43 +01:00
2024-06-26 21:59:08 +01:00
2024-07-23 14:54:38 +08:00
2024-07-23 14:54:38 +08:00
2024-07-23 14:54:38 +08:00
2024-06-19 10:18:08 +01:00
2024-08-06 11:33:05 +01:00
2024-08-06 11:33:05 +01:00
2024-08-06 11:33:05 +01:00
2024-07-24 18:00:21 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-05-22 06:40:15 +02:00
2024-08-06 11:33:05 +01:00
2024-06-07 19:40:26 +01:00
2024-06-07 19:40:26 +01:00
2024-06-26 21:59:08 +01:00
2024-08-06 11:33:05 +01:00
2024-07-08 13:49:21 +02:00
2024-08-06 11:33:05 +01:00
2024-07-23 14:54:38 +08:00
2024-07-23 14:54:38 +08:00
2024-07-23 14:54:38 +08:00
2024-06-26 21:59:08 +01:00
2024-06-17 17:29:13 +01:00
2024-07-23 14:54:38 +08:00
2024-08-01 18:10:56 +08:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-04-02 10:27:26 +02:00
2024-06-26 21:59:08 +01:00
2024-06-06 14:44:35 +01:00
2024-06-26 21:59:08 +01:00
2024-06-04 10:09:24 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-08-06 11:33:05 +01:00