Manuel de Prada Corral
1aa7256f01
Refactor MambaCache to modeling_mamba.py (#38086)
* Refactor MambaCache to modeling_mamba.py (parity with Zamba)
* ruff
* fix dummies
* update
* update
* remove mamba ref in cache tests
* remove cache_implementation from tests
* update
* ruff
* ruff
* sneaky regression
* model consistency
* fix test_multi_gpu_data_parallel_forward
* fix falcon slow tests
* ruff
* ruff
* add sample false
* try to fix slow tests
* Revert "fix test_multi_gpu_data_parallel_forward"
This reverts commit 66b7162c7c5c5ce8a73ccf48cffc8a96343ebb33.
* fix tests on nvidia t4, remove dataparallel tests from mamba
* ruff
* remove DDP tests from mamba and falcon_mamba
* add explicit error for MambaCache
* mamba2 also needs to init cache in prepare_inputs_for_generation
* ruff
* ruff
* move MambaCache to its own file
* ruff
* unprotected import fix
* another attempt to fix unprotected imports
* Revert "another attempt to fix unprotected imports"
This reverts commit 2338354fcab630de5899321f5daced5fb312c2a2.
* fixing unprotected import, attempt 3
* Update src/transformers/cache_utils.py
* ruff's fault
* fix arthur review
* modular falcon mamba
* found a hack
* fix config docs
* fix docs
* add export info
* merge modular falcon branch
* oopsie
* fix fast path failing
* new approach
* oopsie
* fix types
* Revert new pragma in modular
This reverts commit 80b1cf160ee251536f07c40b8a0857d499e70db6.
* trying another modular workaround
* review & fix ci
* oopsie
* clear prepare_inputs on mamba/mamba2/falcon_mamba
2025-07-21 14:59:36 +02:00
..
2025-07-16 12:45:46 +02:00
2025-06-13 11:07:09 +00:00
2025-07-21 14:59:36 +02:00
2025-07-07 13:12:02 +00:00
2025-05-23 16:39:47 +00:00
2025-06-17 19:37:18 +01:00
2024-11-28 16:04:05 +01:00
2024-05-28 18:29:22 +02:00
2025-07-21 13:24:34 +02:00
2025-06-13 15:32:40 +00:00
2025-05-30 16:05:07 +00:00
2025-06-26 12:25:14 -07:00
2025-03-03 10:33:46 -08:00
2025-07-03 17:04:16 +01:00
2025-07-18 13:41:54 +02:00
2025-07-18 18:00:34 +00:00
2025-03-03 10:33:46 -08:00
2025-07-16 14:00:17 +02:00
2025-03-07 13:09:02 +00:00
2025-06-30 07:56:55 -07:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2024-09-09 10:47:24 +02:00
2025-07-03 17:04:16 +01:00
2025-06-17 19:37:18 +01:00
2025-03-03 10:33:46 -08:00
2025-03-04 13:47:41 +00:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-06-30 07:56:55 -07:00
2025-03-03 10:33:46 -08:00
2025-03-04 13:47:41 +00:00
2025-06-20 17:36:57 +01:00
2025-03-11 15:29:14 +01:00
2025-06-13 11:07:09 +00:00
2025-05-19 10:37:54 -07:00
2025-06-05 14:07:23 -07:00
2025-05-12 11:55:51 +02:00
2025-06-13 12:02:27 -07:00
2025-04-07 15:19:47 +02:00
2025-07-07 13:12:02 +00:00
2025-04-03 14:15:53 +01:00
2025-06-13 11:07:09 +00:00
2025-06-17 19:37:18 +01:00
2025-06-13 11:07:09 +00:00
2025-03-03 10:33:46 -08:00
2025-06-25 14:55:22 +00:00
2025-06-26 14:21:54 -07:00
2025-07-16 13:35:53 +02:00
2024-09-09 10:47:24 +02:00
2025-03-03 10:33:46 -08:00
2025-07-16 12:15:15 -07:00
2025-06-06 20:04:44 +02:00
2025-06-30 08:54:05 -07:00
2025-05-06 14:32:55 +01:00
2025-03-03 10:33:46 -08:00
2025-06-06 20:04:44 +02:00
2025-06-06 20:04:44 +02:00
2025-04-29 13:28:06 -07:00
2025-06-26 14:40:45 -07:00
2025-06-23 12:33:10 -07:00
2025-03-03 10:33:46 -08:00
2024-11-26 09:23:34 -08:00
2025-03-03 10:33:46 -08:00
2025-03-04 13:47:41 +00:00
2025-04-15 08:35:05 -07:00
2024-09-09 10:47:24 +02:00
2025-05-19 13:16:35 +00:00
2025-06-24 11:48:15 -07:00
2025-03-11 13:47:38 +00:00
2025-03-03 10:33:46 -08:00
2025-07-17 14:29:57 +00:00
2025-06-25 17:29:10 +00:00
2025-03-03 10:33:46 -08:00
2025-06-13 11:07:09 +00:00
2025-07-03 17:04:16 +01:00
2025-05-08 16:47:45 +01:00
2025-07-14 09:25:06 -07:00
2025-03-03 10:33:46 -08:00
2024-02-16 08:16:58 +01:00
2025-05-12 11:55:51 +02:00