Arthur
fb1c62e973
[Add Mamba] Adds support for the Mamba models ( #28094 )
...
* initial-commit
* start cleaning
* small nits
* small nits
* current updates
* add kernels
* small refactoring little step
* add comments
* styling
* nit
* nits
* Style
* Small changes
* Push dummy mambda simple slow
* nit
* Use original names
* Use original names and remove norm
* Updates for inference params
* Style nd updates
* nits
* Match logits
* Add a test
* Add expected generated text
* nits doc, imports and styling
* style
* oups
* dont install kernels, invite users to install the required kernels
* let use use the original packages
* styling
* nits
* fix some copieds
* update doc
* fix-copies
* styling done
* nits
* fix import check
* run but wrong cuda ress
* mamba CUDA works :)
* fix the fast path
* config naming nits
* conversion script is not required at this stage
* finish fixing the fast path: generation make sense now!
* nit
* Let's start working on the CIs
* style
* better style
* more nits
* test nit
* quick fix for now
* nits
* nit
* nit
* nit
* nits
* update test rest
* fixup
* update test
* nit
* some fixes
* nits
* update test values
* fix styling
* nit
* support peft
* integrations tests require torchg
* also add slow markers
* styling
* chose forward wisely
* nits
* update tests
* fix gradient checkpointing
* fixup
* nit
* fix doc
* check copies
* fix the docstring
* fix some more tests
* style
* fix beam search
* add init schene
* update
* nit
* fix
* fixup the doc
* fix the doc
* fixup
* tentative update but slow is no longer good
* nit
* should we always use float32?
* nits
* revert wrong changes
* res in float32
* cleanup
* skip fmt for now
* update generation values
* update test values running original model
* fixup
* update tests + rename inference_params to cache_params + make sure training does not use cache_params
* small nits
* more nits
* fix final CIs
* style
* nit doc
* I hope final doc nits
* nit
* 🫠
* final touch!
* fix torch import
* Apply suggestions from code review
Co-authored-by: Lysandre Debut <hi@lysand.re >
* Apply suggestions from code review
* fix fix and fix
* fix base model prefix!
* nit
* Update src/transformers/models/mamba/__init__.py
* Update docs/source/en/model_doc/mamba.md
Co-authored-by: Lysandre Debut <hi@lysand.re >
* nit
---------
Co-authored-by: Lysandre Debut <hi@lysand.re >
2024-03-05 20:01:06 +09:00
..
2023-11-16 11:44:36 -08:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-15 10:13:52 -08:00
2024-02-16 08:16:58 +01:00
2023-11-03 10:57:03 -04:00
2023-12-09 05:38:14 +09:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2024-02-16 08:16:58 +01:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-12-06 10:38:21 -08:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2024-01-29 15:46:32 +01:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-06 19:45:03 +00:00
2024-02-14 08:41:31 +01:00
2023-11-03 10:57:03 -04:00
2023-11-10 13:49:10 +00:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2024-02-19 15:22:29 +01:00
2023-11-03 10:57:03 -04:00
2023-12-11 18:03:42 +00:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-23 17:44:08 +00:00
2024-02-16 08:16:58 +01:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-23 17:44:08 +00:00
2023-11-03 10:57:03 -04:00
2023-06-20 18:07:47 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2024-02-16 08:16:58 +01:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-09-04 17:18:34 +01:00
2024-02-12 10:48:31 -08:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-10-19 15:36:41 +02:00
2024-02-21 14:21:28 +01:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2024-02-16 08:16:58 +01:00
2023-12-09 05:38:14 +09:00
2023-11-03 10:57:03 -04:00
2023-12-09 05:38:14 +09:00
2024-01-15 09:09:22 +01:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2024-02-02 08:45:00 +01:00
2023-11-03 10:57:03 -04:00
2024-02-02 08:45:00 +01:00
2023-10-30 21:42:19 +01:00
2023-11-03 10:57:03 -04:00
2024-02-12 10:48:31 -08:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2024-02-12 10:48:31 -08:00
2023-12-20 14:25:07 +05:30
2024-02-06 03:41:42 +01:00
2023-12-11 10:22:26 +01:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-28 13:19:50 +00:00
2024-03-05 20:01:06 +09:00
2023-11-03 10:57:03 -04:00
2024-02-02 08:45:00 +01:00
2023-11-03 10:57:03 -04:00
2024-02-02 08:45:00 +01:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2024-02-08 14:13:35 -08:00
2024-02-22 11:48:01 +01:00
2024-02-22 11:48:01 +01:00
2023-11-03 10:57:03 -04:00
2024-02-02 08:45:00 +01:00
2023-11-06 19:45:03 +00:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2024-02-01 03:53:49 +01:00
2024-02-12 10:48:31 -08:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-06 19:45:03 +00:00
2023-11-13 14:20:54 +01:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-12-09 05:38:14 +09:00
2023-12-11 18:03:42 +00:00
2023-12-11 18:03:42 +00:00
2024-02-19 15:22:29 +01:00
2024-02-19 15:22:29 +01:00
2024-02-08 14:13:35 -08:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2024-01-22 17:15:07 +00:00
2023-11-03 10:57:03 -04:00
2024-02-26 08:42:24 -08:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2024-02-12 10:48:31 -08:00
2023-11-03 10:57:03 -04:00
2024-02-08 14:13:35 -08:00
2024-02-16 08:16:58 +01:00
2024-02-23 10:43:31 +01:00
2023-11-03 10:57:03 -04:00
2023-06-20 18:07:47 -04:00
2024-02-02 08:45:00 +01:00
2023-11-06 19:45:03 +00:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-07-13 11:46:54 -04:00
2023-11-06 19:45:03 +00:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2024-02-02 08:45:00 +01:00
2024-02-19 15:22:29 +01:00
2023-12-14 15:14:13 +00:00
2023-12-14 15:14:13 +00:00
2023-11-03 10:57:03 -04:00
2024-02-26 18:17:19 +00:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2024-02-19 15:22:29 +01:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2024-02-16 08:16:58 +01:00
2023-08-03 14:12:07 +01:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2024-02-14 07:15:18 +01:00
2024-02-28 01:24:34 +01:00
2023-11-03 10:57:03 -04:00
2023-06-20 18:07:47 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-06 19:45:03 +00:00
2024-02-16 08:16:58 +01:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-06 19:45:03 +00:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2024-02-16 08:16:58 +01:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-23 17:02:16 +00:00
2024-03-04 18:49:02 +01:00
2023-11-03 10:57:03 -04:00
2024-02-02 08:45:00 +01:00
2024-02-02 08:45:00 +01:00
2023-11-03 10:57:03 -04:00
2023-11-22 17:21:36 +01:00
2023-11-03 10:57:03 -04:00
2024-02-02 08:45:00 +01:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-12-15 20:16:47 +01:00
2024-02-16 08:16:58 +01:00
2023-11-03 10:57:03 -04:00
2024-02-16 08:16:58 +01:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2024-01-18 13:37:34 +00:00
2023-11-03 10:57:03 -04:00
2024-02-02 08:45:00 +01:00
2024-02-02 08:45:00 +01:00
2024-02-19 15:22:29 +01:00
2023-06-20 18:07:47 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00