pglorio
33cb1f7b61
Add Zamba2 ( #34517 )
...
* First commit
* Finish model implementation
* First commit
* Finish model implementation
* Register zamba2
* generated modeling and configuration
* generated modeling and configuration
* added hybrid cache
* fix attention_mask in mamba
* dropped unused loras
* fix flash2
* config docstrings
* fix config and fwd pass
* make fixup fixes
* text_modeling_zamba2
* small fixes
* make fixup fixes
* Fix modular model converter
* added inheritances in modular, renamed zamba cache
* modular rebase
* new modular conversion
* fix generated modeling file
* fixed import for Zamba2RMSNormGated
* modular file cleanup
* make fixup and model tests
* dropped inheritance for Zamba2PreTrainedModel
* make fixup and unit tests
* Add inheritance of rope from GemmaRotaryEmbedding
* moved rope to model init
* drop del self.self_attn and del self.feed_forward
* fix tests
* renamed lora -> adapter
* rewrote adapter implementation
* fixed tests
* Fix torch_forward in mamba2 layer
* Fix torch_forward in mamba2 layer
* Fix torch_forward in mamba2 layer
* Dropped adapter in-place sum
* removed rope from attention init
* updated rope
* created get_layers method
* make fixup fix
* make fixup fixes
* make fixup fixes
* update to new attention standard
* update to new attention standard
* make fixup fixes
* minor fixes
* cache_position
* removed cache_position postion_ids use_cache
* remove config from modular
* removed config from modular (2)
* import apply_rotary_pos_emb from llama
* fixed rope_kwargs
* Instantiate cache in Zamba2Model
* fix cache
* fix @slow decorator
* small fix in modular file
* Update docs/source/en/model_doc/zamba2.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
* several minor fixes
* inherit mamba2decoder fwd and drop position_ids in mamba
* removed docstrings from modular
* reinstate zamba2 attention decoder fwd
* use regex for tied keys
* Revert "use regex for tied keys"
This reverts commit 9007a522b1f831df6d516a281c0d3fdd20a118f5.
* use regex for tied keys
* add cpu to slow forward tests
* dropped config.use_shared_mlp_adapter
* Update docs/source/en/model_doc/zamba2.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
* re-convert from modular
---------
Co-authored-by: root <root@node-2.us-southcentral1-a.compute.internal >
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
2025-01-27 10:51:23 +01:00
..
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2024-12-15 14:00:36 -05:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2024-06-26 21:59:08 +01:00
2024-03-13 14:53:27 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2024-03-13 14:53:27 +01:00
2024-03-13 14:53:27 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2024-08-23 11:12:53 +01:00
2024-10-09 16:46:11 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2024-09-19 12:04:24 +02:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2024-06-26 21:59:08 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2024-06-26 21:59:08 +01:00
2024-09-24 16:40:48 +01:00
2025-01-24 16:55:28 +01:00
2025-01-17 12:10:43 +00:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2024-10-09 16:46:11 +01:00
2025-01-24 16:55:28 +01:00
2024-10-31 16:36:13 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2024-10-02 16:43:43 +02:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2024-12-18 16:53:39 +01:00
2024-06-26 21:59:08 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-13 14:48:39 +01:00
2024-10-30 10:59:08 +00:00
2025-01-23 14:45:42 +01:00
2025-01-24 16:55:28 +01:00
2025-01-17 12:10:43 +00:00
2025-01-24 16:55:28 +01:00
2025-01-22 09:41:04 +00:00
2025-01-24 16:55:28 +01:00
2024-09-24 16:40:48 +01:00
2025-01-24 16:55:28 +01:00
2024-09-09 10:59:04 +02:00
2024-07-24 17:36:32 +01:00
2024-12-20 14:36:31 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-13 18:41:15 +01:00
2024-06-26 21:59:08 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2024-07-25 15:12:23 +02:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-20 11:15:39 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2024-11-05 15:10:15 +00:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2024-08-23 11:12:53 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2024-10-09 16:46:11 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2024-06-26 21:59:08 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-13 14:48:39 +01:00
2025-01-24 16:55:28 +01:00
2024-08-05 15:19:42 +01:00
2024-10-29 16:14:31 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2024-10-30 10:59:08 +00:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2024-10-07 10:56:24 +02:00
2025-01-13 14:48:39 +01:00
2024-06-26 21:59:08 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2024-10-09 16:46:11 +01:00
2024-10-25 11:55:07 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2024-08-23 11:12:53 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2024-03-13 14:53:27 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2024-10-09 16:46:11 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-23 11:23:00 +01:00
2025-01-07 16:47:27 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2024-12-18 10:14:22 +01:00
2024-10-30 10:59:08 +00:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2024-10-09 16:46:11 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2024-12-20 16:03:26 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2024-12-18 16:53:39 +01:00
2024-12-13 14:33:45 +01:00
2025-01-24 16:55:28 +01:00
2024-10-09 16:46:11 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-13 14:48:39 +01:00
2025-01-20 10:32:39 +00:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-09 15:36:22 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-21 13:11:33 +01:00
2025-01-08 15:14:17 +00:00
2024-11-29 11:58:11 +00:00
2025-01-24 16:55:28 +01:00
2024-11-05 15:10:15 +00:00
2024-10-30 10:59:08 +00:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2024-06-26 21:59:08 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-08 16:02:14 +00:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-21 13:11:33 +01:00
2025-01-21 13:11:33 +01:00
2024-09-26 19:38:20 +02:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2024-10-31 16:36:13 +01:00
2024-10-09 16:46:11 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2024-10-09 16:46:11 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-24 16:55:28 +01:00
2025-01-27 10:51:23 +01:00
2025-01-24 16:55:28 +01:00