Files
HuggingFace_transformer/docs/source/en/model_doc
pglorio 33cb1f7b61 Add Zamba2 (#34517)
* First commit

* Finish model implementation

* First commit

* Finish model implementation

* Register zamba2

* generated modeling and configuration

* generated modeling and configuration

* added hybrid cache

* fix attention_mask in mamba

* dropped unused loras

* fix flash2

* config docstrings

* fix config and fwd pass

* make fixup fixes

* text_modeling_zamba2

* small fixes

* make fixup fixes

* Fix modular model converter

* added inheritances in modular, renamed zamba cache

* modular rebase

* new modular conversion

* fix generated modeling file

* fixed import for Zamba2RMSNormGated

* modular file cleanup

* make fixup and model tests

* dropped inheritance for Zamba2PreTrainedModel

* make fixup and unit tests

* Add inheritance of rope from GemmaRotaryEmbedding

* moved rope to model init

* drop del self.self_attn and del self.feed_forward

* fix tests

* renamed lora -> adapter

* rewrote adapter implementation

* fixed tests

* Fix torch_forward in mamba2 layer

* Fix torch_forward in mamba2 layer

* Fix torch_forward in mamba2 layer

* Dropped adapter in-place sum

* removed rope from attention init

* updated rope

* created get_layers method

* make fixup fix

* make fixup fixes

* make fixup fixes

* update to new attention standard

* update to new attention standard

* make fixup fixes

* minor fixes

* cache_position

* removed cache_position postion_ids use_cache

* remove config from modular

* removed config from modular (2)

* import apply_rotary_pos_emb from llama

* fixed rope_kwargs

* Instantiate cache in Zamba2Model

* fix cache

* fix @slow decorator

* small fix in modular file

* Update docs/source/en/model_doc/zamba2.md

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* several minor fixes

* inherit mamba2decoder fwd and drop position_ids in mamba

* removed docstrings from modular

* reinstate zamba2 attention decoder fwd

* use regex for tied keys

* Revert "use regex for tied keys"

This reverts commit 9007a522b1f831df6d516a281c0d3fdd20a118f5.

* use regex for tied keys

* add cpu to slow forward tests

* dropped config.use_shared_mlp_adapter

* Update docs/source/en/model_doc/zamba2.md

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* re-convert from modular

---------

Co-authored-by: root <root@node-2.us-southcentral1-a.compute.internal>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2025-01-27 10:51:23 +01:00
..
2024-09-03 14:01:00 +01:00
2024-12-06 12:17:34 +01:00
2024-12-18 20:18:17 +01:00
2024-12-17 14:44:47 +01:00
2024-04-26 16:23:44 +01:00
2024-09-20 14:27:32 +01:00
2024-06-04 18:29:45 +02:00
2023-11-06 19:45:03 +00:00
2024-07-18 10:30:37 +05:30
2023-11-10 13:49:10 +00:00
2024-12-17 09:36:31 -08:00
2024-03-15 14:29:11 +01:00
2024-08-19 10:21:51 +01:00
2024-12-17 14:44:47 +01:00
2024-05-28 18:07:07 +01:00
2024-10-21 09:05:05 -04:00
2025-01-07 11:34:56 +01:00
2024-08-19 09:28:13 +01:00
2024-10-02 13:55:19 +01:00
2023-11-23 17:44:08 +00:00
2025-01-20 11:15:39 +01:00
2024-05-28 18:07:07 +01:00
2024-12-17 14:23:13 +01:00
2024-08-07 10:03:05 +05:00
2025-01-26 15:26:38 -08:00
2024-06-19 09:40:57 +02:00
2024-08-27 21:27:21 +02:00
2024-09-21 01:43:50 +02:00
2025-01-26 15:26:38 -08:00
2025-01-13 18:41:15 +01:00
2024-12-06 12:17:34 +01:00
2024-05-13 15:59:46 +01:00
2024-12-09 10:01:31 +01:00
2024-05-14 16:32:01 +02:00
2024-05-28 18:07:07 +01:00
2023-10-30 21:42:19 +01:00
2023-12-20 14:25:07 +05:30
2025-01-26 15:26:38 -08:00
2025-01-21 12:47:04 +01:00
2024-08-26 17:49:44 +02:00
2024-05-28 18:07:07 +01:00
2024-10-16 11:21:49 +02:00
2024-10-30 10:11:50 +01:00
2025-01-09 20:15:38 +01:00
2025-01-10 11:00:54 +01:00
2025-01-26 15:26:38 -08:00
2024-05-28 18:07:07 +01:00
2024-08-06 15:42:05 +02:00
2024-05-28 18:07:07 +01:00
2024-09-25 18:04:42 +01:00
2024-04-17 17:59:07 +02:00
2024-09-05 15:49:28 +02:00
2024-10-10 11:49:34 +02:00
2024-02-19 15:22:29 +01:00
2024-02-19 15:22:29 +01:00
2025-01-26 15:26:38 -08:00
2024-10-04 21:39:45 +02:00
2024-03-13 19:05:20 +00:00
2024-05-28 18:07:07 +01:00
2025-01-23 11:23:00 +01:00
2024-05-28 18:07:07 +01:00
2023-11-06 19:45:03 +00:00
2023-07-13 11:46:54 -04:00
2025-01-26 15:26:38 -08:00
2025-01-26 15:26:38 -08:00
2025-01-26 15:26:38 -08:00
2025-01-20 10:32:39 +00:00
2024-04-19 18:31:43 +01:00
2023-11-06 19:45:03 +00:00
2025-01-08 09:52:51 +01:00
2024-05-28 18:07:07 +01:00
2023-11-23 17:02:16 +00:00
2024-06-11 15:47:38 +01:00
2024-10-15 11:27:54 +02:00
2024-05-28 18:07:07 +01:00
2025-01-27 10:51:23 +01:00
2024-10-04 22:28:05 +02:00
2025-01-26 15:26:38 -08:00