pglorio
33cb1f7b61
Add Zamba2 ( #34517 )
...
* First commit
* Finish model implementation
* First commit
* Finish model implementation
* Register zamba2
* generated modeling and configuration
* generated modeling and configuration
* added hybrid cache
* fix attention_mask in mamba
* dropped unused loras
* fix flash2
* config docstrings
* fix config and fwd pass
* make fixup fixes
* text_modeling_zamba2
* small fixes
* make fixup fixes
* Fix modular model converter
* added inheritances in modular, renamed zamba cache
* modular rebase
* new modular conversion
* fix generated modeling file
* fixed import for Zamba2RMSNormGated
* modular file cleanup
* make fixup and model tests
* dropped inheritance for Zamba2PreTrainedModel
* make fixup and unit tests
* Add inheritance of rope from GemmaRotaryEmbedding
* moved rope to model init
* drop del self.self_attn and del self.feed_forward
* fix tests
* renamed lora -> adapter
* rewrote adapter implementation
* fixed tests
* Fix torch_forward in mamba2 layer
* Fix torch_forward in mamba2 layer
* Fix torch_forward in mamba2 layer
* Dropped adapter in-place sum
* removed rope from attention init
* updated rope
* created get_layers method
* make fixup fix
* make fixup fixes
* make fixup fixes
* update to new attention standard
* update to new attention standard
* make fixup fixes
* minor fixes
* cache_position
* removed cache_position postion_ids use_cache
* remove config from modular
* removed config from modular (2)
* import apply_rotary_pos_emb from llama
* fixed rope_kwargs
* Instantiate cache in Zamba2Model
* fix cache
* fix @slow decorator
* small fix in modular file
* Update docs/source/en/model_doc/zamba2.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
* several minor fixes
* inherit mamba2decoder fwd and drop position_ids in mamba
* removed docstrings from modular
* reinstate zamba2 attention decoder fwd
* use regex for tied keys
* Revert "use regex for tied keys"
This reverts commit 9007a522b1f831df6d516a281c0d3fdd20a118f5.
* use regex for tied keys
* add cpu to slow forward tests
* dropped config.use_shared_mlp_adapter
* Update docs/source/en/model_doc/zamba2.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
* re-convert from modular
---------
Co-authored-by: root <root@node-2.us-southcentral1-a.compute.internal >
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
2025-01-27 10:51:23 +01:00
..
2024-09-03 14:01:00 +01:00
2024-09-24 21:28:19 -04:00
2023-11-03 10:57:03 -04:00
2024-12-06 12:17:34 +01:00
2024-05-16 10:56:11 +01:00
2024-10-08 14:26:43 +02:00
2023-11-03 10:57:03 -04:00
2024-12-18 20:18:17 +01:00
2023-12-09 05:38:14 +09:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2024-12-17 14:44:47 +01:00
2024-02-16 08:16:58 +01:00
2023-11-03 10:57:03 -04:00
2024-04-26 16:23:44 +01:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2024-09-20 14:27:32 +01:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2024-11-18 13:21:07 +01:00
2024-06-04 18:29:45 +02:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2024-09-09 10:47:24 +02:00
2023-11-03 10:57:03 -04:00
2024-09-26 10:18:07 -04:00
2023-11-03 10:57:03 -04:00
2023-11-06 19:45:03 +00:00
2024-07-18 10:30:37 +05:30
2024-09-09 10:47:24 +02:00
2023-11-10 13:49:10 +00:00
2024-09-09 10:47:24 +02:00
2024-04-17 12:19:18 +02:00
2024-12-17 09:36:31 -08:00
2024-03-15 14:29:11 +01:00
2024-12-19 09:08:28 +01:00
2024-05-08 11:42:07 +01:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2024-08-19 10:21:51 +01:00
2024-12-17 14:44:47 +01:00
2024-05-27 14:57:43 +02:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2024-11-19 11:18:58 -05:00
2024-05-16 10:56:11 +01:00
2023-11-03 10:57:03 -04:00
2024-10-22 15:50:54 +02:00
2024-10-22 15:50:54 +02:00
2024-05-28 18:07:07 +01:00
2024-10-21 09:05:05 -04:00
2023-11-03 10:57:03 -04:00
2025-01-07 11:34:56 +01:00
2023-11-03 10:57:03 -04:00
2024-12-24 13:21:59 +01:00
2024-08-19 09:28:13 +01:00
2024-10-02 13:55:19 +01:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-23 17:44:08 +00:00
2024-05-28 18:07:07 +01:00
2023-06-20 18:07:47 -04:00
2023-11-03 10:57:03 -04:00
2025-01-20 11:15:39 +01:00
2023-11-03 10:57:03 -04:00
2024-02-16 08:16:58 +01:00
2024-05-28 18:07:07 +01:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2024-12-17 14:23:13 +01:00
2024-09-09 10:47:24 +02:00
2023-09-04 17:18:34 +01:00
2024-03-29 14:31:31 +00:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2024-09-24 21:28:19 -04:00
2024-08-07 10:03:05 +05:00
2024-05-20 10:06:57 +02:00
2023-11-03 10:57:03 -04:00
2025-01-26 15:26:38 -08:00
2023-11-03 10:57:03 -04:00
2024-06-19 09:40:57 +02:00
2024-02-16 08:16:58 +01:00
2023-12-09 05:38:14 +09:00
2023-11-03 10:57:03 -04:00
2024-06-26 13:56:36 +01:00
2024-01-15 09:09:22 +01:00
2023-11-03 10:57:03 -04:00
2024-05-28 18:07:07 +01:00
2024-08-27 21:27:21 +02:00
2024-09-21 01:43:50 +02:00
2025-01-23 17:15:52 +01:00
2024-05-28 18:07:07 +01:00
2025-01-26 15:26:38 -08:00
2023-11-03 10:57:03 -04:00
2025-01-13 18:41:15 +01:00
2023-11-03 10:57:03 -04:00
2024-09-09 10:47:24 +02:00
2024-04-22 18:30:38 +01:00
2023-11-03 10:57:03 -04:00
2024-12-10 11:36:25 -08:00
2024-12-06 12:17:34 +01:00
2024-05-13 15:59:46 +01:00
2024-12-09 10:01:31 +01:00
2023-11-03 10:57:03 -04:00
2024-02-02 08:45:00 +01:00
2024-11-18 13:21:07 +01:00
2024-11-18 13:21:07 +01:00
2024-09-09 10:47:24 +02:00
2024-05-14 16:32:01 +02:00
2024-05-28 18:07:07 +01:00
2023-10-30 21:42:19 +01:00
2023-11-03 10:57:03 -04:00
2024-02-12 10:48:31 -08:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2024-02-12 10:48:31 -08:00
2023-12-20 14:25:07 +05:30
2024-09-12 11:24:56 +02:00
2024-05-20 10:06:57 +02:00
2024-12-10 11:36:25 -08:00
2024-11-18 13:21:07 +01:00
2025-01-26 15:26:38 -08:00
2025-01-21 12:47:04 +01:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2024-09-25 18:04:42 +01:00
2023-11-28 13:19:50 +00:00
2024-08-26 17:49:44 +02:00
2024-10-02 14:08:46 +01:00
2024-07-22 14:14:47 +01:00
2024-02-02 08:45:00 +01:00
2024-05-31 16:56:17 +01:00
2024-05-31 16:56:17 +01:00
2024-09-09 10:47:24 +02:00
2024-09-09 10:47:24 +02:00
2023-11-03 10:57:03 -04:00
2024-05-28 18:07:07 +01:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2024-02-08 14:13:35 -08:00
2024-10-16 11:21:49 +02:00
2024-12-10 11:36:25 -08:00
2024-12-10 11:36:25 -08:00
2024-10-30 10:11:50 +01:00
2023-11-03 10:57:03 -04:00
2024-09-09 10:47:24 +02:00
2023-11-06 19:45:03 +00:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2025-01-09 20:15:38 +01:00
2025-01-10 11:00:54 +01:00
2025-01-26 15:26:38 -08:00
2023-11-03 10:57:03 -04:00
2024-09-09 10:47:24 +02:00
2023-11-03 10:57:03 -04:00
2024-02-01 03:53:49 +01:00
2025-01-09 15:40:36 +01:00
2024-02-12 10:48:31 -08:00
2023-11-03 10:57:03 -04:00
2024-10-06 10:33:16 +02:00
2024-05-28 18:07:07 +01:00
2024-08-06 15:42:05 +02:00
2024-05-28 18:07:07 +01:00
2023-11-06 19:45:03 +00:00
2024-09-25 18:04:42 +01:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2024-11-25 16:31:22 +01:00
2024-04-17 17:59:07 +02:00
2024-09-05 15:49:28 +02:00
2025-01-17 14:10:19 +00:00
2024-09-09 10:47:24 +02:00
2023-11-03 10:57:03 -04:00
2024-09-09 10:47:24 +02:00
2024-10-10 11:49:34 +02:00
2025-01-17 13:58:28 +00:00
2025-01-17 13:58:28 +00:00
2024-09-27 11:23:14 +02:00
2024-02-19 15:22:29 +01:00
2024-02-19 15:22:29 +01:00
2024-02-08 14:13:35 -08:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2024-05-20 10:06:57 +02:00
2025-01-26 15:26:38 -08:00
2024-04-15 14:10:59 +02:00
2024-10-04 21:39:45 +02:00
2024-09-09 10:47:24 +02:00
2024-02-26 08:42:24 -08:00
2024-11-28 16:04:05 +01:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2024-02-12 10:48:31 -08:00
2024-07-09 10:38:29 +01:00
2024-03-13 19:05:20 +00:00
2024-02-08 14:13:35 -08:00
2024-05-28 18:07:07 +01:00
2025-01-23 11:23:00 +01:00
2025-01-07 16:47:27 +01:00
2024-10-14 08:53:32 +02:00
2025-01-21 11:49:05 -05:00
2024-10-14 08:53:32 +02:00
2023-11-03 10:57:03 -04:00
2024-05-28 18:07:07 +01:00
2024-04-15 18:30:59 +02:00
2024-02-02 08:45:00 +01:00
2023-11-06 19:45:03 +00:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-07-13 11:46:54 -04:00
2023-11-06 19:45:03 +00:00
2024-07-22 10:08:27 -07:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2024-11-19 16:49:25 +01:00
2024-02-02 08:45:00 +01:00
2024-10-07 10:08:20 +02:00
2025-01-26 15:26:38 -08:00
2025-01-26 15:26:38 -08:00
2024-06-10 12:35:10 +01:00
2024-09-09 10:47:24 +02:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2025-01-26 15:26:38 -08:00
2024-05-28 18:07:07 +01:00
2023-11-03 10:57:03 -04:00
2024-02-16 08:16:58 +01:00
2023-08-03 14:12:07 +01:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2024-05-20 10:06:57 +02:00
2024-05-20 10:06:57 +02:00
2025-01-20 10:32:39 +00:00
2024-10-29 09:36:03 +00:00
2024-04-19 18:31:43 +01:00
2024-09-09 10:47:24 +02:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-06 19:45:03 +00:00
2024-04-15 14:10:59 +02:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2023-11-06 19:45:03 +00:00
2023-11-03 10:57:03 -04:00
2025-01-08 09:52:51 +01:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2025-01-21 12:32:39 +00:00
2023-11-03 10:57:03 -04:00
2024-02-16 08:16:58 +01:00
2023-11-03 10:57:03 -04:00
2024-05-28 18:07:07 +01:00
2023-11-23 17:02:16 +00:00
2024-04-10 16:02:50 +02:00
2023-11-03 10:57:03 -04:00
2024-02-02 08:45:00 +01:00
2024-02-02 08:45:00 +01:00
2023-11-03 10:57:03 -04:00
2023-11-22 17:21:36 +01:00
2023-11-03 10:57:03 -04:00
2024-02-02 08:45:00 +01:00
2024-12-10 11:36:25 -08:00
2024-05-16 10:56:11 +01:00
2023-11-03 10:57:03 -04:00
2024-11-27 08:19:34 -08:00
2024-02-16 08:16:58 +01:00
2023-11-03 10:57:03 -04:00
2024-02-16 08:16:58 +01:00
2024-05-28 18:07:07 +01:00
2024-05-16 10:56:11 +01:00
2024-05-16 10:56:11 +01:00
2024-06-11 15:47:38 +01:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2025-01-20 11:29:47 +01:00
2024-09-09 10:47:24 +02:00
2024-10-15 11:27:54 +02:00
2023-11-03 10:57:03 -04:00
2024-01-18 13:37:34 +00:00
2024-06-05 11:56:11 +01:00
2024-06-05 11:56:11 +01:00
2024-02-02 08:45:00 +01:00
2024-08-27 14:11:52 +02:00
2024-05-28 18:07:07 +01:00
2023-11-03 10:57:03 -04:00
2024-05-28 18:07:07 +01:00
2023-11-03 10:57:03 -04:00
2024-09-09 10:47:24 +02:00
2023-11-03 10:57:03 -04:00
2023-11-03 10:57:03 -04:00
2024-09-09 10:47:24 +02:00
2023-11-03 10:57:03 -04:00
2024-06-05 11:56:11 +01:00
2023-11-03 10:57:03 -04:00
2024-05-16 10:56:11 +01:00
2023-11-03 10:57:03 -04:00
2025-01-27 10:51:23 +01:00
2024-10-04 22:28:05 +02:00
2025-01-26 15:26:38 -08:00