Files
HuggingFace_transformer/tests/models
Garrett Goon 390f153469 Add padding-free to bamba (#35861)
* add seq_idx and fa kwargs

* update tests

* docs and grad ckpt support

* fmt

* better names

* test_raise_missing_padding_free_kwarg_errs

* + seq_idx in doc strings

* padding free training docs

* add link to pr plots

* raise err on attn_mask with padding free

* rm raising missing padding free err test

* BambaFlashAttentionKwargs

* run modular util for modular_granitemoehybrid.py
2025-05-20 17:13:59 +02:00
..
2025-04-08 17:15:37 +01:00
2025-05-20 17:13:59 +02:00
2025-05-16 13:26:54 +02:00
2025-05-06 17:40:28 -04:00
2025-05-16 13:26:54 +02:00
2025-04-28 15:08:46 +02:00
2025-04-08 17:15:37 +01:00
2025-05-05 13:05:46 +01:00
2025-04-08 17:15:37 +01:00
2025-04-09 14:02:04 +02:00
2025-05-16 13:26:54 +02:00
2025-04-14 17:07:36 +02:00
2025-05-20 13:15:54 +00:00
2025-05-16 13:26:54 +02:00
2025-05-16 13:26:54 +02:00
2025-05-16 13:26:54 +02:00
2025-04-15 11:33:09 +01:00
2025-05-12 16:59:00 +02:00
2025-04-08 17:15:37 +01:00
2025-04-08 17:15:37 +01:00
2025-04-08 17:15:37 +01:00
2025-04-14 17:58:09 +02:00
2025-05-16 13:26:54 +02:00
2025-05-16 13:26:54 +02:00
2025-05-16 13:26:54 +02:00
2025-04-23 15:55:20 -04:00
2025-05-13 12:50:43 +00:00
2025-04-08 17:15:37 +01:00
2025-04-15 14:23:08 +02:00
2022-05-03 14:42:02 +02:00