Files
HuggingFace_transformer/docs/source
Garrett Goon 390f153469 Add padding-free to bamba (#35861)
* add seq_idx and fa kwargs

* update tests

* docs and grad ckpt support

* fmt

* better names

* test_raise_missing_padding_free_kwarg_errs

* + seq_idx in doc strings

* padding free training docs

* add link to pr plots

* raise err on attn_mask with padding free

* rm raising missing padding free err test

* BambaFlashAttentionKwargs

* run modular util for modular_granitemoehybrid.py
2025-05-20 17:13:59 +02:00
..
2025-04-14 14:16:07 +01:00
2025-05-20 17:13:59 +02:00
2025-04-11 18:42:37 +01:00
2025-05-06 14:45:20 +01:00
2025-05-06 14:59:00 +01:00
2025-04-11 18:42:37 +01:00
2023-11-08 08:35:20 -05:00