Files
HuggingFace_transformer/docs/source/en
Garrett Goon 390f153469 Add padding-free to bamba (#35861)
* add seq_idx and fa kwargs

* update tests

* docs and grad ckpt support

* fmt

* better names

* test_raise_missing_padding_free_kwarg_errs

* + seq_idx in doc strings

* padding free training docs

* add link to pr plots

* raise err on attn_mask with padding free

* rm raising missing padding free err test

* BambaFlashAttentionKwargs

* run modular util for modular_granitemoehybrid.py
2025-05-20 17:13:59 +02:00
..
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-04-11 18:42:37 +01:00
2025-05-06 14:45:20 +01:00
2025-03-03 10:33:46 -08:00
2025-03-24 14:08:29 +00:00
2025-03-03 10:33:46 -08:00
2022-04-04 10:25:46 -04:00
2025-03-24 14:08:29 +00:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-11 15:29:14 +01:00
2024-07-08 11:52:47 +01:00
2025-03-24 14:08:29 +00:00
2025-03-03 10:33:46 -08:00
2025-03-31 09:50:49 +02:00
2025-04-07 15:19:47 +02:00
2025-04-21 09:01:11 -07:00
2025-04-03 14:15:53 +01:00
2025-03-03 10:33:46 -08:00
2022-04-04 10:25:46 -04:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-04-17 14:54:44 +01:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-05-19 13:16:35 +00:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-05-06 14:32:44 +01:00
2025-01-26 15:26:38 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-04-11 18:42:37 +01:00
2025-05-14 12:40:00 +00:00
2025-03-03 10:33:46 -08:00