Garrett Goon
390f153469
Add padding-free to bamba ( #35861 )
...
* add seq_idx and fa kwargs
* update tests
* docs and grad ckpt support
* fmt
* better names
* test_raise_missing_padding_free_kwarg_errs
* + seq_idx in doc strings
* padding free training docs
* add link to pr plots
* raise err on attn_mask with padding free
* rm raising missing padding free err test
* BambaFlashAttentionKwargs
* run modular util for modular_granitemoehybrid.py
2025-05-20 17:13:59 +02:00
..
2025-05-09 15:26:27 +02:00
2025-05-12 11:55:51 +02:00
2025-05-20 17:13:59 +02:00
2025-04-30 20:16:29 +02:00
2025-04-30 11:00:10 -07:00
2024-11-28 16:04:05 +01:00
2024-05-28 18:29:22 +02:00
2025-05-12 11:55:51 +02:00
2025-03-03 10:33:46 -08:00
2025-04-30 12:15:43 +01:00
2025-03-03 10:33:46 -08:00
2025-04-11 18:42:37 +01:00
2025-05-06 14:45:20 +01:00
2024-02-08 14:13:35 -08:00
2025-05-08 17:46:07 -04:00
2025-03-03 10:33:46 -08:00
2025-03-24 14:08:29 +00:00
2025-03-07 13:09:02 +00:00
2025-04-10 14:42:32 +02:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2024-09-09 10:47:24 +02:00
2022-04-04 10:25:46 -04:00
2025-05-12 14:04:41 +01:00
2025-03-24 14:08:29 +00:00
2025-03-03 10:33:46 -08:00
2025-03-04 13:47:41 +00:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-04 13:47:41 +00:00
2025-05-15 10:35:54 +01:00
2025-03-11 15:29:14 +01:00
2024-07-08 11:52:47 +01:00
2025-03-24 14:08:29 +00:00
2025-05-19 10:37:54 -07:00
2025-03-03 10:33:46 -08:00
2025-05-12 11:55:51 +02:00
2025-03-31 09:50:49 +02:00
2025-04-07 15:19:47 +02:00
2025-04-21 09:01:11 -07:00
2025-04-03 14:15:53 +01:00
2025-03-11 09:41:41 -07:00
2025-05-12 14:04:41 +01:00
2024-09-24 03:40:56 -06:00
2025-03-03 10:33:46 -08:00
2024-03-23 18:29:39 -07:00
2025-05-19 13:14:21 +00:00
2025-03-18 14:00:54 -04:00
2022-04-04 10:25:46 -04:00
2025-03-03 10:33:46 -08:00
2024-09-09 10:47:24 +02:00
2025-03-03 10:33:46 -08:00
2025-03-04 13:47:41 +00:00
2025-03-03 10:33:46 -08:00
2025-05-20 08:23:03 +00:00
2025-05-06 14:32:55 +01:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-04-29 13:28:06 -07:00
2025-04-17 14:54:44 +01:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2024-11-26 09:23:34 -08:00
2023-11-06 19:45:03 +00:00
2025-03-03 10:33:46 -08:00
2025-03-04 13:47:41 +00:00
2025-04-15 08:35:05 -07:00
2024-09-09 10:47:24 +02:00
2025-05-19 13:16:35 +00:00
2025-03-03 10:33:46 -08:00
2025-03-11 13:47:38 +00:00
2025-03-03 10:33:46 -08:00
2025-05-06 14:32:44 +01:00
2025-01-26 15:26:38 -08:00
2024-11-18 18:42:28 +00:00
2025-03-04 13:47:41 +00:00
2025-03-03 10:33:46 -08:00
2025-03-03 10:33:46 -08:00
2024-06-03 16:52:23 -07:00
2025-04-11 18:42:37 +01:00
2025-05-08 16:47:45 +01:00
2025-05-14 12:40:00 +00:00
2025-03-03 10:33:46 -08:00
2024-02-16 08:16:58 +01:00
2025-05-12 11:55:51 +02:00