From 390f153469dfdc793e7a9c7eb4822ea76f4f796a Mon Sep 17 00:00:00 2001 From: Garrett Goon <44747910+garrett361@users.noreply.github.com> Date: Tue, 20 May 2025 11:13:59 -0400 Subject: [PATCH] Add padding-free to bamba (#35861) * add seq_idx and fa kwargs * update tests * docs and grad ckpt support * fmt * better names * test_raise_missing_padding_free_kwarg_errs * + seq_idx in doc strings * padding free training docs * add link to pr plots * raise err on attn_mask with padding free * rm raising missing padding free err test * BambaFlashAttentionKwargs * run modular util for modular_granitemoehybrid.py --- docs/source/en/model_doc/bamba.md | 32 +++++- .../models/bamba/modeling_bamba.py | 54 +++++++++-- .../models/bamba/modular_bamba.py | 61 ++++++++++-- .../modeling_granitemoehybrid.py | 14 ++- tests/models/bamba/test_modeling_bamba.py | 97 ++++++++++++++++++- 5 files changed, 233 insertions(+), 25 deletions(-) diff --git a/docs/source/en/model_doc/bamba.md b/docs/source/en/model_doc/bamba.md index c6e1bcec56..b776a5732a 100644 --- a/docs/source/en/model_doc/bamba.md +++ b/docs/source/en/model_doc/bamba.md @@ -39,7 +39,7 @@ Checkout all Bamba-9B model checkpoints [here](https://github.com/foundation-mod