Add ModernBERT Decoder Models - ModernBERT, but trained with CLM! (#38967)
Some checks failed
Release - Conda / build_and_package (push) Has been cancelled
Secret Leaks / trufflehog (push) Has been cancelled

* working locally; need to style and test

* added docs and initial tests; need to debug and flesh out

* fixed tests

* working long context; batches

* working fa2 and eager

* update tests

* add missing confnigs

* remove default autoset

* fix spacing

* fix most tests

* fixed tests

* fix to init

* refactor to match new transformers updates

* remove static cache option

* fa2 fix

* fix docs

* in progress

* working on tests

* fixed issue with attn outputs

* remove debug

* fix local config attr

* update doc string

* fix docstring

* add docs to toc

* correct typo in toc

* add new updates from main w.r.t. ModernBERT RoPE

* fix local param

---------

Co-authored-by: oweller2 <oweller2@dsailogin.mgmt.ai.cluster>
Co-authored-by: oweller2 <oweller2@l07.mgmt.ai.cluster>
Co-authored-by: oweller2 <oweller2@n02.mgmt.ai.cluster>
Co-authored-by: oweller2 <oweller2@l08.mgmt.ai.cluster>
Co-authored-by: oweller2 <oweller2@l01.mgmt.ai.cluster>
Co-authored-by: oweller2 <oweller2@l02.mgmt.ai.cluster>
This commit is contained in:
Orion Weller
2025-07-15 04:40:41 -04:00
committed by GitHub
parent 0b724114cf
commit 0e4b7938d0
13 changed files with 2020 additions and 0 deletions

View File

@@ -3323,6 +3323,8 @@ class ModelTesterMixin:
"ModernBertForTokenClassification",
"TimmWrapperForImageClassification",
"ModernBertForQuestionAnswering",
"ModernBertDecoderForSequenceClassification",
"ModernBertDecoderForCausalLM",
]
special_param_names = [
r"^bit\.",