Adding Qwen3 and Qwen3MoE (#36878)

* Initial commit for Qwen3

* fix and add tests for qwen3 & qwen3_moe

* rename models for tests.

* fix

* fix

* fix and add docs.

* fix model name in docs.

* simplify modular and fix configuration issues

* Fix the red CI: ruff was updated

* revert ruff, version was wrong

* fix qwen3moe.

* fix

* make sure MOE can load

* fix copies

---------

Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com>
This commit is contained in:
Bo Zheng
2025-03-31 15:50:49 +08:00
committed by GitHub
parent 0d6a60fe55
commit 6acd5aecb3
26 changed files with 5650 additions and 3 deletions

View File

@@ -47,6 +47,7 @@ CONFIG_CLASSES_TO_IGNORE_FOR_DOCSTRING_CHECKPOINT_CHECK = {
"LlamaConfig",
"GraniteConfig",
"GraniteMoeConfig",
"Qwen3MoeConfig",
}