Adding Qwen3 and Qwen3MoE (#36878)

* Initial commit for Qwen3

* fix and add tests for qwen3 & qwen3_moe

* rename models for tests.

* fix

* fix

* fix and add docs.

* fix model name in docs.

* simplify modular and fix configuration issues

* Fix the red CI: ruff was updated

* revert ruff, version was wrong

* fix qwen3moe.

* fix

* make sure MOE can load

* fix copies

---------

Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com>
This commit is contained in:
Bo Zheng
2025-03-31 15:50:49 +08:00
committed by GitHub
parent 0d6a60fe55
commit 6acd5aecb3
26 changed files with 5650 additions and 3 deletions

View File

@@ -22,6 +22,8 @@ FILES_TO_PARSE = [
os.path.join(MODEL_ROOT, "olmo", "modular_olmo.py"),
os.path.join(MODEL_ROOT, "rt_detr", "modular_rt_detr.py"),
os.path.join(MODEL_ROOT, "qwen2", "modular_qwen2.py"),
os.path.join(MODEL_ROOT, "qwen3", "modular_qwen3.py"),
os.path.join(MODEL_ROOT, "qwen3", "modular_qwen3_moe.py"),
os.path.join(MODEL_ROOT, "llava_next_video", "modular_llava_next_video.py"),
os.path.join(MODEL_ROOT, "cohere2", "modular_cohere2.py"),
os.path.join(MODEL_ROOT, "modernbert", "modular_modernbert.py"),