Add Qwen2MoE (#29377)
* add support for qwen2 MoE models * update docs * add support for qwen2 MoE models * update docs * update model name & test * update readme * update class names & readme & model_doc of Qwen2MoE. * update architecture name * fix qwen2_moe tests * use Qwen2Tokenizer instead of Qwen2MoeTokenizer * update modeling_qwen2_moe.py * fix model architecture * fix qwen2_moe tests * use Qwen2Tokenizer instead of Qwen2MoeTokenizer * update modeling_qwen2_moe.py * fix model architecture * fix style * fix test when there are sparse and non sparse layers * fixup * Update README.md Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fixup * fixup * add archive back * add support for qwen2 MoE models * update docs * update model name & test * update readme * update class names & readme & model_doc of Qwen2MoE. * update architecture name * fix qwen2_moe tests * use Qwen2Tokenizer instead of Qwen2MoeTokenizer * update modeling_qwen2_moe.py * fix model architecture * fixup * fix qwen2_moe tests * use Qwen2Tokenizer instead of Qwen2MoeTokenizer * fix style * fix test when there are sparse and non sparse layers * fixup * add archive back * fix integration test * fixup --------- Co-authored-by: bozheng-hit <dsoul0621@gmail.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
This commit is contained in:
@@ -240,6 +240,7 @@ Flax), PyTorch, and/or TensorFlow.
|
||||
| [PVTv2](model_doc/pvt_v2) | ✅ | ❌ | ❌ |
|
||||
| [QDQBert](model_doc/qdqbert) | ✅ | ❌ | ❌ |
|
||||
| [Qwen2](model_doc/qwen2) | ✅ | ❌ | ❌ |
|
||||
| [Qwen2MoE](model_doc/qwen2_moe) | ✅ | ❌ | ❌ |
|
||||
| [RAG](model_doc/rag) | ✅ | ✅ | ❌ |
|
||||
| [REALM](model_doc/realm) | ✅ | ❌ | ❌ |
|
||||
| [Reformer](model_doc/reformer) | ✅ | ❌ | ❌ |
|
||||
|
||||
Reference in New Issue
Block a user