Add Qwen2Moe GGUF loading support (#33264)

* update gguf doc, config and tensor mapping * add qwen2moe architecture support, GGUFQwen2MoeConverter and q4 unit tests * apply code style fixes * reformat files * assign GGUFQwen2Converter to qwen2_moe
2024-09-05 17:42:03 +02:00
parent 132e87500e
commit 5d11de4a2f
4 changed files with 76 additions and 5 deletions
--- a/docs/source/en/gguf.md
+++ b/docs/source/en/gguf.md
@@ -78,6 +78,7 @@ For now the supported model architectures are the architectures that have been v
 - LLaMa
 - Mistral
 - Qwen2
+- Qwen2Moe

 ## Example usage