Add Qwen2Moe GGUF loading support (#33264)

* update gguf doc, config and tensor mapping

* add qwen2moe architecture support, GGUFQwen2MoeConverter and q4 unit tests

* apply code style fixes

* reformat files

* assign GGUFQwen2Converter to qwen2_moe
This commit is contained in:
Vladislav Bronzov
2024-09-05 17:42:03 +02:00
committed by GitHub
parent 132e87500e
commit 5d11de4a2f
4 changed files with 76 additions and 5 deletions

View File

@@ -78,6 +78,7 @@ For now the supported model architectures are the architectures that have been v
- LLaMa
- Mistral
- Qwen2
- Qwen2Moe
## Example usage