Add falcon gguf (#33437)

* feat(gguf): add falcon q2 k * fix(gguf): remove useless renaming * feat(gguf): seperate falcon 7b and 40b * feat(gguf): apply fixup * fix(test): error rebase * feat(gguf): add fp16 weight comparison for falcon * feat(gguf): test weight of all layers * test(gguf): add falcon 40b under skip decorator * feat(gguf): quick example for extracting model size
2024-10-02 15:10:39 +03:00
parent 181c962aab
commit fe484726aa
5 changed files with 111 additions and 12 deletions
--- a/docs/source/en/gguf.md
+++ b/docs/source/en/gguf.md
@@ -81,6 +81,7 @@ For now the supported model architectures are the architectures that have been v
 - Qwen2Moe
 - Phi3
 - Bloom
+- Falcon

 ## Example usage