Add falcon gguf (#33437)

* feat(gguf): add falcon q2 k

* fix(gguf): remove useless renaming

* feat(gguf): seperate falcon 7b and 40b

* feat(gguf): apply fixup

* fix(test): error rebase

* feat(gguf): add fp16 weight comparison for falcon

* feat(gguf): test weight of all layers

* test(gguf): add falcon 40b under skip decorator

* feat(gguf): quick example for extracting model size
This commit is contained in:
g-prz
2024-10-02 15:10:39 +03:00
committed by GitHub
parent 181c962aab
commit fe484726aa
5 changed files with 111 additions and 12 deletions

View File

@@ -81,6 +81,7 @@ For now the supported model architectures are the architectures that have been v
- Qwen2Moe
- Phi3
- Bloom
- Falcon
## Example usage