44670
2b79f14375
support loading qwen3 gguf ( #38645 )
...
* support loading qwen3 gguf
* Add qwen3 into GGUF_TO_FAST_CONVERTERS for tokenizer conversion
* Add testcase
* Fix formatting
2025-07-15 09:53:41 +00:00
Yao Matrix
89542fb81c
enable more test cases on xpu ( #38572 )
...
* enable glm4 integration cases on XPU, set xpu expectation for blip2
Signed-off-by: Matrix YAO <matrix.yao@intel.com >
* more
Signed-off-by: YAO Matrix <matrix.yao@intel.com >
* fix style
Signed-off-by: YAO Matrix <matrix.yao@intel.com >
* refine wording
Signed-off-by: YAO Matrix <matrix.yao@intel.com >
* refine test case names
Signed-off-by: YAO Matrix <matrix.yao@intel.com >
* run
Signed-off-by: YAO Matrix <matrix.yao@intel.com >
* add gemma2 and chameleon
Signed-off-by: YAO Matrix <matrix.yao@intel.com >
* fix review comments
Signed-off-by: YAO Matrix <matrix.yao@intel.com >
---------
Signed-off-by: Matrix YAO <matrix.yao@intel.com >
Signed-off-by: YAO Matrix <matrix.yao@intel.com >
2025-06-06 09:29:51 +02:00
Mohamed Mekkouri
38c406844e
Fixing quantization tests ( #37650 )
...
* fix
* style
* add capability check
2025-04-22 13:59:57 +02:00
Isotr0py
c69e23455d
Support loading Gemma3 QAT GGUF models ( #37649 )
...
* fix gemma3 qat gguf support
Signed-off-by: isotr0py <2037008807@qq.com >
* update test
Signed-off-by: isotr0py <2037008807@qq.com >
* make ruff happy
Signed-off-by: isotr0py <2037008807@qq.com >
---------
Signed-off-by: isotr0py <2037008807@qq.com >
Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com >
2025-04-22 11:23:17 +02:00
Isotr0py
6daec12d0b
Add GGUF support to Gemma3 Text backbone ( #37424 )
...
* add gemma3 gguf support
Signed-off-by: Isotr0py <2037008807@qq.com >
* fix typo and add gguf limit
Signed-off-by: Isotr0py <2037008807@qq.com >
* fix a typo
Signed-off-by: Isotr0py <2037008807@qq.com >
* add vision conversion test
Signed-off-by: Isotr0py <2037008807@qq.com >
* fix typos
Signed-off-by: Isotr0py <2037008807@qq.com >
---------
Signed-off-by: Isotr0py <2037008807@qq.com >
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com >
2025-04-10 17:15:43 +02:00
cyyever
1e6b546ea6
Use Python 3.9 syntax in tests ( #37343 )
...
Signed-off-by: cyy <cyyever@outlook.com >
2025-04-08 14:12:08 +02:00
David LaPalomento
b45cf0e90a
Guard against unset resolved_archive_file ( #35628 )
...
* archive_file may not be specified
When loading a pre-trained model from a gguf file, resolved_archive_file may not be set. Guard against that case in the safetensors availability check.
* Remap partial disk offload to cpu for GGUF files
GGUF files don't support disk offload so attempt to remap them to the CPU when device_map is auto. If device_map is anything else but None, raise a NotImplementedError.
* Don't remap auto device_map and raise RuntimeError
If device_map=auto and modules are selected for disk offload, don't attempt to map them to any other device. Raise a runtime error when a GGUF model is configured to map any modules to disk.
---------
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com >
2025-02-14 14:44:31 +01:00
湛露先生
1590c66430
Fix words typos in ggml test. ( #36060 )
...
Signed-off-by: zhanluxianshen <zhanluxianshen@163.com >
2025-02-06 15:32:40 +00:00
Isotr0py
e57b459997
Split and clean up GGUF quantization tests ( #35502 )
...
* clean up ggml test
Signed-off-by: Isotr0py <2037008807@qq.com >
* port remaining tests
Signed-off-by: Isotr0py <2037008807@qq.com >
* further cleanup
Signed-off-by: Isotr0py <2037008807@qq.com >
* format
Signed-off-by: Isotr0py <2037008807@qq.com >
* fix broken tests
Signed-off-by: Isotr0py <2037008807@qq.com >
* update comment
Signed-off-by: Isotr0py <2037008807@qq.com >
* fix
Signed-off-by: Isotr0py <2037008807@qq.com >
* reorganize tests
Signed-off-by: Isotr0py <2037008807@qq.com >
* k-quants use qwen2.5-0.5B
Signed-off-by: Isotr0py <2037008807@qq.com >
* move ggml tokenization test
Signed-off-by: Isotr0py <2037008807@qq.com >
* remove dead code
Signed-off-by: Isotr0py <2037008807@qq.com >
* add assert for serilization test
Signed-off-by: Isotr0py <2037008807@qq.com >
* use str for parameterize
Signed-off-by: Isotr0py <2037008807@qq.com >
---------
Signed-off-by: Isotr0py <2037008807@qq.com >
2025-01-27 15:46:57 +01:00
Arthur
b912f5ee43
use torch.testing.assertclose instead to get more details about error in cis ( #35659 )
...
* use torch.testing.assertclose instead to get more details about error in cis
* fix
* style
* test_all
* revert for I bert
* fixes and updates
* more image processing fixes
* more image processors
* fix mamba and co
* style
* less strick
* ok I won't be strict
* skip and be done
* up
2025-01-24 16:55:28 +01:00
Mohamed Mekkouri
a7738f5a89
Fix : Nemotron tokenizer for GGUF format ( #35836 )
...
fix nemotron gguf
2025-01-22 12:28:40 +01:00
Mohamed Mekkouri
dbd8474125
Fix : BLOOM tie_word_embeddings in GGUF ( #35812 )
...
* fix bloom ggml
* fix falcon output
* make style
2025-01-21 15:35:54 +01:00
Mohamed Mekkouri
b80e334e71
Skip Falcon 7B GGML Test ( #35783 )
...
skip test
2025-01-20 15:00:34 +01:00
Mohamed Mekkouri
a11041ffad
Fix : add require_read_token for gemma2 gated model ( #35687 )
...
fix gemma2 gated model test
2025-01-14 11:47:05 +01:00
Mohamed Mekkouri
df2a812e95
Fix expected output for ggml test ( #35686 )
...
fix expected output
2025-01-14 11:46:55 +01:00
Yijun Lee
e5fd865eba
Add Gemma2 GGUF support ( #34002 )
...
* initial setup for ggml.py
* initial setup of GGUFGemma2Converter class
* Add gemma2 model to gguf.md doc
* Partial work on GGUF_TENSOR_MAPPING
* initial setup of GGUF_TENSOR_MAPPING for Gemma2
* refactor: rename GemmaConvert class to GemmaConverter for naming consistency
* feat: complete gemma2 tensor mapping implementation
* feat: add initial implementation of GGUFGemmaConverter
* feat: complete GGUFGemmaConverter implementation
* feat: add test code for gemma2
* refactor: minor code cleanup
* refactor: minor code cleanup
* fix: resolve suggestions
* Update tests/quantization/ggml/test_ggml.py
Co-authored-by: Isotr0py <2037008807@qq.com >
---------
Co-authored-by: Isotr0py <2037008807@qq.com >
2025-01-03 14:50:07 +01:00
Mohamed Mekkouri
85eb339231
Fix : model used to test ggml conversion of Falcon-7b is incorrect ( #35083 )
...
fixing test model
2024-12-16 13:21:44 +01:00
Mohamed Mekkouri
890ea7de93
Fix failling GGML test ( #34871 )
...
fix_test
2024-11-25 18:04:52 +01:00
farrosalferro
c57eafdaa1
Add Nemotron GGUF Loading Support ( #34725 )
...
* Add Nemotron GGUF Loading Support
* fix the Nemotron architecture assignation
---------
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com >
2024-11-21 11:37:34 +01:00
Isotr0py
e83aaaa86b
Fix use_parallel_residual and qkv_bias for StableLM GGUF config extraction ( #34450 )
...
* fix stablelm qkv_bias
* fix stablelm qkv_bias and use_parallel_residual
* remove original_model.config for stablelm gguf test
2024-11-05 18:26:20 +01:00
Vladislav Bronzov
5251fe6271
Add GGUF for Mamba ( #34200 )
...
* add mamba architecture for gguf
* add logic for weights conversion, some fixes and refactoring
* add lm_head layers, unit test refactoring
* more fixes for tests
* remove lm_head creation
* remove unused comments
2024-10-30 16:52:17 +01:00
김준재
dd267fca72
Add T5 GGUF loading support ( #33389 )
...
* add: GGUFT5Converter
* add: tensormapping for t5
* add: test code for t5
* fix: Remove whitespace from blank line
* add: t5 fp16 tests
* fix: whitespace formatting
* fix: minor formatting
* fix: testing every weights
2024-10-24 15:10:59 +02:00
Vladislav Bronzov
cb5ca3265f
Add GGUF for starcoder2 ( #34094 )
...
* add starcoder2 arch support for gguf
* fix q6 test
2024-10-14 10:22:49 +02:00
Vladislav Bronzov
c9afee5392
Add gguf support for gpt2 ( #34044 )
...
* add gpt2 gguf support
* add doc change
* small refactoring
2024-10-10 13:42:18 +02:00
Vladislav Bronzov
faa0f63b93
Add gguf support for StableLM ( #33793 )
...
* add stablelm gguf architecture support
* add additional quantization tests
* resolve merge conflict, add weight conversion tests for fp16
2024-10-09 12:16:13 +02:00
Vladislav Bronzov
22e102ad98
Bug fix gguf qwen2moe ( #33940 )
...
* fix qwen2moe tensors mapping, add unit tests
* add expert tensor split logic, test refactoring
* small params refactoring
* add comment to tensor reshaping
2024-10-05 16:19:01 +02:00
g-prz
fe484726aa
Add falcon gguf ( #33437 )
...
* feat(gguf): add falcon q2 k
* fix(gguf): remove useless renaming
* feat(gguf): seperate falcon 7b and 40b
* feat(gguf): apply fixup
* fix(test): error rebase
* feat(gguf): add fp16 weight comparison for falcon
* feat(gguf): test weight of all layers
* test(gguf): add falcon 40b under skip decorator
* feat(gguf): quick example for extracting model size
2024-10-02 14:10:39 +02:00
Vladislav Bronzov
9d200cfbee
Add gguf support for bloom ( #33473 )
...
* add bloom arch support for gguf
* apply format
* small refactoring, bug fix in GGUF_TENSOR_MAPPING naming
* optimize bloom GGUF_TENSOR_MAPPING
* implement reverse reshaping for bloom gguf
* add qkv weights test
* add q_8 test for bloom
2024-09-27 12:13:40 +02:00
Alazar
96429e74a8
Add support for GGUF Phi-3 ( #31844 )
...
* Update docs for GGUF supported models
* Add tensor mappings and define class GGUFPhi3Converter
* Fix tokenizer
* Working version
* Attempt to fix some CI failures
* Run ruff format
* Add vocab, merges, decoder methods like LlamaConverter
* Resolve conflicts since Qwen2Moe was added to gguf
- I missed one place when resolving conflict
- I also made a mistake with tests_ggml.py and now has been fixed to reflect
its master version.
2024-09-10 13:32:38 +02:00
Vladislav Bronzov
5d11de4a2f
Add Qwen2Moe GGUF loading support ( #33264 )
...
* update gguf doc, config and tensor mapping
* add qwen2moe architecture support, GGUFQwen2MoeConverter and q4 unit tests
* apply code style fixes
* reformat files
* assign GGUFQwen2Converter to qwen2_moe
2024-09-05 17:42:03 +02:00
Isotr0py
edeca4387c
🚨 Support dequantization for most GGML types ( #32625 )
...
* use gguf internal dequantize
* add Q5_0 test
* add iq1 test
* add remained test
* remove duplicated test
* update docs
* add gguf version limit
* make style
* update gguf import catch
* revert vocab_size patch
* make style
* use GGUF_MIN_VERSION everywhere
2024-09-03 12:58:14 +02:00
Penut Chen
1c122a46dc
Support dequantizing GGUF FP16 format ( #31783 )
...
* support gguf fp16
* support gguf bf16 with pytorch
* add gguf f16 test
* remove bf16
2024-07-24 17:59:59 +02:00
Ita Zaporozhets
a1844a3209
gguf conversion add_prefix_space=None for llama3 ( #31937 )
...
* gguf conversion forces add_prefix_space=False for llama3, this is not required and forces from_slow, which fails. changing to None + test
* typo
* clean test
2024-07-23 11:45:54 +02:00
Penut Chen
ac946aac25
Fix the incorrect permutation of gguf ( #31788 )
...
* Fix the incorrect permutation of gguf
* rename num_kv_heads
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com >
* add typing to num_kv_heads
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com >
* rename variables
* refactor permute function name
* update the expected text of the llama3 q4 test
---------
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com >
2024-07-16 08:20:34 +02:00
Younes Belkada
6d4306160a
GGUF: Fix llama 3 GGUF ( #31358 )
...
* Create push-important-models.yml
* llama3 support for GGUF
* fixup
* Update src/transformers/integrations/ggml.py
* fix pre-tokenizer
* fix
* fix
* fix
* fix
* fix
* fix
* address final comment
* handle special tokens + add tests
2024-06-20 14:29:58 +02:00
Albert Villanova del Moral
a14b055b65
Pass datasets trust_remote_code ( #31406 )
...
* Pass datasets trust_remote_code
* Pass trust_remote_code in more tests
* Add trust_remote_dataset_code arg to some tests
* Revert "Temporarily pin datasets upper version to fix CI"
This reverts commit b7672826ca .
* Pass trust_remote_code in librispeech_asr_dummy docstrings
* Revert "Pin datasets<2.20.0 for examples"
This reverts commit 833fc17a3e .
* Pass trust_remote_code to all examples
* Revert "Add trust_remote_dataset_code arg to some tests" to research_projects
* Pass trust_remote_code to tests
* Pass trust_remote_code to docstrings
* Fix flax examples tests requirements
* Pass trust_remote_dataset_code arg to tests
* Replace trust_remote_dataset_code with trust_remote_code in one example
* Fix duplicate trust_remote_code
* Replace args.trust_remote_dataset_code with args.trust_remote_code
* Replace trust_remote_dataset_code with trust_remote_code in parser
* Replace trust_remote_dataset_code with trust_remote_code in dataclasses
* Replace trust_remote_dataset_code with trust_remote_code arg
2024-06-17 17:29:13 +01:00
Isotr0py
e4628434d8
Add Qwen2 GGUF loading support ( #31175 )
...
* add qwen2 gguf support
* Update docs
* fix qwen2 tokenizer
* add qwen2 gguf test
* fix typo in qwen2 gguf test
* format code
* Remove mistral, clarify the error message
* format code
* add typing and update docstring
2024-06-03 14:55:10 +01:00
Lysandre Debut
a42844955f
Loading GGUF files support ( #30391 )
...
* Adds support for loading GGUF files
Co-authored-by: Younes Belkada <younesbelkada@gmail.com >
Co-authored-by: 99991 <99991@users.noreply.github.com >
* add q2_k q3_k q5_k support from @99991
* fix tests
* Update doc
* Style
* Docs
* fix CI
* Update docs/source/en/gguf.md
* Update docs/source/en/gguf.md
* Compute merges
* change logic
* add comment for clarity
* add comment for clarity
* Update src/transformers/models/auto/tokenization_auto.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
* change logic
* Update src/transformers/modeling_utils.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
* change
* Apply suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
* Update src/transformers/modeling_gguf_pytorch_utils.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
* put back comment
* add comment about mistral
* comments and added tests
* fix unconsistent type
* more
* fix tokenizer
* Update src/transformers/modeling_utils.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
* address comments about tests and tokenizer + add added_tokens
* from_gguf -> gguf_file
* replace on docs too
---------
Co-authored-by: Younes Belkada <younesbelkada@gmail.com >
Co-authored-by: 99991 <99991@users.noreply.github.com >
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com >
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
2024-05-15 14:28:20 +02:00