Yao Matrix
a5a0c7b888
switch to device agnostic device calling for test cases ( #38247 )
...
* use device agnostic APIs in test cases
Signed-off-by: Matrix Yao <matrix.yao@intel.com >
* fix style
Signed-off-by: Matrix Yao <matrix.yao@intel.com >
* add one more
Signed-off-by: YAO Matrix <matrix.yao@intel.com >
* xpu now supports integer device id, aligning to CUDA behaviors
Signed-off-by: Matrix Yao <matrix.yao@intel.com >
* update to use device_properties
Signed-off-by: Matrix Yao <matrix.yao@intel.com >
* fix style
Signed-off-by: Matrix Yao <matrix.yao@intel.com >
* update comment
Signed-off-by: Matrix Yao <matrix.yao@intel.com >
* fix comments
Signed-off-by: Matrix Yao <matrix.yao@intel.com >
* fix style
Signed-off-by: Matrix Yao <matrix.yao@intel.com >
---------
Signed-off-by: Matrix Yao <matrix.yao@intel.com >
Signed-off-by: YAO Matrix <matrix.yao@intel.com >
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2025-05-26 10:18:53 +02:00
Titus
f022bf9322
Remove trust_remote_code=True tests from bnb quantization tests (MPT now integrated) ( #38206 )
...
bnb quant tests: remove obsolete trust_remote_code test
The MPT model is now natively integrated in Transformers and no longer requires trust_remote_code=True. This removes the failing test_get_keys_to_not_convert_trust_remote_code and related usage, which depended on remote code and caused CI issues due to missing dependencies (e.g., triton_pre_mlir).
2025-05-20 11:43:11 +02:00
jiqing-feng
d231f5a7d4
update bnb tests ( #38011 )
...
Signed-off-by: jiqing-feng <jiqing.feng@intel.com >
2025-05-08 20:35:24 +00:00
co63oc
d5fa7d2d19
Fix typos in strings and comments ( #37799 )
2025-04-28 11:39:11 +01:00
cyyever
1e6b546ea6
Use Python 3.9 syntax in tests ( #37343 )
...
Signed-off-by: cyy <cyyever@outlook.com >
2025-04-08 14:12:08 +02:00
jiqing-feng
3a6ab46a0b
add gpt2 test on XPU ( #37028 )
...
* add gpt2 test on XPU
Signed-off-by: jiqing-feng <jiqing.feng@intel.com >
* auto dtype has been fixed
Signed-off-by: jiqing-feng <jiqing.feng@intel.com >
* convert model to train mode
Signed-off-by: jiqing-feng <jiqing.feng@intel.com >
---------
Signed-off-by: jiqing-feng <jiqing.feng@intel.com >
2025-04-01 11:09:29 +02:00
Mohamed Mekkouri
a861db01e5
Fix Device map for bitsandbytes tests ( #36800 )
...
fix
2025-03-19 11:57:13 +01:00
Afanti
19b9d8ae13
chore: fix typos in tests directory ( #36785 )
...
* chore: fix typos in tests directory
* chore: fix typos in tests directory
* chore: fix typos in tests directory
* chore: fix typos in tests directory
* chore: fix typos in tests directory
* chore: fix typos in tests directory
* chore: fix typos in tests directory
2025-03-18 10:31:13 +01:00
Marc Sun
9e94801146
enable/disable compile for quants methods ( #36519 )
...
* disable compile for most quants methods
* fix
* Update src/transformers/generation/configuration_utils.py
Co-authored-by: Matthew Douglas <38992547+matthewdouglas@users.noreply.github.com >
* Update tests/quantization/bnb/test_mixed_int8.py
Co-authored-by: Matthew Douglas <38992547+matthewdouglas@users.noreply.github.com >
* Update src/transformers/generation/configuration_utils.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com >
* changes from joao suggestions
---------
Co-authored-by: Matthew Douglas <38992547+matthewdouglas@users.noreply.github.com >
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com >
2025-03-17 11:38:21 +01:00
Dmitry Rogozhkin
b4b9da6d9b
tests: revert change of torch_require_multi_gpu to be device agnostic ( #35721 )
...
* tests: revert change of torch_require_multi_gpu to be device agnostic
The 11c27dd33 modified `torch_require_multi_gpu()` to be device agnostic
instead of being CUDA specific. This broke some tests which are rightfully
CUDA specific, such as:
* `tests/trainer/test_trainer_distributed.py::TestTrainerDistributed`
In the current Transformers tests architecture `require_torch_multi_accelerator()`
should be used to mark multi-GPU tests agnostic to device.
This change addresses the issue introduced by 11c27dd33 and reverts
modification of `torch_require_multi_gpu()`.
Fixes: 11c27dd33 ("Enable BNB multi-backend support (#31098 )")
Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com >
* fix bug: modification of frozen set
---------
Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com >
Co-authored-by: Titus von Koeller <9048635+Titus-von-Koeller@users.noreply.github.com >
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com >
2025-02-25 13:36:10 +01:00
Fanli Lin
4dbf17c17f
[tests] enable bnb tests on xpu ( #36233 )
...
* fix failed test
* fix device
* fix more device cases
* add more cases
* fix empty cache
* Update test_4bit.py
---------
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com >
2025-02-24 11:30:15 +01:00
Arthur
b912f5ee43
use torch.testing.assertclose instead to get more details about error in cis ( #35659 )
...
* use torch.testing.assertclose instead to get more details about error in cis
* fix
* style
* test_all
* revert for I bert
* fixes and updates
* more image processing fixes
* more image processors
* fix mamba and co
* style
* less strick
* ok I won't be strict
* skip and be done
* up
2025-01-24 16:55:28 +01:00
Matthew Douglas
6b1e86fd4d
Fix new BNB test failures ( #35345 )
2025-01-02 11:24:52 +01:00
jiqing-feng
69e31eb1bf
change bnb tests ( #34713 )
...
* fix training tests
* fix xpu check
Signed-off-by: jiqing-feng <jiqing.feng@intel.com >
* rm pdb
Signed-off-by: jiqing-feng <jiqing.feng@intel.com >
* fix 4bit logits check
Signed-off-by: jiqing-feng <jiqing.feng@intel.com >
* fix 4bit logits check
Signed-off-by: jiqing-feng <jiqing.feng@intel.com >
* add xpu check on int8 training
* fix training tests
* add llama test on bnb
Signed-off-by: jiqing-feng <jiqing.feng@intel.com >
* only cpu and xpu disable autocast training
Signed-off-by: jiqing-feng <jiqing.feng@intel.com >
* fix format
Signed-off-by: jiqing-feng <jiqing.feng@intel.com >
---------
Signed-off-by: jiqing-feng <jiqing.feng@intel.com >
Co-authored-by: Titus <9048635+Titus-von-Koeller@users.noreply.github.com >
2024-12-18 09:49:59 -05:00
Matthew Douglas
34f4080ff5
[CI] Fix bnb quantization tests with accelerate>=1.2.0 ( #35172 )
2024-12-09 13:55:16 -05:00
Matthew Douglas
e447185b1f
Fix bnb training test failure ( #34414 )
...
* Fix bnb training test: compatibility with OPTSdpaAttention
2024-10-25 10:23:20 -04:00
jiqing-feng
11c27dd331
Enable BNB multi-backend support ( #31098 )
...
* enable cpu bnb path
* fix style
* fix code style
* fix 4 bit path
* Update src/transformers/utils/import_utils.py
Co-authored-by: Aarni Koskela <akx@iki.fi >
* add multi backend refactor tests
* fix style
* tweak 4bit quantizer + fix corresponding tests
* tweak 8bit quantizer + *try* fixing corresponding tests
* fix dequant bnb 8bit
* account for Intel CPU in variability of expected outputs
* enable cpu and xpu device map
* further tweaks to account for Intel CPU
* fix autocast to work with both cpu + cuda
* fix comments
* fix comments
* switch to testing_utils.torch_device
* allow for xpu in multi-gpu tests
* fix tests 4bit for CPU NF4
* fix bug with is_torch_xpu_available needing to be called as func
* avoid issue where test reports attr err due to other failure
* fix formatting
* fix typo from resolving of merge conflict
* polish based on last PR review
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com >
* fix CI
* Update src/transformers/integrations/integration_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
* Update src/transformers/integrations/integration_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
* fix error log
* fix error msg
* add \n in error log
* make quality
* rm bnb cuda restriction in doc
* cpu model don't need dispatch
* fix doc
* fix style
* check cuda avaliable in testing
* fix tests
* Update docs/source/en/model_doc/chameleon.md
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com >
* Update docs/source/en/model_doc/llava_next.md
Co-authored-by: Aarni Koskela <akx@iki.fi >
* Update tests/quantization/bnb/test_4bit.py
Co-authored-by: Aarni Koskela <akx@iki.fi >
* Update tests/quantization/bnb/test_4bit.py
Co-authored-by: Aarni Koskela <akx@iki.fi >
* fix doc
* fix check multibackends
* fix import sort
* remove check torch in bnb
* docs: update bitsandbytes references with multi-backend info
* docs: fix small mistakes in bnb paragraph
* run formatting
* reveret bnb check
* move bnb multi-backend check to import_utils
* Update src/transformers/utils/import_utils.py
Co-authored-by: Aarni Koskela <akx@iki.fi >
* fix bnb check
* minor fix for bnb
* check lib first
* fix code style
* Revert "run formatting"
This reverts commit ac108c6d6b34f45a5745a736ba57282405cfaa61.
* fix format
* give warning when bnb version is low and no cuda found]
* fix device assignment check to be multi-device capable
* address akx feedback on get_avlbl_dev fn
* revert partially, as we don't want the function that public, as docs would be too much (enforced)
---------
Co-authored-by: Aarni Koskela <akx@iki.fi >
Co-authored-by: Titus von Koeller <9048635+Titus-von-Koeller@users.noreply.github.com >
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com >
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
2024-09-24 03:40:56 -06:00
Marc Sun
9ea1eacd11
remove to restriction for 4-bit model ( #33122 )
...
* remove to restiction for 4-bit model
* Update src/transformers/modeling_utils.py
Co-authored-by: Matthew Douglas <38992547+matthewdouglas@users.noreply.github.com >
* bitsandbytes: prevent dtype casting while allowing device movement with .to or .cuda
* quality fix
* Improve warning message for .to() and .cuda() on bnb quantized models
---------
Co-authored-by: Matthew Douglas <38992547+matthewdouglas@users.noreply.github.com >
2024-09-02 16:28:50 +02:00
amyeroberts
1de7dc7403
Skip tests properly ( #31308 )
...
* Skip tests properly
* [test_all]
* Add 'reason' as kwarg for skipTest
* [test_all] Fix up
* [test_all]
2024-06-26 21:59:08 +01:00
Younes Belkada
5e5c4d629d
FIX / Quantization: Add extra validation for bnb config ( #31135 )
...
add validation for bnb config
2024-05-30 11:45:03 +02:00
Younes Belkada
3f435823e0
FEAT / Bitsandbytes: Add dequantize API for bitsandbytes quantized models ( #30806 )
...
* add method
* change method name
* more comments
* Apply suggestions from code review
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com >
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
* fixup
* add docstrings and fix comment
* warn users on the de-quantized dtype
* Update src/transformers/quantizers/base.py
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com >
* Update src/transformers/integrations/bitsandbytes.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
* final suggestion - use private method
---------
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com >
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
2024-05-15 17:17:09 +02:00
Marc Sun
4207a4076d
[bnb] Fix offload test ( #30039 )
...
fix bnb test
2024-04-05 13:11:28 +02:00
Younes Belkada
ff76e7c212
FIX [bnb / tests] Propagate the changes from #29092 to 4-bit tests ( #29122 )
...
* forgot to push the changes for 4bit ..
* trigger CI
2024-02-20 11:11:15 +01:00
Titus
5ce90f3212
Bnb test fix for different hardwares ( #29066 )
...
* generated text on A10G
* generated text in CI
* Apply suggestions from code review
add explanatory comments
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com >
---------
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com >
2024-02-19 18:04:44 +00:00
Younes Belkada
a75a6c9315
FIX [bnb / tests]: Fix currently failing bnb tests ( #29092 )
...
Update test_mixed_int8.py
2024-02-19 10:39:12 +01:00
Lysandre Debut
f497f564bb
Update all references to canonical models ( #29001 )
...
* Script & Manual edition
* Update
2024-02-16 08:16:58 +01:00
Klaus Hipp
fe3df9d5b3
[Docs] Add language identifiers to fenced code blocks ( #28955 )
...
Add language identifiers to code blocks
2024-02-12 10:48:31 -08:00
Poedator
d78e78a0e4
HfQuantizer class for quantization-related stuff in modeling_utils.py (#26610 )
...
* squashed earlier commits for easier rebase
* rm rebase leftovers
* 4bit save enabled @quantizers
* TMP gptq test use exllama
* fix AwqConfigTest::test_wrong_backend for A100
* quantizers AWQ fixes
* _load_pretrained_model low_cpu_mem_usage branch
* quantizers style
* remove require_low_cpu_mem_usage attr
* rm dtype arg from process_model_before_weight_loading
* rm config_origin from Q-config
* rm inspect from q_config
* fixed docstrings in QuantizationConfigParser
* logger.warning fix
* mv is_loaded_in_4(8)bit to BnbHFQuantizer
* is_accelerate_available error msg fix in quantizer
* split is_model_trainable in bnb quantizer class
* rm llm_int8_skip_modules as separate var in Q
* Q rm todo
* fwd ref to HFQuantizer in type hint
* rm note re optimum.gptq.GPTQQuantizer
* quantization_config in __init__ simplified
* replaced NonImplemented with create_quantized_param
* rm load_in_4/8_bit deprecation warning
* QuantizationConfigParser refactoring
* awq-related minor changes
* awq-related changes
* awq config.modules_to_not_convert
* raise error if no q-method in q-config in args
* minor cleanup
* awq quantizer docstring
* combine common parts in bnb process_model_before_weight_loading
* revert test_gptq
* .process_model_ cleanup
* restore dict config warning
* removed typevars in quantizers.py
* cleanup post-rebase 16 jan
* QuantizationConfigParser classmethod refactor
* rework of handling of unexpected aux elements of bnb weights
* moved q-related stuff from save_pretrained to quantizers
* refactor v1
* more changes
* fix some tests
* remove it from main init
* ooops
* Apply suggestions from code review
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com >
* fix awq issues
* fix
* fix
* fix
* fix
* fix
* fix
* add docs
* Apply suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
* Update docs/source/en/hf_quantizer.md
* address comments
* fix
* fixup
* Update src/transformers/modeling_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
* Update src/transformers/modeling_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
* address final comment
* update
* Update src/transformers/quantizers/base.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
* Update src/transformers/quantizers/auto.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
* fix
* add kwargs update
* fixup
* add `optimum_quantizer` attribute
* oops
* rm unneeded file
* fix doctests
---------
Co-authored-by: younesbelkada <younesbelkada@gmail.com >
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com >
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com >
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
2024-01-30 02:48:25 +01:00
Omar Sanseviero
a989c6c6eb
Don't allow passing load_in_8bit and load_in_4bit at the same time ( #28266 )
...
* Update quantization_config.py
* Style
* Protect from setting directly
* add tests
* Update tests/quantization/bnb/test_4bit.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com >
---------
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com >
2024-01-30 01:43:40 +01:00
Poedator
4f7806ef7e
[bnb] Let's make serialization of 4bit models possible ( #26037 )
...
* updated bitsandbytes.py
* rm test_raise_* from test_4bit.py
* add test_4bit_serialization.py
* modeling_utils bulk edits
* bnb_ver 0.41.3 in integrations/bitsandbytes.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com >
* @slow reinstated
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com >
* bnb ver 0.41.3 in src/transformers/modeling_utils.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com >
* rm bnb version todo in integrations/bitsandbytes.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com >
* moved 4b serialization tests to test_4bit
* tests upd for opt
* to torch_device
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com >
* ruff fixes to tests
* rm redundant bnb version check in mod_utils
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com >
* restore _hf_peft_config_loaded modeling_utils.py::2188
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com >
* restore _hf_peft_config_loaded test in modeling_utils.py::2199
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com >
* fixed NOT getattr(self, "is_8bit_serializable")
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com >
* setting model.is_4bit_serializable
* rm separate fp16_statistics arg from set_module...
* rm else branch in integrations::bnb::set_module
* bnb 4bit dtype check
* upd comment on 4bit weights
* upd tests for FP4 safe
---------
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com >
2023-12-21 11:54:44 +01:00
Younes Belkada
9b25c164bd
[core / Quantization] Fix for 8bit serialization tests ( #27234 )
...
* fix for 8bit serialization
* added regression tests.
* fixup
2023-11-02 12:03:51 +01:00
Younes Belkada
4bb50aa212
[Quantization / tests ] Fix bnb MPT test ( #27178 )
...
fix bnb mpt test
2023-10-31 16:25:53 +01:00
Younes Belkada
6b466771b0
[tests / Quantization] Fix bnb test ( #27145 )
...
* fix bnb test
* link to GH issue
2023-10-30 15:43:08 +01:00
Younes Belkada
fd6a0ade9b
🚨 🚨 🚨 [Quantization] Store the original dtype in the config as a private attribute 🚨 🚨 🚨 ( #26761 )
...
* First step
* fix
* add adjustements for gptq
* change to `_pre_quantization_dtype`
* Update src/transformers/modeling_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
* fix serialization
* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
* fixup
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
2023-10-16 19:56:53 +02:00
Younes Belkada
2aef9a9601
[PEFT] Final fixes ( #26559 )
...
* fix issues with PEFT
* logger warning futurewarning issues
* fixup
* adapt from suggestions
* oops
* rm test
2023-10-03 14:53:09 +02:00
Younes Belkada
6824461f2a
[core/ auto ] Fix bnb test with code revision + bug with code revision ( #26431 )
...
* fix bnb test with code revision
* fix test
* Apply suggestions from code review
* Update src/transformers/models/auto/auto_factory.py
* Update src/transformers/models/auto/auto_factory.py
* Update src/transformers/models/auto/auto_factory.py
2023-10-02 11:35:07 +02:00
Younes Belkada
7ccac73f74
[RWKV] Final fix RWMV 4bit ( #26134 )
...
* Final fix RWMV 4bit
* fixup
* add a test
* add more clarifications
2023-09-13 16:30:20 +02:00
Younes Belkada
c8b26096d4
[core] fix 4bit num_parameters ( #26132 )
...
* fix 4bit `num_parameters`
* stronger check
2023-09-13 14:12:35 +02:00
Younes Belkada
4b79697865
🚨 🚨 🚨 [Refactor] Move third-party related utility files into integrations/ folder 🚨 🚨 🚨 ( #25599 )
...
* move deepspeed to `lib_integrations.deepspeed`
* more refactor
* oops
* fix slow tests
* Fix docs
* fix docs
* addess feedback
* address feedback
* final modifs for PEFT
* fixup
* ok now
* trigger CI
* trigger CI again
* Update docs/source/en/main_classes/deepspeed.md
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
* import from `integrations`
* address feedback
* revert removal of `deepspeed` module
* revert removal of `deepspeed` module
* fix conflicts
* ooops
* oops
* add deprecation warning
* place it on the top
* put `FutureWarning`
* fix conflicts with not_doctested.txt
* add back `bitsandbytes` module with a depr warning
* fix
* fix
* fixup
* oops
* fix doctests
---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
2023-08-25 17:13:34 +02:00
Younes Belkada
e7e9261a20
[Docs] Fix un-rendered images ( #25561 )
...
fix un-rendered images
2023-08-17 12:08:11 +02:00
Marc Sun
55db70c63d
GPTQ integration ( #25062 )
...
* GTPQ integration
* Add tests for gptq
* support for more quantization model
* fix style
* typo
* fix method
* Update src/transformers/modeling_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
* add dataclass and fix quantization_method
* fix doc
* Update tests/quantization/gptq/test_gptq.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com >
* Apply suggestions from code review
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com >
* modify dataclass
* add gtpqconfig import
* fix typo
* fix tests
* remove dataset as req arg
* remove tokenizer import
* add offload cpu quantization test
* fix check dataset
* modify dockerfile
* protect trainer
* style
* test for config
* add more log
* overwrite torch_dtype
* draft doc
* modify quantization_config docstring
* fix class name in docstring
* Apply suggestions from code review
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com >
* more warning
* fix 8bit kwargs tests
* peft compatibility
* remove var
* fix is_gptq_quantized
* remove is_gptq_quantized
* fix wrap
* Update src/transformers/modeling_utils.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com >
* add exllama
* skip test
* overwrite float16
* style
* fix skip test
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
* fix docsting formatting
* add doc
* better test
---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com >
2023-08-10 16:06:29 -04:00