cyyever
|
1e6b546ea6
|
Use Python 3.9 syntax in tests (#37343)
Signed-off-by: cyy <cyyever@outlook.com>
|
2025-04-08 14:12:08 +02:00 |
|
omahs
|
cbf924b76c
|
Fix typos (#36910)
* fix typos
* fix typos
* fix typos
* fix typos
|
2025-03-24 14:08:29 +00:00 |
|
Afanti
|
19b9d8ae13
|
chore: fix typos in tests directory (#36785)
* chore: fix typos in tests directory
* chore: fix typos in tests directory
* chore: fix typos in tests directory
* chore: fix typos in tests directory
* chore: fix typos in tests directory
* chore: fix typos in tests directory
* chore: fix typos in tests directory
|
2025-03-18 10:31:13 +01:00 |
|
Mohamed Mekkouri
|
0013ba61e5
|
Fix Failing GPTQ tests (#36666)
fix tests
|
2025-03-12 20:03:02 +01:00 |
|
jiqing-feng
|
387663e571
|
Enable gptqmodel (#35012)
* gptqmodel
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* fix format
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* update readme
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* gptqmodel need use checkpoint_format (#1)
* gptqmodel need use checkpoint_format
* fix quantize
* Update quantization_config.py
* Update quantization_config.py
* Update quantization_config.py
---------
Co-authored-by: ZX-ModelCloud <zx@modelcloud.ai>
Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai>
* Revert quantizer_gptq.py (#2)
* revert quantizer_gptq.py change
* pass **kwargs
* limit gptqmodel and optimum version
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* fix format
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* fix warning
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* fix version check
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* revert unrelated changes
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* enable gptqmodel tests
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* fix requires gptq
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* Fix Transformer compat (#3)
* revert quantizer_gptq.py change
* pass **kwargs
* add meta info
* cleanup
* cleanup
* Update quantization_config.py
* hf_select_quant_linear pass checkpoint_format and meta
* fix GPTQTestCUDA
* Update test_gptq.py
* gptqmodel.hf_select_quant_linear() now does not select ExllamaV2
* cleanup
* add backend
* cleanup
* cleanup
* no need check exllama version
* Update quantization_config.py
* lower checkpoint_format and backend
* check none
* cleanup
* Update quantization_config.py
* fix self.use_exllama == False
* spell
* fix unittest
* fix unittest
---------
Co-authored-by: LRL <lrl@lbx.dev>
Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai>
* fix format
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* fix format again
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* update gptqmodel version (#6)
* update gptqmodel version
* update gptqmodel version
* fix unit test (#5)
* update gptqmodel version
* update gptqmodel version
* "not self.use_exllama" is not equivalent to "self.use_exllama==False"
* fix unittest
* update gptqmodel version
* backend is loading_attibutes (#7)
* fix format and tests
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* fix memory check
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* fix device mismatch
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* fix result check
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* Update src/transformers/quantizers/quantizer_gptq.py
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
* Update src/transformers/quantizers/quantizer_gptq.py
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
* Update src/transformers/quantizers/quantizer_gptq.py
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
* update tests
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* review: update docs (#10)
* review: update docs (#12)
* review: update docs
* fix typo
* update tests for gptqmodel
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* update document (#9)
* update overview.md
* cleanup
* Update overview.md
* Update overview.md
* Update overview.md
* update gptq.md
* Update gptq.md
* Update gptq.md
* Update gptq.md
* Update gptq.md
* Update gptq.md
* Update gptq.md
---------
Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai>
* typo
* doc note for asymmetric quant
* typo with apple silicon(e)
* typo for marlin
* column name revert: review
* doc rocm support
* Update docs/source/en/quantization/gptq.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/quantization/gptq.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/quantization/gptq.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/quantization/gptq.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/quantization/overview.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/quantization/overview.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
---------
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
Co-authored-by: LRL-ModelCloud <165116337+LRL-ModelCloud@users.noreply.github.com>
Co-authored-by: ZX-ModelCloud <zx@modelcloud.ai>
Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai>
Co-authored-by: ZX-ModelCloud <165115237+ZX-ModelCloud@users.noreply.github.com>
Co-authored-by: LRL <lrl@lbx.dev>
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
|
2025-01-15 14:22:49 +01:00 |
|
Ella Charlaix
|
02300273e2
|
🚨 Remove dataset with restrictive license (#31452)
remove dataset with restrictive license
|
2024-06-17 17:56:51 +01:00 |
|
Marc Sun
|
7c8dd88d13
|
[GPTQ] Fix test (#28018)
* fix test
* reduce length
* smaller model
|
2024-01-15 11:22:54 -05:00 |
|
Marc Sun
|
c9e72f55b2
|
Add exllamav2 better (#27111)
* add_ xllamav2 arg
* add test
* style
* add check
* add doc
* replace by use_exllama_v2
* fix tests
* fix doc
* style
* better condition
* fix logic
* add deprecate msg
* deprecate exllama
* remove disable_exllama from the linter
* remove
* fix warning
* Revert the commits deprecating exllama
* deprecate disable_exllama for use_exllama
* fix
* fix loading attribute
* better handling of args
* remove disable_exllama from init and linter
* Apply suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* better arg
* fix warning
* Apply suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* switch to dict
* Apply suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* style
* nits
* style
* better tests
* style
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
|
2023-11-01 13:09:21 -04:00 |
|
Arthur
|
90ee9cea19
|
Revert "add exllamav2 arg" (#27102)
Revert "add exllamav2 arg (#26437)"
This reverts commit 8214d6e7b1.
|
2023-10-27 11:23:06 +02:00 |
|
Marc Sun
|
8214d6e7b1
|
add exllamav2 arg (#26437)
* add_ xllamav2 arg
* add test
* style
* add check
* add doc
* replace by use_exllama_v2
* fix tests
* fix doc
* style
* better condition
* fix logic
* add deprecate msg
|
2023-10-26 10:15:05 -04:00 |
|
Younes Belkada
|
fd6a0ade9b
|
🚨🚨🚨 [Quantization] Store the original dtype in the config as a private attribute 🚨🚨🚨 (#26761)
* First step
* fix
* add adjustements for gptq
* change to `_pre_quantization_dtype`
* Update src/transformers/modeling_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* fix serialization
* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* fixup
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
|
2023-10-16 19:56:53 +02:00 |
|
Heinz-Alexander Fuetterer
|
883ed4b344
|
chore: fix typos (#26756)
|
2023-10-12 18:00:27 +02:00 |
|
Marc Sun
|
fa6107c97e
|
modify context length for GPTQ + version bump (#25899)
Release - Conda / build_and_package (push) Has been cancelled
* add new arg for gptq
* add tests
* add min version autogptq
* fix order
* skip test
* fix
* Update src/transformers/modeling_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* fix style
* change model path
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
|
2023-09-06 11:45:47 -04:00 |
|
Younes Belkada
|
584eeb5387
|
[AutoGPTQ] Add correct installation of GPTQ library + fix slow tests (#25713)
* add correct installation of GPTQ library
* update tests values
|
2023-08-24 14:57:16 +02:00 |
|
Marc Sun
|
55db70c63d
|
GPTQ integration (#25062)
* GTPQ integration
* Add tests for gptq
* support for more quantization model
* fix style
* typo
* fix method
* Update src/transformers/modeling_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* add dataclass and fix quantization_method
* fix doc
* Update tests/quantization/gptq/test_gptq.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
* Apply suggestions from code review
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
* modify dataclass
* add gtpqconfig import
* fix typo
* fix tests
* remove dataset as req arg
* remove tokenizer import
* add offload cpu quantization test
* fix check dataset
* modify dockerfile
* protect trainer
* style
* test for config
* add more log
* overwrite torch_dtype
* draft doc
* modify quantization_config docstring
* fix class name in docstring
* Apply suggestions from code review
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
* more warning
* fix 8bit kwargs tests
* peft compatibility
* remove var
* fix is_gptq_quantized
* remove is_gptq_quantized
* fix wrap
* Update src/transformers/modeling_utils.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
* add exllama
* skip test
* overwrite float16
* style
* fix skip test
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* fix docsting formatting
* add doc
* better test
---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
|
2023-08-10 16:06:29 -04:00 |
|