enable large_gpu and torchao cases on XPU (#38355)

* cohere2 done

Signed-off-by: Matrix Yao <matrix.yao@intel.com>

* enable torchao cases on XPU

Signed-off-by: Matrix YAO <matrix.yao@intel.com>

* fix

Signed-off-by: Matrix YAO <matrix.yao@intel.com>

* fix

Signed-off-by: Matrix YAO <matrix.yao@intel.com>

* fix

Signed-off-by: Matrix YAO <matrix.yao@intel.com>

* rename

Signed-off-by: Matrix YAO <matrix.yao@intel.com>

* fix

Signed-off-by: Matrix YAO <matrix.yao@intel.com>

* fix comments

Signed-off-by: Matrix YAO <matrix.yao@intel.com>

---------

Signed-off-by: Matrix Yao <matrix.yao@intel.com>
Signed-off-by: Matrix YAO <matrix.yao@intel.com>
This commit is contained in:
Yao Matrix
2025-05-28 16:30:16 +08:00
committed by GitHub
parent cea254c909
commit fb82a98717
3 changed files with 151 additions and 62 deletions

View File

@@ -143,9 +143,9 @@ class AutoRoundTest(unittest.TestCase):
self.assertIn(output_tokens, self.EXPECTED_OUTPUTS)
@require_torch_multi_accelerator
def test_quantized_model_multi_gpu(self):
def test_quantized_model_multi_accelerator(self):
"""
Simple test that checks if the quantized model is working properly with multiple GPUs
Simple test that checks if the quantized model is working properly with multiple accelerators
"""
input_ids = self.tokenizer(self.input_text, return_tensors="pt").to(torch_device)
quantization_config = AutoRoundConfig(backend="triton")