switch to device agnostic device calling for test cases (#38247)

* use device agnostic APIs in test cases

Signed-off-by: Matrix Yao <matrix.yao@intel.com>

* fix style

Signed-off-by: Matrix Yao <matrix.yao@intel.com>

* add one more

Signed-off-by: YAO Matrix <matrix.yao@intel.com>

* xpu now supports integer device id, aligning to CUDA behaviors

Signed-off-by: Matrix Yao <matrix.yao@intel.com>

* update to use device_properties

Signed-off-by: Matrix Yao <matrix.yao@intel.com>

* fix style

Signed-off-by: Matrix Yao <matrix.yao@intel.com>

* update comment

Signed-off-by: Matrix Yao <matrix.yao@intel.com>

* fix comments

Signed-off-by: Matrix Yao <matrix.yao@intel.com>

* fix style

Signed-off-by: Matrix Yao <matrix.yao@intel.com>

---------

Signed-off-by: Matrix Yao <matrix.yao@intel.com>
Signed-off-by: YAO Matrix <matrix.yao@intel.com>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
This commit is contained in:
Yao Matrix
2025-05-26 16:18:53 +08:00
committed by GitHub
parent cba279f46c
commit a5a0c7b888
39 changed files with 259 additions and 389 deletions

View File

@@ -17,6 +17,7 @@ import unittest
from transformers import AutoModelForCausalLM, AutoTokenizer, HqqConfig
from transformers.testing_utils import (
backend_empty_cache,
require_accelerate,
require_hqq,
require_torch_gpu,
@@ -50,7 +51,7 @@ class HQQLLMRunner:
def cleanup():
torch.cuda.empty_cache()
backend_empty_cache(torch_device)
gc.collect()
@@ -187,7 +188,7 @@ class HQQTestBias(unittest.TestCase):
hqq_runner.model.save_pretrained(tmpdirname)
del hqq_runner.model
torch.cuda.empty_cache()
backend_empty_cache(torch_device)
model_loaded = AutoModelForCausalLM.from_pretrained(
tmpdirname, torch_dtype=torch.float16, device_map=torch_device
@@ -228,7 +229,7 @@ class HQQSerializationTest(unittest.TestCase):
# Remove old model
del hqq_runner.model
torch.cuda.empty_cache()
backend_empty_cache(torch_device)
# Load and check if the logits match
model_loaded = AutoModelForCausalLM.from_pretrained(