Fix convert_and_export_with_cache failures for GPU models (#38976)
* Add the `device` option for `generate()` * Add device for default tensors to avoid tensor mismatch * [test] Enable test_static_cache_exportability for torch_device * infer device from the prompt_token_ids * Add device for generated tensor * [Test] Make `test_export_static_cache` tests to run on devices rather than only CPU * fix format * infer device from the model
This commit is contained in:
@@ -384,7 +384,7 @@ class Phi3IntegrationTest(unittest.TestCase):
|
||||
config.rope_scaling["type"] = "default"
|
||||
|
||||
# Load model
|
||||
device = "cpu"
|
||||
device = torch_device
|
||||
dtype = torch.bfloat16
|
||||
cache_implementation = "static"
|
||||
attn_implementation = "sdpa"
|
||||
|
||||
Reference in New Issue
Block a user