Test: generate with torch.compile(model.forward) as a fast test (#34544)

This commit is contained in:
Joao Gante
2025-01-28 14:10:38 +00:00
committed by GitHub
parent f48ecd7608
commit ece8c42488
25 changed files with 105 additions and 53 deletions

View File

@@ -349,7 +349,7 @@ In case you are using Sink Cache, you have to crop your inputs to that maximum l
>>> user_prompts = ["Hello, what's your name?", "Btw, yesterday I was on a rock concert."]
>>> past_key_values = DynamicCache()
>>> max_cache_length = past_key_values.get_max_length()
>>> max_cache_length = past_key_values.get_max_cache_shape()
>>> messages = []
>>> for prompt in user_prompts: