Add a static cache that offloads to the CPU or other device (#32161)
* Add a static cache that offloads to the CPU or other device * Fix PR comments, add unit-tests
This commit is contained in:
@@ -390,6 +390,11 @@ A [`Constraint`] can be used to force the generation to include specific tokens
|
||||
- get_seq_length
|
||||
- reset
|
||||
|
||||
[[autodoc]] OffloadedStaticCache
|
||||
- update
|
||||
- get_seq_length
|
||||
- reset
|
||||
|
||||
[[autodoc]] HybridCache
|
||||
- update
|
||||
- get_seq_length
|
||||
|
||||
Reference in New Issue
Block a user