Iterative generation using Input embeds and past_key_values (#35890)

* Iterative generation using input embeds

* ruff fix

* Added Testcase

* Updated comment

* ♻️ Refactored testcase

* Skip test for these models

* Continue generation using input embeds and cache

* Skip generate_continue_from_embeds test

* Refactor `prepare_input_for_generation` func

* Continue generation using input embeds and cache

* Modular changes fix

* Overwrite 'prepare_inputs_for_generation' function
This commit is contained in:
Yaswanth Gali
2025-02-06 15:36:05 +05:30
committed by GitHub
parent b5f327f350
commit 7aee036e54
18 changed files with 276 additions and 34 deletions

View File

@@ -131,6 +131,10 @@ class Cohere2ModelTest(CohereModelTest, unittest.TestCase):
def test_generate_from_inputs_embeds_with_static_cache(self):
pass
@unittest.skip("Cohere2 has HybridCache and doesn't support progressive generation using input embeds.")
def test_generate_continue_from_inputs_embeds(self):
pass
# overwrite because HybridCache has fixed length for key/values
def _check_attentions_for_generate(
self, batch_size, attentions, min_length, max_length, config, use_cache=False, num_beam_groups=1