[generate] shape checks in tests compatible with fixed-length caches (+ some minor fixes) (#35993)

* shape checks compatible with static cache * add test * tmp * manually turn on eager attn when we want to output attn * typo * generalize to encoder-decoder models * force compilation on cpu * tmp commit * fix static cache shape checks * models with odd caches * fix copies * shorter cache search loop * use decoder_past_key_values everywhere * better test variable names and comments * signature * rename _check_outputs into _check_generate_outputs * add comments * HybridCache future test note
2025-02-10 17:50:54 +00:00
parent 9510ae39d9
commit be2ac0916a
25 changed files with 379 additions and 917 deletions
--- a/tests/models/led/test_modeling_led.py
+++ b/tests/models/led/test_modeling_led.py
@@ -468,12 +468,12 @@ class LEDModelTest(ModelTesterMixin, GenerationTesterMixin, PipelineTesterMixin,
                ],
            )

-    def _check_encoder_attention_for_generate(self, attentions, batch_size, config, seq_length):
+    def _check_encoder_attention_for_generate(self, attentions, batch_size, config, prompt_length):
        # overwrite because LED does not have (bs, num_heads, seq_len, seq_len) shape
        encoder_expected_shape = (
            batch_size,
            config.num_attention_heads,
-            seq_length,
+            prompt_length,
            self.model_tester.attention_window // 2 * 2 + 1,
        )
        self.assertIsInstance(attentions, tuple)