Paligemma causal attention mask (#30967)

* PaliGemma working causal attention * Formatting * Style * Docstrings + remove commented code * Update docstring for PaliGemma Config * PaliGemma - add separator ind to model/labels * Refactor + docstring paligemma processor method * Style * return token type ids when tokenizing labels * use token type ids when building causal mask * add token type ids to tester * remove separator from config * fix style * don't ignore separator * add processor documentation * simplify tokenization * fix causal mask * style * fix label propagation, revert suffix naming * fix style * fix labels tokenization * [run-slow]paligemma * add eos if suffixes are present * [run-slow]paligemma * [run-slow]paligemma * add misssing tokens to fast version * Apply suggestions from code review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix style * [run-slow]paligemma --------- Co-authored-by: Peter Robicheaux <peter@roboflow.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-05-22 19:37:15 +02:00
parent d44e1ae036
commit a25f7d3c12
3 changed files with 111 additions and 45 deletions
--- a/tests/models/paligemma/test_modeling_paligemma.py
+++ b/tests/models/paligemma/test_modeling_paligemma.py
@@ -163,6 +163,8 @@ class PaliGemmaVisionText2TextModelTester:
            "pixel_values": pixel_values,
            "input_ids": input_ids,
            "attention_mask": attention_mask,
+            "labels": input_ids,
+            "token_type_ids": torch.zeros_like(input_ids),
        }
        return config, inputs_dict