[VLMs] support attention backends (#37576)
* update models * why rename * return attn weights when sdpa * fixes * fix attn implementation composite * fix moshi * add message * add typings * use explicitly all flags for each attn type * fix some tests * import what is needed * kosmos on main has ew attention already, yay * new models in main, run fixup * won't fix kosmos yet * fix-copies * clean up after rebasing * fix tests * style * dont cast attns to fp32 * did we update ruff? oke, let's just do what it asks * fix pixtral after rebase
This commit is contained in:
committed by
GitHub
parent
e296c63cd4
commit
d23aae2b8c
@@ -219,9 +219,10 @@ class OPTModelTest(ModelTesterMixin, GenerationTesterMixin, PipelineTesterMixin,
|
||||
else {}
|
||||
)
|
||||
is_encoder_decoder = False
|
||||
fx_compatible = True
|
||||
fx_compatible = False # Broken by attention refactor cc @Cyrilvallez
|
||||
test_pruning = False
|
||||
test_missing_keys = False
|
||||
test_head_masking = False # new attn API doesn't support head mask
|
||||
|
||||
# TODO: Fix the failed tests
|
||||
def is_pipeline_test_to_skip(
|
||||
|
||||
Reference in New Issue
Block a user