Bart: new cache format (#35314)
* bart compile * add mbart * some more models touched by fix-copies * more * more models * even more models * fix copies * fix tests * fix copies * fix * biogpt accepts position ids now (breaking?) * fix failing non-slow tests * fix some tests * should not be removed * small update * Update src/transformers/models/bart/modeling_bart.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * update for last `main` * fix copies * clone `update_causal_mask` from llama * tmp * fixup * why? how? * fix bart tests * dont skip test * address comments * fix tests * fix * fixup and delete the file --------- Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
This commit is contained in:
committed by
GitHub
parent
3ab47b6ce3
commit
01ad9f4b49
@@ -82,7 +82,7 @@ class M2M100ModelTester:
|
||||
attention_probs_dropout_prob=0.1,
|
||||
encoder_layerdrop=0.0,
|
||||
decoder_layerdrop=0.0,
|
||||
max_position_embeddings=20,
|
||||
max_position_embeddings=50,
|
||||
eos_token_id=2,
|
||||
pad_token_id=1,
|
||||
bos_token_id=0,
|
||||
@@ -426,7 +426,7 @@ class M2M100ModelIntegrationTests(unittest.TestCase):
|
||||
Overwriting the common test as the test is flaky on tiny models
|
||||
"""
|
||||
model = M2M100ForConditionalGeneration.from_pretrained(
|
||||
"facebook/m2m100_418M", attn_implementation="flash_attention_2"
|
||||
"facebook/m2m100_418M", torch_dtype=torch.float16, attn_implementation="flash_attention_2"
|
||||
).to(torch_device)
|
||||
|
||||
tokenizer = M2M100Tokenizer.from_pretrained("facebook/m2m100_418M", src_lang="fr", tgt_lang="en")
|
||||
|
||||
Reference in New Issue
Block a user