Joao Gante
0863eef248
[tests] remove pt_tf equivalence tests ( #36253 )
2025-02-19 11:55:11 +00:00
Yoni Gozlan
e6a7981711
Fix make_batched_videos and add tests ( #36143 )
...
* add support for initial shift in video processing and other fixes
* revert modifications video loading functions
2025-02-13 17:14:30 -05:00
Arthur
b079dd1fa2
Fix red CI ( #36174 )
...
test was weird
2025-02-13 14:27:55 +01:00
Lucain
e60ae0d078
Replace deprecated update_repo_visibility ( #35970 )
2025-02-13 11:27:55 +01:00
Sambhav Dixit
d6897b46bd
Add utility for Reload Transformers imports cache for development workflow #35508 ( #35858 )
...
* Reload transformers fix form cache
* add imports
* add test fn for clearing import cache
* ruff fix to core import logic
* ruff fix to test file
* fixup for imports
* fixup for test
* lru restore
* test check
* fix style changes
* added documentation for usecase
* fixing
---------
Co-authored-by: sambhavnoobcoder <indosambahv@gmail.com >
2025-02-12 12:45:11 +01:00
Zach Mueller
1ce0e2992e
Nail in edge case of torch dtype being overriden permantly in the case of an error ( #35845 )
...
* Nail in edge case of torch dtype
* Rm unused func
* Apply suggestions from code review
Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com >
* Refactor tests to only mock what we need, don't introduce injection functions
* SetUp/TearDown
* Do super
---------
Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com >
2025-02-06 09:05:23 -05:00
Marc Sun
9f486badd5
Display warning for unknown quants config instead of an error ( #35963 )
...
* add supports_quant_method check
* fix
* add test and fix suggestions
* change logic slightly
---------
Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com >
2025-02-04 15:17:01 +01:00
Yoni Gozlan
d7188ba600
Add support for nested images to LLava and VipLLava ( #35558 )
...
* move make_flat_list_of_images and make_batched_videos to image_utils
* remove unnecessary is_vision_available
* move make_nested_list_of_images to image_utils
* fix fast pixtral image processor
* fix import mllama
* fix make_nested_list_of_images
* add tests
* convert 4d arrays/tensors to list
* add test_make_batched_videos
* add support nested batch of videos
* fix image processing qwen2vl
2025-01-30 16:49:20 -05:00
Joao Gante
ece8c42488
Test: generate with torch.compile(model.forward) as a fast test ( #34544 )
2025-01-28 14:10:38 +00:00
Raushan Turganbay
b764c20b09
Fix: loading DBRX back from saved path ( #35728 )
...
* fix dtype as dict for some models + add test
* add comment in tests
2025-01-28 11:38:45 +01:00
Arthur
b912f5ee43
use torch.testing.assertclose instead to get more details about error in cis ( #35659 )
...
* use torch.testing.assertclose instead to get more details about error in cis
* fix
* style
* test_all
* revert for I bert
* fixes and updates
* more image processing fixes
* more image processors
* fix mamba and co
* style
* less strick
* ok I won't be strict
* skip and be done
* up
2025-01-24 16:55:28 +01:00
Cyril Vallez
d3af76df58
[Backend support] Allow num_logits_to_keep as Tensor + add flag ( #35757 )
...
* support
* Update modeling_utils.py
* style
* most models
* Other models
* fix-copies
* tests + generation utils
2025-01-23 09:47:54 +01:00
Raushan Turganbay
373e50e970
Init cache on meta device ( #35164 )
...
* init cache on meta device
* offloaded static + enable tests
* tests weren't running before :(
* update
* fix mamba
* fix copies
* update
* address comments and fix tests
* fix copies
* Update src/transformers/cache_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
* update
* mamba fix
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
2025-01-22 09:49:17 +01:00
Aymeric Roucher
44393df089
Tool calling: support more types ( #35776 )
...
* Tool calling: support NoneType for function return type
2025-01-20 19:15:34 +01:00
Ross Wightman
8c1b5d3782
🚨 🚨 🚨 An attempt to fix #29554 . Include 'LayerNorm.' in gamma/beta rename scope, optimize string search. ( #35615 )
...
* An attempt to fix #29554 . Include 'LayerNorm.' in gamma/beta rename scope, reduce number of characters searched on every load considerably.
* Fix fix on load issue
* Fix gamma/beta warning test
* A style complaint
* Improve efficiency of weight norm key rename. Add better comments about weight norm and layer norm renaming.
* Habitual elif redunant with the return
2025-01-16 17:25:44 -08:00
Joao Gante
aeeceb9916
[cache] add a test to confirm we can use cache at train time ( #35709 )
...
* add test
* augment test as suggested
* Update tests/utils/test_modeling_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
* rerun tests
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
2025-01-16 17:02:34 +00:00
jiqing-feng
387663e571
Enable gptqmodel ( #35012 )
...
* gptqmodel
Signed-off-by: jiqing-feng <jiqing.feng@intel.com >
* fix format
Signed-off-by: jiqing-feng <jiqing.feng@intel.com >
* update readme
Signed-off-by: jiqing-feng <jiqing.feng@intel.com >
* gptqmodel need use checkpoint_format (#1 )
* gptqmodel need use checkpoint_format
* fix quantize
* Update quantization_config.py
* Update quantization_config.py
* Update quantization_config.py
---------
Co-authored-by: ZX-ModelCloud <zx@modelcloud.ai >
Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai >
* Revert quantizer_gptq.py (#2 )
* revert quantizer_gptq.py change
* pass **kwargs
* limit gptqmodel and optimum version
Signed-off-by: jiqing-feng <jiqing.feng@intel.com >
* fix format
Signed-off-by: jiqing-feng <jiqing.feng@intel.com >
* fix warning
Signed-off-by: jiqing-feng <jiqing.feng@intel.com >
* fix version check
Signed-off-by: jiqing-feng <jiqing.feng@intel.com >
* revert unrelated changes
Signed-off-by: jiqing-feng <jiqing.feng@intel.com >
* enable gptqmodel tests
Signed-off-by: jiqing-feng <jiqing.feng@intel.com >
* fix requires gptq
Signed-off-by: jiqing-feng <jiqing.feng@intel.com >
* Fix Transformer compat (#3 )
* revert quantizer_gptq.py change
* pass **kwargs
* add meta info
* cleanup
* cleanup
* Update quantization_config.py
* hf_select_quant_linear pass checkpoint_format and meta
* fix GPTQTestCUDA
* Update test_gptq.py
* gptqmodel.hf_select_quant_linear() now does not select ExllamaV2
* cleanup
* add backend
* cleanup
* cleanup
* no need check exllama version
* Update quantization_config.py
* lower checkpoint_format and backend
* check none
* cleanup
* Update quantization_config.py
* fix self.use_exllama == False
* spell
* fix unittest
* fix unittest
---------
Co-authored-by: LRL <lrl@lbx.dev >
Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai >
* fix format
Signed-off-by: jiqing-feng <jiqing.feng@intel.com >
* fix format again
Signed-off-by: jiqing-feng <jiqing.feng@intel.com >
* update gptqmodel version (#6 )
* update gptqmodel version
* update gptqmodel version
* fix unit test (#5 )
* update gptqmodel version
* update gptqmodel version
* "not self.use_exllama" is not equivalent to "self.use_exllama==False"
* fix unittest
* update gptqmodel version
* backend is loading_attibutes (#7 )
* fix format and tests
Signed-off-by: jiqing-feng <jiqing.feng@intel.com >
* fix memory check
Signed-off-by: jiqing-feng <jiqing.feng@intel.com >
* fix device mismatch
Signed-off-by: jiqing-feng <jiqing.feng@intel.com >
* fix result check
Signed-off-by: jiqing-feng <jiqing.feng@intel.com >
* Update src/transformers/quantizers/quantizer_gptq.py
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com >
* Update src/transformers/quantizers/quantizer_gptq.py
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com >
* Update src/transformers/quantizers/quantizer_gptq.py
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com >
* update tests
Signed-off-by: jiqing-feng <jiqing.feng@intel.com >
* review: update docs (#10 )
* review: update docs (#12 )
* review: update docs
* fix typo
* update tests for gptqmodel
Signed-off-by: jiqing-feng <jiqing.feng@intel.com >
* update document (#9 )
* update overview.md
* cleanup
* Update overview.md
* Update overview.md
* Update overview.md
* update gptq.md
* Update gptq.md
* Update gptq.md
* Update gptq.md
* Update gptq.md
* Update gptq.md
* Update gptq.md
---------
Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai >
* typo
* doc note for asymmetric quant
* typo with apple silicon(e)
* typo for marlin
* column name revert: review
* doc rocm support
* Update docs/source/en/quantization/gptq.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/quantization/gptq.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/quantization/gptq.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/quantization/gptq.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/quantization/overview.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/quantization/overview.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
---------
Signed-off-by: jiqing-feng <jiqing.feng@intel.com >
Co-authored-by: LRL-ModelCloud <165116337+LRL-ModelCloud@users.noreply.github.com >
Co-authored-by: ZX-ModelCloud <zx@modelcloud.ai >
Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai >
Co-authored-by: ZX-ModelCloud <165115237+ZX-ModelCloud@users.noreply.github.com >
Co-authored-by: LRL <lrl@lbx.dev >
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com >
Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com >
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
2025-01-15 14:22:49 +01:00
Raushan Turganbay
84a6789145
Enable different torch dtype in sub models ( #34873 )
...
* fix
* fix test
* add tests
* add more tests
* fix tests
* supposed to be a torch.dtype test
* handle BC and make fp32 default
2025-01-13 13:42:08 +01:00
Cyril Vallez
965a2fb320
More model refactoring! ( #35359 )
...
* cohere
* style
* phi3
* style
* small fix
* small fix
* phi3 longrope
* oups
* Update rope (only for phi3 still)
* Update test_modeling_rope_utils.py
* Update modeling_phi3.py
* fix
* fix copies
* style
* Fix copied from bad renaming
2025-01-09 11:09:09 +01:00
Arthur
2c47618c1a
🚨 All attention refactor 🚨 ( #35235 )
...
* refactor LlamaAttention
* minimal changes
* fix llama
* update
* modular gemmas
* modular nits
* modular updates
* nits
* simplify
* gpt2
* more modualr and fixes
* granite
* modular modular modular
* nits
* update
* qwen2 + starcoder2
* mostly gemma2
* Update image_processing_auto.py
* fix
* Update modular_starcoder2.py
* fix
* remove all copied from attentions
* remove gcv
* make fix-copies
* oups
* oups2.0
* fix some modulars + all copied from
* should be good now
* revert unwanted changes
* Update modeling_decision_transformer.py
* finish cleanup
* Update modeling_olmo.py
* consistency
* re-add gradient checkpointing attribute
* fix
* style
* make config necessary
* bis
* bis
* Update modeling_my_new_model2.py
* is_causal attr
* fix
* remove past kv return from decoder layer
* fix
* default rope config
* correctly fix rope config
* fix bias
* fix gpt2 attention output
* fix test
* fix inits
* fix default sdpa
* fix default sdpa implementation
* harmonize classes
* fix mistral
* fix sliding window models
* mixtral
* be more explicit
* style
* fix
* several fixes
* Update modeling_dbrx.py
* fix test
* olmo + phi
* rotary
* syle
* phi
* phi again
* again
* kwargs
* Update test_modeling_common.py
* skip fx tracing tests
* Update modeling_utils.py
* gemma 2
* again
* Update modeling_recurrent_gemma.py
* gemma2
* granite
* style
* starcoder
* Update sdpa_attention.py
* switch args
* Update modeling_mllama.py
* fix
* cache type tests
* gpt2
* Update test_modeling_common.py
* fix
* consistency
* fix shape with encoder
* should be the last one
* tests non model
* most comments
* small oupsi
* be more explicit in modulars
* more explicit modulars
* CIs! it works locally
* add kwargs to _flash_attention_forward
---------
Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com >
2024-12-18 16:53:39 +01:00
Marc Sun
1eee1cedfd
Fix loading with only state dict and low_cpu_mem_usage = True ( #35217 )
...
* fix loading with only state dict and config
* style
* add tests
---------
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com >
2024-12-18 09:54:32 +01:00
Yih-Dar
b0a51e5cff
Fix flaky Hub CI (test_trainer.py) ( #35062 )
...
* fix
* Update src/transformers/testing_utils.py
Co-authored-by: Lucain <lucainp@gmail.com >
* fix
* fix
* fix
* fix
* fix
* fix
* fix
* fix
* check
* check
* check
* check
* check
* check
* Update src/transformers/testing_utils.py
Co-authored-by: Lucain <lucainp@gmail.com >
* Update src/transformers/testing_utils.py
Co-authored-by: Lucain <lucainp@gmail.com >
* check
* check
* check
* Final space
* Final adjustment
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
Co-authored-by: Lucain <lucainp@gmail.com >
2024-12-05 17:02:27 +01:00
Tibor Reiss
f297af55df
Fix: take into account meta device ( #34134 )
...
* Do not load for meta device
* Make some minor improvements
* Add test
* Update tests/utils/test_modeling_utils.py
Update test parameters
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com >
* Make the test simpler
---------
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com >
2024-11-20 11:32:07 +01:00
Joao Gante
13493215ab
🧼 remove v4.44 deprecations ( #34245 )
...
* remove v4.44 deprecations
* PR comments
* deprecations scheduled for v4.50
* hub version update
* make fiuxp
---------
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com >
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
2024-11-15 23:07:24 +01:00
Joao Gante
34927b0f73
MPS: isin_mps_friendly can support 0D tensors ( #34538 )
...
* apply fix
* tested
* make fixup
2024-11-04 16:18:50 +00:00
Yoni Gozlan
203e27059b
Add image text to text pipeline ( #34170 )
...
* Standardize image-text-to-text-models-output
add post_process_image_text_to_text to chameleon and cleanup
Fix legacy kwarg behavior and deprecation warning
add post_process_image_text_to_text to qwen2_vl and llava_onevision
Add post_process_image_text_to_text to idefics3, mllama, pixtral processor
* nit var name post_process_image_text_to_text udop
* nit fix deprecation warnings
* Add image-text-to-text pipeline
* add support for image url in chat template for pipeline
* Reformat to be fully compatible with chat templates
* Add tests chat template
* Fix imports and tests
* Add pipeline tag
* change logic handling of single prompt ans multiple images
* add pipeline mapping to models
* fix batched inference
* fix tests
* Add manual batching for preprocessing
* Fix outputs with nested images
* Add support for all common processing kwargs
* Add default padding when multiple text inputs (batch size>1)
* nit change version deprecation warning
* Add support for text only inference
* add chat_template warnings
* Add pipeline tests and add copied from post process function
* Fix batched pipeline tests
* nit
* Fix pipeline tests blip2
* remove unnecessary max_new_tokens
* revert processing kosmos2 and remove unnecessary max_new_tokens
* fix pipeline tests idefics
* Force try loading processor if pipeline supports it
* revert load_processor change
* hardcode loading only processor
* remove unnecessary try except
* skip imagetexttotext tests for kosmos2 as tiny model causes problems
* Make code clearer
* Address review comments
* remove preprocessing logic from pipeline
* fix fuyu
* add BC resize fuyu
* Move post_process_image_text_to_text to ProcessorMixin
* add guard in post_process
* fix zero shot object detection pipeline
* add support for generator input in pipeline
* nit
* change default image-text-to-text model to llava onevision
* fix owlv2 size dict
* Change legacy deprecation warning to only show when True
2024-10-31 15:48:11 -04:00
kang sheng
655bec2da7
use a tinymodel to test generation config which aviod timeout ( #34482 )
...
* use a tinymodel to test generation config which aviod timeout
* remove tailing whitespace
2024-10-29 09:39:06 +01:00
Raushan Turganbay
21d5025826
Attn implementation for composite models ( #32238 )
...
* first try
* codestyle
* idefics2 is happy
* [run-slow] llava, llava_next, video_llava, vipllava, llava_next_video, idefics, idefics2, kosmos2, fuyu, blip, blip_2, instructblip, instructblipvideo, paligemma
* fix-copies
* [run-slow] llava, llava_next, video_llava, vipllava, llava_next_video, idefics, idefics2, kosmos2, fuyu, blip, blip_2, instructblip, instructblipvideo
* blip-2 needs to init vision from config
* when was this removed O_o
* minor fix
* tests
* this way?
* tests
* model-agnostic code
* codestyle
* add tests for idefics
* modify general test for VLMs
* no generation test for vlm yet!
* no generation test here also
* wanr in VIT-SDPA if output attn
* add more tests
* user can pass dict as attn impl
* repo consistency
* update
* muicgen
* no prints
* forgot speech enc-dec and clip
* how many composite models we have?
* musicgen meelody is same as mudicgen
* +siglip
* fix tests + add some more
* remove idefics custom overriden code
* make idefics2 automappable
* nits
* skip tests
* doctests
* Update src/transformers/models/idefics2/configuration_idefics2.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
* Update tests/models/clip/test_modeling_clip.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
* Update tests/models/idefics2/test_modeling_idefics2.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
* Update tests/models/idefics2/test_modeling_idefics2.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
* Update src/transformers/configuration_utils.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
* major update, no need for automap
* clean up
* add FA2 test
* more tests
* style
* skip tests
* why did these started failing now?
* no attributes for FA2 needed
* one tiny test
* address comment about FA2 false warning
* style
* add new models and resolve conflicts
* fix copies
* let it be this way for now, come back tomorrow to review
* some more fixes
* update
* more updates
* update
* fix copies
* style and tests
* another big update
* fix tests
* fix tests
* update
* another update
* fix tests
* fix copies
* fix tests
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
2024-10-22 06:54:44 +02:00
alpertunga-bile
98bad9c6d6
[fix] fix token healing tests and usage errors ( #33931 )
...
* auto-gptq requirement is removed & model is changed & tokenizer pad token is assigned
* values func is changed with extensions & sequence key value bug is fixed
* map key value check is added in ExtensionsTree
* empty trimmed_ids bug is fixed
* tail_id IndexError is fixed
* empty trimmed_ids bug fix is updated for failed test
* too much specific case for specific tokenizer is removed
* input_ids check is updated
* require auto-gptq import is removed
* key error check is changed with empty list check
* empty input_ids check is added
* empty trimmed_ids fix is checked with numel function
* usage change comments are added
* test changes are commented
* comment style and quality bugs are fixed
* test comment style and quality bug is fixed
2024-10-16 14:22:55 +02:00
Lysandre Debut
409dd2d19c
Fix failing conversion ( #34010 )
...
* Fix
* Tests
* Typo
* Typo
2024-10-11 14:59:23 +02:00
Dani Martí
a84c413773
HfArgumentParser: allow for hyhenated field names in long-options ( #33990 )
...
Allow for hyphenated field names in long-options
argparse converts hyphens into underscores before assignment (e.g., an
option passed as `--long-option` will be stored under `long_option`), So
there is no need to pass options as literal attributes, as in
`--long_option` (with an underscore instead of a hyphen). This commit
ensures that this behavior is respected by `parse_args_into_dataclasses`
as well.
Issue: #33933
Co-authored-by: Daniel Marti <mrtidm@amazon.com >
2024-10-10 11:58:26 +02:00
Joao Gante
38f9f10dd9
Cache: revert DynamicCache init for BC ( #33861 )
...
* tmp commit
* tmp commit
* make fixup
* missing removal
* fix condition
* fix end-to-end compilation
* if -> elif
* BC
* BC
* use @deprecate_kwarg("num_hidden_layers", version="4.47.0")
* wups the import
* 🥴
---------
Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com >
2024-10-04 22:47:08 +02:00
Raushan Turganbay
061c2c4c38
Ignore keys on validate_rope ( #33753 )
...
* ignore keys on check rope
* add tests
* fix tests, so maybe better leave at logger lvl
2024-10-04 12:39:37 +02:00
Joao Gante
b0c5660e88
Config: lower save_pretrained exception to warning ( #33906 )
...
* lower to warning
* msg
* make fixup
* rm extra comma
2024-10-03 16:45:14 +01:00
Guang Yang
808997a634
Fix passing str dtype to static cache ( #33741 )
...
Co-authored-by: Guang Yang <guangyang@fb.com >
2024-10-01 09:50:17 +02:00
Joao Gante
3557f9a14a
Generate: can_generate() recursive check ( #33718 )
...
* add recursive check and test warnings
* missing space
* models without can_generate
2024-09-26 18:11:14 +01:00
Arthur
19d58d31f1
Add MLLama ( #33703 )
...
* current changes
* nit
* Add cross_attenttion_mask to processor
* multi-image fixed
* Add cross_attenttion_mask to processor
* cross attn works in all cases
* WIP refactoring function for image processor
* WIP refactoring image processor functions
* Refactor preprocess to use global loops instead of list nested list comps
* Docstrings
* Add channels unification
* fix dtype issues
* Update docsrings and format
* Consistent max_image_tiles
* current script
* updates
* Add convert to rgb
* Add image processor tests
* updates!
* update
* god damn it I am dumb sometimes
* Precompute aspect ratios
* now this works, full match
* fix 😉
* nits
* style
* fix model and conversion
* nit
* nit
* kinda works
* hack for sdpa non-contiguous bias
* nits here and there
* latest c hanges
* merge?
* run forward
* Add aspect_ratio_mask
* vision attention mask
* update script and config variable names
* nit
* nits
* be able to load
* style
* nits
* there
* nits
* make forward run
* small update
* enable generation multi-turn
* nit
* nit
* Clean up a bit for errors and typos
* A bit more constant fixes
* 90B keys and shapes match
* Fix for 11B model
* Fixup, remove debug part
* Docs
* Make max_aspect_ratio_id to be minimal
* Update image processing code to match new implementation
* Adjust conversion for final checkpoint state
* Change dim in repeat_interleave (accordig to meta code)
* tmp fix for num_tiles
* Fix for conversion (gate<->up, q/k_proj rope permute)
* nits
* codestyle
* Vision encoder fixes
* pass cross attn mask further
* Refactor aspect ratio mask
* Disable text-only generation
* Fix cross attention layers order, remove q/k norm rotation for cross atention layers
* Refactor gated position embeddings
* fix bugs but needs test with new weights
* rope scaling should be llama3
* Fix rope scaling name
* Remove debug for linear layer
* fix copies
* Make mask prepare private func
* Remove linear patch embed
* Make precomputed embeddings as nn.Embedding module
* MllamaPrecomputedAspectRatioEmbedding with config init
* Remove unused self.output_dim
* nit, intermediate layers
* Rename ln and pos_embed
* vision_chunk_size -> image_size
* return_intermediate -> intermediate_layers_indices
* vision_input_dim -> hidden_size
* Fix copied from statements
* fix most tests
* Fix more copied from
* layer_id->layer_idx
* Comment
* Fix tests for processor
* Copied from for _prepare_4d_causal_attention_mask_with_cache_position
* Style fix
* Add MllamaForCausalLM
* WIP fixing tests
* Remove duplicated layers
* Remove dummy file
* Fix style
* Fix consistency
* Fix some TODOs
* fix language_model instantiation, add docstring
* Move docstring, remove todos for precomputed embeds (we cannot init them properly)
* Add initial docstrings
* Fix
* fix some tests
* lets skip these
* nits, remove print, style
* Add one more copied from
* Improve test message
* Make validate func private
* Fix dummy objects
* Refactor `data_format` a bit + add comment
* typos/nits
Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com >
* fix dummy objects and imports
* Add chat template config json
* remove num_kv_heads from vision attention
* fix
* move some commits and add more tests
* fix test
* Remove `update_key_name` from modeling utils
* remove num-kv-heads again
* some prelimiary docs
* Update chat template + tests
* nit, conversion script max_num_tiles from params
* Fix warning for text-only generation
* Update conversion script for instruct models
* Update chat template in converstion + test
* add tests for CausalLM model
* model_max_length, avoid null chat_template
* Refactor conversion script
* Fix forward
* Fix integration tests
* Refactor vision config + docs
* Fix default
* Refactor text config
* Doc fixes
* Remove unused args, fix docs example
* Squashed commit of the following:
commit b51ce5a2efffbecdefbf6fc92ee87372ec9d8830
Author: qubvel <qubvel@gmail.com >
Date: Wed Sep 18 13:39:15 2024 +0000
Move model + add output hidden states and output attentions
* Fix num_channels
* Add mllama text and mllama vision models
* Fixing repo consistency
* Style fix
* Fixing repo consistency
* Fixing unused config params
* Fix failed tests after refactoring
* hidden_activation -> hidden_act for text mlp
* Remove from_pretrained from sub-configs
* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
* Update src/transformers/models/mllama/convert_mllama_weights_to_hf.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
* Reuse lambda in conversion script
* Remove run.py
* Update docs/source/en/model_doc/mllama.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
* Update src/transformers/models/mllama/processing_mllama.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
* Remove unused LlamaTokenizerFast
* Fix logging
* Refactor gating
* Remove cycle for collecting intermediate states
* Refactor text-only check, add integration test for text-only
* Revert from pretrained to configs
* Fix example
* Add auto `bos_token` adding in processor
* Fix tips
* Update src/transformers/models/auto/tokenization_auto.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
* Enable supports_gradient_checkpointing model flag
* add eager/sdpa options
* don't skip attn tests and bring back GC skips (did i really remove those?)
* Fix signature, but get error with None gradient
* Fix output attention tests
* Disable GC back
* Change no split modules
* Fix dropout
* Style
* Add Mllama to sdpa list
* Add post init for vision model
* Refine config for MllamaForCausalLMModelTest and skipped tests for CausalLM model
* if skipped, say it, don't pass
* Clean vision tester config
* Doc for args
* Update tests/models/mllama/test_modeling_mllama.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
* Add cross_attention_mask to test
* typehint
* Remove todo
* Enable gradient checkpointing
* Docstring
* Style
* Fixing and skipping some tests for new cache
* Mark flaky test
* Skip `test_sdpa_can_compile_dynamic` test
* Fixing some offload tests
* Add direct GenerationMixin inheritance
* Remove unused code
* Add initializer_range to vision config
* update the test to make sure we show if split
* fix gc?
* Fix repo consistency
* Undo modeling utils debug changes
* Fix link
* mllama -> Mllama
* [mllama] -> [Mllama]
* Enable compile test for CausalLM model (text-only)
* Fix TextModel prefix
* Update doc
* Docs for forward, type hints, and vision model prefix
* make sure to reset
* fix init
* small script refactor and styling
* nit
* updates!
* some nits
* Interpolate embeddings for 560 size and update integration tests
* nit
* does not suppor static cache!
* update
* fix
* nit2
* this?
* Fix conversion
* Style
* 4x memory improvement with image cache AFAIK
* Token decorator for tests
* Skip failing tests
* update processor errors
* fix split issues
* style
* weird
* style
* fix failing tests
* update
* nit fixing the whisper tests
* fix path
* update
---------
Co-authored-by: raushan <raushan@huggingface.co >
Co-authored-by: pavel <ubuntu@ip-10-90-0-11.ec2.internal >
Co-authored-by: qubvel <qubvel@gmail.com >
Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com >
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
Co-authored-by: Pedro Cuenca <pedro@huggingface.co >
2024-09-25 19:56:25 +02:00
Joao Gante
e15687fffe
Generation: deprecate PreTrainedModel inheriting from GenerationMixin ( #33203 )
2024-09-23 18:28:36 +01:00
Yih-Dar
077b552f07
Fix some missing tests in circleci ( #33559 )
...
* fix
* fix
* fix
* fix
* skip
* skip more
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2024-09-20 20:58:51 +02:00
Fanli Lin
b87755aa6d
[tests] skip tests for xpu ( #33553 )
...
* enable
* fix
* add xpu skip
* add marker
* skip for xpu
* add more
* add one more
2024-09-19 19:28:04 +01:00
Joao Gante
7542fac2c7
Pipeline: no side-effects on model.config and model.generation_config 🔫 ( #33480 )
2024-09-18 15:43:06 +01:00
Yoni Gozlan
d8500cd229
Uniformize kwargs for Pixtral processor ( #33521 )
...
* add uniformized pixtral and kwargs
* update doc
* fix _validate_images_text_input_order
* nit
2024-09-17 14:44:27 -04:00
Guang Yang
f38590dade
Make StaticCache configurable at model construct time ( #32830 )
...
* Make StaticCache configurable at model construct time
* integrations import structure
* add new doc file to toc
---------
Co-authored-by: Guang Yang <guangyang@fb.com >
Co-authored-by: Joao Gante <joao@huggingface.co >
2024-09-10 16:35:57 +01:00
Lysandre Debut
f24f084329
Import structure & first three model refactors ( #31329 )
...
* Import structure & first three model refactors
* Register -> Export. Export all in __all__. Sensible defaults according to filename.
* Apply most comments from Amy and some comments from Lucain
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
Co-authored-by: Lucain Pouget <lucainp@gmail.com >
* Style
* Add comment
* Clearer .py management
* Raise if not in backend mapping
* More specific type
* More efficient listdir
* Misc fixes
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
Co-authored-by: Lucain Pouget <lucainp@gmail.com >
2024-09-10 11:10:53 +02:00
Ita Zaporozhets
363301f221
support loading model without config.json file ( #32356 )
...
* support loading model without config.json file
* fix condition
* update tests
* add test
* ruff
* ruff
* ruff
2024-09-06 13:49:47 +02:00
Yoni Gozlan
9230d78e76
Add validate images and text inputs order util for processors and test_processing_utils ( #33285 )
...
* Add validate images and test processing utils
* Remove encoded text from possible inputs in tests
* Removed encoded inputs as valid in processing_utils
* change text input check to be recursive
* change text check to all element of lists and not just the first one in recursive checks
2024-09-04 13:50:31 -04:00
Alex Sherstinsky
122ded0a11
Bugfix/alexsherstinsky/fix none check for attention factor in rope scaling 2024 08 28 0 ( #33188 )
...
* Fixing a bug in the way "attention_factor" is validated in ROPE utilities.
* Fixing a bug in the way "attention_factor" is validated in ROPE utilities.
* Fixing a bug in the way "attention_factor" is validated in ROPE utilities.
2024-09-04 17:01:12 +02:00
Raushan Turganbay
ebbe8d8014
Cache docs: update ( #32929 )
...
* some changes
* more updates
* fix cache copy
* nits
* nits
* add tests
2024-09-04 15:05:31 +05:00
Gerben van V
5129671290
Add a static cache that offloads to the CPU or other device ( #32161 )
...
* Add a static cache that offloads to the CPU or other device
* Fix PR comments, add unit-tests
2024-08-29 11:51:09 +02:00
rasmi
f9ed05dd03
Fix import paths for test_module ( #32888 )
...
* Fix import path for test_feature_extraction_utils.py
See https://github.com/huggingface/transformers/pull/32601
* Fix import path for test_image_processing_utils.py
2024-08-28 12:08:29 +01:00