Parag Ekbote
e2b0224d94
Update Model Card for Jamba ( #37152 )
...
* Update model card for jamba
* Apply the suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Apply suggestions from code review-2
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* update model page.
* Apply suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update as per code review.
* Update docs/source/en/model_doc/jamba.md as per code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/jamba.md as per code review
`
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* update as per code review.
* fixes
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
2025-04-07 11:02:59 -07:00
Devesh Rahatekar
6cc109c354
Improvements in Gemma2 model card ( #37076 )
...
* Improved Model card for Gemma2
* Made changes in gemma2 as suggested
* Made more changes in the doc (adding image, notes, closing hfoptions)
* minor fixes
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
2025-04-07 10:51:26 -07:00
Mohamed Mekkouri
8bbcdf5409
Clean up the compressed-tensors integration ( #37349 )
...
clean up
2025-04-07 19:26:45 +02:00
Ashvanth.S
3a826a45ca
Update Model card for GPT2 ( #37101 )
...
* Update Model card for gpt2
* Update link for gpt2 space
* fixes docs based on suggestions
* Add transformers-cli and quantization example for GPT-2
* Remove resources and flash attention docs and fix typos
2025-04-07 10:15:28 -07:00
Ricardo Alanis
5e855095a2
Update falcon mamba card ( #37253 )
...
* feat: edit falcon mamba card
* fix: edit statement on falconmamba arch
* Update docs/source/en/model_doc/falcon_mamba.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/falcon_mamba.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/falcon_mamba.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* fix: add right indent for tags
* fix: remove notas
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
2025-04-07 10:12:44 -07:00
Shubham Panchal
416b5a875d
Update model-card for DINOv2 ( #37104 )
...
[docs] Update model-card for DINOv2
2025-04-07 10:11:08 -07:00
Nahieli
f8a16805c5
updated model card for Mistral ( #37156 )
...
* model card for Mistral
* Update docs/source/en/model_doc/mistral.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Apply suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/mistral.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/mistral.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/mistral.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/mistral.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* apply suggestions
* fix typo
* updated with comments
* updated with comments
* updated with comments
* remove hfoption block
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
2025-04-07 10:05:36 -07:00
Cyril Vallez
48e179857c
Remove HQQ from caching allocator warmup ( #37347 )
...
Update modeling_utils.py
2025-04-07 18:33:48 +02:00
Steven Liu
832cb684a0
Update translation template ( #37294 )
2025-04-07 09:29:37 -07:00
Cyril Vallez
22065bd645
fix derived berts _init_weights ( #37341 )
...
* fix derived berts
* more
* roformer
2025-04-07 18:25:07 +02:00
Matt
f789f960c8
Avoid build crashes when torch.version.xpu doesn't exist and fix Llama4 processor tests ( #37346 )
...
* Avoid build crashes when torch.version.xpu doesn't exist
* Trigger tests
* Fix image token and skip inappropriate test
* Remove ignore_errors=True
* Add another skip
2025-04-07 17:05:54 +01:00
Yao Matrix
12bf24d6ae
enable 2 llama UT cases on xpu ( #37126 )
...
* enable tests/models/llama/test_modeling_llama.py::LlamaIntegrationTest::test_model_7b_logits and tests/models/llama/test_modeling_llama.py::LlamaIntegrationTest::test_model_7b_logits_bf16 on xpu
Signed-off-by: YAO Matrix <matrix.yao@intel.com >
* switch to use Expectations
Signed-off-by: YAO Matrix <matrix.yao@intel.com >
* fix style
Signed-off-by: YAO Matrix <matrix.yao@intel.com >
* extract gen bits from architecture and use it
Signed-off-by: YAO Matrix <matrix.yao@intel.com >
* add cross refererence
Signed-off-by: YAO Matrix <matrix.yao@intel.com >
* fix style
Signed-off-by: YAO Matrix <matrix.yao@intel.com >
---------
Signed-off-by: YAO Matrix <matrix.yao@intel.com >
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com >
2025-04-07 16:02:14 +02:00
Yih-Dar
e7ad077012
byebye torch 2.0 ( #37277 )
...
* bump Torch 2.1 with broken compatibility `torch.compile`
* dep table
* remove usage of is_torch_greater_or_equal_than_2_1
* remove usage of is_torch_greater_or_equal_than_2_1
* remove if is_torch_greater_or_equal("2.1.0")
* remove torch >= "2.1.0"
* deal with 2.0.0
* PyTorch 2.0+ --> PyTorch 2.1+
* ruff 1
* difficult ruff
* address comment
* address comment
---------
Co-authored-by: Jirka B <j.borovec+github@gmail.com >
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2025-04-07 15:19:47 +02:00
jiqing-feng
99f9f1042f
Fix torchao usage ( #37034 )
...
* fix load path
Signed-off-by: jiqing-feng <jiqing.feng@intel.com >
* fix path
Signed-off-by: jiqing-feng <jiqing.feng@intel.com >
* Fix torchao usage
Signed-off-by: jiqing-feng <jiqing.feng@intel.com >
* fix tests
Signed-off-by: jiqing-feng <jiqing.feng@intel.com >
* fix format
Signed-off-by: jiqing-feng <jiqing.feng@intel.com >
* revert useless change
Signed-off-by: jiqing-feng <jiqing.feng@intel.com >
* format
Signed-off-by: jiqing-feng <jiqing.feng@intel.com >
* revert fp8 test
Signed-off-by: jiqing-feng <jiqing.feng@intel.com >
* fix fp8 test
Signed-off-by: jiqing-feng <jiqing.feng@intel.com >
* fix fp8 test
Signed-off-by: jiqing-feng <jiqing.feng@intel.com >
* fix torch dtype
Signed-off-by: jiqing-feng <jiqing.feng@intel.com >
---------
Signed-off-by: jiqing-feng <jiqing.feng@intel.com >
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com >
2025-04-07 14:50:48 +02:00
cyyever
0fb8d49e88
Use Python 3.9 syntax in examples ( #37279 )
...
Signed-off-by: cyy <cyyever@outlook.com >
2025-04-07 12:52:21 +01:00
Cyril Vallez
08f36771b3
Fix init empty weights without accelerate ( #37337 )
...
* add the integration
* Update accelerate.py
* Update accelerate.py
* add find_tied_params as well
* Update accelerate.py
* add where copied from
* simplify
* add error
2025-04-07 11:37:29 +02:00
Cyril Vallez
9db31ea585
Fix deepspeed with quantization ( #37324 )
...
* Update modeling_utils.py
* Update modeling_utils.py
2025-04-07 11:36:44 +02:00
hoshi-hiyouga
debfe904c9
fix llama4 training ( #37319 )
2025-04-07 09:24:44 +02:00
Wing Lian
54538ebee3
fix flex attn when optional args aren't passed ( #37327 )
2025-04-07 09:12:21 +02:00
Lysandre
d1b92369ca
v4.52.0.dev0
2025-04-05 22:04:21 +02:00
Arthur
25b7f27234
Add llama4 ( #37307 )
...
* remove one of the last deps
* update fast image processor after refactor
* styling
* more quality of life improvements
* nit
* update
* cleanups
* some cleanups
* vllm updates
* update fake image token
* [convert] Fix typo
* [convert] Strip extraneous bytes from shards
* [convert] Minor fixes
* [convert] Use num_experts
* multi-image fixes in modeling + processor
* fixup size
* 128 experts
* Use default rope
* Unfuse mlp
* simplify a lot inputs embeds merging
* remove .item() 👀
* fix from review
* Address feedback
* Use None "default" for rope_scaling. Add eot.
* set seed
* return aspect ratios and bug fixes
* Moe 128 rebased (#8 )
* 128 experts
* Use default rope
* Unfuse mlp
* Address feedback
* Use None "default" for rope_scaling. Add eot.
* Meta/llama quant compat (#7 )
* add quant compatible model & conversion code for llama4
* fix a few issues
* fix a few issues
* minor type mapping fix
---------
Co-authored-by: Lu Fang <fanglu@fb.com >
* use a new config parameter to determine which model definition to use for MoE
---------
Co-authored-by: Pedro Cuenca <pedro@huggingface.co >
Co-authored-by: Lu Fang <fanglu@fb.com >
* un-comment write_tokenizer from converting script
* remove un-used imports
* [llama4] Pop aspect_ratios from image processor output in Llama4Processor
Signed-off-by: Jon Swenson <jmswen@gmail.com >
* Fix parameter_count name
* Update src/transformers/models/llama4/configuration_llama4.py
* nit
* Add changes for no_rope, moe_layers, chunked attention. Just need to test all
* Update src/transformers/models/llama4/image_processing_llama4_fast.py
* nit
* fix post merge with main
* support flex attention
* fixes
* fix
* add layer
* small updates
* rebase and delete llm_compressor
* nit
* [llama4/mm] Add back <|image|> token that delimits global tile
* [llama4/mm] Fix Llama 4 image processing unit tests
* add explicit dtype
Signed-off-by: Jon Swenson <jmswen@gmail.com >
* sdpa works
* comment todo small
* fix model loading
Signed-off-by: Zijing Liu <liuzijing2014@gmail.com >
* revert
* nits
* small fix for TP on 1 node
* Read new params from config
* Add <|eom|>
* lol don't know how this got here
* adding fp8
* Save processor, fix chat template
* style
* Add boi/eoi tokens
We don't use them.
* fixes for now flex seems to work :)
* updates
* nits
* updates
* missking keys
* add context parallel
* update
* update
* fix
* nits
* add worldsize and make eager attn work for vision
* Ignore new key present in base models
* add tp_plan
* fix nope
Signed-off-by: Zijing Liu <liuzijing2014@gmail.com >
* minor fix
Signed-off-by: Zijing Liu <liuzijing2014@gmail.com >
* Clean up Llama4 vision model
* current updates
* add support for `attn_temperature_tuning`
* add floor scale
* add missing attn scales
* push what works, dirty trick for the device synch
* oups
* Fix pad_token_id
See
https://huggingface.co/ll-re/Llama-4-Scout-17B-16E/discussions/2/files
Confirmed in the original codebase.
* fix causallml loading
* rm
* fix tied-weights
* fix sdpa
* push current version
* should work with both short and long
* add compressed_tensos & fix fbgemm tp
* Fix flex impl
* style
* chunking
* try to revert the potentially breaking change
* fix auto factory
* fix shapes in general
* rm processing
* commit cache utils cleanup
* Fix context length
* fix
* allocate
* update tp_plan
* fix SDPA!
* Add support for sparse `Llama4TextMoe` layer from the kernel hub
* cleanup
* better merge
* update
* still broken fixing now
* nits
* revert print
* Write max_position_embeddings and max_model_length
* Update modeling_llama4.py
* Save attention_chunk_size
* Sync eos terminators
* Read initializer_range
* style
* remove `dict`
* fix
* eager should use `chunked_attention_mask`
* revert
* fixup
* fix config
* Revert "Merge pull request #36 from huggingface/sparse-llama4-moe"
This reverts commit ccda19f050867dd42ea143c5de60f3dec81375f0, reversing
changes made to a515579aed8c0fe9bf529b6c40446a289406d5d6.
* Fix typo and remove warning with compiled flex and chunked prefill
* Fix MoE vs FF (#41 )
* fix
* Use correct no_rope_layers if provided one is empty list
* update tests
* fix
* skipping some tests
* fix fp8 loading
Signed-off-by: Zijing Liu <liuzijing2014@gmail.com >
* fix text geneartion pipeline
Signed-off-by: Zijing Liu <liuzijing2014@gmail.com >
* eager needs 4D mask
* fix
* Some cleanup
* fix
* update
* fix
* replace correctly module
* patch
* modulelist
* update
* update
* clean up
* Don't move to `cuda:0` in distributed mode
* restrict to compressed tensors for now
* rm print
* Docs!
* Fixes
* Update docs/source/en/model_doc/llama4.md
Co-authored-by: Pedro Cuenca <pedro@huggingface.co >
* Fixes
* cuda graph fix
* revert some stuff
* fixup
* styling
* Update src/transformers/models/llama4/modeling_llama4.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
* fixup
* commit licence, cleanup here and there and style
* more styling changes
* fix dummies
* fix and clean docstrings
* remove comment
* remove warning
* Only fast image processor is supported
* nit
* trigger CI
* fix issue with flex encoder
* fix dynamic cache
* Code quality
* Code quality
* fix more tests for now
* Code quality
* Code quality
* Nuke bunch of failing stuff
* Code quality
* Code quality
* cleanup removal of slow image processor
* ruff fix fast image processor
* fix
* fix styling
* Docs
* Repo consistency
* Repo consistency
* fix sliding window issue
* separate llama cache
* styling
* Repo consistency
* Repo consistency
* push waht works
* L4 Repo consistency
* Docs
* fix last last alst alst alst alstsaltlsltlaslt
---------
Signed-off-by: Jon Swenson <jmswen@gmail.com >
Signed-off-by: Zijing Liu <liuzijing2014@gmail.com >
Co-authored-by: yonigozlan <yoni.gozlan10@gmail.com >
Co-authored-by: Pedro Cuenca <pedro@huggingface.co >
Co-authored-by: Pablo Montalvo <pablo.montalvo.leroux@gmail.com >
Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com >
Co-authored-by: Keyun Tong <tongkeyun@gmail.com >
Co-authored-by: Zijing Liu <liuzijing2014@users.noreply.github.com >
Co-authored-by: Lu Fang <fanglu@fb.com >
Co-authored-by: Zijing Liu <liuzijing2014@gmail.com >
Co-authored-by: Jon Swenson <jmswen@gmail.com >
Co-authored-by: jmswen <jmswen@users.noreply.github.com >
Co-authored-by: MekkCyber <mekk.cyber@gmail.com >
Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com >
Co-authored-by: Mohit Sharma <mohit21sharma.ms@gmail.com >
Co-authored-by: Yong Hoon Shin <yhshin@meta.com >
Co-authored-by: Marc Sun <marc@huggingface.co >
Co-authored-by: drisspg <drisspguessous@gmail.com >
Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com >
Co-authored-by: Daniël de Kok <me@danieldk.eu >
Co-authored-by: Lysandre <hi@lysand.re >
Co-authored-by: Ye (Charlotte) Qi <ye.charlotte.qi@gmail.com >
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2025-04-05 22:02:22 +02:00
Lysandre Debut
aa40fda346
Hf Xet extra ( #37305 )
...
* Hf Xet extra
* Hf Xet extra
2025-04-05 21:06:05 +02:00
Cyril Vallez
e94571580b
Fix deepspeed loading (part 2) ( #37306 )
...
* fix
* Update modeling_utils.py
* Update modeling_utils.py
* oups remove print
2025-04-05 20:41:42 +02:00
Cyril Vallez
84aa13dd85
Fix deepspeed loading ( #37281 )
...
* Update modeling_utils.py
* Update modeling_utils.py
* fix and remove all imports
* Update modeling_utils.py
* Update modeling_utils.py
* style
* Update modeling_utils.py
2025-04-05 17:05:45 +02:00
Linnet Cosmos Tuscano
0ef339ff1b
Update OpenAI GPT model card ( #37255 )
...
* Update OpenAI GPT model card
* Update docs/source/en/model_doc/openai-gpt.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/openai-gpt.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/openai-gpt.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/openai-gpt.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update OpenAI GPT model card: add usage examples and notes section
* Add API autodoc tags after Notes section for OpenAI GPT model
* Update docs/source/en/model_doc/openai-gpt.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/openai-gpt.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/openai-gpt.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/openai-gpt.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/openai-gpt.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/openai-gpt.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/openai-gpt.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/openai-gpt.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/openai-gpt.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/openai-gpt.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/openai-gpt.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/openai-gpt.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/openai-gpt.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Added missing badges
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
2025-04-04 15:25:16 -07:00
Sharareh Younesian
46d73910d5
Updated T5 model card with standardized format ( #37261 )
...
* Updated T5 model card with standardized format
* Updated T5 model card with standardized format, fixed typo
* Update docs/source/en/model_doc/t5.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/t5.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/t5.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/t5.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/t5.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/t5.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/t5.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/t5.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/t5.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/t5.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Apply reviewer suggestions
* Update docs/source/en/model_doc/t5.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
2025-04-04 15:23:09 -07:00
Chathumina Vimukthi
579135a2f6
Updated model card for distilbert ( #37157 )
...
* Updated model card for distilbert
* Updated the distilbert model card
* Updated model card for distilbert
* Updated the distilbert model card
* Addressed code review comments
* Addressed review comments
* fix pipeline
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
2025-04-04 15:22:46 -07:00
Reshan Gomis
8cd57eb731
mobilebert model card update ( #37256 )
...
* mobilebert model card update
* Updates to model card mobilebert
---------
Co-authored-by: Reshan Gomis <reshang@verdentra.com >
2025-04-04 14:28:35 -07:00
Rahul Tuli
ebe47ce3e9
Fix: Unexpected Keys, Improve run_compressed, Rename Test Folder ( #37077 )
2025-04-04 21:30:11 +02:00
Shubham Panchal
531e4fcf0e
Update model card for Depth Anything ( #37065 )
...
[docs] Update model card for Depth Anything
2025-04-04 11:36:05 -07:00
byi8220
a4e55fcff8
Disable delay_optimizer_creation in Trainer to support fsdp2 ( #37147 )
...
* github why you do this
* fix
* make fixup
* disable cpu offload test
* fixup
* tmp reworks
* git branch movement
* make fixup
* add require_fsdp_v2_version
* dep issues
* update ruff and fixup
2025-04-04 20:11:37 +02:00
Yao Matrix
878562b68d
fix test device spec relative path importing issue ( #37190 )
...
Signed-off-by: YAO Matrix <matrix.yao@intel.com >
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com >
2025-04-04 18:22:55 +02:00
Matt
8ebc435267
Fix llava_onevision tests ( #37280 )
...
* Fix llava_onevision tests
* Trigger tests
2025-04-04 15:03:38 +01:00
Joao Gante
ad3d157188
[RoPE] abstract dynamic RoPE update under a decorator ✨ ( #37249 )
...
* dynamic rope decorator
* longrope; shorter fwd pass
* propper docstring
* make fixup
2025-04-04 14:27:28 +01:00
Lysandre Debut
3d40bda30e
Hugging Face Hub pin to v0.30.0 for Xet ( #37166 )
2025-04-04 14:58:22 +02:00
Joao Gante
acbcb5d07d
[Tests] flaky test_constrained_beam_search_generate_dict_output ( #37276 )
2025-04-04 13:38:42 +01:00
Ryan McConville
4ba0989eab
Clarify error message to ensure min 28x28 image supplied for Qwen 2.5 VL ( #37264 )
...
fix: clarify error message for min 28x28 images
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com >
2025-04-04 12:53:38 +01:00
Yih-Dar
352ec8ef22
pin specific natten version in docker file ( #37274 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2025-04-04 13:47:16 +02:00
cyyever
edd345b52e
Fix deprecated PT functions ( #37237 )
...
* Fix deprecated PT functions
Signed-off-by: cyy <cyyever@outlook.com >
* Revert some changes
Signed-off-by: cyy <cyyever@outlook.com >
---------
Signed-off-by: cyy <cyyever@outlook.com >
2025-04-04 12:31:11 +01:00
Yih-Dar
b016de1ae4
Fix utils/check_bad_commit.py ( #37272 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2025-04-04 12:18:20 +02:00
Nikos Antoniou
f74d7da836
Introduce modular files for speech models ( #35902 )
...
* WAV_2_VEC_2 to WAV2VEC2
* added modular files for hubert, wavlm, wav2vec2_bert, data2vec_audio
* remove unnessary definitions in modulars
* added modular files for UniSpeech, UniSpeechSat, Wav2Vec2Conformer
* docstring fix for UniSpeechForCTC
* removed unneccessary re-definition of modular classes
* reverted lazy imports change on modular_model_converter, type-alias for Wav2Vec2BaseModelOutput
* top-level import of deepspeed in seamless_m4t, speecht5
* avoid tracking imports inside classes, relocate lazy deepspeed, peft imports in their original locations
* convert modular
* tiny modular typing fixes
* some more modular fixes
* make style
---------
Co-authored-by: eustlb <94853470+eustlb@users.noreply.github.com >
Co-authored-by: Eustache Le Bihan <eulebihan@gmail.com >
2025-04-04 11:46:27 +02:00
Ita Zaporozhets
d130cd0e16
update error msg ( #37207 )
2025-04-04 10:21:30 +02:00
Raushan Turganbay
41b9b92b52
[qwen-vl] fix image processor ( #37258 )
...
* fix
* add test
2025-04-03 19:48:56 +02:00
Surya Garikipati
8dd0a2b89c
Update model card for electra ( #37063 )
...
* Update ELECTRA model card with new format
* Update ELECTRA model card with new format
* Update docs/source/en/model_doc/electra.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/electra.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/electra.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/electra.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/electra.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/electra.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/electra.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/electra.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/electra.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* close hfoption block
---------
Co-authored-by: Wun0 <f20191221@hyderabad.bits-pilani.ac.in >
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
2025-04-03 10:45:35 -07:00
Parag Ekbote
15ac2b6ac5
Update Model Card for ModernBERT ( #37052 )
...
* Modify Model Card for ModernBERT.
* Update as per code review.
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update model card.
* Update model card.
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
2025-04-03 10:14:02 -07:00
Abhishek Ranjan
b552708694
chore: Update model doc for code_llama ( #37115 )
...
* Update code_llama.md
aims to handle https://github.com/huggingface/transformers/issues/36979#issuecomment-2758560598
sub part of https://github.com/huggingface/transformers/issues/36979
* Update docs/source/en/model_doc/code_llama.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/code_llama.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/code_llama.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* make changes as per code review
* chore: make the function smaller for attention mask visualizer
* chore[docs]: update code_llama.md with some more suggested changes
* Update docs/source/en/model_doc/code_llama.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* chore[docs] : Update code_llama.md with indentation changes
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
2025-04-03 10:09:41 -07:00
Bimal Gajera
2b84831a93
Update model card for Cohere ( #37056 )
...
* Update Cohere model card to follow standard template
* Update docs/source/en/model_doc/cohere.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/cohere.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/cohere.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/cohere.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/cohere.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/cohere.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update cohere.md
Update code snippet for AutoModel, quantization, and transformers-cli
* Update cohere.md
* Update docs/source/en/model_doc/cohere.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
2025-04-03 09:51:40 -07:00
Matt
2d46a08b63
Purge unused ModelTester code ( #37085 )
...
* Purge correctly this time
* Remove more methods from recent PRs
* make fixup
2025-04-03 17:48:35 +01:00
Avigyan Sinha
1b29409d89
feat: updated model card for qwen_2.5_vl ( #37099 )
...
* feat: updated model card for qwen_2.5_vl
* applied suggested change 1
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* applied suggested change 2
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* applied suggested change 3
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* fix: made requested changes for quantization and notes
* suggeested model card change 4
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* updated model card wiht suggested change 5
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* updated model card wiht suggested change 6
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* updated model card wiht suggested change 7
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* feat: applied requested changes
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
2025-04-03 09:13:26 -07:00
cyyever
8a828a747e
Add Optional to types ( #37163 )
...
Signed-off-by: cyy <cyyever@outlook.com >
2025-04-03 16:38:01 +01:00