Arthur
25b7f27234
Add llama4 ( #37307 )
...
* remove one of the last deps
* update fast image processor after refactor
* styling
* more quality of life improvements
* nit
* update
* cleanups
* some cleanups
* vllm updates
* update fake image token
* [convert] Fix typo
* [convert] Strip extraneous bytes from shards
* [convert] Minor fixes
* [convert] Use num_experts
* multi-image fixes in modeling + processor
* fixup size
* 128 experts
* Use default rope
* Unfuse mlp
* simplify a lot inputs embeds merging
* remove .item() 👀
* fix from review
* Address feedback
* Use None "default" for rope_scaling. Add eot.
* set seed
* return aspect ratios and bug fixes
* Moe 128 rebased (#8 )
* 128 experts
* Use default rope
* Unfuse mlp
* Address feedback
* Use None "default" for rope_scaling. Add eot.
* Meta/llama quant compat (#7 )
* add quant compatible model & conversion code for llama4
* fix a few issues
* fix a few issues
* minor type mapping fix
---------
Co-authored-by: Lu Fang <fanglu@fb.com >
* use a new config parameter to determine which model definition to use for MoE
---------
Co-authored-by: Pedro Cuenca <pedro@huggingface.co >
Co-authored-by: Lu Fang <fanglu@fb.com >
* un-comment write_tokenizer from converting script
* remove un-used imports
* [llama4] Pop aspect_ratios from image processor output in Llama4Processor
Signed-off-by: Jon Swenson <jmswen@gmail.com >
* Fix parameter_count name
* Update src/transformers/models/llama4/configuration_llama4.py
* nit
* Add changes for no_rope, moe_layers, chunked attention. Just need to test all
* Update src/transformers/models/llama4/image_processing_llama4_fast.py
* nit
* fix post merge with main
* support flex attention
* fixes
* fix
* add layer
* small updates
* rebase and delete llm_compressor
* nit
* [llama4/mm] Add back <|image|> token that delimits global tile
* [llama4/mm] Fix Llama 4 image processing unit tests
* add explicit dtype
Signed-off-by: Jon Swenson <jmswen@gmail.com >
* sdpa works
* comment todo small
* fix model loading
Signed-off-by: Zijing Liu <liuzijing2014@gmail.com >
* revert
* nits
* small fix for TP on 1 node
* Read new params from config
* Add <|eom|>
* lol don't know how this got here
* adding fp8
* Save processor, fix chat template
* style
* Add boi/eoi tokens
We don't use them.
* fixes for now flex seems to work :)
* updates
* nits
* updates
* missking keys
* add context parallel
* update
* update
* fix
* nits
* add worldsize and make eager attn work for vision
* Ignore new key present in base models
* add tp_plan
* fix nope
Signed-off-by: Zijing Liu <liuzijing2014@gmail.com >
* minor fix
Signed-off-by: Zijing Liu <liuzijing2014@gmail.com >
* Clean up Llama4 vision model
* current updates
* add support for `attn_temperature_tuning`
* add floor scale
* add missing attn scales
* push what works, dirty trick for the device synch
* oups
* Fix pad_token_id
See
https://huggingface.co/ll-re/Llama-4-Scout-17B-16E/discussions/2/files
Confirmed in the original codebase.
* fix causallml loading
* rm
* fix tied-weights
* fix sdpa
* push current version
* should work with both short and long
* add compressed_tensos & fix fbgemm tp
* Fix flex impl
* style
* chunking
* try to revert the potentially breaking change
* fix auto factory
* fix shapes in general
* rm processing
* commit cache utils cleanup
* Fix context length
* fix
* allocate
* update tp_plan
* fix SDPA!
* Add support for sparse `Llama4TextMoe` layer from the kernel hub
* cleanup
* better merge
* update
* still broken fixing now
* nits
* revert print
* Write max_position_embeddings and max_model_length
* Update modeling_llama4.py
* Save attention_chunk_size
* Sync eos terminators
* Read initializer_range
* style
* remove `dict`
* fix
* eager should use `chunked_attention_mask`
* revert
* fixup
* fix config
* Revert "Merge pull request #36 from huggingface/sparse-llama4-moe"
This reverts commit ccda19f050867dd42ea143c5de60f3dec81375f0, reversing
changes made to a515579aed8c0fe9bf529b6c40446a289406d5d6.
* Fix typo and remove warning with compiled flex and chunked prefill
* Fix MoE vs FF (#41 )
* fix
* Use correct no_rope_layers if provided one is empty list
* update tests
* fix
* skipping some tests
* fix fp8 loading
Signed-off-by: Zijing Liu <liuzijing2014@gmail.com >
* fix text geneartion pipeline
Signed-off-by: Zijing Liu <liuzijing2014@gmail.com >
* eager needs 4D mask
* fix
* Some cleanup
* fix
* update
* fix
* replace correctly module
* patch
* modulelist
* update
* update
* clean up
* Don't move to `cuda:0` in distributed mode
* restrict to compressed tensors for now
* rm print
* Docs!
* Fixes
* Update docs/source/en/model_doc/llama4.md
Co-authored-by: Pedro Cuenca <pedro@huggingface.co >
* Fixes
* cuda graph fix
* revert some stuff
* fixup
* styling
* Update src/transformers/models/llama4/modeling_llama4.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
* fixup
* commit licence, cleanup here and there and style
* more styling changes
* fix dummies
* fix and clean docstrings
* remove comment
* remove warning
* Only fast image processor is supported
* nit
* trigger CI
* fix issue with flex encoder
* fix dynamic cache
* Code quality
* Code quality
* fix more tests for now
* Code quality
* Code quality
* Nuke bunch of failing stuff
* Code quality
* Code quality
* cleanup removal of slow image processor
* ruff fix fast image processor
* fix
* fix styling
* Docs
* Repo consistency
* Repo consistency
* fix sliding window issue
* separate llama cache
* styling
* Repo consistency
* Repo consistency
* push waht works
* L4 Repo consistency
* Docs
* fix last last alst alst alst alstsaltlsltlaslt
---------
Signed-off-by: Jon Swenson <jmswen@gmail.com >
Signed-off-by: Zijing Liu <liuzijing2014@gmail.com >
Co-authored-by: yonigozlan <yoni.gozlan10@gmail.com >
Co-authored-by: Pedro Cuenca <pedro@huggingface.co >
Co-authored-by: Pablo Montalvo <pablo.montalvo.leroux@gmail.com >
Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com >
Co-authored-by: Keyun Tong <tongkeyun@gmail.com >
Co-authored-by: Zijing Liu <liuzijing2014@users.noreply.github.com >
Co-authored-by: Lu Fang <fanglu@fb.com >
Co-authored-by: Zijing Liu <liuzijing2014@gmail.com >
Co-authored-by: Jon Swenson <jmswen@gmail.com >
Co-authored-by: jmswen <jmswen@users.noreply.github.com >
Co-authored-by: MekkCyber <mekk.cyber@gmail.com >
Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com >
Co-authored-by: Mohit Sharma <mohit21sharma.ms@gmail.com >
Co-authored-by: Yong Hoon Shin <yhshin@meta.com >
Co-authored-by: Marc Sun <marc@huggingface.co >
Co-authored-by: drisspg <drisspguessous@gmail.com >
Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com >
Co-authored-by: Daniël de Kok <me@danieldk.eu >
Co-authored-by: Lysandre <hi@lysand.re >
Co-authored-by: Ye (Charlotte) Qi <ye.charlotte.qi@gmail.com >
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2025-04-05 22:02:22 +02:00
Linnet Cosmos Tuscano
0ef339ff1b
Update OpenAI GPT model card ( #37255 )
...
* Update OpenAI GPT model card
* Update docs/source/en/model_doc/openai-gpt.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/openai-gpt.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/openai-gpt.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/openai-gpt.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update OpenAI GPT model card: add usage examples and notes section
* Add API autodoc tags after Notes section for OpenAI GPT model
* Update docs/source/en/model_doc/openai-gpt.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/openai-gpt.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/openai-gpt.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/openai-gpt.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/openai-gpt.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/openai-gpt.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/openai-gpt.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/openai-gpt.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/openai-gpt.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/openai-gpt.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/openai-gpt.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/openai-gpt.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/openai-gpt.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Added missing badges
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
2025-04-04 15:25:16 -07:00
Sharareh Younesian
46d73910d5
Updated T5 model card with standardized format ( #37261 )
...
* Updated T5 model card with standardized format
* Updated T5 model card with standardized format, fixed typo
* Update docs/source/en/model_doc/t5.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/t5.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/t5.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/t5.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/t5.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/t5.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/t5.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/t5.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/t5.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/t5.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Apply reviewer suggestions
* Update docs/source/en/model_doc/t5.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
2025-04-04 15:23:09 -07:00
Chathumina Vimukthi
579135a2f6
Updated model card for distilbert ( #37157 )
...
* Updated model card for distilbert
* Updated the distilbert model card
* Updated model card for distilbert
* Updated the distilbert model card
* Addressed code review comments
* Addressed review comments
* fix pipeline
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
2025-04-04 15:22:46 -07:00
Reshan Gomis
8cd57eb731
mobilebert model card update ( #37256 )
...
* mobilebert model card update
* Updates to model card mobilebert
---------
Co-authored-by: Reshan Gomis <reshang@verdentra.com >
2025-04-04 14:28:35 -07:00
Shubham Panchal
531e4fcf0e
Update model card for Depth Anything ( #37065 )
...
[docs] Update model card for Depth Anything
2025-04-04 11:36:05 -07:00
Joao Gante
ad3d157188
[RoPE] abstract dynamic RoPE update under a decorator ✨ ( #37249 )
...
* dynamic rope decorator
* longrope; shorter fwd pass
* propper docstring
* make fixup
2025-04-04 14:27:28 +01:00
Surya Garikipati
8dd0a2b89c
Update model card for electra ( #37063 )
...
* Update ELECTRA model card with new format
* Update ELECTRA model card with new format
* Update docs/source/en/model_doc/electra.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/electra.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/electra.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/electra.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/electra.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/electra.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/electra.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/electra.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/electra.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* close hfoption block
---------
Co-authored-by: Wun0 <f20191221@hyderabad.bits-pilani.ac.in >
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
2025-04-03 10:45:35 -07:00
Parag Ekbote
15ac2b6ac5
Update Model Card for ModernBERT ( #37052 )
...
* Modify Model Card for ModernBERT.
* Update as per code review.
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update model card.
* Update model card.
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
2025-04-03 10:14:02 -07:00
Abhishek Ranjan
b552708694
chore: Update model doc for code_llama ( #37115 )
...
* Update code_llama.md
aims to handle https://github.com/huggingface/transformers/issues/36979#issuecomment-2758560598
sub part of https://github.com/huggingface/transformers/issues/36979
* Update docs/source/en/model_doc/code_llama.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/code_llama.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/code_llama.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* make changes as per code review
* chore: make the function smaller for attention mask visualizer
* chore[docs]: update code_llama.md with some more suggested changes
* Update docs/source/en/model_doc/code_llama.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* chore[docs] : Update code_llama.md with indentation changes
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
2025-04-03 10:09:41 -07:00
Bimal Gajera
2b84831a93
Update model card for Cohere ( #37056 )
...
* Update Cohere model card to follow standard template
* Update docs/source/en/model_doc/cohere.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/cohere.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/cohere.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/cohere.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/cohere.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/cohere.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update cohere.md
Update code snippet for AutoModel, quantization, and transformers-cli
* Update cohere.md
* Update docs/source/en/model_doc/cohere.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
2025-04-03 09:51:40 -07:00
Avigyan Sinha
1b29409d89
feat: updated model card for qwen_2.5_vl ( #37099 )
...
* feat: updated model card for qwen_2.5_vl
* applied suggested change 1
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* applied suggested change 2
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* applied suggested change 3
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* fix: made requested changes for quantization and notes
* suggeested model card change 4
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* updated model card wiht suggested change 5
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* updated model card wiht suggested change 6
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* updated model card wiht suggested change 7
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* feat: applied requested changes
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
2025-04-03 09:13:26 -07:00
Ryan Mullins
3f6af96732
Adding links to ShieldGemma 2 technical report ( #37247 )
2025-04-03 16:26:29 +01:00
Joao Gante
9a1c1fe7ed
[CI] green llama tests ( #37244 )
...
* green llama tests
* use cleanup instead
* better test comment; cleanup upgrade
* better test comment; cleanup upgrade
2025-04-03 14:15:53 +01:00
Raushan Turganbay
98601cc818
[Phi4] add multimodal chat template ( #36996 )
...
* phi4 chat template
* remove from valid kwargs
2025-04-03 09:52:09 +02:00
ARAVINDHAN T
2056287940
Updated model card for Qwen2 ( #37192 )
...
* Update qwen2.md
* Update qwen2.md
* Update qwen2.md
* Update qwen2.md
* Update docs/source/en/model_doc/qwen2.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/qwen2.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/qwen2.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/qwen2.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/qwen2.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/qwen2.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/qwen2.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/qwen2.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/qwen2.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update qwen2.md
* Update docs/source/en/model_doc/qwen2.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/qwen2.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/qwen2.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/qwen2.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/qwen2.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/qwen2.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/qwen2.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/qwen2.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/qwen2.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/qwen2.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/qwen2.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/qwen2.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/qwen2.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/qwen2.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/qwen2.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/qwen2.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/qwen2.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/qwen2.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
2025-04-02 18:10:41 -07:00
Ricardo Alanis
3e96a0c32b
Update falcon model card ( #37184 )
...
* feat: updated model card for falcon
* fix:rewrite model description
* fix: add link to conversion script
* Update docs/source/en/model_doc/falcon.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/falcon.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/falcon.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/falcon.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/falcon.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/falcon.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/falcon.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/falcon.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* fix: Add suggested changes
* fix: typo in link for quantization
* Update docs/source/en/model_doc/falcon.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/falcon.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* fix: fix indent and close ticks
* fix: add indent
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
2025-04-02 17:30:37 -07:00
Purusharth Malik
199d7adf10
Updated the model card for CLIP ( #37040 )
...
* Update clip.md
* Update docs/source/en/model_doc/clip.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/clip.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/clip.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Incorporated suggested changes
* Update docs/source/en/model_doc/clip.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/clip.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/clip.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
2025-04-02 14:57:38 -07:00
Bowen Bao
800510c67b
[doc] Fix link for Quark quantization page ( #37179 )
...
Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com >
2025-04-01 20:57:38 +02:00
Armaghan Shakir
0710e9b1e8
Create and Expose SamVisionModel as public for better accessibility ( #36493 )
...
* move encoder below
* auto modeling
* write SamVisionTester
* fix vision attention shape
* fix SamVisionTest
* minor changes to SamVisionTest
* Revert "fix vision attention shape"
This reverts commit d2a4083ae5704716e33351aed03af8f3cc45f3ae.
* fix attention output shape in new tests
* remove encoder examples
* run modular on got_ocr2
* code formatting
* fix got_ocr2
* ruff fixes
* code quality
* add sam_vision in auto modeling and auto configuration
* remove composite test
* updated index.md
* add TFSamVisionEncoder to __init__
* fix public TFSamVisionEncoder
* remove outdated todo comment
* set test_torch_exportable
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com >
* rename: VisionEncoder -> VisionModel
* bring back original SamVisionEncoder
* rename back: VisionEncoderOutput -> VisionModelOutput
* undo changes in SamModelTester
* reuse SamVisionEncoder in SamVisionModel
---------
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com >
2025-03-31 11:45:07 +02:00
jiqing-feng
286393fbb1
enable tp on CPU ( #36299 )
...
* enable tp on CPU
Signed-off-by: jiqing-feng <jiqing.feng@intel.com >
* get rank from cpu
Signed-off-by: jiqing-feng <jiqing.feng@intel.com >
* update
Signed-off-by: jiqing-feng <jiqing.feng@intel.com >
* enable TP tests
Signed-off-by: jiqing-feng <jiqing.feng@intel.com >
* fix comment
Signed-off-by: jiqing-feng <jiqing.feng@intel.com >
* em print
Signed-off-by: jiqing-feng <jiqing.feng@intel.com >
* fix model id
Signed-off-by: jiqing-feng <jiqing.feng@intel.com >
* fix conflict
Signed-off-by: jiqing-feng <jiqing.feng@intel.com >
* fix index and add doc
Signed-off-by: jiqing-feng <jiqing.feng@intel.com >
---------
Signed-off-by: jiqing-feng <jiqing.feng@intel.com >
2025-03-31 10:55:47 +02:00
Bo Zheng
6acd5aecb3
Adding Qwen3 and Qwen3MoE ( #36878 )
...
* Initial commit for Qwen3
* fix and add tests for qwen3 & qwen3_moe
* rename models for tests.
* fix
* fix
* fix and add docs.
* fix model name in docs.
* simplify modular and fix configuration issues
* Fix the red CI: ruff was updated
* revert ruff, version was wrong
* fix qwen3moe.
* fix
* make sure MOE can load
* fix copies
---------
Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com >
2025-03-31 09:50:49 +02:00
MinJu-Ha
0d6a60fe55
🌐 [i18n-KO] Translated qwen2_vl.md to Korean ( #36750 )
...
* fix: manual edits
* fix: resolve suggestions
* Update toctree.yml
2025-03-30 15:00:27 -07:00
Cyril Vallez
2bea6bf24e
Fix AttentionInterface following feedback ( #37010 )
...
* up
* typo
* update doc
* Update attention_interface.md
2025-03-28 18:00:35 +01:00
Minho Ryu
eca74d1367
[WIP] add deepseek-v3 ( #35926 )
...
* init commit
* style
* take comments into account
* add deepseekv3 modeling
* remove redundant code
* apply make style
* apply fix-copies
* make format
* add init files
* rename deepseekv3 into deepseek_v3 based on its model_type
* rename deepseekv3 into deepseek_v3 based on its model_type
* deepseek-v3 not deepseek_v3
* set model_type as deepseek_v3
* use default docs
* apply make
* fill type and docstring
* add rope_config_validation
* use custom DeepseekV3MLP
* hold code only for checkpoints congifuration; remove redundant
* revise rope yarn for DeepSeek variation
* rename DeepSeek-V3
* some refactoring
* revise load_hook to work properly; make moe func trainable; use llama instead of mixtral
* fix attention forward
* use -1 for not-changing dim when to use exapnd
* refactor DeepseekV3TopkRouter
* use reshape_for_rope instead of load_hook; revise attention forward for TP; rename q_head_dim with qk_head_dim
* register pre_hook and hook both
* make style
* use n_shared_experts
* Update src/transformers/models/deepseek_v3/configuration_deepseek_v3.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
* add test file
* update modeling_file according to modular file
* make style
* add mapping for DeepseekV3ForSequenceClassification
* remove aux_loss_alpha
* add deepseek_v3 for perf
* add deepseek_v3
* rename test as deepseekv3
* use tiny-deepseek-v3
* remove DeepseekV3ForSequenceClassification
* cache before padding
* remote output_router_logits
* Revert "remote output_router_logits"
This reverts commit f264f800d04950390db8413b9efb24cef8186330.
* remove output_router_logits
* make e_score_correction_bias as buffer
* skip tests not compatible
* make style
* make e_score_correction_bias as buffer
* use rope_interleave instead of load_hook
* skip tests not compatible with MLA
* add doc for rope_interleave
* fix typo
* remove torch.no_grad for selecting topk
* fix post merge issue
* mrege with main and simplify
* nits
* final
* small fixes
* fix
* support TP better
* stash
* changes currently requires
* remove synch
* more fixes for TP
* temp fix for TP : some attention layers's FP8 scales are too small + shared is local colwise and anything is local if FP8 because weights are used
* updates to have generation work!
* push most of the changes
* reorder functions + call for contributions!
* update readme
* nits
* update
* ruff was updated on main
* merge with main and fix copies
* revert unrelated changes
* route all tokens to all experts when testing to avoid no gradient iddues
* finish fixing all tests
* fixup
* nit
* clean config
* last readme changes
* nit
* do cnit
* typo
* last nit
* one more one more
---------
Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com >
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
Co-authored-by: arthur@huggingface.co <arthur@ip-26-0-165-131.ec2.internal >
2025-03-28 15:56:59 +01:00
Abu Bakr Soliman
49b5ab6a27
Support QuestionAnswering Module for ModernBert based models. ( #35566 )
...
* push ModernBertForQuestionAnswering
* update ModernBertForQuestionAnswering
* update __init__ loading
* set imports for ModernBertForQuestionAnswering
* update ModernBertForQuestionAnswering
* remove debugging logs
* update init_weights method
* remove custom initialization for ModernBertForQuestionAnswering
* apply make fix-copies
* apply make style
* apply make fix-copies
* append ModernBertForQuestionAnswering to the pipeline supported models
* remove unused file
* remove invalid autoload value
* update en/model_doc/modernbert.md
* apply make fixup command
* make fixup
* Update dummies
* update usage tips for ModernBertForQuestionAnswering
* update usage tips for ModernBertForQuestionAnswering
* add init
* add lint
* add consistency
* update init test
* change text to trigger stuck text
* use self.loss_function instead of custom loss
By @Cyrilvallez
Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com >
* Update modeling_modernbert.py
make comparable commit to even it out
* Match whitespace
* whitespace
---------
Co-authored-by: Matt <rocketknight1@gmail.com >
Co-authored-by: Orion Weller <wellerorion@gmail.com >
Co-authored-by: Orion Weller <31665361+orionw@users.noreply.github.com >
Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com >
2025-03-26 21:24:18 +01:00
Steven Liu
3a8ec8c467
[docs] Attention mask image ( #36970 )
...
add image
2025-03-26 10:11:34 -07:00
Cyril Vallez
788e1092e9
Allow easy registration of custom attention functions ( #36889 )
...
* Update modeling_utils.py
* style
* Update modeling_utils.py
* Update modeling_utils.py
* Update modeling_utils.py
* Update modeling_utils.py
* Update modeling_utils.py
* Update modeling_utils.py
* add to init
* Update modeling_utils.py
* style
* update
* Update modeling_utils.py
* Update modeling_utils.py
* style
* Add some doc
* Update _toctree.yml
* readd it for tgi/vllm compat
* CIs
* CIs
2025-03-26 16:15:06 +01:00
Steven Liu
a844297088
[docs] Fix image link ( #36869 )
...
* fix image link
* fix
* update
* fix
2025-03-25 11:34:21 -07:00
Cyril Vallez
4303d88c09
Add Phi4 multimodal ( #36939 )
...
* raw start
* update
* update
* add to imports
* update
* up
* simplify configs
* clean configs
* style
* typos
* Update convert_phi4_multimodal_weights_to_hf.py
* Update convert_phi4_multimodal_weights_to_hf.py
* fix
* up
* up
* up
* Update convert_phi4_multimodal_weights_to_hf.py
* Update convert_phi4_multimodal_weights_to_hf.py
* up
* up
* up
* Update feature_extraction_phi4_multimodal.py
* up
* up
* up
* up
* up
* simplify configs
* typo
* cut code
* typo
* typo
* typo
* re
* typo
* up
* up
* up
* add tests
* fix
* fix
* Update test_modeling_phi4_multimodal.py
* up
* Update test_modeling_phi4_multimodal.py
* doc
* fix
* up
* up
* up
* up
* up
* up
* simplify
* up
* simplify
* config docstrings
* cleanup
* clean
* typo
* typo
* fix
* Update phi4_multimodal.md
* fix
* fix
* Update test_modeling_phi4_multimodal.py
* update
* simplify reshapes and permutes
* up
* simplify special tokens
* simplify processor a lot
* Update processing_phi4_multimodal.py
* Update processing_phi4_multimodal.py
* switch to fast processor
* image processor
* Update image_processing_phi4_multimodal_fast.py
* add lora extraction to converter
* Update convert_phi4_multimodal_weights_to_hf.py
* Update __init__.py
* add AudioInput type in audio_utils
* rewrite feature_extraction: support torch batched FFT
* input_audio_embeds -> audio_input_features, input_image_embeds -> image_pixel_values
* test update
* not mono channel warning update
* remove auto maps from processor
* kargs dispatch in processor
* simplify kwargs dispatch
* simplify merging
* remove default sampling rate
* style
* Update test_modeling_phi4_multimodal.py
* update doc
* doc
* torch only feature extractor
* make fake tokens adjustable
* Update feature_extraction_phi4_multimodal.py
* fix
* Update processing_phi4_multimodal.py
* simplify mask
* last touch
* fix copies
* style
* Update audio_utils.py
* style
* Update feature_extraction_phi4_multimodal.py
* Update __init__.py
* docstrings
* copies
* fix all checks
* back to fix-copies
* trigger CIs
* Update feature_extraction_phi4_multimodal.py
* improve tests with multimodal inputs
* trigger CIs
---------
Co-authored-by: Eustache Le Bihan <eulebihan@gmail.com >
2025-03-25 09:55:21 +01:00
Raushan Turganbay
47e5432805
Deprecate #36741 and map Causal to Conditional ( #36917 )
...
* deprecate the prev fix
* reword warning and update docs
* reword warning
* tests
* dont bloat `get_text_config()`
2025-03-25 09:13:56 +01:00
omahs
cbf924b76c
Fix typos ( #36910 )
...
* fix typos
* fix typos
* fix typos
* fix typos
2025-03-24 14:08:29 +00:00
Aritra Roy Gosthipaty
c9d1e5238a
Update installation.md ( #36826 )
...
* Update installation.md
* Update README.md
2025-03-21 16:32:02 -07:00
Steven Liu
d253de6d58
[docs] Model docs ( #36469 )
...
* initial
* fix
* fix
* update
* fix
* fixes
* quantization
* attention mask visualizer
* multimodal
* small changes
* fix code samples
2025-03-21 15:35:22 -07:00
Joao Gante
949cca4061
[CI] doc builder without custom image ( #36862 )
...
* no image
* test
* revert jax version updates
* make fixup
* update autodoc path for model_addition_debugger
* shieldgemma2
* add missing pages to toctree
2025-03-21 09:10:27 +00:00
Pablo Montalvo
1d3f35f30a
Add model visual debugger ( #36798 )
...
* draft of model tracer visualiser
* add context manager in addition to decorator
* add debug utils to init
* move model debugging utils to dedicated file
* add documentation
* protect some imports
* format
* move and protect imports
* format
* doc: improve errors in case of broken dummy imports.
* format
* use automatic torch backend
* update doc
* fix backend
* (TEMP) move to dummies while backend wait
* update documentation
* doc
2025-03-20 17:37:29 +01:00
Haotong LIN
6515c25953
Add Prompt Depth Anything Model ( #35401 )
...
* add prompt depth anything model by modular transformer
* add prompt depth anything docs and imports
* update code style according transformers doc
* update code style: import order issue is fixed by custom_init_isort
* fix depth shape from B,1,H,W to B,H,W which is as the same as Depth Anything
* move prompt depth anything to vision models in _toctree.yml
* update backbone test; there is no need for resnet18 backbone test
* update init file & pass RUN_SLOW tests
* update len(prompt_depth) to prompt_depth.shape[0]
Co-authored-by: Joshua Lochner <admin@xenova.com >
* fix torch_int/model_doc
* fix typo
* update PromptDepthAnythingImageProcessor
* fix typo
* fix typo for prompt depth anything doc
* update promptda overview image link of huggingface repo
* fix some typos in promptda doc
* Update image processing to include pad_image, prompt depth position, and related explanations for better clarity and functionality.
* add copy disclaimer for prompt depth anything image processing
* fix some format typos in image processing and conversion scripts
* fix nn.ReLU(False) to nn.ReLU()
* rename residual layer as it's a sequential layer
* move size compute to a separate line/variable for easier debug in modular prompt depth anything
* fix modular format for prompt depth anything
* update modular prompt depth anything
* fix scale to meter and some internal funcs warp
* fix code style in image_processing_prompt_depth_anything.py
* fix issues in image_processing_prompt_depth_anything.py
* fix issues in image_processing_prompt_depth_anything.py
* fix issues in prompt depth anything
* update converting script similar to mllamma
* update testing for modeling prompt depth anything
* update testing for image_processing_prompt_depth_anything
* fix assertion in image_processing_prompt_depth_anything
* Update src/transformers/models/prompt_depth_anything/modular_prompt_depth_anything.py
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com >
* Update src/transformers/models/prompt_depth_anything/modular_prompt_depth_anything.py
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com >
* Update src/transformers/models/prompt_depth_anything/image_processing_prompt_depth_anything.py
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com >
* Update src/transformers/models/prompt_depth_anything/image_processing_prompt_depth_anything.py
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com >
* Update src/transformers/models/prompt_depth_anything/image_processing_prompt_depth_anything.py
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com >
* Update docs/source/en/model_doc/prompt_depth_anything.md
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com >
* Update docs/source/en/model_doc/prompt_depth_anything.md
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com >
* update some testing
* fix testing
* fix
* add return doc for forward of prompt depth anything
* Update src/transformers/models/prompt_depth_anything/modular_prompt_depth_anything.py
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com >
* Update tests/models/prompt_depth_anything/test_modeling_prompt_depth_anything.py
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com >
* fix prompt depth order
* fix format for testing prompt depth anything
* fix minor issues in prompt depth anything doc
* fix format for modular prompt depth anything
* revert format for modular prompt depth anything
* revert format for modular prompt depth anything
* update format for modular prompt depth anything
* fix parallel testing errors
* fix doc for prompt depth anything
* Add header
* Fix imports
* Licence header
---------
Co-authored-by: Joshua Lochner <admin@xenova.com >
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com >
2025-03-20 16:12:44 +00:00
Pavel Iakubovskii
66291778dd
Refactor Attention implementation for ViT-based models ( #36545 )
...
* Refactor vit attention
* Refactor ViT-based models
* 🚨 🚨 🚨 Fix prefix for DPT
* Update params order
* trigger tests
* Fix Dinov2 attention
* Fix DPT attention impl propagation for backbone config
* Common test fix: config is modif. inplace - avoid it
* view->reshape
* Fixup
* Fixup
* Enable IJepa FA2
* Add FA2 in corresponding model docs
2025-03-20 15:15:01 +00:00
fxmarty-amd
1a374799ce
Support loading Quark quantized models in Transformers ( #36372 )
...
* add quark quantizer
* add quark doc
* clean up doc
* fix tests
* make style
* more style fixes
* cleanup imports
* cleaning
* precise install
* Update docs/source/en/quantization/quark.md
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com >
* Update tests/quantization/quark_integration/test_quark.py
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com >
* Update src/transformers/utils/quantization_config.py
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com >
* remove import guard as suggested
* update copyright headers
* add quark to transformers-quantization-latest-gpu Dockerfile
* make tests pass on transformers main + quark==0.7
* add missing F8_E4M3 and F8_E5M2 keys from str_to_torch_dtype
---------
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com >
Co-authored-by: Bowen Bao <bowenbao@amd.com >
Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com >
2025-03-20 15:40:51 +01:00
Ryan Mullins
487dab1b2b
Shieldgemma2 ( #36678 )
...
* single commit
* correct config
* fixup
* dummy pt
* Use ShieldGemma2Config in conversion script
* Update src/transformers/models/shieldgemma2/configuration_shieldgemma2.py
* Adding shieldgemma2 to models.__init__.py
* Adding ShieldGemma2 to main __init__.py
* Update shieldgemma2.md
* Update shieldgemma2.md
* Adding tests. Addressing review feedback.
* Minor docs update
* Fixing code quality feedback from CI
* Fixing empty messages bug reported by ghunkins
---------
Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com >
Co-authored-by: Ren Pang <ain-soph@live.com >
2025-03-20 15:14:38 +01:00
Joao Gante
957b05b413
[qwen2 audio] remove redundant code and update docs ( #36282 )
2025-03-20 10:54:51 +00:00
HDCharles
94555437e2
Disable inductor config setter by default ( #36608 )
...
* Disable inductor config setter by default
This is hard to debug and should be off by default
* remove default settings in autoquant too
* Add info to torchao.md about recommended settings
* satisfying Ruff format
Summary:
Test Plan:
Reviewers:
Subscribers:
Tasks:
Tags:
---------
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com >
2025-03-20 11:23:14 +01:00
Matt
9be4728af8
Just import torch AdamW instead ( #36177 )
...
* Just import torch AdamW instead
* Update docs too
* Make AdamW undocumented
* make fixup
* Add a basic wrapper class
* Add it back to the docs
* Just remove AdamW entirely
* Remove some AdamW references
* Drop AdamW from the public init
* make fix-copies
* Cleanup some references
* make fixup
* Delete lots of transformers.AdamW references
* Remove extra references to adamw_hf
2025-03-19 18:29:40 +00:00
Mohamed Mekkouri
258dd9cc69
Add Space to Bitsandbytes doc ( #36834 )
...
* add space
* address review
2025-03-19 18:56:07 +01:00
Driss Guessous
e8d960329e
Add option for ao base configs ( #36526 )
2025-03-19 14:59:47 +01:00
Yoni Gozlan
12f2ebef63
Support custom dosctrings in modular ( #36726 )
...
* Override docstrings in modular if not none
* Update doc
2025-03-18 14:00:54 -04:00
Yoni Gozlan
30580f035b
Fix Mistral3 tests ( #36797 )
...
* fix processor tests
* fix modeling tests
* fix test processor chat template
* revert modeling test changes
2025-03-18 13:08:12 -04:00
Cyril Vallez
e959530b8f
Add Mistral3 ( #36790 )
...
Release - Conda / build_and_package (push) Has been cancelled
Secret Leaks / trufflehog (push) Has been cancelled
* initial start
* style and dummies
* Create convert_mistral3_weights_to_hf.py
* update
* typo
* typo
* Update convert_mistral3_weights_to_hf.py
* Update convert_mistral3_weights_to_hf.py
* Update convert_mistral3_weights_to_hf.py
* Update convert_mistral3_weights_to_hf.py
* up
* Update convert_mistral3_weights_to_hf.py
* Update convert_mistral3_weights_to_hf.py
* update
* update
* Update image_processing_mistral3.py
* Update convert_mistral3_weights_to_hf.py
* fix patch merger
* Update convert_mistral3_weights_to_hf.py
* Update convert_mistral3_weights_to_hf.py
* up
* update modular to fit
* style
* Update convert_mistral3_weights_to_hf.py
* typo
* Update modular_mistral3.py
* simplify a lot all shape shenanigans
* simplify
* add working test processor
* Add partially working common modeling tests
* All tests working and remove mistral3 image processors
* add docs and fixup
* fix inference with image size >1540
* 🚨 fix test image proc pixtral
* Remove vision_feature_select_strategy
* Update convert_mistral3_weights_to_hf.py
* Update convert_mistral3_weights_to_hf.py
* Update convert_mistral3_weights_to_hf.py
* Update convert_mistral3_weights_to_hf.py
* clean
* fix test checkpoints
* Update test_modeling_mistral3.py
* Update test_modeling_mistral3.py
* style
* Use Pixtral processor
* up
* finish cleaning processor to use pixtral directly
* Update __init__.py
* Update processing_pixtral.py
* doc
* Update __init__.py
* Update mistral3.md
* Update _toctree.yml
---------
Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co >
Co-authored-by: yonigozlan <yoni.gozlan10@gmail.com >
2025-03-18 12:04:42 +01:00
Steven Liu
ac1a1b66b9
[docs] Update README ( #36265 )
...
* update
* feedback
* feedback
* update versions
2025-03-17 09:37:19 -07:00
Christopher Akiki
e3af4fec91
[MINOR:TYPO] Update hubert.md ( #36733 )
...
* [MINOR:TYPO] Update hubert.md
- typo fix (wave2vec instead of hubert)
- make code snippet copiable and runnable
* Run tests
2025-03-17 09:07:51 -07:00