汪志鹏
33c6fdb2cf
Update VITS model card ( #37335 )
...
* Update VITS model card
* Update docs/source/en/model_doc/vits.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/vits.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/vits.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/vits.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update vits.md
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
2025-04-15 13:16:05 -07:00
Parteek
51f544a4d4
Add Fast Conditional-DETR Processor ( #37071 )
...
* Add Fast Conditional-DETR Processor
* Update image_processing_conditional_detr_fast.py
* Add modular_conditional_detr.py
* Update image_processing_conditional_detr_fast.py
* Update tests
* make fix
---------
Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com >
2025-04-15 18:33:34 +02:00
Parteek
4f1dbe8152
Add Fast Chinese-CLIP Processor ( #37012 )
...
* Add Fast Chinese-CLIP Processor
* Update dummy_torchvision_objects.py
* Fix tests
2025-04-15 18:31:20 +02:00
Merve Noyan
c08997c52e
VDR task guide ( #37485 )
...
* VDR task guide
* Add to toctree
* Update docs/source/en/tasks/visual_document_retrieval.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/tasks/visual_document_retrieval.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/tasks/visual_document_retrieval.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/tasks/visual_document_retrieval.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/tasks/visual_document_retrieval.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/tasks/visual_document_retrieval.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/tasks/visual_document_retrieval.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/tasks/visual_document_retrieval.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/tasks/visual_document_retrieval.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/tasks/visual_document_retrieval.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
2025-04-15 08:55:13 -07:00
Yao Matrix
57da364d8e
fix and enhance pipeline_webserver.md ( #36992 )
...
* fix and enhance pipeline_webserver.md
Signed-off-by: Yao, Matrix <matrix.yao@intel.com >
* Update docs/source/en/pipeline_webserver.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/pipeline_webserver.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* use pipe
Signed-off-by: YAO Matrix <matrix.yao@intel.com >
---------
Signed-off-by: Yao, Matrix <matrix.yao@intel.com >
Signed-off-by: YAO Matrix <matrix.yao@intel.com >
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
2025-04-15 08:35:05 -07:00
Parteek
f6c79f767c
Add Fast Yolos Processor ( #37292 )
...
* Add Fast Yolos Processor
* Update modular file
* Fix copies
---------
Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com >
2025-04-15 14:23:08 +02:00
Huajie Tan
6f7ea1cf00
Add MLCD model ( #36182 )
...
* Add MLCD model
* Update codes for auto-mapping
* Add test scripts for MLCD
* Update doc for MLCD model
* Fix import error
* Fix import error
* Fix CI error for attention_outputs
* Fix code style for CI
* Fix code style for CI
* Fix code style for CI
* Fix code style for CI
* Fix code style for CI
* Fix CI error for initialization
* Fix code style for CI
* Fix code style for CI
* Reformat codes and docs for CI test
* Reformat codes and docs for CI test
* Remove unused attributes for CI test
* Fix style for CI test
* List MLCD in flash_attn doc
* Fix: typos, modulars, refactors from suggestions
* Refactoring convert_mlcd_weights_to_hf.py from suggestions
* Fix: docs conflicts
* Fix error for CI test
* Fix style for CI test
* Add integration test for MLCD
* Refactoring by class inheritance
* Fix: refactor attention interface, adjust codes
* Fix: merging conflicts
* Fix: merging conflicts
* Fix: style for CI test
* Fix: style for CI test
* Fix: set test_resize_embeddings to be False
* Fix: initializer for CI test
* Fix: conflicts, CI test, warning and refactoring
* Fix: merging conflicts
* Refactor
* Update docs
* Fix mistakes
* Remove unused args and fix multi-gpu error
* Revert position_embeddings
* Solve conflicts
* Solve conflicts
* Remove dummy
* Update _init_weights
* Update _init_weights
* Update _init_weights for CI test
2025-04-15 11:33:09 +01:00
Parteek
20ceaca228
Add Fast owlvit Processor ( #37164 )
...
* Add Fast Owlvit Processor
* Update image_processing_owlvit_fast.py
* Update image_processing_owlvit_fast.py
---------
Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com >
2025-04-14 17:58:09 +02:00
Parteek
a53a63c9c2
Add Fast Mobilenet-V2 Processor ( #37113 )
...
Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com >
2025-04-14 17:08:47 +02:00
Yann Chéné
4774a39d05
Add ImageProcessorFast to BiT processor ( #37180 )
...
* Add ImageProcessorFast to BiT processor
* propose a fast processor and add tests
* all tests pass except one
* run make
* remove useless print
* use same test as clip
* apply make
* Update src/transformers/models/bit/image_processing_bit_fast.py
Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com >
* Update setup.py
Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com >
* Update src/transformers/models/bit/image_processing_bit_fast.py
Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com >
* apply review comment
---------
Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com >
2025-04-14 17:07:48 +02:00
Parteek
e43f168eb3
Add Fast LeViT Processor ( #37154 )
...
* Add Fast LeViT Processor
* Update levit.md
* Update src/transformers/models/levit/image_processing_levit_fast.py
Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com >
* ruff check
---------
Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com >
2025-04-14 17:07:36 +02:00
Vinh H. Pham
7cc9e61a3a
Add Fast Image Processor for Donut ( #37081 )
...
* add donut fast image processor support
* run make style
* Update src/transformers/models/donut/image_processing_donut_fast.py
Co-authored-by: Parteek <parteekkamboj112@gmail.com >
* update test, remove none default values
* add do_align_axis = True test, fix bug in slow image processor
* run make style
* remove np usage
* make style
* Apply suggestions from code review
* Update src/transformers/models/donut/image_processing_donut_fast.py
Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com >
* add size revert in preprocess
* make style
* fix copies
* add test for preprocess with kwargs
* make style
* handle None input_data_format in align_long_axis
---------
Co-authored-by: Parteek <parteekkamboj112@gmail.com >
Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com >
2025-04-14 16:24:01 +02:00
Vinh H. Pham
1897a02d83
Add Fast Image Processor for LayoutLMv3 ( #37201 )
...
* support fast image processor layoutlmv3
* make style
* add warning and update test
* make style
* Update src/transformers/models/layoutlmv3/image_processing_layoutlmv3_fast.py
* Update image_processing_auto.py
---------
Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com >
2025-04-14 15:42:11 +02:00
Cypher Pepe
7bff4bdcf6
Fixed broken links ( #37466 )
...
* Update broken link
* Update broken link
2025-04-14 14:16:07 +01:00
Vinh H. Pham
e16775d103
Add Fast Image Processor for LayoutLMv2 ( #37203 )
...
* add support layoutlmv2
* make style
* Apply suggestions from code review
Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com >
* add warning and clean up
* make style
* Update src/transformers/models/layoutlmv2/image_processing_layoutlmv2_fast.py
Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com >
---------
Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com >
2025-04-14 15:06:41 +02:00
Vinh H. Pham
49b9a69a36
Add Fast Image Processor for Flava ( #37135 )
...
* support flava fast image processor
* run style and quality
* update test
* update according to reviews
* make style
* update comment on BICUBIC
* make style
---------
Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com >
2025-04-14 15:05:31 +02:00
Vinh H. Pham
e7f5724efd
Add Fast Image Processor for Perceiver ( #37176 )
...
* add test and fast image processor
* make style
* Update src/transformers/models/perceiver/image_processing_perceiver_fast.py
Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com >
* make style
---------
Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com >
2025-04-14 13:49:13 +02:00
BakerBunker
4b8c6d4cf8
Add Qwen2.5-Omni ( #36752 )
...
* Add qwen2.5-omni
* Remove einops dependency
* Add torchdiffeq dependency
* Sort init
* Add torchdiffeq to extras['diffeq']
* Fix repo consistency
* use cached_file
* del odeint
* renew pytest
* format
* Remove torchdiffeq
* format
* fixed batch infer bug
* Change positional_embedding to parameter
* Change default speaker
* Config revision
* Use modular & code clean
* code clean
* decouple padding with model & code cleaning
* sort init
* fix
* fix
* Second code review
* fix
* fix
* rename vars to full name + some comments
* update pytest
* Code clean & fix
* fix
* style
* more clean up
* fixup
* smaller vision model in tests
* fix processor test
* deflake a bit the tests (still flaky though)
* de-flake tests finally + add generation mixin
* final nits i hope
* make sure processor tests are complete
* replace with Qwen2_5OmniForConditionalGeneration
* fix tests after updating ckpt
* fix typos when cleaning, also we can't change ckpt
* fixup
* images and videos kwargs for processor
* thinker and talker loadable from hub ckpt
* address comments and update tests after rebase
* fixup
* skip for now
* fixup
* fixup
* remove torch dependency in processors
---------
Co-authored-by: lvyuanjun.lyj <lvyuanjun.lyj@alibaba-inc.con >
Co-authored-by: feizi.wx <feizi.wx@alibaba-inc.com >
Co-authored-by: raushan <raushan@huggingface.co >
2025-04-14 12:36:41 +02:00
Joao Gante
aaf129cdae
[agents] remove agents 🧹 ( #37368 )
2025-04-11 18:42:37 +01:00
Alex Brooks
623d395aff
Add Granite Speech Support ( #36801 )
...
* First pass at speech granite
Add encoder / projector, rename things
* Combine into one model file with causal lm outputs for forward
* Add loss calc
* Fix config loading
Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com >
* Split new / old loading logic
* Use transformers integration for loading peft adapters
* Add generation wrapper for selective lora enablement
* Add note for qformer encoder automodel
* Guard torch/audio imports in feature extractor
* Handle granite speech autoclasses
* Handle optional deps in package structure for granite speech
* Add granite pretrained model def for init
* Add dummy objects for torch/torchaudio
* Add tests for granite speech processor
* Minor formatting fixes and refactoring
* Add options for falling back to config in forward
* Tentative model docstrings for granite speech
* Fix config type
* Remove legacy load
* Allow non-lora variants for granite speech
* Override weight tying for llm
* Use text config instead of llm config
* Add output embeddings getter to fix weight tying
* Fix relative imports
* computing the number of audio features, based on the raw audio sequence.
* collating audio inputs, and keeping the original lengths.
* asserted we have text. otherwise we can't specify the audio special token.
* assering the number of audio-symbols/audios match correctly.
running get validated_audios only when audio is present
* indentation bugfix + supporting different feature lengths when expanding audio.
* redundant, done in _get_validated_text
* adapting the tests:
- we must have text (not either audio or text)
- _get_num_audio_features takes a list of raw lengths, provided it insetad.
* Minor cleanup, remove unused import
* Add more tests for batch feature processing
* Allow setting offset in rel position embeddings
* Add config option for warning if peft is not installed w/ lora
* Port blip2 qformer code into granite speech
* Add sad test for numpy arr processing
* Allow numpy arrays / tuples in granite speech processor
* Fix config type for projector
* - pad instead of creating a zeros tensor, to keep the original dtype/device (support bfloat16)
- cast input_features to the model dtype (support bfloat16)
* merge Blip2QFormerConfig to GraniteSpeechProjectorConfig
* prevent a crash when re-saving/loading the model (line 109)
* consider additional edge cases during preprocessing.
* consider additional edge cases during preprocessing.
* add features mask for batched inference (bugfix)
* Minor refactor, remove multiaudio processor tests
* Add set input/output embeddings for granite speech
* Fix feature dim check in processor test
* Pop input features in embed test for granite speech
* Small fixes for test edge cases
Add granite speech to seq2seq causal lm mapping names
* Add small tests for granite speech model
* Fix data parallelism test
* Standardize model class names
* Fix check for copies
* Fix misaligned init check
* Skip granite speech in checkpoint check
* Use default for tie_word_embeddings in granite speech
* Fix non documentation granite speech repo issues
* Fix comments and docstring checks
* Add placeholder docs for granite speech
* Fix test naming collision
* Code formatting
* Rerun torch dummy obj regen
* Fix save pretrained for granite speech
* Import sorting
* Fix tests typo
* Remove offset hack
* Pass args through encoder config
* Remove unused prune heads from blip2
* removing einsum. replaced with explicit multiplication (relative positional encodings) and sdpa attention.
* remove Sequential from ConformerFeedForward and ConformerConvModule. + fix for sdpa attention
* remove GraniteSpeechConformerScale
* rename to hidden_states
* rename conformer layers to self.layers, remove the first linear from the list to keep the list homogenous.
* move pre-norm to the attention/feedforward blocks (avoid complex module wrapping)
* adding pre_norm into forward
* feature extractor refactoring to resemble how it's done in phi4multimodal.
* rename feature_extractor to audio_processor
* bugfix: input_feature_mask fix to get the exact number tokens.
* Fix pytest decorator in processor test
* Add (disabled) integration tests for granite speech
* Fix handling of optional feature masking
* Loosen validation in processing for vLLM compatability
* Formatting fixes
* Update init structure to mirror llama
* Make granite speech projector generic
* Update test config to reflect generic projector
* Formatting fixes
* Fix typos, add license
* Fix undefined var in input processing
* Cleanup and expose ctc encoder
* Add missing config docstrings
* Better var names, type hints, etc
* Set attn context size in init
* Add max pos emb to encoder config
* Cleanup feature extractor
* Add granite speech architecture details
* Remove granite speech qformer ref
* Add paper link, explicit calc for qkv
* Calculate padding directly in depthwise conv1d init
* Raise value error instead of asserting
* Reorder class defs (classes used at top)
* Precompute relpos distances
* Run formatting
* Pass attention distances through forward
* Apply suggestions from code review
Co-authored-by: eustlb <94853470+eustlb@users.noreply.github.com >
* Add todo for using common batch feature extraction
* Rename audios/features
* Ensure chat template may be provided to processor
* Move granite speech docs to audio models
* Add todos for input proc refactoring
* Fix import order
* Guard torch import
* Use relative imports
* Require torch backend for processor in granite speech
* Add backend guards in feature extractor
---------
Signed-off-by: Alex-Brooks <Alex.brooks@ibm.com >
Co-authored-by: Avihu Dekel <avihu.dekel@ibm.com >
Co-authored-by: eustlb <94853470+eustlb@users.noreply.github.com >
2025-04-11 18:52:00 +02:00
Lysandre Debut
54a123f068
Simplify soft dependencies and update the dummy-creation process ( #36827 )
...
* Reverse dependency map shouldn't be created when test_all is set
* [test_all] Remove dummies
* Modular fixes
* Update utils/check_repo.py
Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com >
* [test_all] Better docs
* [test_all] Update src/transformers/commands/chat.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com >
* [test_all] Remove deprecated AdaptiveEmbeddings from the tests
* [test_all] Doc builder
* [test_all] is_dummy
* [test_all] Import utils
* [test_all] Doc building should not require all deps
---------
Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com >
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com >
2025-04-11 11:08:36 +02:00
Mehant Kammakomati
7d76876498
(Part 2) feat: allow for tp_size attr for tplizing the model ( #37054 )
...
* feat: custom tp_size, new transformers tp interface
Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com >
* fix: review cmt - error when tp_plan not set for tp_size
Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com >
* fix: nit in docs
Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com >
---------
Signed-off-by: Mehant Kammakomati <mehant.kammakomati2@ibm.com >
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com >
Co-authored-by: Matej Sirovatka <54212263+S1ro1@users.noreply.github.com >
2025-04-10 17:44:09 +02:00
AbdelKarim ELJANDOUBI
7ecc5b88c0
Add image classifier donut & update loss calculation for all swins ( #37224 )
...
* add classifier head to donut
* add to transformers __init__
* add to auto model
* fix typo
* add loss for image classification
* add checkpoint
* remove no needed import
* reoder import
* format
* consistency
* add test of classifier
* add doc
* try ignore
* update loss for all swin models
2025-04-10 15:00:42 +02:00
Raushan Turganbay
1ae8d54b04
[chat-template] Unify tests and clean up 🧼 ( #37275 )
...
* fix tests and some clean up
* make one general test for each modality
* remove redundant merging of kwargs
* edge cases
* dont enforce slow when reloading
* fix gemma3 tests
* has to adapt llama 4 after rebase
* remove also from overriden tests
* should be green now
2025-04-10 14:42:32 +02:00
DerekLiu35
2527f71a47
Add "selecting a quantization method" doc ( #37159 )
...
* initial draft
* make documentation simpler
* Update docs/source/en/quantization/selecting.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/quantization/selecting.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/quantization/selecting.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/quantization/selecting.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/quantization/selecting.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/quantization/selecting.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/quantization/selecting.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/quantization/selecting.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/quantization/selecting.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/quantization/selecting.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* turn pros and cons into tables
* Apply suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* add links to each quant method page
* separate calibration vs no calibration methods
* add calibration time estimates
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
2025-04-09 15:51:37 +02:00
Arthur
e3eda6d188
Add glm4 ( #37388 )
...
* add changed
* Revert "add changed"
This reverts commit 0a0166a1fe80556115a49fbf0c2132de0f4f85c9.
* update with NEW MODEL class called GLM4
* update
* Update glm4.md
* Name
* style
* fix copies
* fixup test
---------
Co-authored-by: Yuxuan Zhang <2448370773@qq.com >
2025-04-09 14:02:04 +02:00
logesh R
31a62c2eb8
Updated Model-card for donut ( #37290 )
...
* Updated documentation for Donut model
* Update docs/source/en/model_doc/donut.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/donut.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/donut.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/donut.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Updated code suggestions
* Update docs/source/en/model_doc/donut.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Updated code suggestion to Align with the AutoModel example
* Update docs/source/en/model_doc/donut.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Updated notes section included code examples
* close hfoption block and indent
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
2025-04-07 11:54:47 -07:00
Parag Ekbote
e2b0224d94
Update Model Card for Jamba ( #37152 )
...
* Update model card for jamba
* Apply the suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Apply suggestions from code review-2
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* update model page.
* Apply suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update as per code review.
* Update docs/source/en/model_doc/jamba.md as per code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/jamba.md as per code review
`
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* update as per code review.
* fixes
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
2025-04-07 11:02:59 -07:00
Devesh Rahatekar
6cc109c354
Improvements in Gemma2 model card ( #37076 )
...
* Improved Model card for Gemma2
* Made changes in gemma2 as suggested
* Made more changes in the doc (adding image, notes, closing hfoptions)
* minor fixes
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
2025-04-07 10:51:26 -07:00
Ashvanth.S
3a826a45ca
Update Model card for GPT2 ( #37101 )
...
* Update Model card for gpt2
* Update link for gpt2 space
* fixes docs based on suggestions
* Add transformers-cli and quantization example for GPT-2
* Remove resources and flash attention docs and fix typos
2025-04-07 10:15:28 -07:00
Ricardo Alanis
5e855095a2
Update falcon mamba card ( #37253 )
...
* feat: edit falcon mamba card
* fix: edit statement on falconmamba arch
* Update docs/source/en/model_doc/falcon_mamba.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/falcon_mamba.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/falcon_mamba.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* fix: add right indent for tags
* fix: remove notas
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
2025-04-07 10:12:44 -07:00
Shubham Panchal
416b5a875d
Update model-card for DINOv2 ( #37104 )
...
[docs] Update model-card for DINOv2
2025-04-07 10:11:08 -07:00
Nahieli
f8a16805c5
updated model card for Mistral ( #37156 )
...
* model card for Mistral
* Update docs/source/en/model_doc/mistral.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Apply suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/mistral.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/mistral.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/mistral.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/mistral.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* apply suggestions
* fix typo
* updated with comments
* updated with comments
* updated with comments
* remove hfoption block
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
2025-04-07 10:05:36 -07:00
Yih-Dar
e7ad077012
byebye torch 2.0 ( #37277 )
...
* bump Torch 2.1 with broken compatibility `torch.compile`
* dep table
* remove usage of is_torch_greater_or_equal_than_2_1
* remove usage of is_torch_greater_or_equal_than_2_1
* remove if is_torch_greater_or_equal("2.1.0")
* remove torch >= "2.1.0"
* deal with 2.0.0
* PyTorch 2.0+ --> PyTorch 2.1+
* ruff 1
* difficult ruff
* address comment
* address comment
---------
Co-authored-by: Jirka B <j.borovec+github@gmail.com >
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2025-04-07 15:19:47 +02:00
Arthur
25b7f27234
Add llama4 ( #37307 )
...
* remove one of the last deps
* update fast image processor after refactor
* styling
* more quality of life improvements
* nit
* update
* cleanups
* some cleanups
* vllm updates
* update fake image token
* [convert] Fix typo
* [convert] Strip extraneous bytes from shards
* [convert] Minor fixes
* [convert] Use num_experts
* multi-image fixes in modeling + processor
* fixup size
* 128 experts
* Use default rope
* Unfuse mlp
* simplify a lot inputs embeds merging
* remove .item() 👀
* fix from review
* Address feedback
* Use None "default" for rope_scaling. Add eot.
* set seed
* return aspect ratios and bug fixes
* Moe 128 rebased (#8 )
* 128 experts
* Use default rope
* Unfuse mlp
* Address feedback
* Use None "default" for rope_scaling. Add eot.
* Meta/llama quant compat (#7 )
* add quant compatible model & conversion code for llama4
* fix a few issues
* fix a few issues
* minor type mapping fix
---------
Co-authored-by: Lu Fang <fanglu@fb.com >
* use a new config parameter to determine which model definition to use for MoE
---------
Co-authored-by: Pedro Cuenca <pedro@huggingface.co >
Co-authored-by: Lu Fang <fanglu@fb.com >
* un-comment write_tokenizer from converting script
* remove un-used imports
* [llama4] Pop aspect_ratios from image processor output in Llama4Processor
Signed-off-by: Jon Swenson <jmswen@gmail.com >
* Fix parameter_count name
* Update src/transformers/models/llama4/configuration_llama4.py
* nit
* Add changes for no_rope, moe_layers, chunked attention. Just need to test all
* Update src/transformers/models/llama4/image_processing_llama4_fast.py
* nit
* fix post merge with main
* support flex attention
* fixes
* fix
* add layer
* small updates
* rebase and delete llm_compressor
* nit
* [llama4/mm] Add back <|image|> token that delimits global tile
* [llama4/mm] Fix Llama 4 image processing unit tests
* add explicit dtype
Signed-off-by: Jon Swenson <jmswen@gmail.com >
* sdpa works
* comment todo small
* fix model loading
Signed-off-by: Zijing Liu <liuzijing2014@gmail.com >
* revert
* nits
* small fix for TP on 1 node
* Read new params from config
* Add <|eom|>
* lol don't know how this got here
* adding fp8
* Save processor, fix chat template
* style
* Add boi/eoi tokens
We don't use them.
* fixes for now flex seems to work :)
* updates
* nits
* updates
* missking keys
* add context parallel
* update
* update
* fix
* nits
* add worldsize and make eager attn work for vision
* Ignore new key present in base models
* add tp_plan
* fix nope
Signed-off-by: Zijing Liu <liuzijing2014@gmail.com >
* minor fix
Signed-off-by: Zijing Liu <liuzijing2014@gmail.com >
* Clean up Llama4 vision model
* current updates
* add support for `attn_temperature_tuning`
* add floor scale
* add missing attn scales
* push what works, dirty trick for the device synch
* oups
* Fix pad_token_id
See
https://huggingface.co/ll-re/Llama-4-Scout-17B-16E/discussions/2/files
Confirmed in the original codebase.
* fix causallml loading
* rm
* fix tied-weights
* fix sdpa
* push current version
* should work with both short and long
* add compressed_tensos & fix fbgemm tp
* Fix flex impl
* style
* chunking
* try to revert the potentially breaking change
* fix auto factory
* fix shapes in general
* rm processing
* commit cache utils cleanup
* Fix context length
* fix
* allocate
* update tp_plan
* fix SDPA!
* Add support for sparse `Llama4TextMoe` layer from the kernel hub
* cleanup
* better merge
* update
* still broken fixing now
* nits
* revert print
* Write max_position_embeddings and max_model_length
* Update modeling_llama4.py
* Save attention_chunk_size
* Sync eos terminators
* Read initializer_range
* style
* remove `dict`
* fix
* eager should use `chunked_attention_mask`
* revert
* fixup
* fix config
* Revert "Merge pull request #36 from huggingface/sparse-llama4-moe"
This reverts commit ccda19f050867dd42ea143c5de60f3dec81375f0, reversing
changes made to a515579aed8c0fe9bf529b6c40446a289406d5d6.
* Fix typo and remove warning with compiled flex and chunked prefill
* Fix MoE vs FF (#41 )
* fix
* Use correct no_rope_layers if provided one is empty list
* update tests
* fix
* skipping some tests
* fix fp8 loading
Signed-off-by: Zijing Liu <liuzijing2014@gmail.com >
* fix text geneartion pipeline
Signed-off-by: Zijing Liu <liuzijing2014@gmail.com >
* eager needs 4D mask
* fix
* Some cleanup
* fix
* update
* fix
* replace correctly module
* patch
* modulelist
* update
* update
* clean up
* Don't move to `cuda:0` in distributed mode
* restrict to compressed tensors for now
* rm print
* Docs!
* Fixes
* Update docs/source/en/model_doc/llama4.md
Co-authored-by: Pedro Cuenca <pedro@huggingface.co >
* Fixes
* cuda graph fix
* revert some stuff
* fixup
* styling
* Update src/transformers/models/llama4/modeling_llama4.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
* fixup
* commit licence, cleanup here and there and style
* more styling changes
* fix dummies
* fix and clean docstrings
* remove comment
* remove warning
* Only fast image processor is supported
* nit
* trigger CI
* fix issue with flex encoder
* fix dynamic cache
* Code quality
* Code quality
* fix more tests for now
* Code quality
* Code quality
* Nuke bunch of failing stuff
* Code quality
* Code quality
* cleanup removal of slow image processor
* ruff fix fast image processor
* fix
* fix styling
* Docs
* Repo consistency
* Repo consistency
* fix sliding window issue
* separate llama cache
* styling
* Repo consistency
* Repo consistency
* push waht works
* L4 Repo consistency
* Docs
* fix last last alst alst alst alstsaltlsltlaslt
---------
Signed-off-by: Jon Swenson <jmswen@gmail.com >
Signed-off-by: Zijing Liu <liuzijing2014@gmail.com >
Co-authored-by: yonigozlan <yoni.gozlan10@gmail.com >
Co-authored-by: Pedro Cuenca <pedro@huggingface.co >
Co-authored-by: Pablo Montalvo <pablo.montalvo.leroux@gmail.com >
Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com >
Co-authored-by: Keyun Tong <tongkeyun@gmail.com >
Co-authored-by: Zijing Liu <liuzijing2014@users.noreply.github.com >
Co-authored-by: Lu Fang <fanglu@fb.com >
Co-authored-by: Zijing Liu <liuzijing2014@gmail.com >
Co-authored-by: Jon Swenson <jmswen@gmail.com >
Co-authored-by: jmswen <jmswen@users.noreply.github.com >
Co-authored-by: MekkCyber <mekk.cyber@gmail.com >
Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com >
Co-authored-by: Mohit Sharma <mohit21sharma.ms@gmail.com >
Co-authored-by: Yong Hoon Shin <yhshin@meta.com >
Co-authored-by: Marc Sun <marc@huggingface.co >
Co-authored-by: drisspg <drisspguessous@gmail.com >
Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com >
Co-authored-by: Daniël de Kok <me@danieldk.eu >
Co-authored-by: Lysandre <hi@lysand.re >
Co-authored-by: Ye (Charlotte) Qi <ye.charlotte.qi@gmail.com >
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2025-04-05 22:02:22 +02:00
Linnet Cosmos Tuscano
0ef339ff1b
Update OpenAI GPT model card ( #37255 )
...
* Update OpenAI GPT model card
* Update docs/source/en/model_doc/openai-gpt.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/openai-gpt.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/openai-gpt.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/openai-gpt.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update OpenAI GPT model card: add usage examples and notes section
* Add API autodoc tags after Notes section for OpenAI GPT model
* Update docs/source/en/model_doc/openai-gpt.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/openai-gpt.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/openai-gpt.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/openai-gpt.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/openai-gpt.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/openai-gpt.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/openai-gpt.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/openai-gpt.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/openai-gpt.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/openai-gpt.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/openai-gpt.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/openai-gpt.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/openai-gpt.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Added missing badges
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
2025-04-04 15:25:16 -07:00
Sharareh Younesian
46d73910d5
Updated T5 model card with standardized format ( #37261 )
...
* Updated T5 model card with standardized format
* Updated T5 model card with standardized format, fixed typo
* Update docs/source/en/model_doc/t5.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/t5.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/t5.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/t5.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/t5.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/t5.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/t5.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/t5.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/t5.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/t5.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Apply reviewer suggestions
* Update docs/source/en/model_doc/t5.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
2025-04-04 15:23:09 -07:00
Chathumina Vimukthi
579135a2f6
Updated model card for distilbert ( #37157 )
...
* Updated model card for distilbert
* Updated the distilbert model card
* Updated model card for distilbert
* Updated the distilbert model card
* Addressed code review comments
* Addressed review comments
* fix pipeline
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
2025-04-04 15:22:46 -07:00
Reshan Gomis
8cd57eb731
mobilebert model card update ( #37256 )
...
* mobilebert model card update
* Updates to model card mobilebert
---------
Co-authored-by: Reshan Gomis <reshang@verdentra.com >
2025-04-04 14:28:35 -07:00
Shubham Panchal
531e4fcf0e
Update model card for Depth Anything ( #37065 )
...
[docs] Update model card for Depth Anything
2025-04-04 11:36:05 -07:00
Joao Gante
ad3d157188
[RoPE] abstract dynamic RoPE update under a decorator ✨ ( #37249 )
...
* dynamic rope decorator
* longrope; shorter fwd pass
* propper docstring
* make fixup
2025-04-04 14:27:28 +01:00
Surya Garikipati
8dd0a2b89c
Update model card for electra ( #37063 )
...
* Update ELECTRA model card with new format
* Update ELECTRA model card with new format
* Update docs/source/en/model_doc/electra.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/electra.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/electra.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/electra.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/electra.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/electra.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/electra.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/electra.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/electra.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* close hfoption block
---------
Co-authored-by: Wun0 <f20191221@hyderabad.bits-pilani.ac.in >
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
2025-04-03 10:45:35 -07:00
Parag Ekbote
15ac2b6ac5
Update Model Card for ModernBERT ( #37052 )
...
* Modify Model Card for ModernBERT.
* Update as per code review.
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update model card.
* Update model card.
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
2025-04-03 10:14:02 -07:00
Abhishek Ranjan
b552708694
chore: Update model doc for code_llama ( #37115 )
...
* Update code_llama.md
aims to handle https://github.com/huggingface/transformers/issues/36979#issuecomment-2758560598
sub part of https://github.com/huggingface/transformers/issues/36979
* Update docs/source/en/model_doc/code_llama.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/code_llama.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/code_llama.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* make changes as per code review
* chore: make the function smaller for attention mask visualizer
* chore[docs]: update code_llama.md with some more suggested changes
* Update docs/source/en/model_doc/code_llama.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* chore[docs] : Update code_llama.md with indentation changes
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
2025-04-03 10:09:41 -07:00
Bimal Gajera
2b84831a93
Update model card for Cohere ( #37056 )
...
* Update Cohere model card to follow standard template
* Update docs/source/en/model_doc/cohere.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/cohere.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/cohere.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/cohere.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/cohere.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/cohere.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update cohere.md
Update code snippet for AutoModel, quantization, and transformers-cli
* Update cohere.md
* Update docs/source/en/model_doc/cohere.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
2025-04-03 09:51:40 -07:00
Avigyan Sinha
1b29409d89
feat: updated model card for qwen_2.5_vl ( #37099 )
...
* feat: updated model card for qwen_2.5_vl
* applied suggested change 1
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* applied suggested change 2
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* applied suggested change 3
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* fix: made requested changes for quantization and notes
* suggeested model card change 4
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* updated model card wiht suggested change 5
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* updated model card wiht suggested change 6
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* updated model card wiht suggested change 7
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* feat: applied requested changes
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
2025-04-03 09:13:26 -07:00
Ryan Mullins
3f6af96732
Adding links to ShieldGemma 2 technical report ( #37247 )
2025-04-03 16:26:29 +01:00
Joao Gante
9a1c1fe7ed
[CI] green llama tests ( #37244 )
...
* green llama tests
* use cleanup instead
* better test comment; cleanup upgrade
* better test comment; cleanup upgrade
2025-04-03 14:15:53 +01:00
Raushan Turganbay
98601cc818
[Phi4] add multimodal chat template ( #36996 )
...
* phi4 chat template
* remove from valid kwargs
2025-04-03 09:52:09 +02:00
ARAVINDHAN T
2056287940
Updated model card for Qwen2 ( #37192 )
...
* Update qwen2.md
* Update qwen2.md
* Update qwen2.md
* Update qwen2.md
* Update docs/source/en/model_doc/qwen2.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/qwen2.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/qwen2.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/qwen2.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/qwen2.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/qwen2.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/qwen2.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/qwen2.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/qwen2.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update qwen2.md
* Update docs/source/en/model_doc/qwen2.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/qwen2.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/qwen2.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/qwen2.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/qwen2.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/qwen2.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/qwen2.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/qwen2.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/qwen2.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/qwen2.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/qwen2.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/qwen2.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/qwen2.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/qwen2.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/qwen2.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/qwen2.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/qwen2.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/qwen2.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
2025-04-02 18:10:41 -07:00