Quentin Gallouédec
de24fb63ed
Use HF papers ( #38184 )
...
* Use hf papers
* Hugging Face papers
* doi to hf papers
* style
2025-06-13 11:07:09 +00:00
Raushan Turganbay
e26ae89281
[docs] update cache docs with new info ( #38775 )
...
* update docs with new info
* Update docs/source/en/kv_cache.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
2025-06-13 07:10:56 +00:00
SohamPrabhu
85f060e9b0
Updated moonshine modelcard ( #38711 )
...
* Moved the sources to the right
* small Changes
* Some Changes to moonshine
* Added the install to pipline
* updated the monshine model card
* Update docs/source/en/model_doc/moonshine.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/moonshine.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/moonshine.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/moonshine.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/moonshine.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Updated Documentation According to changes
* Fixed the model with the commits
* Update moonshine.md
* Update moshi.md
---------
Co-authored-by: Your Name <sohamprabhu@Mac.fios-router.home >
Co-authored-by: Your Name <sohamprabhu@Sohams-MacBook-Air.local >
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
2025-06-12 10:27:17 -07:00
Drew Ross
645cf297cc
Add missing div in Pegasus model card ( #38773 )
...
Add missing div
2025-06-12 10:27:07 -07:00
Yusuf Shihata
346f341630
[Docs] New DiT model card ( #38721 )
...
* documenation finished
* Update dit.md
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
2025-06-12 10:26:50 -07:00
Cyril Vallez
4b8ec667e9
Remove all traces of low_cpu_mem_usage ( #38792 )
...
* remove it from all py files
* remove it from the doc
* remove it from examples
* style
* remove traces of _fast_init
* Update test_peft_integration.py
* CIs
2025-06-12 16:39:33 +02:00
Lysandre Debut
6a5fd0c6d2
Reword README in light of model definitions ( #38762 )
...
* Slight readme reword
* reword
* reword
* reword
* Slight readme reword
2025-06-12 14:43:31 +01:00
Jesse Cai
e1812864ab
[docs] Add int4wo + 2:4 sparsity example to TorchAO README ( #38592 )
...
* update quantization readme
* update
---------
Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com >
2025-06-12 12:17:07 +00:00
rileyafox
9487765f07
Add Qwen2 MoE model card ( #38649 )
...
* Add Qwen2 MoE model card
* Revisions to qwen2 moe model card
* Add Qwen2 MoE model card
2025-06-11 15:14:01 -07:00
Emile Aydar
32dbf4bddb
Update altCLIP model card ( #38306 )
...
* Update altclip.md
* Update altclip.md
* Update altclip.md
* Update altclip.md
* Update altclip.md
* Update altclip.md
* Rename altclip.md to altclip.mdx
* Rename altclip.mdx to altclip.md
* Update altclip.md
* Update altclip.md
* Update altclip.md
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
2025-06-11 14:48:34 -07:00
Drew Ross
bb44d2a0f6
Update pegasus model card ( #38675 )
...
* Update Pegasus model card
* Fix transformers-cli command
* Update code examples to use bfloat16
* Reverted code examples to use float16
* Fix typo, update checkpoints link
* Update str formatting in code examples
* Apply suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Fix typo
* Remove inaccurate badges
* Revert badge removal
* Apply suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Include cache_implementation argument in quantization example
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
2025-06-11 10:56:25 -07:00
Marc Sun
11ad9be153
Better typing for num_items_in_batch ( #38728 )
...
* fix
* style
* type checking ?
* maybe this ?
* fix
* can't be an int anymore
* fix
2025-06-11 16:26:41 +02:00
Pavel Iakubovskii
84710a4291
Add V-JEPA 2 ( #38746 )
...
Release - Conda / build_and_package (push) Has been cancelled
Secret Leaks / trufflehog (push) Has been cancelled
* adding model and conversion scripts
* add imports to test vjepa conversion
* fix imports and make conversion work
* fix computation for short side
* replace attention with library attention function
* cleanup more attention classes
* remove config overrides
* add test cases, fix some of the failing ones
* fix the model outputs
* fix outputs of the model per review
* fix too big model test case
* fix styling __init__.py
* fix initialization test
* remove all asserts per review
* update sorting unsorting logic as per feedback
* remove is_video per review
* remove another is_video segment
* remove unwanted stuff
* small fixes
* add docstrings for the model
* revert adding vjepa2 config here
* update styling
* add config docstrings (wip)
* fix dpr issue
* removed test failing issues
* update styles
* merge predictor configs into main config
* remove processing code, add video processor
* remove permute which is not necessary now
* fix styles
* updated vjepa2 to be in video_processing_auto
* update comment for preprocessing
* test integration test and fix the outputs
* update test values, change test to look at repeated frames for a given image
* add a simple video processing test
* refactoring pixel_values_videos and upload ckpts to original
* fix torch_fx test cases
* remove unused config
* add all config docstrings
* add more integration tests
* add basic doc
* revert unwanted styling changes
* working make fixup
* Fix model_type in config
* update attention implementation to fit new hf standards
* fix the preprocessing logic, ensure it matches the original model
* remove use_rope logic, cleanup
* fix docstrings
* Further cleanup, update doc
* Fix model prefix
* fix get_vision_features
* VJEPA2Embeddings style refactor
* nit, style comment
* change modules default values
* Only `str` activation in config
* GradientCheckpointingLayer
* fixup
* fix conversion script
* Remove return_dict
* remove None return typehint
* Refactor VJEPA2Layer, remove use_SiLU
* Fix fx tests
* dpr -> drop_path_rates
* move *ModelOutput on top
* format docs bit
* update docs
* update docs
* update doc example
* remove prune_heads from model
* remove unused config params
* refactor embed signature
* Add vjepa to docs
* Fix config docstring
* update defaults
* Update docs/source/en/model_doc/vjepa2.md
Co-authored-by: Pedro Cuenca <pedro@huggingface.co >
* Update docs/source/en/model_doc/vjepa2.md
Co-authored-by: Pedro Cuenca <pedro@huggingface.co >
* Fix import
* Min refactoring
* Update HUB_SOURCE and HUB_REPO in conversion script
* Add missing headers
* VJEPA -> V-JEPA in docs
* Add image to doc
* fix style
* fix init weights
* change checkpoint name in modeling tests
---------
Co-authored-by: Koustuv Sinha <koustuv.sinha@mail.mcgill.ca >
Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co >
Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com >
Co-authored-by: Koustuv Sinha <koustuvsinha@gmail.com >
Co-authored-by: Pedro Cuenca <pedro@huggingface.co >
2025-06-11 15:00:08 +01:00
RogerSinghChugh
aa798b7ac9
New canine model card ( #38631 )
...
* Updated BERTweet model card.
* Update docs/source/en/model_doc/bertweet.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/bertweet.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/bertweet.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/bertweet.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/bertweet.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/bertweet.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/bertweet.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* updated toctree (EN).
* Updated BERTweet model card.
* Update docs/source/en/model_doc/bertweet.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/bertweet.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/bertweet.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/bertweet.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/bertweet.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/bertweet.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/bertweet.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* updated toctree (EN).
* Updated BERTweet model card.
* Update docs/source/en/model_doc/bertweet.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/bertweet.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/bertweet.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/bertweet.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/bertweet.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/bertweet.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/bertweet.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* updated toctree (EN).
* Commit for new_gpt_model_card.
* Update docs/source/en/model_doc/gpt_neo.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/gpt_neo.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/gpt_neo.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/gpt_neo.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/gpt_neo.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/gpt_neo.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/gpt_neo.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/gpt_neo.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* commit for new canine model card.
* Update docs/source/en/model_doc/canine.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/canine.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/canine.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/canine.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/canine.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/canine.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* implemented suggestion by @stevhliu.
* Update canine.md
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
2025-06-10 09:30:05 -07:00
Yana Mishula
81799d8b55
Standardize ByT5 model card format ( #38699 )
...
* Standardize ByT5 model card format
* Apply review feedback from @stevhliu
* Fix Notes formatting and wording
* Fix `aya_vision` test (#38674 )
* fix 1: load_in_4bit=True,
* fix 2: decorateor
* fixfix 2: breakpoint
* fixfix 3: update
* fixfix 4: fast
* fixfix 5: cond
* fixfix 5: cond
* fixfix 6: cuda 8
* ruff
* breakpoint
* dtype
* a10
* a10
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
* Fix autodoc formatting for ByT5Tokenizer
---------
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com >
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2025-06-09 15:02:50 -07:00
Aashish Anand
b61c47f5a5
Created model card for xlm-roberta-xl ( #38597 )
...
* Created model card for xlm-roberta-xl
* Update XLM-RoBERTa-XL model card with improved descriptions and usage examples
* Minor option labeling fix
* Added MaskedLM version of XLM RoBERTa XL to model card
* Added quantization example for XLM RoBERTa XL model card
* minor fixes to xlm roberta xl model card
* Minor fixes to mask format in xlm roberta xl model card
2025-06-09 13:00:38 -07:00
Aashish Anand
e594e75f1b
Update XLM-RoBERTa model documentation with enhanced usage examples and improved layout ( #38596 )
...
* Update XLM-RoBERTa model documentation with enhanced usage examples and improved layout
* Added CLI command example and quantization example for XLM RoBERTa model card.
* Minor change to transformers CLI and quantization example for XLM roberta model card
2025-06-09 12:26:31 -07:00
Aashish Anand
29ca043856
Created model card for XLM model ( #38595 )
...
* Created model card for XLM model
* Revised model card structure and content of XLM model
* Update XLM model documentation with improved examples and code snippets for predicting <mask> tokens using Pipeline and AutoModel.
2025-06-09 12:26:23 -07:00
Matthew Douglas
837ddac1ec
Docs: update bitsandbytes torch.compile compatibility ( #38651 )
2025-06-09 14:51:57 -04:00
Anthony
19224c3642
fix: "check out" as verb ( #38678 )
...
"check out" as verb
2025-06-09 14:07:31 +00:00
Yao Matrix
dc76eff12b
remove ipex_optimize_model usage ( #38632 )
...
* remove ipex_optimize_model usage
Signed-off-by: YAO Matrix <matrix.yao@intel.com >
* update Dockerfile
Signed-off-by: root <root@a4bf01945cfe.jf.intel.com >
---------
Signed-off-by: YAO Matrix <matrix.yao@intel.com >
Signed-off-by: root <root@a4bf01945cfe.jf.intel.com >
Co-authored-by: root <root@a4bf01945cfe.jf.intel.com >
2025-06-06 20:04:44 +02:00
Armaghan Shakir
31023b6909
Fix MiniMax (docs and integration tests checkpoint) ( #38575 )
...
* update checkpoints for integration tests
* minor fixes in docs
2025-06-06 08:43:11 +02:00
Vanshu
593e29c5e2
Updated Aria model card ( #38472 )
...
* Update aria.md
* Update aria.md
* Suggested Updates - aria.md
2025-06-05 14:36:54 -07:00
Parag Ekbote
77cf4936fe
[Nit] Add Note on SigOpt being in Public Archive Mode ( #38610 )
...
* add note on sigopt
* update
* Update docs/source/en/hpo_train.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
2025-06-05 14:07:23 -07:00
Monish Singhal
c75bf2c36e
Fix typo in LLaVa documentation ( #38618 )
...
* Fix typo in LLaVa documentation
In exactly one section, LlavaImageProcessor was spelt wrongly as LLavaImageProcessor, which throws off copy-pasting the section.
* Fix LlavaImageProcessor url to make it valid (and copypaste-able)
Earlier, the URL contained the entire HF prefix. This commit removes that to ensure that the code block can be copied and run as is.
2025-06-05 13:25:07 -07:00
Henrik Matthiesen
1fed6166c0
added fast image processor for ZoeDepth and expanded tests accordingly ( #38515 )
...
* added fast image processor for ZoeDepth and expanded tests accordingly
* added fast image processor for ZoeDepth and expanded tests accordingly, hopefully fixed repo consistency issue too now
* final edits for zoedept fast image processor
* final minor edit for zoedepth fast imate procesor
2025-06-04 22:59:17 +00:00
RogerSinghChugh
8e1266de2b
New gpt neo model card ( #38505 )
...
* Updated BERTweet model card.
* Update docs/source/en/model_doc/bertweet.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/bertweet.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/bertweet.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/bertweet.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/bertweet.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/bertweet.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/bertweet.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* updated toctree (EN).
* Updated BERTweet model card.
* Update docs/source/en/model_doc/bertweet.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/bertweet.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/bertweet.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/bertweet.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/bertweet.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/bertweet.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/bertweet.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* updated toctree (EN).
* Updated BERTweet model card.
* Update docs/source/en/model_doc/bertweet.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/bertweet.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/bertweet.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/bertweet.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/bertweet.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/bertweet.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/bertweet.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* updated toctree (EN).
* Commit for new_gpt_model_card.
* Update docs/source/en/model_doc/gpt_neo.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/gpt_neo.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/gpt_neo.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/gpt_neo.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/gpt_neo.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/gpt_neo.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/gpt_neo.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/gpt_neo.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
2025-06-04 09:56:47 -07:00
Manal ML
1285aec4cc
Docs: fix code formatting in torchao docs ( #38504 )
2025-06-04 12:35:21 +00:00
Armaghan Shakir
55736eea99
Add support for MiniMax's MiniMax-Text-01 ( #35831 )
...
* end-to-end architecture
* lightning-attn: refactor, clean, optimize
* put minimax_text_01 in other files
* use latest __init__ standards and auto-generate modular
* support attention_mask for lightning-attn
* Revert "use latest __init__ standards and auto-generate modular"
This reverts commit d8d3c409d89e335c98a8cd36f47304a76eac7493.
* fix modular conversion
* pass both attention masks instead of tuple
* formatting
* Updated Dynamic Cache
* created MiniMaxText01Cache
* fix hardcoded slope_rate
* update attn_type_list in config
* fix lightning when use_cache=False
* copy tests from mixtral
* (checkpoint) all tests pass for normal attention
* fix all unittests
* fix import sorting
* fix consistency and formatting tests
* fix config
* update tests, since changes in main
* fix seq_len error
* create dummy docs
* fix checkpoint
* add checkpoint in config docstring
* run modular_conversion
* update docs
* fix checkpoint path and update tests
* fix ruff
* remove repeated expected_slice
* update docs
* rename "minimax-text-01" to "minimax"
* inherit config from mixtral
* remove from docs in other languages
* undo files that should be untouched
* move minimax to end in conversation docs
* use MiniMaxForCausalLM as it is
* ruff fixes
* run modular
* fix docstring example in causallm
* refactor attention loop and decay factors
* refactor config in modular
* run modular
* refactor cache
* rename static_cache to linear_cache
* make positional embeddings necessary
* remove unnecessary layernorms declarations
* fix import in tests
* refactor attention in next tokens
* remove outdated code
* formatting and modular
* update tests
* rename layernorm alpha/beta factors
* register decay factors as buffers
* remove unused declarations of decay factors
* update config for alpha/beta factors
* run modular
* remove head_dim in tests
* remove minimax from fx.py
* remove stuff that is not really needed
* update __init__
* update qkv torch.split
Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com >
* fix qkv torch.split
* quality fixes
* remove mistakenly added dummy
* purge unused ModelTester code
* fix-copies
* run fix-copies
* fix head_dim
* write cache formatting tests
* remove postnorm
* avoid contiguous in attention current states
* update expected_slice
* add generation test for integration
* fix dtype in generation test
* update authors
* update with changes in main
* update graident checkpointing and minor fixes
* fix mutable attn_type_list
* rename: attn_type -> layer_type
* update for layer_types
* update integration tests
* update checkpoint
* clean overview in docs
---------
Co-authored-by: Shakib-IO <shakib.khan17@northsouth.edu >
Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com >
2025-06-04 09:38:40 +02:00
Steven Liu
78d771c3c2
[docs] Format fix ( #38414 )
...
fix table
2025-06-03 09:53:23 -07:00
Driss Guessous
279000bb70
Name change AOPermod -> ModuleFqn ( #38456 )
...
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com >
Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com >
2025-06-03 15:43:31 +00:00
Tony Wu
c72ba69441
Add ColQwen2 to 🤗 transformers ( #35778 )
...
Release - Conda / build_and_package (push) Has been cancelled
Secret Leaks / trufflehog (push) Has been cancelled
* feat: add colqwen2 (wip)
* tests: fix test_attention_outputs
* tests: reduce hidden size to accelerate tests
* tests: fix `test_attention_outputs` 🥳
* fix: fix wrong parent class for `ColQwen2ForRetrievalOutput`
* fix: minor typing and style changes
* chore: run `make style`
* feat: remove redundant `max_num_visual_tokens` attribute in `ColQwen2Processor`
* tests: tweak comments
* style: apply ruff formatter
* feat: move default values for `visual_prompt_prefix` and `query_prefix`
* docs: update ColQwen2 model card
* docs: tweak model cards
* docs: add required example config checkpoint
* tests: update expected scores in integration test
* docs: tweak quickstart snippets
* fix: address PR comments
* tests: fix colqwen2 tests + tweak comment in colpali test
* tests: unskip useful tests
* fix: fix bug when `visual_prompt_prefix` or `query_prefix` is an empty string
* fix: fix ColPali outputs when `return_dict == False`
* fix: fix issue with PaliGemma output not being a dict
* docs: set default dtype to bfloat16 in quickstart snippets
* fix: fix error when `return_dict=False` in ColPali and ColQwen2
* tests: fix special tokens not being replaced in input_ids
* style: fix lint
* fix: `ColQwen2Processor`'s `padding_side` is now set from `processor_config.json`
* fix: remove unused `padding_side` in ColQwen2 model
* docs: update ColQwen2's model doc
* fix: fix harcoded vlm backbone class in ColQwen2Config
* fix: remove `padding_side` from ColQwen2Processor as should fed from kwargs
* docs: fix typo in model docstring
* docs: add illuin mention in model docs
* fix: let `padding_size` be handled by `tokenizer_config.json`
* docs: add colpali reference url in colqwen2's model doc
* docs: add Hf mention in model docs
* docs: add late interaction mention in model docs
* docs: tweak colqwen2 model doc
* docs: update reference checkpoint for ColPali to v1.3
* docs: simplify quickstart snippets
* docs: remove redundant `.eval()`
* refactor: use `can_return_tuple` decorator for ColPali and ColQwen2
* docs: fix copyright date
* docs: add missing copyright in tests
* fix: raise error when `initializer_range` is not in config
* docs: remove redundant `.eval()` in colpali doc
* fix: fix `get_text_config` now that Qwen2VL has a proper `text_config` attribute
See https://github.com/huggingface/transformers/pull/37268 for details about changes in Qwen2VL's config.
* fix: add missing `initializer_range` attribute in `ColQwen2Config`
* fix: use `get_text_config` in `resize_token_embeddings`
* update colwen2 with auto_docstring
* docs: fix wrong copyright year
* chore: remove `raise` as `initializer_range` has a default value in `ColQwen2Config`
* refactor: merge `inner_forward` into `forward`
* Refactor colqwen2 after refactoring of qwen2VL, use modular for modeling code
* protect torch import in modular to protect in processing
* protect torch import in modular to protect in processing
* tests: fix hf model path in ColQwen2 integration test
* docs: clarify `attn_implementation` and add comments
* docs: add fallback snippet for using offline PIL dummy images
* docs: temporarily revert attn_implementation to `None` while sdpa is not fixed
* docs: tweaks in colpali/colqwen2 quick start snippets
* fix: add missing flags to enable SDPA/Flex Attention in ColQwen2 model
* fix: add missing changes in modular file
* fix modeling tests
---------
Co-authored-by: yonigozlan <yoni.gozlan@huggingface.co >
2025-06-02 12:58:01 +00:00
Joao Gante
beaed8ce01
[generate] move SinkCache to a custom_generate repo ( #38399 )
...
remove sink cache
2025-06-02 12:13:30 +02:00
Fanli Lin
51d732709e
[docs] add xpu environment variable for gpu selection ( #38194 )
...
* squash commits
* rename gpu
* rename accelerator
* change _toctree.yml
* Apply suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
---------
Co-authored-by: sdp <sdp@a4bf01943ff7.jf.intel.com >
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com >
2025-05-30 16:05:07 +00:00
Yuanzhou Cai
942c60956f
Model card for mobilenet v1 and v2 ( #37948 )
...
* doc: #36979
* doc: update hfoptions
* add model checkpoints links
* add model checkpoints links
* update example output
* update style #36979
* add pipeline tags
* improve comments
* Apply suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* apply suggested changes
* Apply suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
2025-05-28 09:20:19 -07:00
Jiwook Han
9a8510572b
Updated the model card for ViTMAE ( #38302 )
...
* Update vit_mae.md
* badge float:right
* Update docs/source/en/model_doc/vit_mae.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/vit_mae.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/vit_mae.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/vit_mae.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/vit_mae.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/vit_mae.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/vit_mae.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/vit_mae.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/vit_mae.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update model_doc/vit_mae.md
* fix
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
2025-05-28 09:19:43 -07:00
Vanshu
c9fcbd5bf9
Updated the Model docs - for the ALIGN model ( #38072 )
...
* Updated the Model docs - for the ALIGN model
* Update docs/source/en/model_doc/align.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/align.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Updated align.md
* Update docs/source/en/model_doc/align.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/align.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update align.md
* fix
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
2025-05-28 09:19:09 -07:00
Andy Vu
3b3ebcec40
Updated model card for OLMo2 ( #38394 )
...
* Updated OLMo2 model card
* added command line
* Add suggestions
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Added suggestions
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Indented code block as per suggestions
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
2025-05-27 16:24:36 -07:00
Tanuj Rai
a092f6babf
Update granite.md ( #37791 )
...
* Update granite.md
* Update docs/source/en/model_doc/granite.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/granite.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/granite.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* update granite.md
* Update docs/source/en/model_doc/granite.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/granite.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/granite.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/granite.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/granite.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/granite.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* minor fixes
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
2025-05-27 12:55:15 -07:00
RogerSinghChugh
be7aa3210b
New bart model card ( #37858 )
...
* Modified BART documentation wrt to issue #36979 .
* Modified BART documentation wrt to issue #36979 .
* fixed a typo.
* Update docs/source/en/model_doc/bart.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/bart.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/bart.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/bart.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/bart.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/bart.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* blank commit.
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
2025-05-27 11:51:41 -07:00
RogerSinghChugh
587c1b0ed1
Updated BERTweet model card. ( #37981 )
...
* Updated BERTweet model card.
* Update docs/source/en/model_doc/bertweet.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/bertweet.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/bertweet.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/bertweet.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/bertweet.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/bertweet.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/bertweet.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* updated toctree (EN).
* Updated BERTweet model card.
* Update docs/source/en/model_doc/bertweet.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/bertweet.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/bertweet.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/bertweet.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/bertweet.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/bertweet.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/bertweet.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* updated toctree (EN).
* Updated BERTweet model card.
* Update docs/source/en/model_doc/bertweet.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/bertweet.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/bertweet.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/bertweet.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/bertweet.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/bertweet.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/bertweet.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* updated toctree (EN).
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
2025-05-27 11:51:22 -07:00
RogerSinghChugh
b73faef52f
Updated BigBird Model card as per #36979 . ( #37959 )
...
* Updated BigBird Model card as per #36979 .
* Update docs/source/en/model_doc/big_bird.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/big_bird.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/big_bird.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/model_doc/big_bird.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
2025-05-27 11:24:28 -07:00
Madhav Kumar
538e847c06
Updated Zoedepth model card ( #37898 )
...
* Edited zoedepth model card according to specifications.
* Edited Zoedepth model file
* made suggested changes.
2025-05-27 10:06:53 -07:00
Parag Ekbote
4f7b0ff8d1
Update Model Card for Mamba-2 ( #37951 )
...
* update model page.
* update model page.
* Update docs/source/en/model_doc/mamba2.md
Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com >
* update the model page.
* update.
* Apply suggestions from code review
Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com >
* Apply the suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* add an quantization example and update the toctree.
* Apply suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* remove the additional comma
---------
Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com >
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
2025-05-27 10:06:39 -07:00
eustlb
b9f8f863d9
[CSM] update model id ( #38211 )
...
* update model id
* codec_model eval
* add processor img
* use ungated repo for processor tests
2025-05-27 17:03:55 +02:00
eustlb
3142bd8592
[CSM] infer codec model with no_grad + audio eos label ( #38215 )
...
* infer codec model with no_grad
* codec_model eval
* training labels: add audio eos token
2025-05-27 14:10:17 +00:00
Ragnar
63964b7c67
fix typos ( #38336 )
...
* Update video_processor.md
* Update deepseek_v3.md
2025-05-26 14:42:37 +00:00
Manuel de Prada Corral
78079abeff
Improved cache docs ( #38060 )
...
* improved cache docs
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
2025-05-26 13:53:41 +00:00
Kseniya Parkhamchuk
31f8a0fe8a
[docs]: update roformer.md model card ( #37946 )
...
* Update roformer model card
* fix example purpose description
* fix model description according to the comments
* revert changes for autodoc
* remove unneeded tags
* fix review issues
* fix hfoption
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
2025-05-23 16:27:56 -07:00
Bryan C.
36f97ae15b
docs(swinv2): Update SwinV2 model card to new standard format ( #37942 )
...
* docs(swinv2): Update SwinV2 model card to new standard format
* docs(swinv2): Apply review suggestions
Incorporates feedback from @stevhliu to:
- Enhance the introductory paragraph with more details about scaling and SimMIM.
- Generalize the tip from "image classification tasks" to "vision tasks".
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
2025-05-23 13:04:13 -07:00