Commit Graph

17006 Commits

Author SHA1 Message Date
Benjamin Bossan
6500f78c86 [PEFT] Support low_cpu_mem_usage option for PEFT loading adapters (#33725)
* [PEFT] Support low_cpu_mem_usage for PEFT loading

PEFT added support for low_cpu_mem_usage=True when loading adapters in
https://github.com/huggingface/peft/pull/1961. This feature is now
available when installing PEFT v0.13.0. With this PR, this option is
also supported when loading PEFT adapters directly into transformers
models.

Additionally, with this PR,
https://github.com/huggingface/diffusers/pull/9510 will be unblocked,
which implements this option in diffusers.

* Fix typo
2024-10-03 16:15:36 +02:00
Yoach Lacombe
bf0ffe3d29 [Tests] Diverse Whisper fixes (#33665)
* fix beam indices in token_timestamps

* fix attention_mask in FA2

* correct translation example with the right example

* correct how somes tests are using outputs + correct num_frames

* fix shortform batch prev cond tests

* make fix-copies

* make fix-copies

* take care of shifting beam indices

* [run-slow] whisper

* [run-slow] whisper
2024-10-03 15:59:01 +02:00
KanTakahiro
ab97a78130 Fix: use unidic-lite instead of ipadic as the tokenizer dictionary for Japanese (#33372)
* Fix: use unidic-lite instead of ipadic as the tokenizer dictionary of Japanese

Signed-off-by: Kan Takahiro <kan@Kans-Mac-mini.local>

* fix the default name

---------

Signed-off-by: Kan Takahiro <kan@Kans-Mac-mini.local>
Co-authored-by: Kan Takahiro <kan@Kans-Mac-mini.local>
Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com>
2024-10-03 15:30:03 +02:00
Joao Gante
d29738f5b4 Generate tests: modality-agnostic input preparation (#33685) 2024-10-03 14:01:24 +01:00
Arie Pratama Sutiono
f2bf4fcf3d Add SplinterTokenizer unit test (#32652)
* add unit tests for splinter_tokenizer

* add unit test for splinter tokenizer, pass in the question_token to be saved on save_pretrained called

* remove unused import

* remove vocab_splinter.txt, add Copied from, use fmt:on and fmt:off to prevent autoformatting on long lines

* remove all the spaces

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-10-03 14:49:56 +02:00
Ben Schneider
95a2f5f6c3 Fix module initialization for root module under Zero3 (#33632)
* Use all state dict keys when checking if root module is initialized.

* Apply style corrections

* Add comment explaining change.

* Change comment phrasing.
2024-10-03 14:41:50 +02:00
Guillaume LEGENDRE
4df3ccddb7 Migrate the CI runners to the new clusters (#33849)
* try fixing push-ci

* move to new runners

* move benchmark.yml to new runners

* move doctest_job.yml to new runners

* move doctests.yml to new runners

* move push-important-models.yml to new runners

* move self-pr-slow-ci.yml to new runners

* fix typo

Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

* fix working directory

Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

* fix working directory

Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

* improve code

Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

---------

Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
2024-10-03 14:39:49 +02:00
Joao Gante
6f0ce52760 VLM Generate: tag test_static_cache_matches_dynamic as flaky (#33630)
flaky
2024-10-03 12:27:02 +01:00
Nonthachai Plodthong
f1a5f81296 Update an keyerror on _save_check_point prevent confusion of missing … (#33832)
* Update an keyerror on _save_check_point prevent confusion of missing metric keys

* Update grammar error and case sensitive.

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* adding update KeyError on _evaluate function to align with _save_checkpoint function

---------

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2024-10-03 10:27:49 +02:00
HofitBata
dc8156fdd8 Fix dt proj bias reassigned (#33314)
* When we set self.dt_proj.bias = None, it removes the bias parameter from the model. When we later tried to assign a tensor to self.dt_proj.bias, it caused a TypeError because PyTorch expects a Parameter object.

* When we set self.dt_proj.bias = None, it removes the bias parameter from the model. When we later tried to assign a tensor to self.dt_proj.bias, it caused a TypeError because PyTorch expects a Parameter object.

* When we set self.dt_proj.bias = None, it removes the bias parameter from the model. When we later tried to assign a tensor to self.dt_proj.bias, it caused a TypeError because PyTorch expects a Parameter object.
2024-10-03 09:51:03 +02:00
Yoni Gozlan
d7950bff82 uniformize processor Mllama (#33876)
* uniformize processor Mllama

* nit syntax

* nit
2024-10-02 16:50:15 +02:00
Yoni Gozlan
62e8c759c3 rename all test_processing_*.py to test_processor_*.py (#33878)
* rename all test_processing_*.py to test_processor_*.py ans fix duplicate test processor paligemma

* fix copies

* fix broken tests

* fix-copies

* fix test processor bridgetower
2024-10-02 16:43:43 +02:00
Pavel Iakubovskii
2f25ab95db Handle Trainer tokenizer kwarg deprecation with decorator (#33887)
* Handle deprecation with decorator

* Fix for seq2seq Trainer
2024-10-02 15:28:20 +01:00
Yoni Gozlan
ee71c9853a Optim deformable detr (#33600)
* optimize deformable detr

* fix copies

* remove deformable_detr_basline

* fix hardcoded float16 and .float()

* [run slow] deformable-detr,grounding-dino,mask2former,oneformer,rt-detr

* [run slow] deformable_detr,grounding_dino,mask2former,oneformer,rt_detr
2024-10-02 15:46:27 +02:00
Marc Sun
cac4a4876b [Quantization] Switch to optimum-quanto (#31732)
* switch to optimum-quanto rebase squach

* fix import check

* again

* test try-except

* style
2024-10-02 15:14:34 +02:00
amyeroberts
b7474f211d Trainer - deprecate tokenizer for processing_class (#32385)
* Trainer - deprecate tokenizer for processing_class

* Extend chage across Seq2Seq trainer and docs

* Add tests

* Update to FutureWarning and add deprecation version
2024-10-02 14:08:46 +01:00
Omar Salman
e7c8af7f33 Add sdpa for DistilBert (#33724)
* Add sdpa for DistilBert

* [run_slow] distilbert

* [run_slow] distilbert

* [run_slow] distilbert

* Try without slow tests

* [run_slow] distilbert

* [run_slow] distilbert
2024-10-02 13:55:19 +01:00
Kyle Sayers
614c79a9b0 Fix kwargs passed by AutoQuantizationConfig.from_pretrained (#33798)
fix kwargs

Co-authored-by: kylesayrs <kyle@neuralmagic.com>
2024-10-02 14:12:03 +02:00
Kyle Sayers
b09234cfc1 Allow for nightly packages of compressed_tensors (#33828)
* only check spec

* correct typo in nightly package name
2024-10-02 14:11:44 +02:00
g-prz
fe484726aa Add falcon gguf (#33437)
* feat(gguf): add falcon q2 k

* fix(gguf): remove useless renaming

* feat(gguf): seperate falcon 7b and 40b

* feat(gguf): apply fixup

* fix(test): error rebase

* feat(gguf): add fp16 weight comparison for falcon

* feat(gguf): test weight of all layers

* test(gguf): add falcon 40b under skip decorator

* feat(gguf): quick example for extracting model size
2024-10-02 14:10:39 +02:00
George
181c962aab populate quantization_config for kv-cache-scheme only configs (#33874) 2024-10-02 14:06:40 +02:00
Yih-Dar
e5d14f39ad Don't run reminder bot for now (#33883)
update

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-10-02 11:51:01 +02:00
Pablo Montalvo
50290cf7a0 Uniformize model processors (#31368)
* add initial design for uniform processors + align model

* add uniform processors for altclip + chinese_clip

* add uniform processors for blip + blip2

* fix mutable default 👀

* add configuration test

* handle structured kwargs w defaults + add test

* protect torch-specific test

* fix style

* fix

* rebase

* update processor to generic kwargs + test

* fix style

* add sensible kwargs merge

* update test

* fix assertEqual

* move kwargs merging to processing common

* rework kwargs for type hinting

* just get Unpack from extensions

* run-slow[align]

* handle kwargs passed as nested dict

* add from_pretrained test for nested kwargs handling

* [run-slow]align

* update documentation + imports

* update audio inputs

* protect audio types, silly

* try removing imports

* make things simpler

* simplerer

* move out kwargs test to common mixin

* [run-slow]align

* skip tests for old processors

* [run-slow]align, clip

* !$#@!! protect imports, darn it

* [run-slow]align, clip

* [run-slow]align, clip

* update common processor testing

* add altclip

* add chinese_clip

* add pad_size

* [run-slow]align, clip, chinese_clip, altclip

* remove duplicated tests

* fix

* add blip, blip2, bridgetower

Added tests for bridgetower which override common. Also modified common
tests to force center cropping if existing

* fix

* update doc

* improve documentation for default values

* add model_max_length testing

This parameter depends on tokenizers received.

* Raise if kwargs are specified in two places

* fix

* removed copied from

* match defaults

* force padding

* fix tokenizer test

* clean defaults

* move tests to common

* add missing import

* fix

* adapt bridgetower tests to shortest edge

* uniformize donut processor + tests

* add wav2vec2

* extend common testing to audio processors

* add testing + bert version

* propagate common kwargs to different modalities

* BC order of arguments

* check py version

* revert kwargs merging

* add draft overlap test

* update

* fix blip2 and wav2vec due to updates

* fix copies

* ensure overlapping kwargs do not disappear

* replace .pop by .get to handle duplicated kwargs

* fix copies

* fix missing import

* add clearly wav2vec2_bert to uniformized models

* fix copies

* increase number of features

* fix style

* [run-slow] blip, blip2, bridgetower, donut, wav2vec2, wav2vec2_bert

* [run-slow] blip, blip_2, bridgetower, donut, wav2vec2, wav2vec2_bert

* fix concatenation

* [run-slow] blip, blip_2, bridgetower, donut, wav2vec2, wav2vec2_bert

* Update tests/test_processing_common.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* 🧹

* address comments

* clean up + tests

* [run-slow] instructblip, blip, blip_2, bridgetower, donut, wav2vec2, wav2vec2_bert

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-10-02 10:41:08 +02:00
TrickEye
2292be6c1b Fix: typo (#33880)
Update llm_tutorial.md: typo
2024-10-02 09:12:21 +01:00
Yoni Gozlan
61ac161a9d Add support for custom inputs and batched inputs in ProcessorTesterMixin (#33711)
* add support for custom inputs and batched inputs in ProcessorTesterMixin

* Fix batch_size behavior ProcessorTesterMixin

* Change format prepare inputs batched

* Remove override test pixtral processor

* Remove unnecessary tests and cleanup after new prepare_inputs functions

* Fix instructBlipVideo image processor
2024-10-01 23:52:03 +02:00
amyeroberts
1baa08897d Repo consistency fix after #33339 (#33873)
* Repo consistency fix after #33339

* [run-slow] omdet_turbo
2024-10-01 21:03:15 +01:00
Prakarsh Kaushik
68a2b50069 [Fix] ViViT interpolate_pos_encoding (#33815)
* fix:test_inference_interpolate_pos_encoding

* style:make style;make fixup

* test: add suggestion to test_modeling_vivit

* chore:add suggestions

* style:make style

* [run_slow] vivit

* ci:slow test fix

* [run_slow] vivit
2024-10-01 20:14:35 +01:00
g-prz
8635802af9 Move weight initilization deformabledetr (#33339)
* fix(copy): fixup copy

* fix(deformable_detr): move weight initialization to the right place

* fix(grounding_dino): move weight initialization to the right place

* fix(rt_detr): move weight initialization to the right place

* [run-slow] deformable_detr, grounding_dino, rt_detr
2024-10-01 20:08:57 +01:00
Matt
a43e84cb3b Make ASR pipeline compliant with Hub spec + add tests (#33769)
* Remove max_new_tokens arg

* Add ASR pipeline to testing

* make fixup

* Factor the output test out into a util

* Full error reporting

* Full error reporting

* Update src/transformers/pipelines/automatic_speech_recognition.py

Co-authored-by: Lysandre Debut <hi@lysand.re>

* Small comment

---------

Co-authored-by: Lysandre Debut <hi@lysand.re>
2024-10-01 18:15:04 +01:00
Nicola De Angeli
0256520794 fix: repair depth estimation multiprocessing (#33759)
* fix: repair depth estimation multiprocessing

* test: add test for multiprocess depth estimation
2024-10-01 17:59:59 +01:00
Yih-Dar
f205da9660 Avoid using context that is not accessable from external contributors (#33866)
* fix

* fix

* fix

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-10-01 17:42:45 +02:00
Manal ML
0c4c2d7e07 Add include_loss_for_metrics (#33088)
* Add include_loss_for_metrics

* Fix styling

* Initialize inputs and losses to avoid AttributeError

* Ruff styling

* Refactor compute_metrics and update EvalPrediction

* Change Naming

* Added include_for_metrics to group both args

* Fix style

* Change warnings to logger

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

---------

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2024-10-01 16:51:41 +02:00
jackyjinjing
5f9f58fc59 Validate the eval dataset in advance. (#33743)
* Validate the eval dataset in advance.

* format

* format

* format

* Update src/transformers/trainer.py

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* format

---------

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2024-10-01 16:45:06 +02:00
Kyle Sayers
f8110a6ddf Raise accelerate dependency error in case of defaulting low_cpu_mem_usage=True (#33830)
Clarify warning, add import check
2024-10-01 16:44:38 +02:00
aroun-coumar
326b2bad1c This PR contains additional changes for #33143 (#33581)
* fix: Fix optimizer bug in ModelCard

* fix: fix W293

* Fixes in modelcard.py for issue #33143

---------

Co-authored-by: moontidef <53668275+relic-yuexi@users.noreply.github.com>
2024-10-01 16:42:30 +02:00
Raushan Turganbay
b1c914e463 Fix device mismatch errors (#33851)
fix device mismatch errors
2024-10-01 15:55:57 +02:00
Matt
ac28a23b3d Workaround for bark issue in pipelines (#33824)
* Quick workaround for bark + generation_config issue

* make fixup

* [run slow] bark
2024-10-01 14:40:12 +01:00
Francesco Ortu
acdfdd9387 add attention weight up-cast to float32 in chameleon (#33822)
add attention weight float32 cast  in chameleon
2024-10-01 15:19:16 +02:00
Fabian David Schmidt
351873a145 fix: skip dropout in eval for flash_attn in various models (#33844)
* fix(m2m_100): skip dropout in eval for flash_attn

* fix(misc): skip dropout in eval for flash attn various models

* chore(m2m_100): copy flash attn from bart

* chore: run make fix-copies

* [run-slow] bart, m2m_100
2024-10-01 14:39:21 +02:00
Kenza Bouzid
88d960937c Refactor image features selection in LlaVa (#33696)
* refactor image features selection

* break line

* remove whitespace

* add pr comments: include projection and rename function

* make fix-copies

* fix get_image_feature in vip llava
2024-10-01 14:37:31 +02:00
Joao Gante
22266be970 Generate: move llama prepare_inputs_for_generation to GenerationMixin (#33677) 2024-10-01 12:32:54 +01:00
Yih-Dar
d19ab15421 post reminder comment only once (#33848)
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-10-01 12:52:53 +02:00
Wing Lian
fbde09c8c9 fix check for hidden size in text model for deepspeed zero3 auto entries (#33829)
* fix check for hidden size in text model for deepspeed zero3 auto entries

* fix typo
2024-10-01 12:28:26 +02:00
Guang Yang
808997a634 Fix passing str dtype to static cache (#33741)
Co-authored-by: Guang Yang <guangyang@fb.com>
2024-10-01 09:50:17 +02:00
Adibvafa Fallahpour
c269c5c74d Fix Mamba slow path bug with dtype mismatch. (#32691)
* Fix Mamba slow path bug with dtype mismatch.

* Update test_modeling_mamba.py

* Improve style.

* Fix issue with cache position of dtype mismatch test.

* Change test for slow path.

* Revert changes.

* Switch to buggy code and add test to catch it.

* Fix the dtype mismatch bug and add test code to verify it.

* Fix minor bug with test.

* Fix incorrect dtype of model output.

* Fix incorrect dtype of cache.

* Fix incorrect dtype of ssm cache.

* Fix incorrect dtype of conv state.

* Remove assertion for ssm state.

* Add assertion for conv state dtype.

* Fix all issues with dtype mismatch test.
2024-10-01 09:28:40 +02:00
dependabot[bot]
570c89625b Bump torch from 1.13.1 to 2.2.0 in /examples/research_projects/lxmert (#33821)
Bumps [torch](https://github.com/pytorch/pytorch) from 1.13.1 to 2.2.0.
- [Release notes](https://github.com/pytorch/pytorch/releases)
- [Changelog](https://github.com/pytorch/pytorch/blob/main/RELEASE.md)
- [Commits](https://github.com/pytorch/pytorch/compare/v1.13.1...v2.2.0)

---
updated-dependencies:
- dependency-name: torch
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-30 21:57:57 +02:00
Aryan
90dca5a71b minor typo fix (#33784)
fix typo
2024-09-30 21:42:22 +02:00
pogpog
b77846a6e6 Fix link in gguf.md (#33768)
Change hyphen to underscore for URL in link to convert_hf_to_gguf.py
2024-09-30 20:17:33 +02:00
aroun-coumar
baa765f813 Fixes for issue #33763 in idefics2 model (#33766) 2024-09-30 18:08:48 +01:00
Joshua Lochner
18c5b216f1 Fix ViT-MAE decoder interpolate (#33330)
* Fix ViT-MAE decoder interpolate

* Add unit test for `interpolate_pos_encoding` w/ custom sizes

* [run_slow] vit_mae
2024-09-30 18:47:13 +02:00