HuggingFace_transformer

Author	SHA1	Message	Date
Benjamin Bossan	6500f78c86	[PEFT] Support low_cpu_mem_usage option for PEFT loading adapters (#33725 ) * [PEFT] Support low_cpu_mem_usage for PEFT loading PEFT added support for low_cpu_mem_usage=True when loading adapters in https://github.com/huggingface/peft/pull/1961. This feature is now available when installing PEFT v0.13.0. With this PR, this option is also supported when loading PEFT adapters directly into transformers models. Additionally, with this PR, https://github.com/huggingface/diffusers/pull/9510 will be unblocked, which implements this option in diffusers. * Fix typo	2024-10-03 16:15:36 +02:00
Yoach Lacombe	bf0ffe3d29	[Tests] Diverse Whisper fixes (#33665 ) * fix beam indices in token_timestamps * fix attention_mask in FA2 * correct translation example with the right example * correct how somes tests are using outputs + correct num_frames * fix shortform batch prev cond tests * make fix-copies * make fix-copies * take care of shifting beam indices * [run-slow] whisper * [run-slow] whisper	2024-10-03 15:59:01 +02:00
KanTakahiro	ab97a78130	Fix: use unidic-lite instead of ipadic as the tokenizer dictionary for Japanese (#33372 ) * Fix: use unidic-lite instead of ipadic as the tokenizer dictionary of Japanese Signed-off-by: Kan Takahiro <kan@Kans-Mac-mini.local> * fix the default name --------- Signed-off-by: Kan Takahiro <kan@Kans-Mac-mini.local> Co-authored-by: Kan Takahiro <kan@Kans-Mac-mini.local> Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com>	2024-10-03 15:30:03 +02:00
Joao Gante	d29738f5b4	Generate tests: modality-agnostic input preparation (#33685 )	2024-10-03 14:01:24 +01:00
Arie Pratama Sutiono	f2bf4fcf3d	Add `SplinterTokenizer` unit test (#32652 ) * add unit tests for splinter_tokenizer * add unit test for splinter tokenizer, pass in the question_token to be saved on save_pretrained called * remove unused import * remove vocab_splinter.txt, add Copied from, use fmt:on and fmt:off to prevent autoformatting on long lines * remove all the spaces Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-10-03 14:49:56 +02:00
Ben Schneider	95a2f5f6c3	Fix module initialization for root module under Zero3 (#33632 ) * Use all state dict keys when checking if root module is initialized. * Apply style corrections * Add comment explaining change. * Change comment phrasing.	2024-10-03 14:41:50 +02:00
Guillaume LEGENDRE	4df3ccddb7	Migrate the CI runners to the new clusters (#33849 ) * try fixing push-ci * move to new runners * move benchmark.yml to new runners * move doctest_job.yml to new runners * move doctests.yml to new runners * move push-important-models.yml to new runners * move self-pr-slow-ci.yml to new runners * fix typo Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> * fix working directory Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> * fix working directory Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> * improve code Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> --------- Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>	2024-10-03 14:39:49 +02:00
Joao Gante	6f0ce52760	VLM Generate: tag `test_static_cache_matches_dynamic` as flaky (#33630 ) flaky	2024-10-03 12:27:02 +01:00
Nonthachai Plodthong	f1a5f81296	Update an keyerror on _save_check_point prevent confusion of missing … (#33832 ) * Update an keyerror on _save_check_point prevent confusion of missing metric keys * Update grammar error and case sensitive. Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * adding update KeyError on _evaluate function to align with _save_checkpoint function --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2024-10-03 10:27:49 +02:00
HofitBata	dc8156fdd8	Fix dt proj bias reassigned (#33314 ) * When we set self.dt_proj.bias = None, it removes the bias parameter from the model. When we later tried to assign a tensor to self.dt_proj.bias, it caused a TypeError because PyTorch expects a Parameter object. * When we set self.dt_proj.bias = None, it removes the bias parameter from the model. When we later tried to assign a tensor to self.dt_proj.bias, it caused a TypeError because PyTorch expects a Parameter object. * When we set self.dt_proj.bias = None, it removes the bias parameter from the model. When we later tried to assign a tensor to self.dt_proj.bias, it caused a TypeError because PyTorch expects a Parameter object.	2024-10-03 09:51:03 +02:00
Yoni Gozlan	d7950bff82	uniformize processor Mllama (#33876 ) * uniformize processor Mllama * nit syntax * nit	2024-10-02 16:50:15 +02:00
Yoni Gozlan	62e8c759c3	rename all test_processing_.py to test_processor_.py (#33878 ) * rename all test_processing_.py to test_processor_.py ans fix duplicate test processor paligemma * fix copies * fix broken tests * fix-copies * fix test processor bridgetower	2024-10-02 16:43:43 +02:00
Pavel Iakubovskii	2f25ab95db	Handle Trainer `tokenizer` kwarg deprecation with decorator (#33887 ) * Handle deprecation with decorator * Fix for seq2seq Trainer	2024-10-02 15:28:20 +01:00
Yoni Gozlan	ee71c9853a	Optim deformable detr (#33600 ) * optimize deformable detr * fix copies * remove deformable_detr_basline * fix hardcoded float16 and .float() * [run slow] deformable-detr,grounding-dino,mask2former,oneformer,rt-detr * [run slow] deformable_detr,grounding_dino,mask2former,oneformer,rt_detr	2024-10-02 15:46:27 +02:00
Marc Sun	cac4a4876b	[Quantization] Switch to optimum-quanto (#31732 ) * switch to optimum-quanto rebase squach * fix import check * again * test try-except * style	2024-10-02 15:14:34 +02:00
amyeroberts	b7474f211d	Trainer - deprecate tokenizer for processing_class (#32385 ) * Trainer - deprecate tokenizer for processing_class * Extend chage across Seq2Seq trainer and docs * Add tests * Update to FutureWarning and add deprecation version	2024-10-02 14:08:46 +01:00
Omar Salman	e7c8af7f33	Add sdpa for DistilBert (#33724 ) * Add sdpa for DistilBert * [run_slow] distilbert * [run_slow] distilbert * [run_slow] distilbert * Try without slow tests * [run_slow] distilbert * [run_slow] distilbert	2024-10-02 13:55:19 +01:00
Kyle Sayers	614c79a9b0	Fix kwargs passed by AutoQuantizationConfig.from_pretrained (#33798 ) fix kwargs Co-authored-by: kylesayrs <kyle@neuralmagic.com>	2024-10-02 14:12:03 +02:00
Kyle Sayers	b09234cfc1	Allow for nightly packages of `compressed_tensors` (#33828 ) * only check spec * correct typo in nightly package name	2024-10-02 14:11:44 +02:00
g-prz	fe484726aa	Add falcon gguf (#33437 ) * feat(gguf): add falcon q2 k * fix(gguf): remove useless renaming * feat(gguf): seperate falcon 7b and 40b * feat(gguf): apply fixup * fix(test): error rebase * feat(gguf): add fp16 weight comparison for falcon * feat(gguf): test weight of all layers * test(gguf): add falcon 40b under skip decorator * feat(gguf): quick example for extracting model size	2024-10-02 14:10:39 +02:00
George	181c962aab	populate quantization_config for kv-cache-scheme only configs (#33874 )	2024-10-02 14:06:40 +02:00
Yih-Dar	e5d14f39ad	Don't run reminder bot for now (#33883 ) update Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-10-02 11:51:01 +02:00
Pablo Montalvo	50290cf7a0	Uniformize model processors (#31368 ) * add initial design for uniform processors + align model * add uniform processors for altclip + chinese_clip * add uniform processors for blip + blip2 * fix mutable default 👀 * add configuration test * handle structured kwargs w defaults + add test * protect torch-specific test * fix style * fix * rebase * update processor to generic kwargs + test * fix style * add sensible kwargs merge * update test * fix assertEqual * move kwargs merging to processing common * rework kwargs for type hinting * just get Unpack from extensions * run-slow[align] * handle kwargs passed as nested dict * add from_pretrained test for nested kwargs handling * [run-slow]align * update documentation + imports * update audio inputs * protect audio types, silly * try removing imports * make things simpler * simplerer * move out kwargs test to common mixin * [run-slow]align * skip tests for old processors * [run-slow]align, clip * !$#@!! protect imports, darn it * [run-slow]align, clip * [run-slow]align, clip * update common processor testing * add altclip * add chinese_clip * add pad_size * [run-slow]align, clip, chinese_clip, altclip * remove duplicated tests * fix * add blip, blip2, bridgetower Added tests for bridgetower which override common. Also modified common tests to force center cropping if existing * fix * update doc * improve documentation for default values * add model_max_length testing This parameter depends on tokenizers received. * Raise if kwargs are specified in two places * fix * removed copied from * match defaults * force padding * fix tokenizer test * clean defaults * move tests to common * add missing import * fix * adapt bridgetower tests to shortest edge * uniformize donut processor + tests * add wav2vec2 * extend common testing to audio processors * add testing + bert version * propagate common kwargs to different modalities * BC order of arguments * check py version * revert kwargs merging * add draft overlap test * update * fix blip2 and wav2vec due to updates * fix copies * ensure overlapping kwargs do not disappear * replace .pop by .get to handle duplicated kwargs * fix copies * fix missing import * add clearly wav2vec2_bert to uniformized models * fix copies * increase number of features * fix style * [run-slow] blip, blip2, bridgetower, donut, wav2vec2, wav2vec2_bert * [run-slow] blip, blip_2, bridgetower, donut, wav2vec2, wav2vec2_bert * fix concatenation * [run-slow] blip, blip_2, bridgetower, donut, wav2vec2, wav2vec2_bert * Update tests/test_processing_common.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * 🧹 * address comments * clean up + tests * [run-slow] instructblip, blip, blip_2, bridgetower, donut, wav2vec2, wav2vec2_bert --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-10-02 10:41:08 +02:00
TrickEye	2292be6c1b	Fix: typo (#33880 ) Update llm_tutorial.md: typo	2024-10-02 09:12:21 +01:00
Yoni Gozlan	61ac161a9d	Add support for custom inputs and batched inputs in ProcessorTesterMixin (#33711 ) * add support for custom inputs and batched inputs in ProcessorTesterMixin * Fix batch_size behavior ProcessorTesterMixin * Change format prepare inputs batched * Remove override test pixtral processor * Remove unnecessary tests and cleanup after new prepare_inputs functions * Fix instructBlipVideo image processor	2024-10-01 23:52:03 +02:00
amyeroberts	1baa08897d	Repo consistency fix after #33339 (#33873 ) * Repo consistency fix after #33339 * [run-slow] omdet_turbo	2024-10-01 21:03:15 +01:00
Prakarsh Kaushik	68a2b50069	[Fix] ViViT interpolate_pos_encoding (#33815 ) * fix:test_inference_interpolate_pos_encoding * style:make style;make fixup * test: add suggestion to test_modeling_vivit * chore:add suggestions * style:make style * [run_slow] vivit * ci:slow test fix * [run_slow] vivit	2024-10-01 20:14:35 +01:00
g-prz	8635802af9	Move weight initilization deformabledetr (#33339 ) * fix(copy): fixup copy * fix(deformable_detr): move weight initialization to the right place * fix(grounding_dino): move weight initialization to the right place * fix(rt_detr): move weight initialization to the right place * [run-slow] deformable_detr, grounding_dino, rt_detr	2024-10-01 20:08:57 +01:00
Matt	a43e84cb3b	Make ASR pipeline compliant with Hub spec + add tests (#33769 ) * Remove max_new_tokens arg * Add ASR pipeline to testing * make fixup * Factor the output test out into a util * Full error reporting * Full error reporting * Update src/transformers/pipelines/automatic_speech_recognition.py Co-authored-by: Lysandre Debut <hi@lysand.re> * Small comment --------- Co-authored-by: Lysandre Debut <hi@lysand.re>	2024-10-01 18:15:04 +01:00
Nicola De Angeli	0256520794	fix: repair depth estimation multiprocessing (#33759 ) * fix: repair depth estimation multiprocessing * test: add test for multiprocess depth estimation	2024-10-01 17:59:59 +01:00
Yih-Dar	f205da9660	Avoid using context that is not accessable from external contributors (#33866 ) * fix * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-10-01 17:42:45 +02:00
Manal ML	0c4c2d7e07	Add include_loss_for_metrics (#33088 ) * Add include_loss_for_metrics * Fix styling * Initialize inputs and losses to avoid AttributeError * Ruff styling * Refactor compute_metrics and update EvalPrediction * Change Naming * Added include_for_metrics to group both args * Fix style * Change warnings to logger Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2024-10-01 16:51:41 +02:00
jackyjinjing	5f9f58fc59	Validate the eval dataset in advance. (#33743 ) * Validate the eval dataset in advance. * format * format * format * Update src/transformers/trainer.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * format --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2024-10-01 16:45:06 +02:00
Kyle Sayers	f8110a6ddf	Raise `accelerate` dependency error in case of defaulting `low_cpu_mem_usage=True` (#33830 ) Clarify warning, add import check	2024-10-01 16:44:38 +02:00
aroun-coumar	326b2bad1c	This PR contains additional changes for #33143 (#33581 ) * fix: Fix optimizer bug in ModelCard * fix: fix W293 * Fixes in modelcard.py for issue #33143 --------- Co-authored-by: moontidef <53668275+relic-yuexi@users.noreply.github.com>	2024-10-01 16:42:30 +02:00
Raushan Turganbay	b1c914e463	Fix device mismatch errors (#33851 ) fix device mismatch errors	2024-10-01 15:55:57 +02:00
Matt	ac28a23b3d	Workaround for bark issue in pipelines (#33824 ) * Quick workaround for bark + generation_config issue * make fixup * [run slow] bark	2024-10-01 14:40:12 +01:00
Francesco Ortu	acdfdd9387	add attention weight up-cast to float32 in chameleon (#33822 ) add attention weight float32 cast in chameleon	2024-10-01 15:19:16 +02:00
Fabian David Schmidt	351873a145	fix: skip dropout in eval for flash_attn in various models (#33844 ) * fix(m2m_100): skip dropout in eval for flash_attn * fix(misc): skip dropout in eval for flash attn various models * chore(m2m_100): copy flash attn from bart * chore: run make fix-copies * [run-slow] bart, m2m_100	2024-10-01 14:39:21 +02:00
Kenza Bouzid	88d960937c	Refactor image features selection in LlaVa (#33696 ) * refactor image features selection * break line * remove whitespace * add pr comments: include projection and rename function * make fix-copies * fix get_image_feature in vip llava	2024-10-01 14:37:31 +02:00
Joao Gante	22266be970	Generate: move llama `prepare_inputs_for_generation` to `GenerationMixin` (#33677 )	2024-10-01 12:32:54 +01:00
Yih-Dar	d19ab15421	post reminder comment only once (#33848 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-10-01 12:52:53 +02:00
Wing Lian	fbde09c8c9	fix check for hidden size in text model for deepspeed zero3 auto entries (#33829 ) * fix check for hidden size in text model for deepspeed zero3 auto entries * fix typo	2024-10-01 12:28:26 +02:00
Guang Yang	808997a634	Fix passing str dtype to static cache (#33741 ) Co-authored-by: Guang Yang <guangyang@fb.com>	2024-10-01 09:50:17 +02:00
Adibvafa Fallahpour	c269c5c74d	Fix Mamba slow path bug with dtype mismatch. (#32691 ) * Fix Mamba slow path bug with dtype mismatch. * Update test_modeling_mamba.py * Improve style. * Fix issue with cache position of dtype mismatch test. * Change test for slow path. * Revert changes. * Switch to buggy code and add test to catch it. * Fix the dtype mismatch bug and add test code to verify it. * Fix minor bug with test. * Fix incorrect dtype of model output. * Fix incorrect dtype of cache. * Fix incorrect dtype of ssm cache. * Fix incorrect dtype of conv state. * Remove assertion for ssm state. * Add assertion for conv state dtype. * Fix all issues with dtype mismatch test.	2024-10-01 09:28:40 +02:00
dependabot[bot]	570c89625b	Bump torch from 1.13.1 to 2.2.0 in /examples/research_projects/lxmert (#33821 ) Bumps [torch](https://github.com/pytorch/pytorch) from 1.13.1 to 2.2.0. - [Release notes](https://github.com/pytorch/pytorch/releases) - [Changelog](https://github.com/pytorch/pytorch/blob/main/RELEASE.md) - [Commits](https://github.com/pytorch/pytorch/compare/v1.13.1...v2.2.0) --- updated-dependencies: - dependency-name: torch dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-09-30 21:57:57 +02:00
Aryan	90dca5a71b	minor typo fix (#33784 ) fix typo	2024-09-30 21:42:22 +02:00
pogpog	b77846a6e6	Fix link in gguf.md (#33768 ) Change hyphen to underscore for URL in link to convert_hf_to_gguf.py	2024-09-30 20:17:33 +02:00
aroun-coumar	baa765f813	Fixes for issue #33763 in idefics2 model (#33766 )	2024-09-30 18:08:48 +01:00
Joshua Lochner	18c5b216f1	Fix ViT-MAE decoder interpolate (#33330 ) * Fix ViT-MAE decoder interpolate * Add unit test for `interpolate_pos_encoding` w/ custom sizes * [run_slow] vit_mae	2024-09-30 18:47:13 +02:00

1 2 3 4 5 ...

17006 Commits