HuggingFace_transformer

Author	SHA1	Message	Date
Samin Yasar	af45ec0a16	add type hint in pipeline model argument (#23740 ) * add type hint in pipeline model argument * add pretrainedmodel and tfpretainedmodel type hint * make type hints string	2023-05-30 11:05:58 +01:00
zspo	003a0cf8cc	Fix some docs what layerdrop does (#23691 ) * Fix some docs what layerdrop does * Update src/transformers/models/data2vec/configuration_data2vec_audio.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Fix more docs --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-05-23 14:50:40 -04:00
Matt	876d9a32c6	TF version compatibility fixes (#23663 ) * New TF version compatibility fixes * Remove dummy print statement, move expand_1d * Make a proper framework inference function * Make a proper framework inference function * ValueError -> TypeError	2023-05-23 16:42:11 +01:00
NielsRogge	2f424d7979	[image-to-text pipeline] Add conditional text support + GIT (#23362 ) * First draft * Remove print statements * Add conditional generation * Add more tests * Remove scripts * Remove BLIP specific linkes * Add support for pix2struct * Add fast test * Address comment * Fix style	2023-05-22 21:45:50 +02:00
Joao Gante	b369e507aa	Generate: text generation pipeline no longer emits `max_length` warning when it is not set (#23139 )	2023-05-04 18:36:23 +01:00
Joao Gante	3a08dc63fd	Generate: better warnings with pipelines (#23128 )	2023-05-03 14:43:17 +01:00
Stephen Kaplan	9062d1bab2	Fix grammar error in summarization pipeline (#23080 ) Fix minor grammar issue	2023-05-01 08:54:57 -04:00
Younes Belkada	a0ae2310ec	[`DocTest`] Fix correct checkpoint (#22988 ) fix pipeline issue	2023-04-25 15:20:36 +02:00
Connor Boyle	b5f06d6c59	Raise error if `stride` is too high in `TokenClassificationPipeline` (#22942 ) * Raise error if `stride` is too high * Clarify use of `stride`	2023-04-24 09:27:49 -04:00
Arthur	f143037789	Add `automatic-mask-generation` pipeline for Segment Anything Model (SAM) (#22840 ) * cleanup * updates * more refactoring * make style * update inits * support other inputs in base * update based on review Co-authored-by: Nicolas Patry <patry.nicolas@gmail.com> * Update tests/pipelines/test_pipelines_automatic_mask_generation.py Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com> * update * fixup * TODO x and y to refactor, _h _w refactored here * update docstring * more nits * style on these * more doc fix * rename variables * update * updates * style * update * fix `_mask_to_rle_pytorch` * styling * fix ask to rle, wrong outputs * add device arg * update * more updates, fix tets * udpate * update docstrings * styling * fixup * add notebook on the docs * update orginal sizes * fix docstring * updat condition on point_per-batch * updates tests * fix CI test * extend is required, append does not work! * fixup * fix CI tests * whit pixels left * address doc comments * fix doc * slow pipeline tests * update auto init * add revision * make fixup * update p!ipoeline tag when calling tests * alphabeitcal order in inits * fix copies * last style nits * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * reformat docstring * more reformat * address most of the comments * Update src/transformers/pipelines/mask_generation.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * final refactor * Update src/transformers/models/sam/image_processing_sam.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * fixup and fix slow tests * revert --------- Co-authored-by: Nicolas Patry <patry.nicolas@gmail.com> Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com> Co-authored-by: younesbelkada <younesbelkada@gmail.com> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-04-20 19:27:24 +02:00
Oscar	a438a0941c	fix: Correct small typo in docstring (#22857 ) * fix: Correct small typo in docstring * fix: Run make fixup	2023-04-20 11:58:52 +01:00
Sylvain Gugger	5f9b825c89	Use code on the Hub from another repo (#22814 ) * initial work * Add other classes * Refactor code * Move warning and fix dynamic pipeline * Issue warning when necessary * Add test * Do not skip auto tests * Fix failing tests * Refactor and address review comments * Address review comments	2023-04-18 13:46:11 -04:00
Sylvain Gugger	50caa20628	Revert "Use code on the Hub from another repo" (#22813 ) Revert "Use code on the Hub from another repo (#22698)" This reverts commit `ea7b0a539a`.	2023-04-17 14:22:13 -04:00
Sylvain Gugger	ea7b0a539a	Use code on the Hub from another repo (#22698 ) * initial work * Add other classes * Refactor code * Move warning and fix dynamic pipeline * Issue warning when necessary * Add test	2023-04-17 11:36:29 -04:00
Yih-Dar	32b08742a5	`DocumentQuestionAnsweringPipeline` only for fast ⚡ tokenizers (#22745 ) * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-04-13 17:22:59 +02:00
Luc CAILLIAU	06b05d4575	Clarify stride option (#22684 ) * Clarify stride option * formatting	2023-04-11 14:06:54 +01:00
Arthur	117a0f6afa	Small nit, (#22653 ) * Small nit, Fixes #21986 * Update src/transformers/pipelines/__init__.py	2023-04-07 17:29:23 +02:00
Roy Hvaara	11426641dc	Guard imports of PreTrainedTokenizerFast on is_tokenizers_available (#22285 ) Guard imports that use the tokenizers library	2023-03-30 09:16:03 -04:00
Samuel Larkin	59b9351b78	Minor typo in pipeline FillMaskPipeline's documentation. (#22339 )	2023-03-23 11:14:11 -04:00
Sylvain Gugger	506e7c6361	Fix various imports (#22281 ) * Fix various imports * Fix copies * Fix import	2023-03-23 10:34:17 -04:00
Sylvain	ef28df0572	Fix quality due to ruff release	2023-03-22 20:45:08 -04:00
Luc CAILLIAU	d62e7d8842	Chunkable token classification pipeline (#21771 ) * Chunkable classification pipeline The TokenClassificationPipeline is now able to process sequences longer than 512. No matter the framework, the model, the tokenizer. We just have to pass process_all=True and a stride number (optional). The behavior remains the same if you don't pass these optional parameters. For overlapping parts when using stride above 0, we consider only the max scores for each overlapped token in all chunks where the token is. * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * update with latest black format * update black format * Update token_classification.py * Update token_classification.py * format correction * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update comments * Update src/transformers/pipelines/token_classification.py Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com> * Update token_classification.py Correct spaces, remove process_all and keep only stride. If stride is provided, the pipeline is applied to the whole text. * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update chunk aggregation Update the chunk aggregation strategy based on entities aggregation. * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py Remove unnecessary pop from outputs dict * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update token_classification.py * Update src/transformers/pipelines/token_classification.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * add chunking tests * correct formating * correct formatting * correct model id for test chunking * update scores with nested simplify * Update test_pipelines_token_classification.py * Update test_pipelines_token_classification.py * update model to a tiny one * Update test_pipelines_token_classification.py * Adding smaller test for chunking. * Fixup * Update token_classification.py * Update src/transformers/pipelines/token_classification.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/pipelines/token_classification.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> --------- Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-03-22 14:13:20 -04:00
Sylvain Gugger	42ad693b7b	Regression pipeline device (#22190 ) * Fix regression in pipeline when device=-1 is passed * Add regression test	2023-03-15 14:13:38 -04:00
Alara Dirik	32e3466d38	Add AutoModelForZeroShotImageClassification (#22087 ) Adds AutoModelForZeroShotImageClassification to transformers	2023-03-13 12:46:14 +03:00
Ceyda Cinarel	3ec8171bed	Bug fix: token classification pipeline while passing offset_mapping (#22034 ) fix slow tokenizers with passing offset_mapping	2023-03-08 16:21:46 -05:00
anruijian	b427b263e2	Add tokenize_kwargs parameter definition in the FeatureExtractionPipeline (#22031 ) add tokenize_kwargs doc in the FeatureExtractionPipeline	2023-03-08 11:43:31 -05:00
Arthur	bc33fbf956	[CI] Fix ci (#21940 ) * fix `get_proposal_pos_embed` * fix order * style * zero shot simplify test * add approximate values for zero shot audio classification	2023-03-06 15:22:27 +01:00
Arthur	718e9d777f	[CLAP] Support batched inputs for CLAP. Fixes pipeline issues (#21931 ) * fix pipeline * fix feature_extraction clap * you can now batch the `is_longer` attribute * add tests * fixup * add expected scores * comment on is_longert	2023-03-03 18:42:18 +01:00
Arthur	dcec3277cd	faster forward following what is done for images (#21906 ) * faster forward following what is done for images * add missing licence	2023-03-03 06:18:18 +01:00
Nicolas Patry	b2a41d2be4	Faster zero shot image (#21897 ) * Make ZeroShotImageClassificationPipeline faster The pipeline makes separate calls to model for each candidate label. This commit combines all labels into one call. Original code takes more that 60 seconds to process one image and 1000 candidate labels. Updated code takes less than 2 seconds. * implement batching * code formatting * Creating an even faster zero-shot-image-classifiction. Unfortunately super tailored towards CLIP. Co-Authored-By: Yessen Kanapin <yessen@deepinfra.com> * Quality. * Cleanup. * Order different on the CI it seems. * Cleanup. * Quality. --------- Co-authored-by: Yessen Kanapin <yessen@deepinfra.com>	2023-03-02 19:46:22 +01:00
Nicolas Patry	1325459105	Refactor whisper asr pipeline to include language too. (#21427 ) * [WIP] whisper refacto to support language output. * Handling merges. * A bit more cleanup and comments. * Many improvements. Lots of details everywhere. * Cleanup old code and tests. * Handle lone timestamp tokens (just recover when something bad happens). * Adding return_language example. * No ffmpeg. * Hmm. * Some corrections. * Both fast and slow. * New black. * Update src/transformers/models/whisper/tokenization_whisper.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/whisper/tokenization_whisper.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Remove print. * Undoing tests modifications. * Smaller test modifications. * Rename. * Remove maxDiff. --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2023-03-02 18:12:19 +01:00
Arthur	c256bc6d10	[ZAC] fix ci daily (#21893 ) add correct revision after model was overwritten	2023-03-02 10:46:03 +01:00
Arthur	633e5e89f7	[Refactor] Relative imports wherever we can (#21880 ) * initial commit * update * second batch * style * fix imports * fix relative import on pipeline	2023-03-02 09:45:42 +01:00
Arthur	43299c63ca	fix checkpoint (#21874 )	2023-03-02 08:47:20 +01:00
Yih-Dar	89359e4c63	Fix `test_load_default_pipelines_pt` for `ClapModel` (#21886 ) * fix tests --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-03-01 21:52:26 +01:00
Joao Gante	92dfceb124	Inheritance-based framework detection (#21784 )	2023-02-27 15:31:55 +00:00
Arthur	cc44e72d14	[Pipeline] Add zero shot audio classificatoin pipeline (#21600 ) * add pipeline * update init * add zero shot to init * update inits and correct checkpoints * update base to support input features * add tests * Update src/transformers/pipelines/zero_shot_audio_classification.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Update src/transformers/pipelines/zero_shot_audio_classification.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * update pieline code * use tiny checkpoint * nits and expected value with tiny model * style * last nit on tests values * fix styling * fix collate fn that was casting t float * update --------- Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>	2023-02-27 11:43:44 +01:00
Connor Henderson	279008adc3	fix: Change is_last chunk calc and add conditional break in chunk_iter (#21612 ) * fix: Change is_last chunk calc and add conditional break * format fix * account for 0 and full stride_rights, add comment * add new test * make style * update slow whisper asr test timestamps * use nested_simplify on output and round timestamp to hundreths place	2023-02-24 08:30:32 +01:00
Aaron Gokaslan	5e8c8eb5ba	Apply ruff flake8-comprehensions (#21694 )	2023-02-22 09:14:54 +01:00
Younes Belkada	f83942684d	[`pipeline`] A simple fix for half-precision & 8bit models (#21479 ) * v1 fix * adapt from suggestions * make style * fix tests * add gpu tests * update docs * fix other tests * Apply suggestions from code review Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com> * better fix * make fixup * better example * revert changes * proposal * more elegant solution * Update src/transformers/pipelines/automatic_speech_recognition.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> --------- Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-02-10 10:26:17 +01:00
Nicolas Patry	06d940efc3	Fixing backward compatiblity `image_processor` in pipeline. (#21513 )	2023-02-08 19:36:20 +01:00
Sylvain Gugger	67d074874d	Cleanup quality (#21493 ) * Remove mentions of flake8/isort * Clean up inits * Deall with all other inits * Last special rule for dummy files	2023-02-07 12:27:31 -05:00
Sylvain Gugger	6f79d26442	Update quality tooling for formatting (#21480 ) * Result of black 23.1 * Update target to Python 3.7 * Switch flake8 to ruff * Configure isort * Configure isort * Apply isort with line limit * Put the right black version * adapt black in check copies * Fix copies	2023-02-06 18:10:56 -05:00
Yih-Dar	a6d8a149a8	Fix some pipeline tests (#21401 ) * fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-02-02 19:03:31 +01:00
Wang, Yi	f3a7befffa	fix the issue that the output dict of jit model could not get [0] (#21354 )	2023-01-30 09:23:55 -05:00
Arthur	0dff407d71	[Whisper] another patch (#21324 ) * another patch * fix timestamp test modeling * let it be negative when the token is None	2023-01-27 16:35:16 +01:00
Nicolas Patry	fd0ef8b66d	Small QoL for qa. (#21316 )	2023-01-26 14:50:09 +01:00
Nicolas Patry	8788fd0ceb	Moving to cleaner tokenizer version or `oneformer`. (#21292 ) Moving to cleaner tokenizer version.	2023-01-25 15:46:10 +01:00
Arthur	255257f3ea	[Whisper] Refactor whisper (#21252 ) * update whisper logit processor * add generate for whisper * remove part of the whisper specific code from pipeline * update logit processes * major update * enforce first timestamp * update generate * add more tests * update new decoding strategy * Apply suggestions from code review * update docstring * fixup * default config will not have multilingual ar * update expected tokenizer size, see pull on the hub for whisper-tiny	2023-01-25 13:09:43 +01:00
Nicolas Patry	99e7905422	Supporting `ImageProcessor` in place of `FeatureExtractor` for pipelines (#20851 ) * Fixing the pipeline with image processor. * Update the slow test. * Using only the first image processor. * Include exclusion mecanism for Image processor. * Do not handle Gitconfig, deemed as a bug. * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Remove `conversational` changes. They are not supposed to be here. * Address first row of comments. * Remove OneFormer modifications. Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-01-25 10:16:31 +01:00

1 2 3 4 5 ...

324 Commits