HuggingFace_transformer

Author	SHA1	Message	Date
Yoach Lacombe	704bf595eb	Update Bark generation configs and tests (#25409 ) * update bark generation configs for more coherent parameter * make style * update bark hub repo	2023-08-09 18:28:02 +02:00
Yih-Dar	5b517e1764	Use small config for `OneFormerModelTest.test_model_with_labels` (#25383 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-08-08 17:15:34 +02:00
Sanchit Gandhi	dedd11160d	[ASR Pipeline] Clarify return timestamps (#25344 ) * [ASR Pipeline] Clarify return timestamps * fix indentation * fix ctc check * fix ctc error message! * fix test * fix other test * add new tests * final comment	2023-08-08 10:16:00 +01:00
Yih-Dar	6ea3ee3cd2	Fix `test_model_parallelism` (#25359 ) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-08-08 10:48:45 +02:00
Matthew Hoffman	d4bd33cc9f	Register ModelOutput subclasses as supported torch.utils._pytree nodes (#25358 ) * Register ModelOutput subclasses as supported torch.utils._pytree nodes Fixes #25357 where DDP with static_graph=True does not sync gradients when calling backward() over tensors contained in ModelOutput subclasses * Add test for torch pytree ModelOutput serialization and deserialization	2023-08-08 08:12:11 +02:00
Pedro Lira	080a97119c	Add mask2former fp16 support (#25093 ) * Add mask2former fp16 support * Clear consistency/quality issues * Fix consistency/quality (2) * Add integration test for mask2former (fp16 case) * Fix code quality * Add integration test for maskformer (fp16 case) * Add integration test for oneformer (fp16 case) * Remove slow decorator from fp16 tests * Fix lint * Remove usage of full inference and value checks for fp16 * Temporarily comment slow for {mask, mask2, one}former * Add fp16 support to oneformer * Revert "Temporarily comment slow for {mask, mask2, one}former" This reverts commit e5371edabd301cf56079def0421a0a87df307cb0. * Remove dtype conversion noop	2023-08-07 20:07:29 +01:00
Sylvain Gugger	baf1daa58e	Migrate Trainer from `Repository` to `upload_folder` (#25095 ) * First draft * Deal with progress bars * Update src/transformers/utils/hub.py Co-authored-by: Lucain <lucainp@gmail.com> * Address review comments * Forgot one * Pin hf_hub * Add argument for push all and fix tests * Fix tests * Address review comments --------- Co-authored-by: Lucain <lucainp@gmail.com>	2023-08-07 17:47:22 +02:00
Yih-Dar	c177606fb4	Fix more offload edge cases (#25342 ) * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-08-07 17:45:41 +02:00
Guillaume "Vermeille" Sanchez	d533465150	add CFG for .generate() (#24654 )	2023-08-06 20:15:24 +01:00
Yih-Dar	ce6d153a53	Make `bark` could have tiny model (#25290 ) * temp * update * update * update * small dim * small dim * small dim * fix * update * fix * fix * fix * fix * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-08-04 15:13:14 +02:00
Sylvain Gugger	f0fd73a2de	Document check copies (#25291 ) * Document check copies better and add tests * Include header in check for copies * Manual fixes * Try autofix * Fixes * Clean tests * Finalize doc * Remove debug print * More fixes	2023-08-04 14:56:29 +02:00
Sylvain Gugger	29f04002e6	Deal with nested configs better in base class (#25237 ) * Deal better with nested configs * Fixes * More fixes * Fix last test * Clean up existing configs * Remove hack in MPT Config * Update src/transformers/configuration_utils.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Fix setting a nested config via dict in the kwargs * Adapt common test * Add test for nested config load with dict --------- Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>	2023-08-04 14:56:09 +02:00
Sylvain Gugger	fab1a0aa82	Give more memory in test_disk_offload (#25315 )	2023-08-04 14:10:31 +02:00
Roland Szabo	d114a6b71f	Add timeout parameter to load_image function (#25184 ) * Add timeout parameter to load_image function. * Remove line. * Reformat code Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Add parameter to docs. --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-08-03 15:51:54 +01:00
Yoach Lacombe	6d3f9c1e2e	add generate method to SpeechT5ForTextToSpeech (#25233 ) * add generate method to SpeechT5ForTextToSpeech * update speecht5forTTS docstrings * Remove defaults to None in generate docstrings Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-08-03 14:12:07 +01:00
amyeroberts	30409af6e1	Update InstructBLIP & Align values after rescale update (#25209 ) * Update InstructBLIP values Note: the tests are not independent. Running the test independentely produces different logits compared to running all the integration tests * Update test values after rescale update * Remove left over commented out code * Revert to previous rescaling logic * Update rescale tests	2023-08-03 11:01:10 +01:00
Yih-Dar	bd90cda9a6	CI with `num_hidden_layers=2` 🚀🚀🚀 (#25266 ) * CI with layers=2 --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-08-02 20:22:36 +02:00
Patrick von Platen	b28ebb2655	[MMS] Fix mms (#25267 ) * [MMS] Fix mms * [MMS] Fix mms * fix mms loading * Apply suggestions from code review * make style * Update tests/models/wav2vec2/test_modeling_wav2vec2.py	2023-08-02 18:11:15 +02:00
Yupeng Jia	8021c684ec	Fix some bugs for two stage training of deformable detr (#25045 ) * Update modeling_deformable_detr.py Fix bugs for two stage training * Update modeling_deformable_detr.py * Add test_two_stage_training to DeformableDetrModelTest --------- Co-authored-by: yupeng.jia <yupeng.jia@momenta.ai>	2023-08-02 11:30:36 +01:00
amyeroberts	1b35409768	Update rescale tests - cast to float after rescaling to reflect #25229 (#25259 ) Rescale tests - cast to float after rescaling to reflect #25229	2023-08-02 11:29:55 +01:00
YQ	2230d149f0	fix get_keys_to_not_convert() to return correct modules for full precision inference (#25105 ) * add test for `get_keys_to_not_convert` * add minimum patch to keep mpt lm_head from 8bit quantization * add reivsion to	2023-08-02 04:21:52 -04:00
Younes Belkada	05ebb0264e	[`MPT`] Add `require_bitsandbytes` on MPT integration tests (#25201 ) * add `require_bitsandbytes` on MPT integration tests * add it on mpt as well	2023-08-01 12:20:34 +02:00
Yih-Dar	1b4f6199c6	Update tiny model info. and pipeline testing (#25213 ) * update tiny_model_summary.json * update * update * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-07-31 19:35:33 +02:00
Yih-Dar	9ca3aa0156	Fix `all_model_classes` in `FlaxBloomGenerationTest` (#25211 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-07-31 17:32:05 +02:00
amyeroberts	05cda5df34	🚨🚨🚨 Fix rescale ViVit Efficientnet (#25174 ) * Fix rescaling bug * Add tests * Update integration tests * Fix up * Update src/transformers/image_transforms.py * Update test - new possible order in list	2023-07-28 19:52:51 +01:00
Sanchit Gandhi	03f98f9683	[MusicGen] Fix integration tests (#25169 ) * move to device * update with cuda values * fix fp16 * more rigorous	2023-07-28 18:50:15 +01:00
Younes Belkada	dd9d45b6ec	[`InstructBlip`] Fix instructblip slow test (#25171 ) * fix instruct blip slow test * Update tests/models/instructblip/test_modeling_instructblip.py	2023-07-28 17:00:10 +02:00
Younes Belkada	add0895dd9	[`Mpt`] Fix mpt slow test (#25170 ) fix mpt slow test	2023-07-28 16:45:09 +02:00
Lucain	c1dba1111b	Add test when downloading from gated repo (#25039 )	2023-07-28 08:14:27 -04:00
Sanchit Gandhi	e93103632b	Add bloom flax (#25094 ) * First commit * step 1 working * add alibi * placeholder for `scan` * add matrix mult alibi * beta scaling factor for bmm * working v1 - simple forward pass * move layer_number from attribute to arg in call * partial functioning scan * hacky working scan * add more modifs * add test * update scan for new kwarg order * fix position_ids problem * fix bug in attention layer * small fix - do the alibi broadcasting only once * prelim refactor * finish refactor * alibi shifting * incorporate dropout_add to attention module * make style * make padding work again * update * remove bogus file * up * get generation to work * clean code a bit * added small tests * adding albii test * make CI tests pass: - change init weight - add correct tuple for output attention - add scan test - make CI tests work * fix few nits * fix nit onnx * fix onnx nit * add missing dtype args to nn.Modules * remove debugging statements * fix scan generate * Update modeling_flax_bloom.py * Update test_modeling_flax_bloom.py * Update test_modeling_flax_bloom.py * Update test_modeling_flax_bloom.py * fix small test issue + make style * clean up * Update tests/models/bloom/test_modeling_flax_bloom.py Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * fix function name * small fix test * forward contrib credits from PR17761 * Fix failing test * fix small typo documentation * fix non passing test - remove device from build alibi * refactor call - refactor `FlaxBloomBlockCollection` module * make style * upcast to fp32 * cleaner way to upcast * remove unused args * remove layer number * fix scan test * make style * fix i4 casting * fix slow test * Update src/transformers/models/bloom/modeling_flax_bloom.py Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * remove `layer_past` * refactor a bit * fix `scan` slow test * remove useless import * major changes - remove unused code - refactor a bit - revert import `torch` * major refactoring - change build alibi * remove scan * fix tests * make style * clean-up alibi * add integration tests * up * fix batch norm conversion * style * style * update pt-fx cross tests * update copyright * Update src/transformers/modeling_flax_pytorch_utils.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * per-weight check * style * line formats --------- Co-authored-by: younesbelkada <younesbelkada@gmail.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by: haileyschoelkopf <haileyschoelkopf@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-07-27 18:24:56 +01:00
Yoach Lacombe	0b92ae3489	Add offload support to Bark (#25037 ) * initial Bark offload proposal * use hooks instead of manually offloading * add test of bark offload to cpu feature * Apply nit suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docstrings of offload Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * remove unecessary set_seed in Bark tests --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>	2023-07-27 15:35:17 +01:00
Arthur	9cea3e7b80	[`MptConfig`] support from pretrained args (#25116 ) * support from pretrained args * draft addition of tests * update test * use parrent assert true * Update src/transformers/models/mpt/configuration_mpt.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> --------- Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>	2023-07-27 16:24:52 +02:00
amyeroberts	659829b6ae	MaskFormer - enable return_dict in order to compile (#25052 ) * Enable return_dict in order to compile * Update tests	2023-07-26 16:23:30 +01:00
Yih-Dar	224da5df69	update `use_auth_token` -> `token` (#25083 ) * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-07-26 15:09:59 +02:00
Yih-Dar	31acba5697	Fix `PvtModelIntegrationTest::test_inference_fp16` (#25106 ) update Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-07-26 14:57:44 +02:00
Sebastian Husch Lee	8f36ab3e22	[`T5`, `MT5`, `UMT5`] Add [T5, MT5, UMT5]ForSequenceClassification (#24726 ) * Initial addition of t5forsequenceclassification * Adding imports and adding tests * Formatting * Running make fix-copies * Adding mt5forseq * Formatting * run make fix-copies * Adding to docs * Add model_parallel * Fix bug * Fix * Remove TODO * Fixing tests for T5ForSequenceClassification * Undo changes to dependency_versions_table.py * Change classification head to work with T5Config directly * Change seq length to let tests pass * PR comments for formatting * Formatting * Initial addition of UMT5ForSequenceClassification * Adding to inits and formatting * run make fix-copies * Add doc for UMT5ForSeqClass * Update UMT5 config * Fix docs * Skip torch fx test for SequenceClassification * Formatting * Add skip to UMT5 tests as well * Fix umt5 tests * Running make fix-copies * PR comments * Fix for change to sentence_representation * Rename seq_len to hidden_size since that's what it is * Use base_model to follow format of the rest of the library * Update docs * Extract the decoder_input_ids changes and make one liner * Make one-liner	2023-07-25 21:02:49 +02:00
Arthur	f9cc333805	[ `PreTrainedTokenizerFast`] Keep properties from fast tokenizer (#25053 ) * draft solution * use `setdefault` * nits * add tests and fix truncation issue * fix test * test passes locally * quality * updates * update tsets	2023-07-25 18:45:01 +02:00
Connor Henderson	0779fc8eb8	Edit err message and comment in `test_model_is_small` (#25087 ) * Edit err message and comment in * put back 80M comment	2023-07-25 12:24:36 -04:00
Arthur	dcb183f4bd	[`MPT`] Add MosaicML's `MPT` model to transformers (#24629 ) * draft add new model like * some cleaning of the config * nits * add nested configs * nits * update * update * added layer norms + triton kernels * consider only LPLayerNorm for now. * update * all keys match. * Update * fixing nits here and there * working forward pass. * removed einops dependency * nits * format * add alibi * byebye head mask * refactor attention * nits. * format * fix nits. * nuke ande updates * nuke tokenizer test * don't reshape query with kv heads * added a bit of documentation. * remove unneeded things * nuke more stuff * nit * logits match - same generations * rm unneeded methods * 1 remaining failing CI test * nit * fix nits * fix docs * fix docs * rm tokenizer * fixup * fixup * fixup and fix tests * fixed configuration object. * use correct activation * few minor fixes * clarify docs a bit * logits match à 1e-12 * skip and unskip a test * added some slow tests. * fix readme * add more details * Update docs/source/en/model_doc/mpt.md Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix configuration issues * more fixes in config * added more models * Apply suggestions from code review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * remove unneeded position ids * fix some comments * Apply suggestions from code review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * revert suggestion * mpt alibi + added batched generation * Update src/transformers/models/mpt/__init__.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * remove init config * Update src/transformers/models/mpt/configuration_mpt.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix nit * add another slow test * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * fits in one line * some refactor because make fixup doesn't pass * add ft notebook * update md * correct doc path --------- Co-authored-by: younesbelkada <younesbelkada@gmail.com> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-07-25 14:32:40 +02:00
Xuehai Pan	6bc61aa7af	Set `TF32` flag for PyTorch cuDNN backend (#25075 )	2023-07-25 08:04:48 -04:00
Sylvain Gugger	f295fc8a16	Fix last models for common tests that are too big. (#25058 ) * Fix last models for common tests that are too big. * Remove print statement	2023-07-25 07:56:04 -04:00
Rinat	a03d13c83d	Pvt model (#24720 ) * pull and push updates * add docs * fix modeling * Add and run test * make copies * add task * fix tests and fix small issues * Checks on a Pull Request * fix docs * add desc pvt.md	2023-07-24 15:34:19 +01:00
Sylvain Gugger	afe8bfc075	Comment again print statement	2023-07-24 10:12:20 -04:00
Sylvain Gugger	42571f6eb8	Make more test models smaller (#25005 ) * Make more test models tiny * Make more test models tiny * More models * More models	2023-07-24 10:08:47 -04:00
Zach Mueller	3b734f5042	Add dispatch_batches to training arguments (#25038 ) * Dispatch batches * Copy items	2023-07-24 09:27:19 -04:00
Arthur	0511369a8b	[`LlamaConfig`] Nit: pad token should be None by default (#24958 ) * pad token should be None by default * fix tests * nits	2023-07-21 14:32:34 +02:00
Benjamin Badger	caf5e369fc	Contrastive Search peak memory reduction (#24120 ) Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>	2023-07-20 18:46:53 +01:00
Joao Gante	89136ff7f8	Generate: sequence bias can handle same terminations (#24822 )	2023-07-20 12:23:17 +01:00
Tom Aarsen	79444f370f	Deprecate unused OpenLlama architecture (#24922 ) * Resolve typo in check_repo.py * Specify encoding when opening modeling files * Deprecate the OpenLlama architecture * Add disclaimer pointing to Llama I'm open to different wordings here * Match the capitalisation of LLaMA	2023-07-20 07:03:24 -04:00
Arthur	07360b6c9c	[`Llama2`] Add support for Llama 2 (#24891 ) * add llama * add other readmes * update padding id in readme * add link to paper * fix paths and tokenizer * more nits * styling * fit operation in 2 lines when possible * nits * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * add form * update reademe * update readme, we don't have a default pad token * update test and tokenization * LLaMA instead of Llama * nits * add expected text * add greeedy output * styling * Update src/transformers/models/llama/modeling_llama.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * sequential device map * skip relevant changes --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-07-18 15:18:31 -04:00

1 2 3 4 5 ...

2940 Commits