HuggingFace_transformer

Author	SHA1	Message	Date
Nima Yaqmuri	07ae53e6e7	Fix/speecht5 bug (#28481 ) * Fix bug in SpeechT5 speech decoder prenet's forward method - Removed redundant `repeat` operation on speaker_embeddings in the forward method. This line was erroneously duplicating the embeddings, leading to incorrect input size for concatenation and performance issues. - Maintained original functionality of the method, ensuring the integrity of the speech decoder prenet's forward pass remains intact. - This change resolves a critical bug affecting the model's performance in handling speaker embeddings. * Refactor SpeechT5 text to speech integration tests - Updated SpeechT5ForTextToSpeechIntegrationTests to accommodate the variability in sequence lengths due to dropout in the speech decoder pre-net. This change ensures that our tests are robust against random variations in generated speech, enhancing the reliability of our test suite. - Removed hardcoded dimensions in test assertions. Replaced with dynamic checks based on model configuration and seed settings, ensuring tests remain valid across different runs and configurations. - Added new test cases to thoroughly validate the shapes of generated spectrograms and waveforms. These tests leverage seed settings to ensure consistent and predictable behavior in testing, addressing potential issues in speech generation and vocoder processing. - Fixed existing test cases where incorrect assumptions about output shapes led to potential errors. * Fix bug in SpeechT5 speech decoder prenet's forward method - Removed redundant `repeat` operation on speaker_embeddings in the forward method. This line was erroneously duplicating the embeddings, leading to incorrect input size for concatenation and performance issues. - Maintained original functionality of the method, ensuring the integrity of the speech decoder prenet's forward pass remains intact. - This change resolves a critical bug affecting the model's performance in handling speaker embeddings. * Refactor SpeechT5 text to speech integration tests - Updated SpeechT5ForTextToSpeechIntegrationTests to accommodate the variability in sequence lengths due to dropout in the speech decoder pre-net. This change ensures that our tests are robust against random variations in generated speech, enhancing the reliability of our test suite. - Removed hardcoded dimensions in test assertions. Replaced with dynamic checks based on model configuration and seed settings, ensuring tests remain valid across different runs and configurations. - Added new test cases to thoroughly validate the shapes of generated spectrograms and waveforms. These tests leverage seed settings to ensure consistent and predictable behavior in testing, addressing potential issues in speech generation and vocoder processing. - Fixed existing test cases where incorrect assumptions about output shapes led to potential errors. * Enhance handling of speaker embeddings in SpeechT5 - Refined the generate and generate_speech functions in the SpeechT5 class to robustly handle two scenarios for speaker embeddings: matching the batch size (one embedding per sample) and one-to-many (a single embedding for all samples in the batch). - The update includes logic to repeat the speaker embedding when a single embedding is provided for multiple samples, and a ValueError is raised for any mismatched dimensions. - Also added corresponding test cases to validate both scenarios, ensuring complete coverage and functionality for diverse speaker embedding situations. * Improve Test Robustness with Randomized Speaker Embeddings	2024-01-16 14:14:28 +00:00
fxmarty	66db33ddc8	Fix mismatching loading in from_pretrained with/without accelerate (#28414 ) * fix mismatching behavior in from_pretrained with/without accelerate * meaningful refactor * remove added space * add test * fix model on the hub * comment * use tiny model * style	2024-01-16 14:29:51 +01:00
Timothy Cronin	ff86bc364d	improve dev setup comments and hints (#28495 ) * improve dev setup comments and hints * fix tests for new dev setup hints	2024-01-15 18:36:40 +00:00
Joao Gante	7e0ddf89f4	Generate: consolidate output classes (#28494 )	2024-01-15 17:04:08 +00:00
Marc Sun	7c8dd88d13	[GPTQ] Fix test (#28018 ) * fix test * reduce length * smaller model	2024-01-15 11:22:54 -05:00
thedamnedrhino	366c03271e	Tokenizer kwargs in textgeneration pipe (#28362 ) * added args to the pipeline * added test * more sensical tests * fixup * docs * typo ; * docs * made changes to support named args * fixed test * docs update * styles * docs * docs	2024-01-15 16:52:18 +01:00
Younes Belkada	1b9a2e4c80	[`core`/ FEAT] Add the possibility to push custom tags using `PreTrainedModel` itself (#28405 ) * v1 tags * remove unneeded conversion * v2 * rm unneeded warning * add more utility methods * Update src/transformers/utils/hub.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/utils/hub.py Co-authored-by: Lucain <lucainp@gmail.com> * Update src/transformers/utils/hub.py Co-authored-by: Lucain <lucainp@gmail.com> * more enhancements * oops * merge tags * clean up * revert unneeded change * add extensive docs * more docs * more kwargs * add test * oops * fix test * Update src/transformers/modeling_utils.py Co-authored-by: Omar Sanseviero <osanseviero@gmail.com> * Update src/transformers/utils/hub.py Co-authored-by: Lucain <lucainp@gmail.com> * Update src/transformers/modeling_utils.py * Update src/transformers/trainer.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/modeling_utils.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * add more conditions * more logic --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: Lucain <lucainp@gmail.com> Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>	2024-01-15 14:48:07 +01:00
Apoorv Saxena	e304f9769c	Adding Prompt lookup decoding (#27775 ) * MVP * fix ci * more ci * remove redundant kwarg * added and wired up PromptLookupCandidateGenerator * rebased with main, working * removed print * style fixes * fix test * fixed tests * added test for prompt lookup decoding * fixed circleci * fixed test issue * Update src/transformers/generation/candidate_generator.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/generation/candidate_generator.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/generation/candidate_generator.py * Update src/transformers/generation/candidate_generator.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> --------- Co-authored-by: Joao Gante <joao@huggingface.co> Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-01-13 17:15:58 +00:00
Joao Gante	afc45b13ca	Generate: refuse to save bad generation config files (#28477 )	2024-01-12 16:01:17 +00:00
Younes Belkada	266c67b06a	[`Mixtral` / `Awq`] Add mixtral fused modules for Awq (#28240 ) * add mixtral fused modules * add changes from modeling utils * add test * fix test + rope theta issue * Update src/transformers/modeling_utils.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * add tests --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-01-12 14:29:35 +01:00
amyeroberts	666a6f078c	Update metadata loading for oneformer (#28398 ) * Update meatdata loading for oneformer * Enable loading from a model repo * Update docstrings * Fix tests * Update tests * Clarify repo_path behaviour	2024-01-12 12:35:31 +00:00
amyeroberts	4e36a6cd00	Mark two logger tests as flaky (#28458 ) * Mark two logger tests as flaky * Add description to is_flaky	2024-01-12 11:58:59 +00:00
Younes Belkada	07bdbebb48	[`Awq`] Add llava fused modules support (#28239 ) * add llava + fused modules * Update src/transformers/models/llava/modeling_llava.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-01-12 06:55:54 +01:00
Yih-Dar	59cd9de39d	Byebye torch 1.10 (#28207 ) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-01-11 16:18:27 +01:00
liangxuZhang	e768616afa	Fix load balancing loss func for mixtral (#28256 ) * Correct the implementation of auxiliary loss of mixtrtal * correct the implementation of auxiliary loss of mixtrtal * Implement a simpler calculation method --------- Co-authored-by: zhangliangxu3 <zhangliangxu3@jd.com>	2024-01-11 16:16:12 +01:00
Gustavo de Rosa	5509058561	[Phi] Extend implementation to use GQA/MQA. (#28163 ) * chore(phi): Updates configuration_phi with missing keys. * chore(phi): Adds first draft of combined modeling_phi. * fix(phi): Fixes according to latest review. * fix(phi): Removes pad_vocab_size_multiple to prevent inconsistencies. * fix(phi): Fixes unit and integration tests. * fix(phi): Ensures that everything works with microsoft/phi-1 for first integration. * fix(phi): Fixes output of docstring generation. * fix(phi): Fixes according to latest review. * fix(phi): Fixes according to latest review. * fix(tests): Re-enables Phi-1.5 test. * fix(phi): Fixes attention overflow on PhiAttention (for Phi-2). * fix(phi): Improves how queries and keys are upcast. * fix(phi): Small updates on latest changes.	2024-01-11 15:58:02 +01:00
Harisankar Babu	d560637885	Optionally preprocess segmentation maps for MobileViT (#28420 ) * optionally preprocess segmentation maps for mobilevit * changed pretrained model name to that of segmentation model * removed voc-deeplabv3 from model archive list * added preprocess_image and preprocess_mask methods for processing images and segmentation masks respectively * added tests for segmentation masks based on segformer feature extractor * use crop_size instead of size * reverting to initial model	2024-01-11 14:52:14 +00:00
amyeroberts	66964c00f6	Enable multi-label image classification in pipeline (#28433 ) Enable multi-label image classification	2024-01-11 10:29:38 +00:00
Patrick von Platen	cbbe30749b	[Whisper] Fix slow test (#28407 ) * [Whisper] Fix slow test * update * update * update * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-01-10 22:35:36 +01:00
Zach Mueller	6015d0ad6c	Support `DeepSpeed` when using auto find batch size (#28088 ) Fixup test	2024-01-10 06:03:13 -05:00
Zach Mueller	a777f52599	Skip now failing test in the Trainer tests (#28421 ) * Fix test * Skip	2024-01-10 06:02:31 -05:00
HanHui	4df1d69634	[BUG] BarkEosPrioritizerLogitsProcessor eos_token_id use list, tensor size mismatch (#28201 ) fix(generation/logits_process.py): BarkEosPrioritizerLogitsProcessor eos_token_id use list, tensor size mismatch Co-authored-by: chenhanhui <chenhanhui@kanzhun.com>	2024-01-10 11:46:49 +01:00
Weiming Zhao	701298d2d3	Use mmap option to load_state_dict (#28331 ) Use mmap option to load_state_dict (#28331)	2024-01-10 09:57:30 +01:00
Victor SANH	0f2f0c634f	Fix `_merge_input_ids_with_image_features` for llava model (#28333 ) * fix `_merge_input_ids_with_image_features` for llava model * Update src/transformers/models/llava/modeling_llava.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * adress comments * style and tests * ooops * test the backward too * Apply suggestions from code review Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Update tests/models/vipllava/test_modeling_vipllava.py * style and quality --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>	2024-01-10 08:33:33 +01:00
Xuehai Pan	976189a6df	Fix initialization for missing parameters in `from_pretrained` under ZeRO-3 (#28245 ) * Fix initialization for missing parameters in `from_pretrained` under ZeRO-3 * Test initialization for missing parameters under ZeRO-3 * Add more tests * Only enable deepspeed context for per-module level parameters * Enable deepspeed context only once * Move class definition inside test case body	2024-01-09 14:58:21 +00:00
Sangbum Daniel Choi	357971ec36	fix auxiliary loss training in DetrSegmentation (#28354 ) * fix auxiliary loss training in detrSegmentation * add auxiliary_loss testing	2024-01-09 10:17:07 +00:00
NielsRogge	3b742ea84c	Add SigLIP (#26522 ) * Add first draft * Use appropriate gelu function * More improvements * More improvements * More improvements * Convert checkpoint * More improvements * Improve docs, remove print statements * More improvements * Add link * remove unused masking function * begin tokenizer * do_lower_case * debug * set split_special_tokens=True * Remove script * Fix style * Fix rebase * Use same design as CLIP * Add fast tokenizer * Add SiglipTokenizer to init, remove extra_ids * Improve conversion script * Use smaller inputs in conversion script * Update conversion script * More improvements * Add processor to conversion script * Add tests * Remove print statements * Add tokenizer tests * Fix more tests * More improvements related to weight initialization * More improvements * Make more tests pass * More improvements * More improvements * Add copied from * Add canonicalize_text * Enable fast tokenizer tests * More improvements * Fix most slow tokenizer tests * Address comments * Fix style * Remove script * Address some comments * Add copied from to tests * Add more copied from * Add more copied from * Add more copied from * Remove is_flax_available * More updates * Address comment * Remove SiglipTokenizerFast for now * Add caching * Remove umt5 test * Add canonicalize_text inside _tokenize, thanks Arthur * Fix image processor tests * Skip tests which are not applicable * Skip test_initialization * More improvements * Compare pixel values * Fix doc tests, add integration test * Add do_normalize * Remove causal mask and leverage ignore copy * Fix attention_mask * Fix remaining tests * Fix dummies * Rename temperature and bias * Address comments * Add copied from to tokenizer tests * Add SiglipVisionModel to auto mapping * Add copied from to image processor tests * Improve doc * Remove SiglipVisionModel from index * Address comments * Improve docs * Simplify config * Add first draft * Make it like mistral * More improvements * Fix attention_mask * Fix output_attentions * Add note in docs * Convert multilingual model * Convert large checkpoint * Convert more checkpoints * Add pipeline support, correct image_mean and image_std * Use padding=max_length by default * Make processor like llava * Add code snippet * Convert more checkpoints * Set keep_punctuation_string=None as in OpenCLIP * Set normalized=False for special tokens * Fix doc test * Update integration test * Add figure * Update organization * Happy new year * Use AutoModel everywhere --------- Co-authored-by: patil-suraj <surajp815@gmail.com>	2024-01-08 18:17:16 +01:00
Rosie Wood	73c88012b7	Add segmentation map processing to SAM Image Processor (#27463 ) * add segmentation map processing to sam image processor * fixup * add tests * reshaped_input_size is shape before padding * update tests for size/shape outputs * fixup * add code snippet to docs * Update docs/source/en/model_doc/sam.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Add missing backticks * add `segmentation_maps` as arg for SamProcessor.__call__() --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-01-08 16:40:36 +00:00
Mohamed Abu El-Nasr	0c2121f99b	Fix building alibi tensor when num_heads is not a power of 2 (#28380 ) * Fix building alibi tensor when num_heads is not a power of 2 * Remove print function	2024-01-08 10:39:40 +01:00
Susnato Dhar	3eddda1111	[Phi2] Add support for phi2 models (#28211 ) * modified script and added test for phi2 * changes	2024-01-07 08:19:14 +01:00
Sangbum Daniel Choi	899d8351f9	[DETA] Improvement and Sync from DETA especially for training (#27990 ) * [DETA] fix freeze/unfreeze function * Update src/transformers/models/deta/modeling_deta.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/deta/modeling_deta.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * add freeze/unfreeze test case in DETA * fix type * fix typo 2 * fix : enable aux and enc loss in training pipeline * Add unsynced variables from original DETA for training * modification for passing CI test * make style * make fix * manual make fix * change deta_modeling_test of configuration 'two_stage' default to TRUE and minor change of dist checking * remove print * divide configuration in DetaModel and DetaForObjectDetection * image smaller size than 224 will give topk error * pred_boxes and logits should be equivalent to two_stage_num_proposals * add missing part in DetaConfig * Update src/transformers/models/deta/modeling_deta.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * add docstring in configure and prettify TO DO part * change distribute related code to accelerate * Update src/transformers/models/deta/configuration_deta.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/deta/test_modeling_deta.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * protect importing accelerate * change variable name to specific value * wrong import --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-01-05 14:20:21 +00:00
Fernando Rodriguez Sanchez	57e9c83213	Fix pos_mask application and update tests accordingly (#27892 ) * Fix pos_mask application and update tests accordingly * Fix style * Adding comments --------- Co-authored-by: Fernando Rodriguez <fernando.rodriguez@nielseniq.com>	2024-01-05 12:36:10 +01:00
yuanwu2017	03b980990a	Don't check the device when device_map=auto (#28351 ) When running the case on multi-cards server with devcie_map-auto, It will not always be allocated to device 0, Because other processes may be using these cards. It will select the devices that can accommodate this model. Signed-off-by: yuanwu <yuan.wu@intel.com>	2024-01-05 12:21:29 +01:00
Yoach Lacombe	35e9d2b223	Fix error in M4T feature extractor (#28340 ) * fix M4T FE error when no attention mask * modify logic * add test * go back to initial test situation + add other tests	2024-01-04 16:40:53 +00:00
Sangbum Daniel Choi	4a66c0d952	enable training mask2former and maskformer for transformers trainer (#28277 ) * fix get_num_masks output as [int] to int * fix loss size from torch.Size([1]) to torch.Size([])	2024-01-04 09:53:25 +01:00
Apsod	45b1dfa342	Remove token_type_ids from model_input_names (like #24788 ) (#28325 ) * remove token_type_ids from model_input_names (like #24788) * removed test that assumed token_type_ids should be present and updated a model reference so that it points to an available model)	2024-01-03 19:26:07 +01:00
Connor Henderson	d83ff5eeff	Add FastSpeech2Conformer (#23439 ) * start - docs, SpeechT5 copy and rename * add relevant code from FastSpeech2 draft, have tests pass * make it an actual conformer, demo ex. * matching inference with original repo, includes debug code * refactor nn.Sequentials, start more desc. var names * more renaming * more renaming * vocoder scratchwork * matching vocoder outputs * hifigan vocoder conversion script * convert model script, rename some config vars * replace postnet with speecht5's implementation * passing common tests, file cleanup * expand testing, add output hidden states and attention * tokenizer + passing tokenizer tests * variety of updates and tests * g2p_en pckg setup * import structure edits * docstrings and cleanup * repo consistency * deps * small cleanup * forward signature param order * address comments except for masks and labels * address comments on attention_mask and labels * address second round of comments * remove old unneeded line * address comments part 1 * address comments pt 2 * rename auto mapping * fixes for failing tests * address comments part 3 (bart-like, train loss) * make style * pass config where possible * add forward method + tests to WithHifiGan model * make style * address arg passing and generate_speech comments * address Arthur comments * address Arthur comments pt2 * lint changes * Sanchit comment * add g2p-en to doctest deps * move up self.encoder * onnx compatible tensor method * fix is symbolic * fix paper url * move models to espnet org * make style * make fix-copies * update docstring * Arthur comments * update docstring w/ new updates * add model architecture images * header size * md wording update * make style	2024-01-03 18:01:06 +00:00
Younes Belkada	fa21ead73d	[`Awq`] Enable the possibility to skip quantization for some target modules (#27950 ) * v1 * add docstring * add tests * add awq 0.1.8 * oops * fix test	2023-12-25 11:06:56 +01:00
Younes Belkada	29e7a1e183	[`Llava`] Fix llava index errors (#28032 ) * fix llava index errors * forward contrib credits from original implementation and fix * better fix * final fixes and fix all tests * fix * fix nit * fix tests * add regression tests --------- Co-authored-by: gullalc <gullalc@users.noreply.github.com>	2023-12-22 17:47:38 +01:00
Yoach Lacombe	5da3db3fd5	[Whisper] Fix word-level timestamps with bs>1 or num_beams>1 (#28114 ) * fix frames * use smaller chunk length * correct beam search + tentative stride * fix whisper word timestamp in batch * add test batch generation with return token timestamps * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * clean a test * make style + correct typo * write clearer comments * explain test in comment --------- Co-authored-by: sanchit-gandhi <sanchit@huggingface.co> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>	2023-12-22 12:43:11 +00:00
Yih-Dar	bb3bd44739	Fix the check of models supporting FA/SDPA not run (#28202 ) * add check_support_list.py * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-12-22 12:56:11 +01:00
NielsRogge	c9fb250a25	Add Swinv2 backbone (#27742 ) * First draft * More improvements * More improvements * Make all tests pass * Remove script * Update image processor * Address comments * Use new gradient checkpointing method * Convert checkpoints, add integration test * Do not keep aspect ratio for now * Set keep_aspect_ratio=False for beit, add integration test * Remove print statement	2023-12-22 11:12:56 +00:00
Nicholas Neo	1ef86c4f56	Fix: [SeamlessM4T - S2TT] Bug in batch loading of audio in torch.Tensor format in the SeamlessM4TFeatureExtractor class (#27914 ) * fixes: code fixes on is_batched condition to also check for batched audio data in torch.Tensor format instead of only just checking for batched audio data in np.ndarray format * Update src/transformers/models/seamless_m4t/feature_extraction_seamless_m4t.py Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com> * refactor: code refactoring to remove torch framework dependency * docs: updated docstring to add torch tensor compatibility * test: add test cases to incorporate torch tensor inputs * test: ran make fix-copies for code conformity * test: refactor test to separate the test_call into test_call_numpy and test_call_torch --------- Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com>	2023-12-22 10:47:30 +00:00
amyeroberts	3657748b4d	Update YOLOS slow test values (#28187 ) Update test values	2023-12-21 18:17:07 +00:00
amyeroberts	cd1350ce9b	Fix slow backbone tests - out_indices must match stage name ordering (#28186 ) Indices must match stage name ordering	2023-12-21 18:16:50 +00:00
Matt	260b9d2179	Even more TF test fixes (#28146 ) * Fix vision text dual encoder * Small cleanup for wav2vec2 (not fixed yet) * Small fix for vision_encoder_decoder * Fix SAM builds * Update TFBertTokenizer test with modern exporting + tokenizer * Fix DeBERTa * Fix DeBERTav2 * Try RAG fix but it's impossible to test locally * Actually fix RAG now that I got FAISS working somehow * Fix Wav2Vec2, add sermon * Fix Hubert	2023-12-21 15:14:46 +00:00
Arthur	f9a98c476c	[`Mixtral` & `Mistral`] Add support for sdpa (#28133 ) * some nits * update test * add support d\sd[a * remove some dummy inputs * all good * style * nits * fixes * fix more copies * nits * styling * fix * Update src/transformers/models/mistral/modeling_mistral.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * add a slow test just to be sure * fixup --------- Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>	2023-12-21 12:38:22 +01:00
Sanchit Gandhi	814619f54f	[Whisper] Use torch for stft if available (#26119 ) * [Whisper] Use torch for stft if available * update docstring * mock patch decorator * fit on one line	2023-12-21 11:04:05 +00:00
Poedator	4f7806ef7e	[bnb] Let's make serialization of 4bit models possible (#26037 ) * updated bitsandbytes.py * rm test_raise_* from test_4bit.py * add test_4bit_serialization.py * modeling_utils bulk edits * bnb_ver 0.41.3 in integrations/bitsandbytes.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * @slow reinstated Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * bnb ver 0.41.3 in src/transformers/modeling_utils.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * rm bnb version todo in integrations/bitsandbytes.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * moved 4b serialization tests to test_4bit * tests upd for opt * to torch_device Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * ruff fixes to tests * rm redundant bnb version check in mod_utils Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * restore _hf_peft_config_loaded modeling_utils.py::2188 Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * restore _hf_peft_config_loaded test in modeling_utils.py::2199 Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * fixed NOT getattr(self, "is_8bit_serializable") Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * setting model.is_4bit_serializable * rm separate fp16_statistics arg from set_module... * rm else branch in integrations::bnb::set_module * bnb 4bit dtype check * upd comment on 4bit weights * upd tests for FP4 safe --------- Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>	2023-12-21 11:54:44 +01:00
Dean Wyatte	e268d7e5dc	disable test_retain_grad_hidden_states_attentions on SeamlessM4TModelWithTextInputTest (#28169 ) disable retain_grad_hidden_states_attentions on SeamlessM4TModelWithTextInputTest	2023-12-21 08:39:44 +01:00

1 2 3 4 5 ...

3342 Commits