HuggingFace_transformer

Author	SHA1	Message	Date
Pavel Iakubovskii	94ae9a8da1	OwlViT/Owlv2 post processing standardization (#34929 ) * Refactor owlvit post_process_object_detection + add text_labels * Fix copies in grounding dino * Sync with Owlv2 postprocessing * Add post_process_grounded_object_detection method to processor, deprecate post_process_object_detection * Add test cases * Move text_labels to processors only * [run-slow] owlvit owlv2 * [run-slow] owlvit, owlv2 * Update snippets * Update docs structure * Update deprecated objects for check_repo * Update docstring for post processing of image guided object detection	2025-01-17 13:58:28 +00:00
Ambrose Robinson	add5f0566c	Added liger_kernel compatibility with `PeftModel` (#35680 ) * Added liger_kernel compatibility with `PeftModel` * Amending based on review comments * Amending based on review comments	2025-01-17 14:43:20 +01:00
alpertunga-bile	df6d42a914	check is added for the report_to variable in TrainingArguments (#35403 ) check for report_to variable is added	2025-01-17 14:39:32 +01:00
Francesco Cariaggi	54fd7e9260	Unable to use `MimiModel` with DeepSpeed ZeRO-3 (#34735 ) use torch.tensor(), not torch.Tensor() Co-authored-by: eustlb <94853470+eustlb@users.noreply.github.com>	2025-01-17 14:06:20 +01:00
Cyril Vallez	ab1afd56f5	Fix some tests (#35682 ) * cohere tests * glm tests * cohere2 model name * create decorator * update * fix cohere2 completions * style * style * style * add cuda in comments	2025-01-17 12:10:43 +00:00
Ross Wightman	8c1b5d3782	🚨🚨🚨 An attempt to fix #29554 . Include 'LayerNorm.' in gamma/beta rename scope, optimize string search. (#35615 ) * An attempt to fix #29554. Include 'LayerNorm.' in gamma/beta rename scope, reduce number of characters searched on every load considerably. * Fix fix on load issue * Fix gamma/beta warning test * A style complaint * Improve efficiency of weight norm key rename. Add better comments about weight norm and layer norm renaming. * Habitual elif redunant with the return	2025-01-16 17:25:44 -08:00
Sai-Suraj-27	02a492a838	Added resource class configuration option for `check_circleci_user` job (#32866 ) Added resource class configuration option for check_circleci_user job.	2025-01-16 21:31:18 +01:00
Joao Gante	94af1c0aa2	[generate] return Cache object even if passed in a legacy format (#35673 ) * generate returns a Cache object by default * fix tests * fix test for encoder-decoder models	2025-01-16 17:06:24 +00:00
Joao Gante	2818307e93	[generate] can instantiate `GenerationConfig(cache_implementation="static")` (#35679 ) fix failing instantiation	2025-01-16 17:04:54 +00:00
Joao Gante	aaa969e97d	Remove `pt_to_tf` (#35672 ) * rm command * remove exception	2025-01-16 17:03:37 +00:00
Joao Gante	80dbbd103c	🧹 remove `generate`-related objects and methods scheduled for removal in v4.48 (#35677 ) * remove things scheduled for removal * make fixup	2025-01-16 17:03:20 +00:00
Joao Gante	aeeceb9916	[cache] add a test to confirm we can use cache at train time (#35709 ) * add test * augment test as suggested * Update tests/utils/test_modeling_utils.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * rerun tests --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2025-01-16 17:02:34 +00:00
Quinten Roets	57bf1a12a0	Remove batch size argument warning when unjustified (#35519 ) * use max batch size * revert unneccessary change --------- Co-authored-by: Raushan Turganbay <raushan@huggingface.co>	2025-01-16 17:48:11 +01:00
Cyril Vallez	91be6a5eb2	Modular: support for importing functions from any file (#35692 ) * fix function imports * improve comment * Update modeling_switch_function.py * make checks more robust * improvement * rename * final test update	2025-01-16 16:37:53 +00:00
efsotr	8ebe9d7166	Optimize ForCausalLMLoss by removing unnecessary contiguous() call to reduce memory overhead (#35646 ) Optimize ForCausalLMLoss by removing unnecessary contiguous() calls to reduce memory overhead	2025-01-16 15:47:43 +00:00
Matt	1302c32a84	Add proper jinja2 error (#35533 ) * Cleanup jinja2 imports * Raise a proper error if Jinja is missing * make fixup	2025-01-16 15:31:11 +00:00
Joao Gante	3292e96a4f	[generation] fix type hint (#35725 ) fix type hint	2025-01-16 15:09:59 +00:00
人民艺术家	8b78d9d6e7	Fix the bug that `Trainer` cannot correctly call `torch_jit_model_eval` (#35722 ) Fix the bug that the accelerator.autocast does not pass parameters correctly when calling torch_jit_model_eval (#35706)	2025-01-16 15:53:37 +01:00
kang sheng	2cbcc5877d	Fix condition when GA loss bug fix is not performed (#35651 ) * fix condition when GA loss bug fix is not performed * max loss diff is 2.29 * fix typo * add an extra validation that loss should not vary too much	2025-01-16 13:59:53 +01:00
Mohamed Mekkouri	fd4f14c968	Fix: Falcon tie_word_embeddings in GGUF (#35715 ) * fix falcon tie_word_embeddings * fix style	2025-01-16 13:18:22 +01:00
Mikko Reinikainen	bef7dded22	Replace deprecated batch_size with max_batch_size when using HybridCache (#35498 ) * Replace deprecated batch_size with max_batch_size - Functionality remains the same, because property getter batch_size(self) returned max_batch_size anyways. - This change just avoids an unnecessary warning about deprecation. * Use max_batch_size instead of deprecated batch_size with HybridCache * Use max_batch_size instead of deprecated batch_size with HybridCache - Change generated code to match original source	2025-01-16 11:48:41 +00:00
hiroaki222	99e0ab6ed8	Fix typo in /docs/source/ja/model_doc/decision_transformer.md URL (#35705 ) doc: Update original code repository URL	2025-01-15 07:36:50 -08:00
Mohamed Mekkouri	12dfd99007	Fix : Nemotron Processor in GGUF conversion (#35708 ) * fixing nemotron processor * make style	2025-01-15 14:25:44 +01:00
jiqing-feng	387663e571	Enable gptqmodel (#35012 ) * gptqmodel Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix format Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * update readme Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * gptqmodel need use checkpoint_format (#1) * gptqmodel need use checkpoint_format * fix quantize * Update quantization_config.py * Update quantization_config.py * Update quantization_config.py --------- Co-authored-by: ZX-ModelCloud <zx@modelcloud.ai> Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai> * Revert quantizer_gptq.py (#2) * revert quantizer_gptq.py change * pass *kwargs limit gptqmodel and optimum version Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix format Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix warning Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix version check Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * revert unrelated changes Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * enable gptqmodel tests Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix requires gptq Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * Fix Transformer compat (#3) * revert quantizer_gptq.py change * pass *kwargs add meta info * cleanup * cleanup * Update quantization_config.py * hf_select_quant_linear pass checkpoint_format and meta * fix GPTQTestCUDA * Update test_gptq.py * gptqmodel.hf_select_quant_linear() now does not select ExllamaV2 * cleanup * add backend * cleanup * cleanup * no need check exllama version * Update quantization_config.py * lower checkpoint_format and backend * check none * cleanup * Update quantization_config.py * fix self.use_exllama == False * spell * fix unittest * fix unittest --------- Co-authored-by: LRL <lrl@lbx.dev> Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai> * fix format Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix format again Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * update gptqmodel version (#6) * update gptqmodel version * update gptqmodel version * fix unit test (#5) * update gptqmodel version * update gptqmodel version * "not self.use_exllama" is not equivalent to "self.use_exllama==False" * fix unittest * update gptqmodel version * backend is loading_attibutes (#7) * fix format and tests Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix memory check Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix device mismatch Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix result check Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * Update src/transformers/quantizers/quantizer_gptq.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update src/transformers/quantizers/quantizer_gptq.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update src/transformers/quantizers/quantizer_gptq.py Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * update tests Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * review: update docs (#10) * review: update docs (#12) * review: update docs * fix typo * update tests for gptqmodel Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * update document (#9) * update overview.md * cleanup * Update overview.md * Update overview.md * Update overview.md * update gptq.md * Update gptq.md * Update gptq.md * Update gptq.md * Update gptq.md * Update gptq.md * Update gptq.md --------- Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai> * typo * doc note for asymmetric quant * typo with apple silicon(e) * typo for marlin * column name revert: review * doc rocm support * Update docs/source/en/quantization/gptq.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/gptq.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/gptq.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/gptq.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/overview.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/quantization/overview.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com> Co-authored-by: LRL-ModelCloud <165116337+LRL-ModelCloud@users.noreply.github.com> Co-authored-by: ZX-ModelCloud <zx@modelcloud.ai> Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai> Co-authored-by: ZX-ModelCloud <165115237+ZX-ModelCloud@users.noreply.github.com> Co-authored-by: LRL <lrl@lbx.dev> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2025-01-15 14:22:49 +01:00
Matt	615bf9c5e4	Add future import for Py < 3.10 (#35666 ) * Add future import for Py < 3.10 * make fixup * Same issue in convert_olmo2_weights_to_hf.py	2025-01-15 12:45:43 +00:00
Raushan Turganbay	09d5f76274	Clean-up composite configs (#34603 ) * remove manual assignment tie-word-embeddings * remove another unused attribute * fix tests * fix tests * remove unnecessary overwrites * fix * decoder=True * clean pix2struct * run-all * forgot `_tied_weights_keys` when adding Emu3 * also Aria + fix-copies * and clean aria	2025-01-15 10:04:07 +01:00
Mahdi Baghbanzadeh	c61fcde910	Enhance DataCollatorForLanguageModeling with Configurable Token Replacement Probabilities (#35251 ) * DataCollatorForLanguageModeling class was updated with new parameters that provides more control over the token masking and relacing * DataCollatorForLanguageModeling class was updated with new parameters that provides more control over the token masking and relacing * Addressed review comments, modified the docstring and made a test for the DataCollatorForLanguageModeling	2025-01-14 17:01:10 +00:00
Ego Joseph Oborakpororo	b0cdbd9119	Enhanced Installation Section in README.md (#35094 ) * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md * Update README.md Enhanced installation section with troubleshooting, GPU setup, and OS-specific details. * Update README.md Enhanced installation section with troubleshooting, GPU setup, and OS-specific details. * Update installation.md Updated installation.md to include virtual environment and GPU setup instructions. * Update installation.md Updated installation.md to include virtual environment and GPU setup instructions. * Update installation.md Updated installation.md to include virtual environment, troubleshooting and GPU setup instructions. * Update installation.md * Update installation.md * Update installation.md * Update installation.md Updated installation.md to include virtual environment, troubleshooting functions and GPU setup instructions. * Update installation.md Updated installation.md to include virtual environment, troubleshooting functions and GPU setup instructions. * Update installation.md Updated installation.md to include virtual environment, troubleshooting functions and GPU setup instructions. * Update README.md Removed numbering from README.md. * Update README.md Removed unnecessary "a)" formatting as per maintainer feedback. * Update README.md Added blank lines around code snippets for better readability. * Update README.md Removed the line "b) Install a backend framework:" from README.md as per feedback. * Update README.md Simplified "For Windows:" to "Windows" in README.md as per feedback as well as "For macOS/Linux:" to "macOS/Linux" * Update README.md Removed unnecessary heading and retained valid code snippet. * Update README.md Removed unnecessary heading "d) Optional: Install from source for the latest updates" as per feedback. * Update README.md Removed "GPU Setup (Optional)" section to align with minimal design feedback. * Update installation.md Removed "Create and Activate a Virtual Environment" section from installation.md as per feedback. * Update installation.md Adjusted "Troubleshooting" to a second-level heading and added an introductory line as per feedback. * Update installation.md Updated troubleshooting section with simplified headings and formatted code blocks as per feedback. * Update installation.md Integrated GPU setup instructions into the "Install with pip" section for better content flow. * Update README.md Removed Troubleshooting section from README.md for minimalism as per maintainer feedback.	2025-01-14 08:05:08 -08:00
Mohamed Mekkouri	a11041ffad	Fix : add require_read_token for gemma2 gated model (#35687 ) fix gemma2 gated model test	2025-01-14 11:47:05 +01:00
Mohamed Mekkouri	df2a812e95	Fix expected output for ggml test (#35686 ) fix expected output	2025-01-14 11:46:55 +01:00
Mohamed Mekkouri	050636518a	Fix : HQQ config when hqq not available (#35655 ) * fix * make style * adding require_hqq * make style	2025-01-14 11:37:37 +01:00
Martin	715fdd6459	Update torchao.md: use auto-compilation (#35490 ) * Update torchao.md: use auto-compilation * Update torchao.md: indicate updating transformers to the latest --------- Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>	2025-01-14 11:33:48 +01:00
Mohamed Mekkouri	4b8d1f7fca	Fix : adding einops lib in the CI docker for some bitsandbytes tests (#35652 ) * fix docker * fix	2025-01-14 07:36:10 +01:00
RTrace	34f76bb62b	Fix `zero_shot_image_classification` documentation guide link in SigLIP (#35671 )	2025-01-13 11:08:17 -08:00
Arthur	c23a1c1932	Add-helium (#35669 ) * Add the helium model. * Add a missing helium. * And add another missing helium. * Use float for the rmsnorm mul. * Add the Helium tokenizer converter. * Add the pad token as suggested by Arthur. * Update the RMSNorm + some other tweaks. * Fix more rebase issues. * fix copies and style * fixes and add helium.md * add missing tests * udpate the backlink * oups * style * update init, and expected results * small fixes * match test outputs * style fixup, fix doc builder * add dummies and we should be good to go!z * update sdpa and fa2 documentation --------- Co-authored-by: laurent <laurent.mazare@gmail.com>	2025-01-13 18:41:15 +01:00
Ahmed Almaghz	a3f82328ed	[i18n-ar] Translated file : docs/source/ar/tasks/token_classification.md into Arabic (#35193 ) * Create token_classification.md * Update token_classification.md * Update docs/source/ar/tasks/token_classification.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/token_classification.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/token_classification.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/token_classification.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/token_classification.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/token_classification.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/token_classification.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/token_classification.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/token_classification.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/token_classification.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/token_classification.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/token_classification.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/token_classification.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/token_classification.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/token_classification.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/token_classification.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/token_classification.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/token_classification.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/token_classification.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/token_classification.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/token_classification.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/token_classification.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/token_classification.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/token_classification.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/token_classification.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update docs/source/ar/tasks/token_classification.md Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com> * Update _toctree.yml --------- Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com>	2025-01-13 09:32:15 -08:00
Fanli Lin	2fa876d2d8	[tests] make cuda-only tests device-agnostic (#35607 ) * intial commit * remove unrelated files * further remove * Update test_trainer.py * fix style	2025-01-13 14:48:39 +01:00
Arthur	e6f9b03464	[`Compile`] Only test compiling model forward pass (#35658 ) * rename test to only compile forward! * style emu	2025-01-13 13:43:29 +01:00
Raushan Turganbay	84a6789145	Enable different torch dtype in sub models (#34873 ) * fix * fix test * add tests * add more tests * fix tests * supposed to be a torch.dtype test * handle BC and make fp32 default	2025-01-13 13:42:08 +01:00
Arthur	87089176d9	[`Phi`] bias should be True (#35650 ) bias should be True	2025-01-13 13:15:07 +01:00
Sai-Suraj-27	91f14f1fc4	Removed some duplicated code (#35637 ) * Removed duplicate class field definition. * Removed duplicate code in try-except block. --------- Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>	2025-01-13 12:34:21 +01:00
jiqing-feng	b8c34d97fc	Fix whisper compile (#35413 ) Fix compile error Signed-off-by: jiqing-feng <jiqing.feng@intel.com>	2025-01-13 11:31:51 +01:00
Cyril Vallez	cd44bdb4b8	Fix device in rope module when using dynamic updates (#35608 ) fix rope device	2025-01-13 10:11:17 +01:00
Matt	15bd3e61f8	Update codeowners with individual model owners (#35595 ) * Update codeowners with individual model owners * rip yoach * add comment * Replace - with _ * Add @qubvel for zero-shot object-detection * Update CODEOWNERS Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update CODEOWNERS Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update CODEOWNERS Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update CODEOWNERS Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Add yoni for omdet-turbo * Update CODEOWNERS Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com> * Refactor / comment the CODEOWNERS file * Capture modular files as well * Add dummies without owner * More cleanup * Set Niels on a few more models that he added --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com>	2025-01-10 17:59:36 +00:00
Yih-Dar	1e3c6c1f7d	Skip `MobileNetV1ModelTest::test_batching_equivalence` for now (#35614 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-01-10 18:32:36 +01:00
Yih-Dar	04eae987f3	Fix flaky `test_beam_search_low_memory` (#35611 ) * fix * fix * fix * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-01-10 17:31:03 +01:00
Zach Mueller	b02828e4af	Let `EarlyStoppingCallback` not require `load_best_model_at_end` (#35101 ) * Bookmark * Add warning	2025-01-10 10:25:32 -05:00
Taha Akbari	0aaf124fb9	Added error when sequence length is bigger than max_position_embeddings (#32156 ) * Added error when sequence length is bigger than max_position_embeddings * Fixed formatting * Fixed bug * Changed copies to match * Fixed bug * Applied suggestions * Removed redundant code * Fixed bugs * Bug fix * Bug fix * Added requested Changes * Fixed bug * Fixed unwanted change * Fixed unwanated changes * Fixed formatting	2025-01-10 15:23:54 +00:00
Zach Mueller	1211e616a4	Use inherit tempdir makers for tests + fix failing DS tests (#35600 ) * Use existing APIs to make tempdir folders * Fixup deepspeed too * output_dir -> tmp_dir	2025-01-10 10:01:58 -05:00
Yih-Dar	bbc00046b9	Fix flaky `test_custom_4d_attention_mask` (#35606 ) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2025-01-10 15:40:04 +01:00

1 2 3 4 5 ...

17804 Commits