Pavel Iakubovskii
94ae9a8da1
OwlViT/Owlv2 post processing standardization ( #34929 )
...
* Refactor owlvit post_process_object_detection + add text_labels
* Fix copies in grounding dino
* Sync with Owlv2 postprocessing
* Add post_process_grounded_object_detection method to processor, deprecate post_process_object_detection
* Add test cases
* Move text_labels to processors only
* [run-slow] owlvit owlv2
* [run-slow] owlvit, owlv2
* Update snippets
* Update docs structure
* Update deprecated objects for check_repo
* Update docstring for post processing of image guided object detection
2025-01-17 13:58:28 +00:00
Ambrose Robinson
add5f0566c
Added liger_kernel compatibility with PeftModel ( #35680 )
...
* Added liger_kernel compatibility with `PeftModel`
* Amending based on review comments
* Amending based on review comments
2025-01-17 14:43:20 +01:00
alpertunga-bile
df6d42a914
check is added for the report_to variable in TrainingArguments ( #35403 )
...
check for report_to variable is added
2025-01-17 14:39:32 +01:00
Francesco Cariaggi
54fd7e9260
Unable to use MimiModel with DeepSpeed ZeRO-3 ( #34735 )
...
use torch.tensor(), not torch.Tensor()
Co-authored-by: eustlb <94853470+eustlb@users.noreply.github.com >
2025-01-17 14:06:20 +01:00
Cyril Vallez
ab1afd56f5
Fix some tests ( #35682 )
...
* cohere tests
* glm tests
* cohere2 model name
* create decorator
* update
* fix cohere2 completions
* style
* style
* style
* add cuda in comments
2025-01-17 12:10:43 +00:00
Ross Wightman
8c1b5d3782
🚨 🚨 🚨 An attempt to fix #29554 . Include 'LayerNorm.' in gamma/beta rename scope, optimize string search. ( #35615 )
...
* An attempt to fix #29554 . Include 'LayerNorm.' in gamma/beta rename scope, reduce number of characters searched on every load considerably.
* Fix fix on load issue
* Fix gamma/beta warning test
* A style complaint
* Improve efficiency of weight norm key rename. Add better comments about weight norm and layer norm renaming.
* Habitual elif redunant with the return
2025-01-16 17:25:44 -08:00
Sai-Suraj-27
02a492a838
Added resource class configuration option for check_circleci_user job ( #32866 )
...
Added resource class configuration option for check_circleci_user job.
2025-01-16 21:31:18 +01:00
Joao Gante
94af1c0aa2
[generate] return Cache object even if passed in a legacy format ( #35673 )
...
* generate returns a Cache object by default
* fix tests
* fix test for encoder-decoder models
2025-01-16 17:06:24 +00:00
Joao Gante
2818307e93
[generate] can instantiate GenerationConfig(cache_implementation="static") ( #35679 )
...
fix failing instantiation
2025-01-16 17:04:54 +00:00
Joao Gante
aaa969e97d
Remove pt_to_tf ( #35672 )
...
* rm command
* remove exception
2025-01-16 17:03:37 +00:00
Joao Gante
80dbbd103c
🧹 remove generate-related objects and methods scheduled for removal in v4.48 ( #35677 )
...
* remove things scheduled for removal
* make fixup
2025-01-16 17:03:20 +00:00
Joao Gante
aeeceb9916
[cache] add a test to confirm we can use cache at train time ( #35709 )
...
* add test
* augment test as suggested
* Update tests/utils/test_modeling_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
* rerun tests
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
2025-01-16 17:02:34 +00:00
Quinten Roets
57bf1a12a0
Remove batch size argument warning when unjustified ( #35519 )
...
* use max batch size
* revert unneccessary change
---------
Co-authored-by: Raushan Turganbay <raushan@huggingface.co >
2025-01-16 17:48:11 +01:00
Cyril Vallez
91be6a5eb2
Modular: support for importing functions from any file ( #35692 )
...
* fix function imports
* improve comment
* Update modeling_switch_function.py
* make checks more robust
* improvement
* rename
* final test update
2025-01-16 16:37:53 +00:00
efsotr
8ebe9d7166
Optimize ForCausalLMLoss by removing unnecessary contiguous() call to reduce memory overhead ( #35646 )
...
Optimize ForCausalLMLoss by removing unnecessary contiguous() calls to reduce memory overhead
2025-01-16 15:47:43 +00:00
Matt
1302c32a84
Add proper jinja2 error ( #35533 )
...
* Cleanup jinja2 imports
* Raise a proper error if Jinja is missing
* make fixup
2025-01-16 15:31:11 +00:00
Joao Gante
3292e96a4f
[generation] fix type hint ( #35725 )
...
fix type hint
2025-01-16 15:09:59 +00:00
人民艺术家
8b78d9d6e7
Fix the bug that Trainer cannot correctly call torch_jit_model_eval ( #35722 )
...
Fix the bug that the accelerator.autocast does not pass parameters correctly when calling torch_jit_model_eval (#35706 )
2025-01-16 15:53:37 +01:00
kang sheng
2cbcc5877d
Fix condition when GA loss bug fix is not performed ( #35651 )
...
* fix condition when GA loss bug fix is not performed
* max loss diff is 2.29
* fix typo
* add an extra validation that loss should not vary too much
2025-01-16 13:59:53 +01:00
Mohamed Mekkouri
fd4f14c968
Fix: Falcon tie_word_embeddings in GGUF ( #35715 )
...
* fix falcon tie_word_embeddings
* fix style
2025-01-16 13:18:22 +01:00
Mikko Reinikainen
bef7dded22
Replace deprecated batch_size with max_batch_size when using HybridCache ( #35498 )
...
* Replace deprecated batch_size with max_batch_size
- Functionality remains the same, because property getter batch_size(self) returned max_batch_size anyways.
- This change just avoids an unnecessary warning about deprecation.
* Use max_batch_size instead of deprecated batch_size with HybridCache
* Use max_batch_size instead of deprecated batch_size with HybridCache
- Change generated code to match original source
2025-01-16 11:48:41 +00:00
hiroaki222
99e0ab6ed8
Fix typo in /docs/source/ja/model_doc/decision_transformer.md URL ( #35705 )
...
doc: Update original code repository URL
2025-01-15 07:36:50 -08:00
Mohamed Mekkouri
12dfd99007
Fix : Nemotron Processor in GGUF conversion ( #35708 )
...
* fixing nemotron processor
* make style
2025-01-15 14:25:44 +01:00
jiqing-feng
387663e571
Enable gptqmodel ( #35012 )
...
* gptqmodel
Signed-off-by: jiqing-feng <jiqing.feng@intel.com >
* fix format
Signed-off-by: jiqing-feng <jiqing.feng@intel.com >
* update readme
Signed-off-by: jiqing-feng <jiqing.feng@intel.com >
* gptqmodel need use checkpoint_format (#1 )
* gptqmodel need use checkpoint_format
* fix quantize
* Update quantization_config.py
* Update quantization_config.py
* Update quantization_config.py
---------
Co-authored-by: ZX-ModelCloud <zx@modelcloud.ai >
Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai >
* Revert quantizer_gptq.py (#2 )
* revert quantizer_gptq.py change
* pass **kwargs
* limit gptqmodel and optimum version
Signed-off-by: jiqing-feng <jiqing.feng@intel.com >
* fix format
Signed-off-by: jiqing-feng <jiqing.feng@intel.com >
* fix warning
Signed-off-by: jiqing-feng <jiqing.feng@intel.com >
* fix version check
Signed-off-by: jiqing-feng <jiqing.feng@intel.com >
* revert unrelated changes
Signed-off-by: jiqing-feng <jiqing.feng@intel.com >
* enable gptqmodel tests
Signed-off-by: jiqing-feng <jiqing.feng@intel.com >
* fix requires gptq
Signed-off-by: jiqing-feng <jiqing.feng@intel.com >
* Fix Transformer compat (#3 )
* revert quantizer_gptq.py change
* pass **kwargs
* add meta info
* cleanup
* cleanup
* Update quantization_config.py
* hf_select_quant_linear pass checkpoint_format and meta
* fix GPTQTestCUDA
* Update test_gptq.py
* gptqmodel.hf_select_quant_linear() now does not select ExllamaV2
* cleanup
* add backend
* cleanup
* cleanup
* no need check exllama version
* Update quantization_config.py
* lower checkpoint_format and backend
* check none
* cleanup
* Update quantization_config.py
* fix self.use_exllama == False
* spell
* fix unittest
* fix unittest
---------
Co-authored-by: LRL <lrl@lbx.dev >
Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai >
* fix format
Signed-off-by: jiqing-feng <jiqing.feng@intel.com >
* fix format again
Signed-off-by: jiqing-feng <jiqing.feng@intel.com >
* update gptqmodel version (#6 )
* update gptqmodel version
* update gptqmodel version
* fix unit test (#5 )
* update gptqmodel version
* update gptqmodel version
* "not self.use_exllama" is not equivalent to "self.use_exllama==False"
* fix unittest
* update gptqmodel version
* backend is loading_attibutes (#7 )
* fix format and tests
Signed-off-by: jiqing-feng <jiqing.feng@intel.com >
* fix memory check
Signed-off-by: jiqing-feng <jiqing.feng@intel.com >
* fix device mismatch
Signed-off-by: jiqing-feng <jiqing.feng@intel.com >
* fix result check
Signed-off-by: jiqing-feng <jiqing.feng@intel.com >
* Update src/transformers/quantizers/quantizer_gptq.py
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com >
* Update src/transformers/quantizers/quantizer_gptq.py
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com >
* Update src/transformers/quantizers/quantizer_gptq.py
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com >
* update tests
Signed-off-by: jiqing-feng <jiqing.feng@intel.com >
* review: update docs (#10 )
* review: update docs (#12 )
* review: update docs
* fix typo
* update tests for gptqmodel
Signed-off-by: jiqing-feng <jiqing.feng@intel.com >
* update document (#9 )
* update overview.md
* cleanup
* Update overview.md
* Update overview.md
* Update overview.md
* update gptq.md
* Update gptq.md
* Update gptq.md
* Update gptq.md
* Update gptq.md
* Update gptq.md
* Update gptq.md
---------
Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai >
* typo
* doc note for asymmetric quant
* typo with apple silicon(e)
* typo for marlin
* column name revert: review
* doc rocm support
* Update docs/source/en/quantization/gptq.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/quantization/gptq.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/quantization/gptq.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/quantization/gptq.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/quantization/overview.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update docs/source/en/quantization/overview.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
---------
Signed-off-by: jiqing-feng <jiqing.feng@intel.com >
Co-authored-by: LRL-ModelCloud <165116337+LRL-ModelCloud@users.noreply.github.com >
Co-authored-by: ZX-ModelCloud <zx@modelcloud.ai >
Co-authored-by: Qubitium-ModelCloud <qubitium@modelcloud.ai >
Co-authored-by: ZX-ModelCloud <165115237+ZX-ModelCloud@users.noreply.github.com >
Co-authored-by: LRL <lrl@lbx.dev >
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com >
Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com >
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
2025-01-15 14:22:49 +01:00
Matt
615bf9c5e4
Add future import for Py < 3.10 ( #35666 )
...
* Add future import for Py < 3.10
* make fixup
* Same issue in convert_olmo2_weights_to_hf.py
2025-01-15 12:45:43 +00:00
Raushan Turganbay
09d5f76274
Clean-up composite configs ( #34603 )
...
* remove manual assignment tie-word-embeddings
* remove another unused attribute
* fix tests
* fix tests
* remove unnecessary overwrites
* fix
* decoder=True
* clean pix2struct
* run-all
* forgot `_tied_weights_keys` when adding Emu3
* also Aria + fix-copies
* and clean aria
2025-01-15 10:04:07 +01:00
Mahdi Baghbanzadeh
c61fcde910
Enhance DataCollatorForLanguageModeling with Configurable Token Replacement Probabilities ( #35251 )
...
* DataCollatorForLanguageModeling class was updated with new parameters that provides more control over the token masking and relacing
* DataCollatorForLanguageModeling class was updated with new parameters that provides more control over the token masking and relacing
* Addressed review comments, modified the docstring and made a test for the DataCollatorForLanguageModeling
2025-01-14 17:01:10 +00:00
Ego Joseph Oborakpororo
b0cdbd9119
Enhanced Installation Section in README.md ( #35094 )
...
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
* Update README.md
Enhanced installation section with troubleshooting, GPU setup, and OS-specific details.
* Update README.md
Enhanced installation section with troubleshooting, GPU setup, and OS-specific details.
* Update installation.md
Updated installation.md to include virtual environment and GPU setup instructions.
* Update installation.md
Updated installation.md to include virtual environment and GPU setup instructions.
* Update installation.md
Updated installation.md to include virtual environment, troubleshooting and GPU setup instructions.
* Update installation.md
* Update installation.md
* Update installation.md
* Update installation.md
Updated installation.md to include virtual environment, troubleshooting functions and GPU setup instructions.
* Update installation.md
Updated installation.md to include virtual environment, troubleshooting functions and GPU setup instructions.
* Update installation.md
Updated installation.md to include virtual environment, troubleshooting functions and GPU setup instructions.
* Update README.md
Removed numbering from README.md.
* Update README.md
Removed unnecessary "a)" formatting as per maintainer feedback.
* Update README.md
Added blank lines around code snippets for better readability.
* Update README.md
Removed the line "b) Install a backend framework:" from README.md as per feedback.
* Update README.md
Simplified "For Windows:" to "Windows" in README.md as per feedback as well as "For macOS/Linux:" to "macOS/Linux"
* Update README.md
Removed unnecessary heading and retained valid code snippet.
* Update README.md
Removed unnecessary heading "d) Optional: Install from source for the latest updates" as per feedback.
* Update README.md
Removed "GPU Setup (Optional)" section to align with minimal design feedback.
* Update installation.md
Removed "Create and Activate a Virtual Environment" section from installation.md as per feedback.
* Update installation.md
Adjusted "Troubleshooting" to a second-level heading and added an introductory line as per feedback.
* Update installation.md
Updated troubleshooting section with simplified headings and formatted code blocks as per feedback.
* Update installation.md
Integrated GPU setup instructions into the "Install with pip" section for better content flow.
* Update README.md
Removed Troubleshooting section from README.md for minimalism as per maintainer feedback.
2025-01-14 08:05:08 -08:00
Mohamed Mekkouri
a11041ffad
Fix : add require_read_token for gemma2 gated model ( #35687 )
...
fix gemma2 gated model test
2025-01-14 11:47:05 +01:00
Mohamed Mekkouri
df2a812e95
Fix expected output for ggml test ( #35686 )
...
fix expected output
2025-01-14 11:46:55 +01:00
Mohamed Mekkouri
050636518a
Fix : HQQ config when hqq not available ( #35655 )
...
* fix
* make style
* adding require_hqq
* make style
2025-01-14 11:37:37 +01:00
Martin
715fdd6459
Update torchao.md: use auto-compilation ( #35490 )
...
* Update torchao.md: use auto-compilation
* Update torchao.md: indicate updating transformers to the latest
---------
Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com >
2025-01-14 11:33:48 +01:00
Mohamed Mekkouri
4b8d1f7fca
Fix : adding einops lib in the CI docker for some bitsandbytes tests ( #35652 )
...
* fix docker
* fix
2025-01-14 07:36:10 +01:00
RTrace
34f76bb62b
Fix zero_shot_image_classification documentation guide link in SigLIP ( #35671 )
2025-01-13 11:08:17 -08:00
Arthur
c23a1c1932
Add-helium ( #35669 )
...
* Add the helium model.
* Add a missing helium.
* And add another missing helium.
* Use float for the rmsnorm mul.
* Add the Helium tokenizer converter.
* Add the pad token as suggested by Arthur.
* Update the RMSNorm + some other tweaks.
* Fix more rebase issues.
* fix copies and style
* fixes and add helium.md
* add missing tests
* udpate the backlink
* oups
* style
* update init, and expected results
* small fixes
* match test outputs
* style fixup, fix doc builder
* add dummies and we should be good to go!z
* update sdpa and fa2 documentation
---------
Co-authored-by: laurent <laurent.mazare@gmail.com >
2025-01-13 18:41:15 +01:00
Ahmed Almaghz
a3f82328ed
[i18n-ar] Translated file : docs/source/ar/tasks/token_classification.md into Arabic ( #35193 )
...
* Create token_classification.md
* Update token_classification.md
* Update docs/source/ar/tasks/token_classification.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com >
* Update docs/source/ar/tasks/token_classification.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com >
* Update docs/source/ar/tasks/token_classification.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com >
* Update docs/source/ar/tasks/token_classification.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com >
* Update docs/source/ar/tasks/token_classification.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com >
* Update docs/source/ar/tasks/token_classification.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com >
* Update docs/source/ar/tasks/token_classification.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com >
* Update docs/source/ar/tasks/token_classification.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com >
* Update docs/source/ar/tasks/token_classification.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com >
* Update docs/source/ar/tasks/token_classification.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com >
* Update docs/source/ar/tasks/token_classification.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com >
* Update docs/source/ar/tasks/token_classification.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com >
* Update docs/source/ar/tasks/token_classification.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com >
* Update docs/source/ar/tasks/token_classification.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com >
* Update docs/source/ar/tasks/token_classification.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com >
* Update docs/source/ar/tasks/token_classification.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com >
* Update docs/source/ar/tasks/token_classification.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com >
* Update docs/source/ar/tasks/token_classification.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com >
* Update docs/source/ar/tasks/token_classification.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com >
* Update docs/source/ar/tasks/token_classification.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com >
* Update docs/source/ar/tasks/token_classification.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com >
* Update docs/source/ar/tasks/token_classification.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com >
* Update docs/source/ar/tasks/token_classification.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com >
* Update docs/source/ar/tasks/token_classification.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com >
* Update docs/source/ar/tasks/token_classification.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com >
* Update docs/source/ar/tasks/token_classification.md
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com >
* Update _toctree.yml
---------
Co-authored-by: Abdullah Mohammed <554032+abodacs@users.noreply.github.com >
2025-01-13 09:32:15 -08:00
Fanli Lin
2fa876d2d8
[tests] make cuda-only tests device-agnostic ( #35607 )
...
* intial commit
* remove unrelated files
* further remove
* Update test_trainer.py
* fix style
2025-01-13 14:48:39 +01:00
Arthur
e6f9b03464
[Compile] Only test compiling model forward pass ( #35658 )
...
* rename test to only compile forward!
* style emu
2025-01-13 13:43:29 +01:00
Raushan Turganbay
84a6789145
Enable different torch dtype in sub models ( #34873 )
...
* fix
* fix test
* add tests
* add more tests
* fix tests
* supposed to be a torch.dtype test
* handle BC and make fp32 default
2025-01-13 13:42:08 +01:00
Arthur
87089176d9
[Phi] bias should be True ( #35650 )
...
bias should be True
2025-01-13 13:15:07 +01:00
Sai-Suraj-27
91f14f1fc4
Removed some duplicated code ( #35637 )
...
* Removed duplicate class field definition.
* Removed duplicate code in try-except block.
---------
Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com >
2025-01-13 12:34:21 +01:00
jiqing-feng
b8c34d97fc
Fix whisper compile ( #35413 )
...
Fix compile error
Signed-off-by: jiqing-feng <jiqing.feng@intel.com >
2025-01-13 11:31:51 +01:00
Cyril Vallez
cd44bdb4b8
Fix device in rope module when using dynamic updates ( #35608 )
...
fix rope device
2025-01-13 10:11:17 +01:00
Matt
15bd3e61f8
Update codeowners with individual model owners ( #35595 )
...
* Update codeowners with individual model owners
* rip yoach
* add comment
* Replace - with _
* Add @qubvel for zero-shot object-detection
* Update CODEOWNERS
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
* Update CODEOWNERS
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
* Update CODEOWNERS
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
* Update CODEOWNERS
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
* Add yoni for omdet-turbo
* Update CODEOWNERS
Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com >
* Refactor / comment the CODEOWNERS file
* Capture modular files as well
* Add dummies without owner
* More cleanup
* Set Niels on a few more models that he added
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
Co-authored-by: Yoni Gozlan <74535834+yonigozlan@users.noreply.github.com >
2025-01-10 17:59:36 +00:00
Yih-Dar
1e3c6c1f7d
Skip MobileNetV1ModelTest::test_batching_equivalence for now ( #35614 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2025-01-10 18:32:36 +01:00
Yih-Dar
04eae987f3
Fix flaky test_beam_search_low_memory ( #35611 )
...
* fix
* fix
* fix
* fix
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2025-01-10 17:31:03 +01:00
Zach Mueller
b02828e4af
Let EarlyStoppingCallback not require load_best_model_at_end ( #35101 )
...
* Bookmark
* Add warning
2025-01-10 10:25:32 -05:00
Taha Akbari
0aaf124fb9
Added error when sequence length is bigger than max_position_embeddings ( #32156 )
...
* Added error when sequence length is bigger than max_position_embeddings
* Fixed formatting
* Fixed bug
* Changed copies to match
* Fixed bug
* Applied suggestions
* Removed redundant code
* Fixed bugs
* Bug fix
* Bug fix
* Added requested Changes
* Fixed bug
* Fixed unwanted change
* Fixed unwanated changes
* Fixed formatting
2025-01-10 15:23:54 +00:00
Zach Mueller
1211e616a4
Use inherit tempdir makers for tests + fix failing DS tests ( #35600 )
...
* Use existing APIs to make tempdir folders
* Fixup deepspeed too
* output_dir -> tmp_dir
2025-01-10 10:01:58 -05:00
Yih-Dar
bbc00046b9
Fix flaky test_custom_4d_attention_mask ( #35606 )
...
* fix
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2025-01-10 15:40:04 +01:00