Commit Graph

15232 Commits

Author SHA1 Message Date
Raushan Turganbay
cc309fd406 pass kwargs in stopping criteria list (#28927) 2024-02-08 15:38:29 +00:00
vodkaslime
0b693e90e0 fix: torch.int32 instead of torch.torch.int32 (#28883) 2024-02-08 16:28:17 +01:00
Matt
693667b8ac Remove dead TF loading code (#28926)
Remove dead code
2024-02-08 14:17:33 +00:00
Arthur
115ac94d06 [Core generation] Adds support for static KV cache (#27931)
Co-authored-by: fxmarty <9808326+fxmarty@users.noreply.github.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
2024-02-08 11:50:34 +01:00
Javier
4b236aed76 Fix utf-8 yaml load for marian conversion to pytorch in Windows (#28618)
Fix utf-8 yaml in marian conversion
2024-02-08 08:23:15 +01:00
Klaus Hipp
33df036917 [Docs] Revert translation of '@slow' decorator (#28912) 2024-02-08 03:31:47 +01:00
Klaus Hipp
328ade855b [Docs] Fix placement of tilde character (#28913)
Fix placement of tilde character
2024-02-07 17:19:39 -08:00
Huazhong Ji
5f96855761 Add npu device for pipeline (#28885)
add npu device for pipeline

Co-authored-by: unit_test <test@unit.com>
2024-02-07 17:27:01 +00:00
Yih-Dar
308d2b9004 Update the cache number (#28905)
* fix

* fix

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-02-07 16:37:09 +01:00
Daniel Korat
abf8f54a01 ⚠️ Raise Exception when trying to generate 0 tokens ⚠️ (#28621)
* change warning to exception

* Update src/transformers/generation/utils.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* validate `max_new_tokens` > 0 in `GenerationConfig`

* fix truncation test parameterization in `TextGenerationPipelineTests`

---------

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
2024-02-07 13:42:01 +01:00
Matt
349a6e8542 Fix Keras scheduler import so it works for older versions of Keras (#28895)
Fix our schedule import so it works for older versions of Keras
2024-02-07 12:28:24 +00:00
Sourab Mangrulkar
d9deddb4c1 fix Starcoder FA2 implementation (#28891) 2024-02-07 14:10:10 +05:30
Sai-Suraj-27
64d1518cbf fix: Fixed the documentation for logging_first_step by removing "evaluate" (#28884)
Fixed the documentation for logging_first_step by removing evaluate.
2024-02-07 08:46:36 +01:00
Klaus Hipp
1c31b7aa3b [Docs] Add missing language options and fix broken links (#28852)
* Add missing entries to the language selector

* Add links to the Colab and AWS Studio notebooks for ONNX

* Use anchor links in CONTRIBUTING.md

* Fix broken hyperlinks due to spaces

* Fix links to OpenAI research articles

* Remove confusing footnote symbols from author names, as they are also considered invalid markup
2024-02-06 12:01:01 -08:00
Yih-Dar
40658be461 Hotfix - make torchaudio get the correct version in torch_and_flax_job (#28899)
* check

* check

* check

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-02-06 21:00:42 +01:00
Klaus Hipp
4830f26965 [Docs] Fix backticks in inline code and documentation links (#28875)
Fix backticks in code blocks and documentation links
2024-02-06 11:15:44 -08:00
Lucain
a1afec9e17 Explicit server error on gated model (#28894) 2024-02-06 17:45:20 +00:00
Yih-Dar
89439fea64 unpin torch (#28892)
* unpin torch

* check

* check

* check

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-02-06 17:21:05 +01:00
Yih-Dar
76b4f666f5 Revert "[WIP] Hard error when ignoring tensors." (#28898)
Revert "[WIP] Hard error when ignoring tensors. (#27484)"

This reverts commit 2da28c4b41.
2024-02-06 17:18:30 +01:00
Yih-Dar
6529a5b5c1 Fix FastSpeech2ConformerModelTest and skip it on CPU (#28888)
* fix

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-02-06 11:05:23 +01:00
Sourab Mangrulkar
5346db1684 Raise error when using save_only_model with load_best_model_at_end for DeepSpeed/FSDP (#28866)
* Raise error when using `save_only_model` with `load_best_model_at_end` for DeepSpeed/FSDP

* Update trainer.py
2024-02-06 11:25:44 +05:30
Eran Hirsch
ee2a3400f2 Fix LongT5ForConditionalGeneration initialization of lm_head (#28873) 2024-02-06 04:24:20 +01:00
Klaus Hipp
1ea0bbd73c [Docs] Update project names and links in awesome-transformers (#28878)
Update project names and repository links in awesome-transformers
2024-02-06 04:06:29 +01:00
dependabot[bot]
e83227d76e Bump cryptography from 41.0.2 to 42.0.0 in /examples/research_projects/decision_transformer (#28879)
Bump cryptography in /examples/research_projects/decision_transformer

Bumps [cryptography](https://github.com/pyca/cryptography) from 41.0.2 to 42.0.0.
- [Changelog](https://github.com/pyca/cryptography/blob/main/CHANGELOG.rst)
- [Commits](https://github.com/pyca/cryptography/compare/41.0.2...42.0.0)

---
updated-dependencies:
- dependency-name: cryptography
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-02-06 03:53:08 +01:00
nakranivaibhav
2e7c942c81 Adds LlamaForQuestionAnswering class in modeling_llama.py along with AutoModel Support (#28777)
* This is a test commit

* testing commit

* final commit with some changes

* Removed copy statement

* Fixed formatting issues

* Fixed error added past_key_values in the forward method

* Fixed a trailing whitespace. Damn the formatting rules are strict

* Added the copy statement
2024-02-06 03:41:42 +01:00
xkszltl
ac51e59e47 Do not use mtime for checkpoint rotation. (#28862)
Resolve https://github.com/huggingface/transformers/issues/26961
2024-02-06 03:21:50 +01:00
eajechiloae
06901162b5 ClearMLCallback enhancements: support multiple runs and handle logging better (#28559)
* add clearml tracker

* support multiple train runs

* remove bad code

* add UI entries for config/hparams overrides

* handle models in different tasks

* run ruff format

* tidy code based on code review

---------

Co-authored-by: Eugen Ajechiloae <eugenajechiloae@gmail.com>
2024-02-05 20:04:17 +00:00
amyeroberts
ba3264b4e8 Image Feature Extraction pipeline (#28216)
* Draft pipeline

* Fixup

* Fix docstrings

* Update doctest

* Update pipeline_model_mapping

* Update docstring

* Update tests

* Update src/transformers/pipelines/image_feature_extraction.py

Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>

* Fix docstrings - review comments

* Remove pipeline mapping for composite vision models

* Add to pipeline tests

* Remove for flava (multimodal)

* safe pil import

* Add requirements for pipeline run

* Account for super slow efficientnet

* Review comments

* Fix tests

* Swap order of kwargs

* Use build_pipeline_init_args

* Add back FE pipeline for Vilt

* Include image_processor_kwargs in docstring

* Mark test as flaky

* Update TODO

* Update tests/pipelines/test_pipelines_image_feature_extraction.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Add license header

---------

Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-02-05 14:50:07 +00:00
Yoach Lacombe
7addc9346c Correct wav2vec2-bert inputs_to_logits_ratio (#28821)
* Correct wav2vec2-bert inputs_to_logits_ratio

* correct ratio

* correct ratio, clean asr pipeline

* refactor on one line
2024-02-05 13:14:47 +00:00
Arthur
3f9f749325 [Doc] update contribution guidelines (#28858)
update guidelines
2024-02-05 21:19:21 +09:00
Nicolas Patry
2da28c4b41 [WIP] Hard error when ignoring tensors. (#27484)
* [WIP] Hard error when ignoring tensors.

* Better selection/error when saving a checkpoint.

- Find all names we should normally drop (those are in the transformers
  config)
- Find all disjoint tensors (for those we can safely trigger a copy to
  get rid of the sharing before saving)
- Clone those disjoint tensors getting rid of the issue
- Find all identical names (those should be declared in the config
  but we try to find them all anyway.)
- For all identical names:
  - If they are in the config, just ignore them everything is fine
  - If they are not, warn about them.
- For all remainder tensors which are shared yet neither identical NOR
  disjoint. raise a hard error.

* Adding a failing test on `main` that passes here.

* We don't need to keep the subfolder logic in this test.

* Apply suggestions from code review

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2024-02-05 09:17:24 +01:00
w4ffl35
0466fd5ca2 Ability to override clean_code_for_run (#28783)
* Add clean_code_for_run function

* Call clean_code_for_run from agent method
2024-02-05 03:48:41 +01:00
Zizhao Chen
c430d6eaee [Docs] Fix bad doc: replace save with logging (#28855)
Fix bad doc: replace save with logging
2024-02-05 03:38:08 +01:00
Ziyang
7b702836af Support custom scheduler in deepspeed training (#26831)
Reuse trainer.create_scheduler to create scheduler for deepspeed
2024-02-05 03:33:55 +01:00
dependabot[bot]
ca8944c4e3 Bump dash from 2.3.0 to 2.15.0 in /examples/research_projects/decision_transformer (#28845)
Bump dash in /examples/research_projects/decision_transformer

Bumps [dash](https://github.com/plotly/dash) from 2.3.0 to 2.15.0.
- [Release notes](https://github.com/plotly/dash/releases)
- [Changelog](https://github.com/plotly/dash/blob/dev/CHANGELOG.md)
- [Commits](https://github.com/plotly/dash/compare/v2.3.0...v2.15.0)

---
updated-dependencies:
- dependency-name: dash
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-02-05 03:12:30 +01:00
amyeroberts
3d2900e829 Mark test_encoder_decoder_model_generate for vision_encoder_deocder as flaky (#28842)
Mark test as flaky
2024-02-02 16:57:08 +00:00
Sourab Mangrulkar
80d50076c8 Reduce GPU memory usage when using FSDP+PEFT (#28830)
support FSDP+PEFT
2024-02-02 21:18:01 +05:30
Yih-Dar
f497795948 Use -v for pytest on CircleCI (#28840)
use -v in pytest

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-02-02 16:44:13 +01:00
Yih-Dar
a7cb92aa03 fix / skip (for now) some tests before switch to torch 2.2 (#28838)
* fix / skip some tests before we can switch to torch 2.2

* style

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-02-02 14:11:50 +01:00
Yih-Dar
0e75aeefaf Fix issues caused by natten (#28834)
try

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-02-02 21:11:48 +09:00
Juri Ganitkevitch
ec29d25d9f Add missing None check for hf_quantizer (#28804)
* Add missing None check for hf_quantizer

* Add test, fix logic.

* make style

* Switch test model to Mistral

* Comment

* Update tests/test_modeling_utils.py

---------

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
2024-02-02 09:34:12 +01:00
skumar951
1efb21c764 Explicitly check if token ID's are None in TFBertTokenizer constructor (#28824)
Add an explicit none-check, since token ids can be 0
2024-02-02 09:13:36 +01:00
Klaus Hipp
721ee783ca [Docs] Fix spelling and grammar mistakes (#28825)
* Fix typos and grammar mistakes in docs and examples

* Fix typos in docstrings and comments

* Fix spelling of `tokenizer` in model tests

* Remove erroneous spaces in decorators

* Remove extra spaces in Markdown link texts
2024-02-02 08:45:00 +01:00
Steven Liu
2418c64a1c [docs] HfQuantizer (#28820)
* tidy

* fix path
2024-02-02 08:22:18 +01:00
Steven Liu
abbffc4525 [docs] Backbone (#28739)
* backbones

* fix path

* fix paths

* fix code snippet

* fix links
2024-02-01 09:16:16 -08:00
Rockerz
23ea6743f2 Add models from deit (#28302)
* Add modelss

* Add 2 more models

* add models to tocrree

* Add modles

* Update docs/source/ja/model_doc/detr.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/ja/model_doc/deit.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/ja/model_doc/deplot.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* fix bugs

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-02-01 09:15:55 -08:00
zspo
d98591a12b [docs] fix some bugs about parameter description (#28806)
Co-authored-by: p_spozzhang <p_spozzhang@tencent.com>
2024-02-01 16:59:29 +00:00
Sangbum Daniel Choi
e19c12e094 enable graident checkpointing in DetaObjectDetection and add tests in Swin/Donut_Swin (#28615)
* enable graident checkpointing in DetaObjectDetection

* fix missing part in original DETA

* make style

* make fix-copies

* Revert "make fix-copies"

This reverts commit 4041c86c29248f1673e8173b677c20b5a4511358.

* remove fix-copies of DetaDecoder

* enable swin gradient checkpointing

* fix gradient checkpointing in donut_swin

* add tests for deta/swin/donut

* Revert "fix gradient checkpointing in donut_swin"

This reverts commit 1cf345e34d3cc0e09eb800d9895805b1dd9b474d.

* change supports_gradient_checkpointing pipeline to PreTrainedModel

* Revert "add tests for deta/swin/donut"

This reverts commit 6056ffbb1eddc3cb3a99e4ebb231ae3edf295f5b.

* Revert "Revert "fix gradient checkpointing in donut_swin""

This reverts commit 24e25d0a14891241de58a0d86f817d0b5d2a341f.

* Simple revert

* enable deformable detr gradient checkpointing

* add gradient in encoder
2024-02-01 15:07:44 +00:00
Matt
7bc6d76396 Add tip on setting tokenizer attributes (#28764)
* Add tip on setting tokenizer attributes

* Grammar

* Remove the bit that was causing doc builds to fail
2024-02-01 14:44:58 +00:00
fxmarty
709dc43239 Fix symbolic_trace with kv cache (#28724)
* fix symbolic_trace with kv cache

* comment & better test
2024-02-01 09:45:02 +01:00