NielsRogge
762af3e3c7
Add OWLv2, bis ( #26668 )
...
* First draft
* Update conversion script
* Update copied from statements
* Fix style
* Add copied from to config
* Add copied from to processor
* Run make fixup
* Add docstring
* Update docstrings
* Add method
* Improve docstrings
* Fix docstrings
* Improve docstrings
* Remove onnx
* Add flag
* Address comments
* Add copied from to model tests
* Add flag to conversion script
* Add code snippet
* Address more comments
* Address comment
* Improve conversion script
* More improvements
* Add expected objectness logits
* Skip test
* Improve conversion script
* Extend conversion script
* Convert large checkpoint
* Fix doc tests
* Convert all checkpoints, update integration tests
* Add checkpoint_path arg
* Fix repo_id
2023-10-13 16:41:24 +02:00
Matt
bdb391e9c6
Fix Falcon generation test ( #26770 )
2023-10-13 15:10:27 +01:00
Matt
c9785d956b
Disable default system prompt for LLaMA ( #26765 )
...
* Disable default system prompt for LLaMA
* Update test to not expect default prompt
2023-10-13 14:48:38 +01:00
Yih-Dar
21da3b2461
Update expect outputs of IdeficsProcessorTest.test_tokenizer_padding ( #26779 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2023-10-13 09:52:10 +02:00
Yih-Dar
3e93dd295b
Skip TrainerIntegrationFSDP::test_basic_run_with_cpu_offload if torch < 2.1 ( #26764 )
...
* fix
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2023-10-12 18:22:09 +02:00
Heinz-Alexander Fuetterer
883ed4b344
chore: fix typos ( #26756 )
2023-10-12 18:00:27 +02:00
Yih-Dar
a243cdca2a
Fix PerceiverModelIntegrationTest::test_inference_masked_lm ( #26760 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2023-10-12 17:43:06 +02:00
Yih-Dar
db5e0c3292
Fix MistralIntegrationTest OOM ( #26754 )
...
* fix
* fix
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2023-10-12 12:31:11 +02:00
Yih-Dar
72256bc72a
Fix PersimmonIntegrationTest OOM ( #26750 )
...
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2023-10-12 11:24:18 +02:00
Tom Aarsen
40ea9ab2a1
Add many missing spaces in adjacent strings ( #26751 )
...
Add missing spaces in adjacent strings
2023-10-12 10:28:40 +02:00
Patrick von Platen
da69de17e8
[Assistant Generation] Improve Encoder Decoder ( #26701 )
...
* [Assistant Generation] Improve enc dec
* save more
* Fix logit processor checks
* Clean
* make style
* fix deprecation
* fix generation test
* Apply suggestions from code review
* fix biogpt
* make style
2023-10-11 15:52:20 +02:00
Yih-Dar
5334796d20
Copied from for test files (#26713 )
...
* copied statement for test files
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2023-10-11 14:12:09 +02:00
Billy Bradley
dcc49d8a7e
In assisted decoding, pass model_kwargs to model's forward call (fix prepare_input_for_generation in all models) ( #25242 )
...
* In assisted decoding, pass model_kwargs to model's forward call
Previously, assisted decoding would ignore any additional kwargs
that it doesn't explicitly handle. This was inconsistent with other
generation methods, which pass the model_kwargs through
prepare_inputs_for_generation and forward the returned dict to the
model's forward call.
The prepare_inputs_for_generation method needs to be amended in all
models, as previously it only kept the last input ID when a past_key_values
was passed.
* Improve variable names in _extend_attention_mask
* Refactor extending token_type_ids into a function
* Replace deepcopy with copy to optimize performance
* Update new persimmon model with llama changes for assisted generation
* Update new mistral model for assisted generation with prepare_inputs_for_generation
* Update position_ids creation in falcon prepare_inputs_for_generation to support assisted generation
2023-10-11 13:18:42 +02:00
Thien Tran
1e3c9ddacc
Make Whisper Encoder's sinusoidal PE non-trainable by default ( #26032 )
...
* set encoder's PE as non-trainable
* freeze flax
* init sinusoids
* add test for non-trainable embed positions
* simplify TF encoder embed_pos
* revert tf
* clean up
* add sinusoidal init for jax
* make consistent sinusoidal function
* fix dtype
* add default dtype
* use numpy for sinusoids. fix jax
* add sinusoid init for TF
* fix
* use custom embedding
* use specialized init for each impl
* fix sinusoids init. add test for pytorch
* fix TF dtype
* simplify sinusoid init for flax and tf
* add tests for TF
* change default dtype to float32
* add sinusoid test for flax
* Update src/transformers/models/whisper/modeling_flax_whisper.py
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com >
* Update src/transformers/models/whisper/modeling_tf_whisper.py
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com >
* move sinusoidal init to _init_weights
---------
Co-authored-by: sanchit-gandhi <sanchit@huggingface.co >
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com >
2023-10-11 09:08:54 +01:00
Shreyas S
86a4e5a96b
Fixed malapropism error ( #26660 )
...
Update test_integration.py
Fixed malapropism clone>copy
2023-10-09 11:04:57 +02:00
Arthur
9ad815e412
[LlamaTokenizerFast] Adds edge cases for the template processor ( #26606 )
...
* make sure eos and bos are properly handled for fast tokenizer
* fix code llama as well
* nits
* fix the conversion script as well
* fix failing test
2023-10-06 16:40:54 +02:00
statelesshz
27597fea07
remove SharedDDP as it is deprecated ( #25702 )
...
* remove SharedDDP as it was drepracated
* apply review suggestion
* make style
* Oops,forgot to remove the compute_loss context manager in Seq2SeqTrainer.
* remove the unnecessary conditional statement
* keep the logic of IPEX
* clean code
* mix precision setup & make fixup
---------
Co-authored-by: statelesshz <jihuazhong1@huawei.com >
2023-10-06 16:03:11 +02:00
Yih-Dar
e840aa67e8
Fix failing MusicgenTest .test_pipeline_text_to_audio ( #26586 )
...
* fix
* fix
* Fix
* Fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2023-10-06 15:53:59 +02:00
fxmarty
64845307b3
Remove unnecessary unsqueeze - squeeze in rotary positional embedding ( #26162 )
...
* remove unnecessary unsqueeze-squeeze in llama
* correct other models
* fix
* revert gpt_neox_japanese
* fix copie
* fix test
2023-10-06 18:25:15 +09:00
Tianqi Liu
65aabafe2f
Update tokenization_code_llama_fast.py ( #26576 )
...
* Update tokenization_code_llama_fast.py
* Update test_tokenization_code_llama.py
* Update test_tokenization_code_llama.py
2023-10-06 10:49:02 +02:00
Towdo
af38c837ee
Fixed inconsistency in several fast tokenizers ( #26561 )
2023-10-06 10:40:47 +02:00
Marvin Gabler
0a3b9d02fe
#26566 swin2 sr allow in out channels ( #26568 )
...
* feat: close #26566 , changed model & config files to accept arbitary in and out channels
* updated docstrings
* fix: linter error
* fix: update Copy docstrings
* fix: linter update
* fix: rename num_channels_in to num_channels to prevent breaking changes
* fix: make num_channels_out None per default
* Update src/transformers/models/swin2sr/configuration_swin2sr.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
* fix: update tests to include num_channels_out
* fix:linter
* fix: remove normalization with precomputed rgb values when #input_channels!=#output_channels
---------
Co-authored-by: marvingabler <marvingabler@outlook.de >
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
2023-10-05 15:20:38 +02:00
Younes Belkada
e6d250e4cd
[core] fix silent bug keep_in_fp32 modules ( #26589 )
...
* fix silent bug `keep_in_fp32` modules
* final fix
* added a common test.
* Trigger CI
* revert
2023-10-05 14:44:31 +02:00
Yih-Dar
54e17a15dc
Fix failing tests on main due to torch 2.1 ( #26607 )
...
* fix
* fix
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2023-10-05 10:27:05 +02:00
Arthur
c037b2e340
skip flaky hub tests ( #26594 )
...
skip flaky
2023-10-04 17:47:55 +02:00
dg845
9deb18ca1a
Add # Copied from statements to audio feature extractors that use the floats_list function ( #26581 )
...
Add # Copied from statements to audio feature extractors that use the floats_list function.
2023-10-04 17:09:48 +02:00
Sylvain Gugger
03af4c42a6
Docstring check ( #26052 )
...
* Fix number of minimal calls to the Hub with peft integration
* Alternate design
* And this way?
* Revert
* Nits to fix
* Add util
* Print when changes are made
* Add list to ignore
* Add more rules
* Manual fixes
* deal with kwargs
* deal with enum defaults
* avoid many digits for floats
* Manual fixes
* Fix regex
* Fix regex
* Auto fix
* Style
* Apply script
* Add ignored list
* Add check that templates are filled
* Adding to CI checks
* Add back semi-fix
* Ignore more objects
* More auto-fixes
* Ignore missing objects
* Remove temp semi-fix
* Fixes
* Update src/transformers/models/pvt/configuration_pvt.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
* Update utils/check_docstrings.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
* Update src/transformers/utils/quantization_config.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
* Deal with float defaults
* Fix small defaults
* Address review comment
* Treat
* Post-rebase cleanup
* Address review comment
* Update src/transformers/models/deprecated/mctct/configuration_mctct.py
Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr >
* Address review comment
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr >
2023-10-04 15:13:37 +02:00
Lysandre Debut
5c66378cea
[Tokenizers] Skip tests temporarily ( #26574 )
...
* Skip tests temporarily
* style
* Add additional test
2023-10-03 19:43:42 +02:00
Sanchit Gandhi
57f44dc428
[Whisper] Allow basic text normalization ( #26149 )
...
* [Whisper] Allow basic text normalization
* up
* style copies
2023-10-03 17:57:16 +01:00
Younes Belkada
2aef9a9601
[PEFT] Final fixes ( #26559 )
...
* fix issues with PEFT
* logger warning futurewarning issues
* fixup
* adapt from suggestions
* oops
* rm test
2023-10-03 14:53:09 +02:00
Younes Belkada
ae9a344cce
[Mistral] Add Flash Attention-2 support for mistral ( #26464 )
...
* add FA-2 support for mistral
* fixup
* add sliding windows
* fixing few nits
* v1 slicing cache - logits do not match
* add comment
* fix bugs
* more mem efficient
* add warning once
* add warning once
* oops
* fixup
* more comments
* copy
* add safety checker
* fixup
* Update src/transformers/models/mistral/modeling_mistral.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
* copied from
* up
* raise when padding side is right
* fixup
* add doc + few minor changes
* fixup
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
2023-10-03 13:44:46 +02:00
Sanchit Gandhi
768aa3d9cd
[Wav2Vec2 and Co] Update init tests for PT 2.1 ( #26494 )
2023-10-03 10:52:34 +02:00
Nathan Cahill
b5ca8fcd20
Add tokenizer kwargs to fill mask pipeline. ( #26234 )
...
* add tokenizer kwarg inputs
* Adding tokenizer_kwargs to _sanitize_parameters
* Add truncation=True example to tests
* Update test_pipelines_fill_mask.py
* Update test_pipelines_fill_mask.py
* make fix-copies and make style
* Update fill_mask.py
Replace single tick with double
* make fix-copies
* Style
---------
Co-authored-by: Lysandre <lysandre@huggingface.co >
2023-10-03 10:25:10 +02:00
Arthur
bab3331906
Code-llama-nit ( #26300 )
...
* fix encoding when the fill token is None
* add tests and edge cases
* fiuxp
* Update tests/models/code_llama/test_tokenization_code_llama.py
2023-10-02 18:29:27 +02:00
Arthur
63864e057f
Fix model integration ci ( #26322 )
...
* fix wav2vec2
* nit
* stash
* one more file to update
* fix byt5
* vocab size is 256, don't change that!
* use other revision
* test persimon in smaller size
* style
* tests
* nits
* update add tokens from pretrained
* test tokenization
* nits
* potential fnet fix?
* more nits
* nits
* correct test
* assert close
* udpate
* ouch
* fix it
* some more nits
* FINALLU
* use `adept` checkpoints
* more adept checkpoints
* that was invlved!
2023-10-02 13:55:46 +02:00
Younes Belkada
6824461f2a
[core/ auto ] Fix bnb test with code revision + bug with code revision ( #26431 )
...
* fix bnb test with code revision
* fix test
* Apply suggestions from code review
* Update src/transformers/models/auto/auto_factory.py
* Update src/transformers/models/auto/auto_factory.py
* Update src/transformers/models/auto/auto_factory.py
2023-10-02 11:35:07 +02:00
Lysandre Debut
67239f7360
Revert falcon exception ( #26472 )
...
* Revert "Falcon: fix revision propagation (#26006 )"
This reverts commit 118c676ef3 .
* Revert "Put Falcon back (#25960 )"
This reverts commit 22a69f1d7d .
2023-10-02 09:13:19 +02:00
Yih-Dar
391177441b
Avoid all-zeor attnetion mask used in testing ( #26469 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2023-09-29 11:06:06 +02:00
Yih-Dar
9b23d0de0e
Skip 2 failing persimmon pipeline tests for now ( #26485 )
...
skip
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2023-09-29 10:52:18 +02:00
Marc Sun
5e11d72d4d
fix_mbart_tied_weights ( #26422 )
...
* fix_mbart_tied_weights
* add test
2023-09-28 15:08:35 +02:00
Younes Belkada
38e96324ef
[PEFT] introducing adapter_kwargs for loading adapters from different Hub location (subfolder, revision) than the base model ( #26270 )
...
* make use of adapter_revision
* v1 adapter kwargs
* fix CI
* fix CI
* fix CI
* fixup
* add BC
* Update src/transformers/integrations/peft.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
* fixup
* change it to error
* Update src/transformers/modeling_utils.py
* Update src/transformers/modeling_utils.py
* fixup
* change
* Update src/transformers/integrations/peft.py
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
2023-09-28 11:13:03 +02:00
Chris Bamford
72958fcd3c
[Mistral] Mistral-7B-v0.1 support ( #26447 )
...
* [Mistral] Mistral-7B-v0.1 support
* fixing names
* slightly longer test
* fixups
* not_doctested
* wrongly formatted references
* make fixuped
---------
Co-authored-by: Timothee Lacroix <t@eugen.ai >
Co-authored-by: timlacroix <t@mistral.ai >
2023-09-27 18:30:46 +02:00
Younes Belkada
3ca18d6d09
[PEFT] Fix PEFT multi adapters support ( #26407 )
...
* fix PEFT multi adapters support
* refactor a bit
* save pretrained + BC + added tests
* Update src/transformers/integrations/peft.py
Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com >
* add more tests
* add suggestion
* final changes
* adapt a bit
* fixup
* Update src/transformers/integrations/peft.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com >
* adapt from suggestions
---------
Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com >
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com >
2023-09-27 16:45:31 +02:00
Younes Belkada
153755ee38
[FA / tests] Add use_cache tests for FA models ( #26415 )
...
* add use_cache tests for FA
* fixup
2023-09-27 12:21:54 +02:00
Shauray Singh
abd2531034
Fix padding for IDEFICS ( #26396 )
...
* fix
* fixup
* tests
* fixup
2023-09-27 10:56:07 +02:00
sanjeevk-os
6ce6a5adb9
added support for gradient checkpointing in ESM models ( #26386 )
2023-09-26 10:15:53 +02:00
NielsRogge
ace74d16bd
Add Nougat ( #25942 )
...
* Add conversion script
* Add NougatImageProcessor
* Add crop margin
* More improvements
* Add docs, READMEs
* Remove print statements
* Include model_max_length
* Add NougatTokenizerFast
* Fix imports
* Improve postprocessing
* Improve image processor
* Fix image processor
* Improve normalize method
* More improvements
* More improvements
* Add processor, improve docs
* Simplify fast tokenizer
* Remove test file
* Fix docstrings
* Use NougatProcessor in conversion script
* Add is_levensthein_available
* Add tokenizer tests
* More improvements
* Use numpy instead of opencv
* Add is_cv2_available
* Fix cv2_available
* Add is_nltk_available
* Add image processor tests, improve crop_margin
* Add integration tests
* Improve integration test
* Use do_rescale instead of hacks, thanks Amy
* Remove random_padding
* Address comments
* Address more comments
* Add import
* Address more comments
* Address more comments
* Address comment
* Address comment
* Set max_model_input_sizes
* Add tests
* Add requires_backends
* Add Nougat to exotic tests
* Use to_pil_image
* Address comment regarding nltk
* Add NLTK
* Improve variable names, integration test
* Add test
* refactor, document, and test regexes
* remove named capture groups, add comments
* format
* add non-markdown fixed tokenization
* format
* correct flakyness of args parse
* add regex comments
* test functionalities for crop_image, align long axis and expected output
* add regex tests
* remove cv2 dependency
* test crop_margin equality between cv2 and python
* refactor table regexes to markdown
add newline
* change print to log, improve doc
* fix high count tables correction
* address PR comments: naming, linting, asserts
* Address comments
* Add copied from
* Update conversion script
* Update conversion script to convert both small and base versions
* Add inference example
* Add more info
* Fix style
* Add require annotators to test
* Define all keyword arguments explicitly
* Move cv2 annotator
* Add tokenizer init method
* Transfer checkpoints
* Add reference to Donut
* Address comments
* Skip test
* Remove cv2 method
* Add copied from statements
* Use cached_property
* Fix docstring
* Add file to not doctested
---------
Co-authored-by: Pablo Montalvo <pablo.montalvo.leroux@gmail.com >
2023-09-26 07:06:04 +02:00
Yih-Dar
d9e4bc2895
Update tiny model information and pipeline tests ( #26285 )
...
* Update tiny model summary file
* add to pipeline tests
* revert
* fix import
* fix import
* fix
* fix
* update
* update
* update
* fix
* remove BarkModelTest
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2023-09-25 18:08:12 +02:00
LeviVasconcelos
576cd45a57
Add image to image pipeline ( #25393 )
...
* Add image to image pipeline
Add image to image pipeline
* remove swin2sr from tf auto
* make ImageToImage importable
* make style
make style
make style
make style
* remove tf support
* remove nonused imports
* fix postprocessing
* add important comments; add unit tests
* add documentation
* remove support for TF
* make fixup
* fix typehint Image.Image
* fix documentation code
* address review request; fix unittest type checking
* address review request; fix unittest type checking
* make fixup
* address reviews
* Update src/transformers/pipelines/image_to_image.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com >
* enhance docs
* make style
* make style
* improve docetest time
* improve docetest time
* Update tests/pipelines/test_pipelines_image_to_image.py
Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com >
* Update tests/pipelines/test_pipelines_image_to_image.py
Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com >
* make fixup
* undo faulty merge
* undo faulty merge
* add image-to-image to test pipeline mixin
* Update src/transformers/pipelines/image_to_image.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
* Update tests/pipelines/test_pipelines_image_to_image.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
* improve docs
---------
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com >
Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com >
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
2023-09-22 19:53:55 +03:00
Younes Belkada
368a58e61c
[core ] Integrate Flash attention 2 in most used models ( #25598 )
...
* v1
* oops
* working v1
* fixup
* add some TODOs
* fixup
* padding support + try with module replacement
* nit
* alternative design
* oops
* add `use_cache` support for llama
* v1 falcon
* nit
* a bit of refactor
* nit
* nits nits
* add v1 padding support falcon (even though it seemed to work before)
* nit
* falcon works
* fixup
* v1 tests
* nit
* fix generation llama flash
* update tests
* fix tests + nits
* fix copies
* fix nit
* test- padding mask
* stype
* add more mem efficient support
* Update src/transformers/modeling_utils.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com >
* fixup
* nit
* fixup
* remove it from config when saving
* fixup
* revert docstring
* add more checks
* use values
* oops
* new version
* fixup
* add same trick for falcon
* nit
* add another test
* change tests
* fix issues with GC and also falcon
* fixup
* oops
* Update src/transformers/models/falcon/modeling_falcon.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
* add init_rope
* updates
* fix copies
* fixup
* fixup
* more clarification
* fixup
* right padding tests
* add docs
* add FA in docker image
* more clarifications
* add some figures
* add todo
* rectify comment
* Change to FA2
* Update docs/source/en/perf_infer_gpu_one.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
* split in two lines
* change test name
* add more tests
* some clean up
* remove `rearrange` deps
* add more docs
* revert changes on dockerfile
* Revert "revert changes on dockerfile"
This reverts commit 8d72a66b4b9b771abc3f15a9b9506b4246d62d8e.
* revert changes on dockerfile
* Apply suggestions from code review
Co-authored-by: Lysandre Debut <hi@lysand.re >
* address some comments
* docs
* use inheritance
* Update src/transformers/testing_utils.py
Co-authored-by: Lysandre Debut <hi@lysand.re >
* fixup
* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
* Update src/transformers/modeling_utils.py
* final comments
* clean up
* style
* add cast + warning for PEFT models
* fixup
---------
Co-authored-by: Felix Marty <9808326+fxmarty@users.noreply.github.com >
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com >
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
Co-authored-by: Lysandre Debut <hi@lysand.re >
2023-09-22 17:42:10 +02:00