Matthijs Hollemans
2faa09530b
fix Whisper tests on GPU ( #23753 )
...
* move input features to GPU
* skip these tests because undefined behavior
* unskip tests
2023-05-30 09:06:58 -04:00
Eli Simhayev
4b6a5a7caa
[Time-Series] Autoformer model ( #21891 )
...
* ran `transformers-cli add-new-model-like`
* added `AutoformerLayernorm` and `AutoformerSeriesDecomposition`
* added `decomposition_layer` in `init` and `moving_avg` to config
* added `AutoformerAutoCorrelation` to encoder & decoder
* removed caninical self attention `AutoformerAttention`
* added arguments in config and model tester. Init works! 😁
* WIP autoformer attention with autocorrlation
* fixed `attn_weights` size
* wip time_delay_agg_training
* fixing sizes and debug time_delay_agg_training
* aggregation in training works! 😁
* `top_k_delays` -> `top_k_delays_index` and added `contiguous()`
* wip time_delay_agg_inference
* finish time_delay_agg_inference 😎
* added resize to autocorrelation
* bug fix: added the length of the output signal to `irfft`
* `attention_mask = None` in the decoder
* fixed test: changed attention expected size, `test_attention_outputs` works!
* removed unnecessary code
* apply AutoformerLayernorm in final norm in enc & dec
* added series decomposition to the encoder
* added series decomp to decoder, with inputs
* added trend todos
* added autoformer to README
* added to index
* added autoformer.mdx
* remove scaling and init attention_mask in the decoder
* make style
* fix copies
* make fix-copies
* inital fix-copies
* fix from https://github.com/huggingface/transformers/pull/22076
* make style
* fix class names
* added trend
* added d_model and projection layers
* added `trend_projection` source, and decomp layer init
* added trend & seasonal init for decoder input
* AutoformerModel cannot be copied as it has the decomp layer too
* encoder can be copied from time series transformer
* fixed generation and made distrb. out more robust
* use context window to calculate decomposition
* use the context_window for decomposition
* use output_params helper
* clean up AutoformerAttention
* subsequences_length off by 1
* make fix copies
* fix test
* added init for nn.Conv1d
* fix IGNORE_NON_TESTED
* added model_doc
* fix ruff
* ignore tests
* remove dup
* fix SPECIAL_CASES_TO_ALLOW
* do not copy due to conv1d weight init
* remove unused imports
* added short summary
* added label_length and made the model non-autoregressive
* added params docs
* better doc for `factor`
* fix tests
* renamed `moving_avg` to `moving_average`
* renamed `factor` to `autocorrelation_factor`
* make style
* Update src/transformers/models/autoformer/configuration_autoformer.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com >
* Update src/transformers/models/autoformer/configuration_autoformer.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com >
* fix configurations
* fix integration tests
* Update src/transformers/models/autoformer/configuration_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
* fixing `lags_sequence` doc
* Revert "fixing `lags_sequence` doc"
This reverts commit 21e34911e36a6f8f45f25cbf43584a49e5316c55.
* Update src/transformers/models/autoformer/modeling_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
* Update src/transformers/models/autoformer/modeling_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
* Update src/transformers/models/autoformer/modeling_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
* Apply suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
* Update src/transformers/models/autoformer/configuration_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
* model layers now take the config
* added `layer_norm_eps` to the config
* Update src/transformers/models/autoformer/modeling_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
* added `config.layer_norm_eps` to AutoformerLayernorm
* added `config.layer_norm_eps` to all layernorm layers
* Update src/transformers/models/autoformer/configuration_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
* Update src/transformers/models/autoformer/configuration_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
* Update src/transformers/models/autoformer/configuration_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
* Update src/transformers/models/autoformer/configuration_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
* fix variable names
* added inital pretrained model
* added use_cache docstring
* doc strings for trend and use_cache
* fix order of args
* imports on one line
* fixed get_lagged_subsequences docs
* add docstring for create_network_inputs
* get rid of layer_norm_eps config
* add back layernorm
* update fixture location
* fix signature
* use AutoformerModelOutput dataclass
* fix pretrain config
* no need as default exists
* subclass ModelOutput
* remove layer_norm_eps config
* fix test_model_outputs_equivalence test
* test hidden_states_output
* make fix-copies
* Update src/transformers/models/autoformer/configuration_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
* removed unused attr
* Update tests/models/autoformer/test_modeling_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
* Update src/transformers/models/autoformer/modeling_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
* Update src/transformers/models/autoformer/modeling_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
* Update src/transformers/models/autoformer/modeling_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
* Update src/transformers/models/autoformer/modeling_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
* Update src/transformers/models/autoformer/modeling_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
* Update src/transformers/models/autoformer/modeling_autoformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
* use AutoFormerDecoderOutput
* fix formatting
* fix formatting
---------
Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com >
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com >
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
2023-05-30 10:23:32 +02:00
Sanchit Gandhi
d8222be57e
[Whisper] Reduce batch size in tests ( #23736 )
2023-05-24 17:31:25 +01:00
Matt
f8b2574416
Better TF docstring types ( #23477 )
...
* Rework TF type hints to use | None instead of Optional[] for tf.Tensor
* Rework TF type hints to use | None instead of Optional[] for tf.Tensor
* Don't forget the imports
* Add the imports to tests too
* make fixup
* Refactor tests that depended on get_type_hints
* Better test refactor
* Fix an old hidden bug in the test_keras_fit input creation code
* Fix for the Deit tests
2023-05-24 13:52:52 +01:00
Yih-Dar
de5f86e59d
Skip TFCvtModelTest::test_keras_fit_mixed_precision for now ( #23699 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2023-05-23 20:47:47 +02:00
LWprogramming
3d57404464
is_batched fix for remaining 2-D numpy arrays ( #23309 )
...
* Fix is_batched code to allow 2-D numpy arrays for audio
* Tests
* Fix typo
* Incorporate comments from PR #23223
2023-05-23 14:37:35 -04:00
Younes Belkada
42baa58f90
[SAM] Fixes pipeline and adds a dummy pipeline test ( #23684 )
...
* add a dummy pipeline test
* change test name
2023-05-23 17:36:49 +02:00
Yih-Dar
71a5ed3433
Fix a BridgeTower test ( #23694 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2023-05-23 17:32:57 +02:00
Yih-Dar
abf691aac0
Fix PyTorch SAM tests ( #23682 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2023-05-23 14:48:38 +02:00
Matt
26a06814a1
Fix SAM tests and use smaller checkpoints ( #23656 )
...
* Fix SAM tests and use smaller checkpoints
* Override test_model_from_pretrained to use sam-vit-base as well
* make fixup
2023-05-22 19:42:35 +02:00
LWprogramming
5de2a6d5e5
Fix wav2vec2 is_batched check to include 2-D numpy arrays ( #23223 )
...
* Fix wav2vec2 is_batched check to include 2-D numpy arrays
* address comment
* Add tests
* oops
* oops
* Switch to np array
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com >
* Switch to np array
* condition merge
* Specify mono channel only in comment
* oops, add other comment too
* make style
* Switch list check from falsiness to empty
---------
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com >
2023-05-22 12:57:45 -04:00
Younes Belkada
3cb9309024
[Blip] Remove redundant shift right ( #23153 )
...
* remove redundant shit right
* fix failing tests
* this time fix tests
2023-05-19 19:14:16 +02:00
Matt
1c460a5273
TF port of the Segment Anything Model (SAM) ( #22970 )
...
* First commit
* Add auto-translation with GPT-4
* make fixup
* Add a functional layernorm for TF
* Add all the auxiliary imports etc.
* Add the extra processor and tests
* rebase to main
* Add all the needed fixes to the GPT code
* make fixup
* Make convolutions channels-last so they run on CPU
* make fixup
* Fix final issues
* Fix other models affected by test change
* Clarify comment on the sparse_prompt_embeddings check
* Refactor functional_layernorm, use shape_list in place of .shape in some places
* Remove deprecated torch-alike code
* Update tests/models/sam/test_modeling_tf_sam.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
* Update tests/models/sam/test_modeling_tf_sam.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
* Refactor processor with common methods and separated private methods
* make fixup
* Quietly delete the file that didn't do anything (sorry Sylvain)
* Refactor the processor tests into one file
* make fixup
* Clean up some unnecessary indirection
* Fix TF mask postprocessing
* Add more processor equivalence tests
* Refactor generate_crop_boxes to use framework-neutral np code
* Make the serving output correctly conditional
* Fix error message line length
* Use dict keys rather than indices internally in both TF and PT SAM call/forward
* Return dicts internally in the call/forward methods
* Revert changes to common tests and just override check_pt_tf_outputs
* Revert changes to other model tests
* Clarify comments for functional layernorm
* Add missing transpose from PT code
* Removed unused copied from in PT code
* Remove overrides for tests that don't exist in TF
* Fix transpose and update tests for PT and TF to check pred_masks
* Add training flag
* Update tests to use TF checkpoints
* Update index.mdx
* Add missing cross-test decorator
* Remove optional extra asterisks
* Revert return_dict changes in PT code
* Update src/transformers/models/sam/modeling_tf_sam.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
* Remove None return annotations on init methods
* Update tests/models/sam/test_processor_sam.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
* Fix input_boxes shapes
* make fixup
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
2023-05-19 14:14:13 +01:00
Connor Henderson
2acedf4721
feat: Whisper prompting ( #22496 )
...
* initial working additions
* clean and rename, add cond stripping initial prompt to decode
* cleanup, edit create_initial_prompt_ids, add tests
* repo consistency, flip order of conditional
* fix error, move the processor fn to the tokenizer
* repo consistency, update test ids to corresponding tokenizer
* use convert_tokens_to_ids not get_vocab...
* use actual conditional in generate
* make sytle
* initial address comments
* initial working add new params to pipeline
* first draft of sequential generation for condition_on_previous_text
* add/update tests, make compatible with timestamps
* make compatible with diff. input kwargs and max length
* add None check
* add temperature check
* flip temp check operand
* refocusing to prev pr scope
* remove the params too
* make style
* edits, move max length incorporating prompt to whisper
* address comments
* remove asr pipeline prompt decoding, fix indexing
* address comments (more tests, validate prompt)
* un-comment out tests (from debug)
* remove old comment
* address comments
* fix typo
* remove timestamp token from test
* make style
* cleanup
* copy method to fast tokenizer, set max_new_tokens for test
* prompt_ids type just pt
* address Amy's comments
* make style
2023-05-19 09:33:11 +01:00
Yih-Dar
ffad4f1373
Update tiny models and pipeline tests ( #23446 )
...
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2023-05-18 17:29:04 +02:00
Joao Gante
aea7b23b57
Generate: skip left-padding tests on old models ( #23437 )
2023-05-18 11:04:51 +01:00
Yih-Dar
a8732e09bb
Fix device issue in SwiftFormerModelIntegrationTest::test_inference_image_classification_head ( #23435 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2023-05-17 19:48:18 +02:00
Yih-Dar
939a65aba7
Update Bigbird Pegasus tests ( #23431 )
...
* fix
* fix
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2023-05-17 18:14:29 +02:00
IMvision12
ebb649a4e3
Add Missing tokenization test [electra] ( #22997 )
...
* Create test_tokenization_electra.py
* Update tests/models/electra/test_tokenization_electra.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
2023-05-17 10:45:15 -04:00
Younes Belkada
3d3c7d4213
[SAM] fix sam slow test ( #23376 )
...
* fix sam slow test
* oops
* fix error message
2023-05-17 14:27:43 +02:00
Yih-Dar
46d2468695
Update ConvNextV2ModelIntegrationTest::test_inference_image_classification_head ( #23402 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2023-05-16 23:35:11 +02:00
Joao Gante
918a06e25d
Generate: add test to check KV format ( #23403 )
...
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
2023-05-16 19:28:19 +01:00
Yih-Dar
21741e8c7e
Update test_batched_inference_image_captioning_conditioned ( #23391 )
...
* fix
* fix
* fix test + add more docs
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
Co-authored-by: younesbelkada <younesbelkada@gmail.com >
2023-05-16 14:49:24 +02:00
LWprogramming
ee3be05310
Fix test typos - audio feature extractors ( #23310 )
2023-05-15 17:22:10 +01:00
Yih-Dar
8f76dc8e5a
Skip failing AlignModelTest::test_multi_gpu_data_parallel_forward ( #23374 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2023-05-15 16:46:58 +02:00
Shehan Munasinghe
c045249049
Add swiftformer ( #22686 )
...
* Commit the automatically generated code
using add-new-model-like
* Update description at swiftformer.mdx file
* remove autogenerated code for MaskedImageModeling
* update weight conversion scripts
* Update modeling_swiftformer.py
* update configuration_swiftformer.py
* Update test_modeling_swiftformer.py
* update modeling code - remove einops dependency
* Update _toctree.yml
* update modeling code - remove copied from comments
* update docs
* Revert "update docs"
This reverts commit c2e05e2998fe2cd6eaee8b8cc31aca5222bac9fb.
* update docs
* remove unused reference SwiftFormerImageProcessor
* update dependency_versions_table.py
* update swiftformer.mdx
* update swiftformer.mdx
* change model output type - no attentions
* update model org name
* Fix typo
* fix copies
* Update tests/models/swiftformer/test_modeling_swiftformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
* Update src/transformers/models/auto/image_processing_auto.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
* Update src/transformers/models/auto/feature_extraction_auto.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
* Update docs/source/en/model_doc/swiftformer.mdx
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
* Update src/transformers/models/swiftformer/configuration_swiftformer.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
* Apply suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
* Apply suggestions from code review
Co-Authored-By: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
* Apply suggestions from code review
Co-Authored-By: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
* Apply suggestions from code review
Co-Authored-By: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
* Update modeling_swiftformer.py
fix-copies
* make style, make quality, fix-copies
* Apply suggestions from code review
Co-Authored-By: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
* Apply suggestions from code review
Co-Authored-By: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
* make style
Co-Authored-By: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
* Add suggestions from code review
Co-Authored-By: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
* Add suggestions from code review
Co-Authored-By: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
* make fix-copies
* Update modeling_swiftformer.py
* Update modeling_swiftformer.py
* Add suggestions from code review
Co-Authored-By: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
2023-05-12 11:52:31 +01:00
amyeroberts
e1eb3efd02
Temporarily increase tol for PT-FLAX whisper tests ( #23288 )
2023-05-11 11:43:18 +01:00
amyeroberts
f82ee109e6
Temporary tolerance fix for flaky whipser PT-TF equiv. test ( #23257 )
...
* Temp tol fix for flaky whipser test
* Add equivalent update to TF tests
2023-05-11 10:04:07 +01:00
Sylvain Gugger
b4d4d6fe87
Add RWKV-4 ( #22797 )
...
* First draft of RWKV-4
* Add support for generate
* Style post-rebase
* Properly use state
* Write doc
* Fix doc
* More math
* Add model to README, dummies and clean config
* Fix init
* multiple fixes:
- fix common tests
- fix configuraion default values
- add CI test for checking state computation
- fix some CI tests
* correct tokenizer
* some tweaks
- fix config docstring
- fix failing tests
* fix CI tests
- add output_attention / output_hidden_states
- override test_initialization
- fix failing CIs
* fix conversion script
- fix sharded case
- add new arguments
* add slow tests + more fixes on conversion script
* add another test
* final fixes
* change single name variable
* add mock attention mask for pipeline to work
* correct eos token id
* fix nits
* add checkpoints
* Apply suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
* add `tie_word_embeddings` in docstring
* change tensor name
* fix final nits
* Trigger CI
---------
Co-authored-by: younesbelkada <younesbelkada@gmail.com >
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com >
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
2023-05-09 13:04:10 -04:00
Matthijs Hollemans
7f91950901
audio_utils improvements ( #21998 )
...
* silly change to allow making a PR
* clean up doc comments
* simplify hertz_to_mel and mel_to_hertz
* fixup
* clean up power_to_db
* also add amplitude_to_db
* move functions
* clean up mel_filter_bank
* fixup
* credit librosa & torchaudio authors
* add unit tests
* tests for power_to_db and amplitude_to_db
* add mel_filter_bank tests
* rewrite STFT
* add convenience spectrogram function
* missing transpose
* fewer transposes
* add integration test to M-CTC-T
* frame length can be either window or FFT length
* rewrite stft API
* add preemphasis coefficient
* move argument
* add log option to spectrogram
* replace M-CTC-T feature extractor
* fix api thing
* replace whisper STFT
* replace whisper mel filters
* replace tvlt's stft
* allow alternate window names
* replace speecht5 stft
* fixup
* fix integration tests
* fix doc comments
* remove manual FFT length calculation
* fix docs
* go away, deprecation warnings
* combine everything into spectrogram function
* add deprecated functions back
* fixup
2023-05-09 09:10:17 -04:00
Bartosz Szmelczynski
6f8a02844a
fix random attention for pytorch's bigbird/pegasus_bigbird ( #23056 )
...
* fix random attention usage for bigbird and pegasus_bigbird
* remove staticmethod, update tests target valus
* revert style changes
2023-05-07 18:55:04 -04:00
raghavanone
312b104ff6
Add FlaxWhisperForAudioClassification model ( #23173 )
...
* Add FlaxWhisperForAudioClassification model
* Add models to init
* Add models to init
* Fix copies
* Fix automapping
* Fix failing test
2023-05-05 13:23:46 -04:00
Connor Henderson
17083b9b84
fix: Passing language as acronym to Whisper generate ( #23141 )
...
* add fix
* address comments
* remove error formatting
2023-05-05 11:52:19 -04:00
Sylvain Gugger
01734dba84
Revert "Add FlaxWhisperForAudioClassification model" ( #23154 )
...
Revert "Add FlaxWhisperForAudioClassification model (#22883 )"
This reverts commit c8f2c5c56e .
2023-05-04 13:47:07 -04:00
raghavanone
c8f2c5c56e
Add FlaxWhisperForAudioClassification model ( #22883 )
...
* Add FlaxWhisperForAudioClassification model
* Add models to init
* Add models to init
* Fix copies
* Fix automapping
2023-05-04 13:00:16 -04:00
peter-sk
83b38fbea8
GPTNeoXForQuestionAnswering ( #23059 )
...
* first draft - gives index error in question_answering.py
* maturing
* no labels
* pipeline should know about QA
* fixing checks
* formatting
* fixed docstring
* initial commit
* formatting
* adding the class to many places
* towards less unhappy checks
* nearly there
* and gpt neox for qa
* use right model
* forgot this one
* base_model_prefix is "gpt_neox" for GPTNeoX* models
* unnecessary stuff
* Update src/transformers/models/gpt_neox/modeling_gpt_neox.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
* format
* Update src/transformers/models/gpt_neox/modeling_gpt_neox.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
* removed gpt2 stuff
---------
Co-authored-by: Prof. Peter Schneider-Kamp <jps@ordbogen.com >
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
2023-05-04 10:15:15 -04:00
peter-sk
78b7debf56
GPTNeoForQuestionAnswering ( #23057 )
...
* first draft - gives index error in question_answering.py
* maturing
* no labels
* pipeline should know about QA
* fixing checks
* formatting
* fixed docstring
* initial commit
* formatting
* adding the class to many places
* towards less unhappy checks
* nearly there
* Update src/transformers/models/gpt_neo/modeling_gpt_neo.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
* avoid error
* moving to device of star/end_logits
---------
Co-authored-by: Prof. Peter Schneider-Kamp <jps@ordbogen.com >
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
2023-05-03 15:59:19 -04:00
Alara Dirik
441658dd6c
Add focalnet backbone ( #23104 )
...
Adds FocalNet backbone to return features from all stages
2023-05-03 19:32:42 +03:00
Joao Gante
ce31e3c8bf
Generate: slow assisted generation test ( #23125 )
2023-05-03 14:24:50 +01:00
peter-sk
2b0c924568
GPT2ForQuestionAnswering ( #23030 )
...
* first draft - gives index error in question_answering.py
* maturing
* no labels
* pipeline should know about QA
* fixing checks
* formatting
* fixed docstring
* make sure legacy code executes
* comment
* like this
---------
Co-authored-by: Prof. Peter Schneider-Kamp <jps@ordbogen.com >
2023-05-02 09:25:46 -04:00
Ashwin Mathur
487f132a6f
Add BioGPTForSequenceClassification ( #22253 )
...
* added BioGptForSequenceClassification
* added source of copied code
* typo
* Format code with black
* Update comments for copied code
* Remove code copy comment
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
* Fix failing tests
* Update code copied from comments
* Fix code quality
* Update src/transformers/models/biogpt/modeling_biogpt.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
* Fix lint error
* Update src/transformers/models/biogpt/modeling_biogpt.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
* Rename model to biogpt for consistency
* Add PipelineTesterMixin to test_modeling_biogpt.py
* Apply suggestions from code review
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
* Resolve merge confict
---------
Co-authored-by: Guillem García Subies <37592763+GuillemGSubies@users.noreply.github.com >
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
2023-05-01 09:17:27 -04:00
s-JoL
c2c99dc7ef
add open-llama model with ckpt ( #22795 )
...
* update Open-Llama model
* update
* update format
* update doc
* update
* update stable embedding test
* update test case
* update format
* update readme
* fix typo
* update name
* remove tokenizer and update format
* remove convert_open_llama_weights_to_hf
* update warning and doc_string
---------
Co-authored-by: songliang.bayesian <songliang.bayesian@bytedance.com >
2023-04-28 11:01:32 -04:00
Yih-Dar
0bf34b1c9f
Skip pt/flax equivalence tests in pytorch bigbird test file ( #23040 )
...
skip
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2023-04-28 17:00:13 +02:00
Bartosz Szmelczynski
88399476c3
Fix bigbird random attention ( #21023 )
...
* switch np.random.permutation to jax.random.permuation
* remove comments
* remove leftover comment
* skip similarity tests
* modify indices_prng_key usage, add deterministic behaviour
* update style
* remove unused import
* remove copy statement since classes are not identical
* remove numpy import
* revert removing copied from statements
* make style from copied
* remove copied from statement
* update copied from statement to include only np.ndarry
* add deterministic args, unittestskip equivalence tests
2023-04-27 13:52:28 -04:00
Yih-Dar
27b66bea01
Update BridgeTowerModelTester ( #23029 )
...
* update
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2023-04-27 18:26:17 +02:00
peter-sk
d65b14ed67
added GPTNeoForTokenClassification ( #22908 )
...
* added GPTNeoForTokenClassification
* add to top-level init
* fixup
* test
* more fixup
* add to gpt_neo.mdx
* repo consistency
* dummy copy
* fix copies
* optax >= 0.1.5 assumes jax.Array exists - which it doesn't for jax <= 0.3.6
* merge with main made this superfluous
* added classifier_dropout
* remove legacy code
* removed fmt:on/off
removed expected_outputs
* doc style fix
* classifier_dropout is always in config
---------
Co-authored-by: Prof. Peter Schneider-Kamp <jps@ordbogen.com >
2023-04-27 12:10:03 -04:00
peter-sk
614e191c4d
added GPTNeoXForTokenClassification ( #23002 )
...
* initial commit
* added GPTNeoXForTokenClassification
* typo
* doc
fixed extra comma that turned into a tuple
* unifying variable names
fixing forward call
* classifier_dropout is in config
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
---------
Co-authored-by: Prof. Peter Schneider-Kamp <jps@ordbogen.com >
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
2023-04-27 11:08:26 -04:00
Younes Belkada
304aacac90
🚨 🚨 🚨 [Pix2Struct] Attempts to fix training issues 🚨 🚨 🚨 ( #23004 )
...
* multiple fixes
- add `add_special_tokens` to `True` by default
- remove label smoothing and labels masking
* fix test
2023-04-26 18:29:25 +02:00
Ritik Nandwal
20ac86c6f1
Add TensorFlow Wav2Vec2 for sequence classification ( #22073 )
...
* Add initial changes for TF wav2vec2 for sequence classification
* Add suggested changes
* Add serving and serving output methods
* Add serving_output implementation and fix layer_weights
* Add fixes
* Fixed test cases
* Fixing test and adding suggested changes
2023-04-26 13:35:30 +01:00
Yih-Dar
3f6a4b5bd7
Decorate test_codegen_sample_max_time as flaky ( #22953 )
...
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2023-04-24 15:27:31 +02:00