Francisco Kurucz
2f8acfea1c
Fix test_modeling_mpt typo in model id ( #25606 )
...
Fix model id in get_large_model_config on file test_modeling_mpt
2023-08-21 11:11:21 +02:00
ydshieh
1982dd3b15
Hotfix
2023-08-19 11:15:38 +02:00
Stas Bekman
6c811a322f
new model: IDEFICS via HuggingFaceM4 ( #24796 )
...
* rename
* restore
* mappings
* unedited tests+docs
* docs
* fixes
* fix auto-sync breakage
* cleanup
* wip
* wip
* add fetch_images
* remove einops dependency
* update
* fix
* fix
* fix
* fix
* fix
* re-add
* add batching
* rework
* fix
* improve
* add Leo as I am extending his work
* cleanup
* fix
* cleanup
* slow-test
* fix
* fix
* fixes
* deal with warning
* rename modified llama classes
* rework fetch_images
* alternative implementation
* cleanup
* strict version
* cleanup
* [`IDEFICS`] Fix idefics ci (#25056 )
* Fix IDEFICS CI
* fix test file
* fixup
* some changes to make tests pass
* fix
* fixup
* Update src/transformers/models/idefics/configuration_idefics.py
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com >
---------
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com >
* remove compat checks
* style
* explain that Idefics is not for training from scratch
* require pt>=2.0
* fix idefics vision config (#25092 )
* fix idefics vision config
* fixup
* clean
* Update src/transformers/models/idefics/configuration_idefics.py
---------
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com >
* cleanup
* style
* cleanup
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
* upcase
* sequence of images
* handle the case with no images
* Update src/transformers/image_processing_utils.py
Co-authored-by: Victor SANH <victorsanh@gmail.com >
* support pure lm take 2
* support tokenizer options
* parameterize num_channels
* fix upcase
* s|IdeficsForCausalLM|IdeficsForVisionText2Text|g
* manual to one line
* addressing review
* unbreak
* remove clip dependency
* fix test
* consistency
* PIL import
* Idefics prefix
* Idefics prefix
* hack to make tests work
* style
* fix
* fix
* revert
* try/finally
* cleanup
* clean up
* move
* [`IDEFICS`] Fix idefics config refactor (#25149 )
* refactor config
* nuke init weights
* more refactor
* oops
* remove visual question answering pipeline support
* Update src/transformers/models/idefics/clip.py
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com >
* Update src/transformers/models/idefics/modeling_idefics.py
* cleanup
* mv clip.py vision.py
* tidyup
---------
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com >
Co-authored-by: Stas Bekman <stas@stason.org >
* fix
* license
* condition on pt
* fix
* style
* fix
* rm torchvision dependency, allow custom transforms
* address review
* rework device arg
* add_eos_token
* s/transforms/transform/
* fix top level imports
* fix return value
* cleanup
* cleanup
* fix
* style
* license
* license
* Update src/transformers/models/idefics/image_processing_idefics.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
* add a wrapper to freeze vision layears
* tidyup
* use the correct std/mean settings
* parameterize values from config
* add tests/models/idefics/test_image_processing_idefics.py
* add test_processor_idefics.py
* cleanup
* cleanups
* fix
* fix
* move to the right group
* style
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
* add perceiver config
* reset
* missing arg docs
* Apply suggestions from code review
Co-authored-by: Leo Tronchon <leo.tronchon@gmail.com >
* address review comments
* inject automatic end of utterance tokens (#25218 )
* inject automatic end of utterance tokens
* fix
* fix
* fix
* rework to not use the config
* not end_of_utterance_token at the end
* Update src/transformers/models/idefics/processing_idefics.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
* address review
* Apply suggestions from code review
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com >
* Update src/transformers/image_processing_utils.py
Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com >
* [`Idefics`] add image_embeddings option in generate-related methods (#25442 )
* add image_embeddings option in generate-related methods
* style
* rename image_embeddings and allow perceiver embeddings precomputation
* compute embeddings within generate
* make is_encoder_decoder= True the default in config
* nested if else fix
* better triple check
* switch if elif order for pixel values / img embeds
* update model_kwargs perceiver only at the end
* use _prepare_model_inputs instead of encoder_decoder logic
* fix comment typo
* fix config default for is_encoder_decoder
* style
* add typehints
* precompute in forward
* doc builder
* style
* pop instead of get image hidden states
* Trigger CI
* Update src/transformers/models/idefics/modeling_idefics.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
* Update src/transformers/models/idefics/modeling_idefics.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
* fix * + indentation + style
* simplify a bit the use_resampler logic using comments
* update diocstrings
* Trigger CI
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
* fix rebase changes
* unbreak #25237 - to be fixed in follow up PRs
* is_composition = False
* no longer needed
---------
Co-authored-by: leot13 <leo.tronchon@gmail.com >
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com >
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
Co-authored-by: Victor SANH <victorsanh@gmail.com >
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com >
Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com >
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
2023-08-18 14:12:28 -07:00
Arthur
30b3c46ff5
[split_special_tokens] Add support for split_special_tokens argument to encode ( #25081 )
...
* draft changes
* update and add tests
* styling for no
* move test
* path to usable model
* update test
* small update
* update bertbased tokenizers
* don'tuse kwargs for _tokenize
* don'tuse kwargs for _tokenize
* fix copies
* update
* update test for special tokenizers
* fixup
* skip two tests
* remove pdb breakpiont()
* wowo
* rewrite custom tests
* nits
* revert chang in target keys
* fix markup lm
* update documentation of the argument
2023-08-18 13:26:27 +02:00
Alex McKinney
9d7afd2536
Replaces calls to .cuda with .to(torch_device) in tests ( #25571 )
...
* Replaces calls to `.cuda` with `.to(torch_device)` in tests
`torch.Tensor.cuda()` is a pre-0.4 solution to changing a tensor's device. It is recommended to prefer `.to(...)` for greater flexibility and error handling. Furthermore, this makes it more consistent with other tests (that tend to use `.to(torch_device)`) and ensures the correct device backend is used (if `torch_device` is neither `cpu` or `cuda`).
* addressing review comments
* more formatting changes in Bloom test
* `make style`
* Update tests/models/bloom/test_modeling_bloom.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
* fixes style failures
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
2023-08-18 12:40:40 +02:00
Yih-Dar
427adc898a
Skip test_contrastive_generate for TFXLNet ( #25574 )
...
* fix
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2023-08-17 18:56:34 +02:00
Arthur
181d778f83
[NllbMoe] Update code to properly support loss computation ( #25429 )
...
* update nllb_moe
* fix
* doc nits
* nits
* add a small test
* ficup
* remove adapted from
2023-08-17 17:21:56 +02:00
Arthur
b4d5548800
🚨 🚨 🚨 [SPM] Finish fix spm models 🚨 🚨 🚨 ( #25224 )
...
* fix EVERYTHING
* more fixes
* ⚗️ ⚗️ Tokenizer magic ⚗️ ⚗️
* wrong value but test passes for the TODO
* update
* updat
* safe protobuf import?
* style
* non gated repo
* update
* fixup
* Update src/transformers/models/llama/tokenization_llama.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
* Update src/transformers/models/llama/tokenization_llama.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
* Update tests/models/t5/test_tokenization_t5.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
* nits
* fix t5 too
* use assert equal
* fix llama decoding
* nits on t5
* fixup
* only remove the prefix space, not other spaces
* more deconding tests and more todos
* fix CI as well
* fixup
* skip failing test on CI (its tf its ok)
* skip test_subword_regularization_tokenizer that is also crashing on the CI for TF
* update llama
* revert good fixes
* fixup
* empty
* explain why we need to encode with an additional token
* better warning?
* nits
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com >
2023-08-17 17:08:05 +02:00
Yih-Dar
d2871b2975
Skip test_beam_search_xla_generate_simple for T5 ( #25566 )
...
* fix
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2023-08-17 15:30:46 +02:00
Yih-Dar
ec25306b39
Fix MPT CI ( #25548 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2023-08-17 09:06:26 +02:00
amyeroberts
6bca43bb90
Input data format ( #25464 )
...
* Add copied from statements for image processors
* Move out rescale and normalize to base image processor
* Remove rescale and normalize from vit (post rebase)
* Update docstrings and tidy up
* PR comments
* Add input_data_format as preprocess argument
* Resolve tests and tidy up
* Remove num_channels argument
* Update doc strings -> default ints not in code formatting
2023-08-16 17:45:02 +01:00
Yih-Dar
f61f072b61
Fix MaskFormerModelIntegrationTest OOM ( #25544 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2023-08-16 18:11:24 +02:00
Marc Sun
0ed23e4db2
fix vit hybrid test ( #25543 )
...
fix test
2023-08-16 17:02:57 +02:00
Joao Gante
0b568291d7
Marian: post-hack-fix correction ( #25459 )
2023-08-16 11:49:29 +01:00
amyeroberts
c41291965f
🚨 🚨 🚨 Remove softmax for EfficientNetForImageClassification 🚨 🚨 🚨 ( #25501 )
...
* Remove softmax for EfficientNet
* Update integration test values
* Fix up
2023-08-14 17:08:47 +01:00
amyeroberts
5e5fa0d88c
Mark flaky tests ( #25463 )
...
Make CI less brittle
2023-08-11 15:26:45 +01:00
Joao Gante
4692d26194
Switch Transformers: remove overwritten beam sample test ( #25458 )
2023-08-11 13:16:01 +01:00
amyeroberts
41d56ea6dd
Refactor image processor testers ( #25450 )
...
* Refactor image processor test mixin
- Move test_call_numpy, test_call_pytorch, test_call_pil to mixin
- Rename mixin to reflect handling of logic more than saving
- Add prepare_image_inputs, expected_image_outputs for tests
* Fix for oneformer
2023-08-11 11:30:18 +01:00
Yoach Lacombe
704bf595eb
Update Bark generation configs and tests ( #25409 )
...
* update bark generation configs for more coherent parameter
* make style
* update bark hub repo
2023-08-09 18:28:02 +02:00
Yih-Dar
5b517e1764
Use small config for OneFormerModelTest.test_model_with_labels ( #25383 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2023-08-08 17:15:34 +02:00
Yih-Dar
6ea3ee3cd2
Fix test_model_parallelism ( #25359 )
...
* fix
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2023-08-08 10:48:45 +02:00
Pedro Lira
080a97119c
Add mask2former fp16 support ( #25093 )
...
* Add mask2former fp16 support
* Clear consistency/quality issues
* Fix consistency/quality (2)
* Add integration test for mask2former (fp16 case)
* Fix code quality
* Add integration test for maskformer (fp16 case)
* Add integration test for oneformer (fp16 case)
* Remove slow decorator from fp16 tests
* Fix lint
* Remove usage of full inference and value checks for fp16
* Temporarily comment slow for {mask, mask2, one}former
* Add fp16 support to oneformer
* Revert "Temporarily comment slow for {mask, mask2, one}former"
This reverts commit e5371edabd301cf56079def0421a0a87df307cb0.
* Remove dtype conversion noop
2023-08-07 20:07:29 +01:00
Yih-Dar
c177606fb4
Fix more offload edge cases ( #25342 )
...
* fix
* fix
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2023-08-07 17:45:41 +02:00
Yih-Dar
ce6d153a53
Make bark could have tiny model ( #25290 )
...
* temp
* update
* update
* update
* small dim
* small dim
* small dim
* fix
* update
* fix
* fix
* fix
* fix
* fix
* fix
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2023-08-04 15:13:14 +02:00
Yoach Lacombe
6d3f9c1e2e
add generate method to SpeechT5ForTextToSpeech ( #25233 )
...
* add generate method to SpeechT5ForTextToSpeech
* update speecht5forTTS docstrings
* Remove defaults to None in generate docstrings
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
2023-08-03 14:12:07 +01:00
amyeroberts
30409af6e1
Update InstructBLIP & Align values after rescale update ( #25209 )
...
* Update InstructBLIP values
Note: the tests are not independent. Running the test independentely produces different logits compared to running all the integration tests
* Update test values after rescale update
* Remove left over commented out code
* Revert to previous rescaling logic
* Update rescale tests
2023-08-03 11:01:10 +01:00
Yih-Dar
bd90cda9a6
CI with num_hidden_layers=2 🚀 🚀 🚀 ( #25266 )
...
* CI with layers=2
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2023-08-02 20:22:36 +02:00
Patrick von Platen
b28ebb2655
[MMS] Fix mms ( #25267 )
...
* [MMS] Fix mms
* [MMS] Fix mms
* fix mms loading
* Apply suggestions from code review
* make style
* Update tests/models/wav2vec2/test_modeling_wav2vec2.py
2023-08-02 18:11:15 +02:00
Yupeng Jia
8021c684ec
Fix some bugs for two stage training of deformable detr ( #25045 )
...
* Update modeling_deformable_detr.py
Fix bugs for two stage training
* Update modeling_deformable_detr.py
* Add test_two_stage_training to DeformableDetrModelTest
---------
Co-authored-by: yupeng.jia <yupeng.jia@momenta.ai >
2023-08-02 11:30:36 +01:00
amyeroberts
1b35409768
Update rescale tests - cast to float after rescaling to reflect #25229 ( #25259 )
...
Rescale tests - cast to float after rescaling to reflect #25229
2023-08-02 11:29:55 +01:00
Younes Belkada
05ebb0264e
[MPT] Add require_bitsandbytes on MPT integration tests ( #25201 )
...
* add `require_bitsandbytes` on MPT integration tests
* add it on mpt as well
2023-08-01 12:20:34 +02:00
Yih-Dar
1b4f6199c6
Update tiny model info. and pipeline testing ( #25213 )
...
* update tiny_model_summary.json
* update
* update
* update
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2023-07-31 19:35:33 +02:00
Yih-Dar
9ca3aa0156
Fix all_model_classes in FlaxBloomGenerationTest ( #25211 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2023-07-31 17:32:05 +02:00
amyeroberts
05cda5df34
🚨 🚨 🚨 Fix rescale ViVit Efficientnet ( #25174 )
...
* Fix rescaling bug
* Add tests
* Update integration tests
* Fix up
* Update src/transformers/image_transforms.py
* Update test - new possible order in list
2023-07-28 19:52:51 +01:00
Sanchit Gandhi
03f98f9683
[MusicGen] Fix integration tests ( #25169 )
...
* move to device
* update with cuda values
* fix fp16
* more rigorous
2023-07-28 18:50:15 +01:00
Younes Belkada
dd9d45b6ec
[InstructBlip] Fix instructblip slow test ( #25171 )
...
* fix instruct blip slow test
* Update tests/models/instructblip/test_modeling_instructblip.py
2023-07-28 17:00:10 +02:00
Younes Belkada
add0895dd9
[Mpt] Fix mpt slow test ( #25170 )
...
fix mpt slow test
2023-07-28 16:45:09 +02:00
Sanchit Gandhi
e93103632b
Add bloom flax ( #25094 )
...
* First commit
* step 1 working
* add alibi
* placeholder for `scan`
* add matrix mult alibi
* beta scaling factor for bmm
* working v1 - simple forward pass
* move layer_number from attribute to arg in call
* partial functioning scan
* hacky working scan
* add more modifs
* add test
* update scan for new kwarg order
* fix position_ids problem
* fix bug in attention layer
* small fix
- do the alibi broadcasting only once
* prelim refactor
* finish refactor
* alibi shifting
* incorporate dropout_add to attention module
* make style
* make padding work again
* update
* remove bogus file
* up
* get generation to work
* clean code a bit
* added small tests
* adding albii test
* make CI tests pass:
- change init weight
- add correct tuple for output attention
- add scan test
- make CI tests work
* fix few nits
* fix nit onnx
* fix onnx nit
* add missing dtype args to nn.Modules
* remove debugging statements
* fix scan generate
* Update modeling_flax_bloom.py
* Update test_modeling_flax_bloom.py
* Update test_modeling_flax_bloom.py
* Update test_modeling_flax_bloom.py
* fix small test issue + make style
* clean up
* Update tests/models/bloom/test_modeling_flax_bloom.py
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com >
* fix function name
* small fix test
* forward contrib credits from PR17761
* Fix failing test
* fix small typo documentation
* fix non passing test
- remove device from build alibi
* refactor call
- refactor `FlaxBloomBlockCollection` module
* make style
* upcast to fp32
* cleaner way to upcast
* remove unused args
* remove layer number
* fix scan test
* make style
* fix i4 casting
* fix slow test
* Update src/transformers/models/bloom/modeling_flax_bloom.py
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com >
* remove `layer_past`
* refactor a bit
* fix `scan` slow test
* remove useless import
* major changes
- remove unused code
- refactor a bit
- revert import `torch`
* major refactoring
- change build alibi
* remove scan
* fix tests
* make style
* clean-up alibi
* add integration tests
* up
* fix batch norm conversion
* style
* style
* update pt-fx cross tests
* update copyright
* Update src/transformers/modeling_flax_pytorch_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
* per-weight check
* style
* line formats
---------
Co-authored-by: younesbelkada <younesbelkada@gmail.com >
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com >
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com >
Co-authored-by: haileyschoelkopf <haileyschoelkopf@users.noreply.github.com >
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
2023-07-27 18:24:56 +01:00
Yoach Lacombe
0b92ae3489
Add offload support to Bark ( #25037 )
...
* initial Bark offload proposal
* use hooks instead of manually offloading
* add test of bark offload to cpu feature
* Apply nit suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
* Update docstrings of offload
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com >
* remove unecessary set_seed in Bark tests
---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com >
2023-07-27 15:35:17 +01:00
Arthur
9cea3e7b80
[MptConfig] support from pretrained args ( #25116 )
...
* support from pretrained args
* draft addition of tests
* update test
* use parrent assert true
* Update src/transformers/models/mpt/configuration_mpt.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com >
---------
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com >
2023-07-27 16:24:52 +02:00
amyeroberts
659829b6ae
MaskFormer - enable return_dict in order to compile ( #25052 )
...
* Enable return_dict in order to compile
* Update tests
2023-07-26 16:23:30 +01:00
Yih-Dar
31acba5697
Fix PvtModelIntegrationTest::test_inference_fp16 ( #25106 )
...
update
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com >
2023-07-26 14:57:44 +02:00
Sebastian Husch Lee
8f36ab3e22
[T5, MT5, UMT5] Add [T5, MT5, UMT5]ForSequenceClassification ( #24726 )
...
* Initial addition of t5forsequenceclassification
* Adding imports and adding tests
* Formatting
* Running make fix-copies
* Adding mt5forseq
* Formatting
* run make fix-copies
* Adding to docs
* Add model_parallel
* Fix bug
* Fix
* Remove TODO
* Fixing tests for T5ForSequenceClassification
* Undo changes to dependency_versions_table.py
* Change classification head to work with T5Config directly
* Change seq length to let tests pass
* PR comments for formatting
* Formatting
* Initial addition of UMT5ForSequenceClassification
* Adding to inits and formatting
* run make fix-copies
* Add doc for UMT5ForSeqClass
* Update UMT5 config
* Fix docs
* Skip torch fx test for SequenceClassification
* Formatting
* Add skip to UMT5 tests as well
* Fix umt5 tests
* Running make fix-copies
* PR comments
* Fix for change to sentence_representation
* Rename seq_len to hidden_size since that's what it is
* Use base_model to follow format of the rest of the library
* Update docs
* Extract the decoder_input_ids changes and make one liner
* Make one-liner
2023-07-25 21:02:49 +02:00
Arthur
dcb183f4bd
[MPT] Add MosaicML's MPT model to transformers ( #24629 )
...
* draft add new model like
* some cleaning of the config
* nits
* add nested configs
* nits
* update
* update
* added layer norms + triton kernels
* consider only LPLayerNorm for now.
* update
* all keys match.
* Update
* fixing nits here and there
* working forward pass.
* removed einops dependency
* nits
* format
* add alibi
* byebye head mask
* refactor attention
* nits.
* format
* fix nits.
* nuke ande updates
* nuke tokenizer test
* don't reshape query with kv heads
* added a bit of documentation.
* remove unneeded things
* nuke more stuff
* nit
* logits match - same generations
* rm unneeded methods
* 1 remaining failing CI test
* nit
* fix nits
* fix docs
* fix docs
* rm tokenizer
* fixup
* fixup
* fixup and fix tests
* fixed configuration object.
* use correct activation
* few minor fixes
* clarify docs a bit
* logits match à 1e-12
* skip and unskip a test
* added some slow tests.
* fix readme
* add more details
* Update docs/source/en/model_doc/mpt.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
* fix configuration issues
* more fixes in config
* added more models
* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
* remove unneeded position ids
* fix some comments
* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
* revert suggestion
* mpt alibi + added batched generation
* Update src/transformers/models/mpt/__init__.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
* remove init config
* Update src/transformers/models/mpt/configuration_mpt.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com >
* fix nit
* add another slow test
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
* fits in one line
* some refactor because make fixup doesn't pass
* add ft notebook
* update md
* correct doc path
---------
Co-authored-by: younesbelkada <younesbelkada@gmail.com >
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com >
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com >
2023-07-25 14:32:40 +02:00
Xuehai Pan
6bc61aa7af
Set TF32 flag for PyTorch cuDNN backend ( #25075 )
2023-07-25 08:04:48 -04:00
Sylvain Gugger
f295fc8a16
Fix last models for common tests that are too big. ( #25058 )
...
* Fix last models for common tests that are too big.
* Remove print statement
2023-07-25 07:56:04 -04:00
Rinat
a03d13c83d
Pvt model ( #24720 )
...
* pull and push updates
* add docs
* fix modeling
* Add and run test
* make copies
* add task
* fix tests and fix small issues
* Checks on a Pull Request
* fix docs
* add desc pvt.md
2023-07-24 15:34:19 +01:00
Sylvain Gugger
42571f6eb8
Make more test models smaller ( #25005 )
...
* Make more test models tiny
* Make more test models tiny
* More models
* More models
2023-07-24 10:08:47 -04:00
Arthur
0511369a8b
[LlamaConfig] Nit: pad token should be None by default ( #24958 )
...
* pad token should be None by default
* fix tests
* nits
2023-07-21 14:32:34 +02:00
Tom Aarsen
79444f370f
Deprecate unused OpenLlama architecture ( #24922 )
...
* Resolve typo in check_repo.py
* Specify encoding when opening modeling files
* Deprecate the OpenLlama architecture
* Add disclaimer pointing to Llama
I'm open to different wordings here
* Match the capitalisation of LLaMA
2023-07-20 07:03:24 -04:00