Python 3.9 changed the format of the string serialization of `typing.Optional`.
For example, `str(typing.Optional[str])` is
`typing.Union[str, NoneType]` in python 3.8 and
`typing.Optional[str]` in Python 3.9.
* Merging all duplicated codes for Text2TextPipeline while preserving
backward compat.
* Fixing TranslationPipeline Hierarchy + return_name
* torch import guard.
* Update isort version.
* Remove code from other PR disentanglement.
* Removed named example to something more agnostic.
* Main init work
* Add version
* Change from absolute to relative imports
* Fix imports
* One more typo
* More typos
* Styling
* Make quality script pass
* Add necessary replace in template
* Fix typos
* Spaces are ignored in replace for some reason
* Forgot one models.
* Fixes for import
Co-authored-by: LysandreJik <lysandre.debut@reseau.eseo.fr>
* Add documentation
* Styling
Co-authored-by: LysandreJik <lysandre.debut@reseau.eseo.fr>
* Don't import libs to check they are available
* Don't import integrations at init
* Add importlib_metdata to deps
* Remove old vars references
* Avoid syntax error
* Adapt testing utils
* Try to appease torchhub
* Add dependency
* Remove more private variables
* Fix typo
* Another typo
* Refine the tf availability test
* Define new output dataclasses for greedy generation
* Add output_[...] flags in greedy generation methods
Added output_attentions, output_hidden_states, output_scores flags in
generate and greedy_search methods in GenerationMixin.
* [WIP] Implement logic and tests for output flags in generation
* Update GreedySearchOutput classes & docstring
* Implement greedy search output accumulation logic
Update greedy_search unittests
Fix generate method return value docstring
Properly init flags with the default config
* Update configuration to add output_scores flag
* Fix test_generation_utils
Sort imports and fix isinstance tests for GreedySearchOutputs
* Fix typo in generation_utils
* Add return_dict_in_generate for backwards compatibility
* Add return_dict_in_generate flag in config
* Fix tyPo in configuration
* Fix handling of attentions and hidden_states flags
* Make style & quality
* first attempt attentions
* some corrections
* improve tests
* special models requires special test
* disable xlm test for now
* clean tests
* fix for tf
* isort
* Add output dataclasses for other generation methods
* Add logic to return dict in sample generation
* Complete test for sample generation
- Pass output_attentions and output_hidden_states flags to encoder in
encoder-decoder models
- Fix import satements order in test_generation_utils file
* Add logic to return dict in sample generation
- Refactor tests to avoid using self.assertTrue, which provides
scarce information when the test fails
- Add tests for the three beam_search methods: vanilla, sample and
grouped
* Style doc
* Fix copy-paste error in generation tests
* Rename logits to scores and refactor
* Refactor group_beam_search for consistency
* make style
* add sequences_scores
* fix all tests
* add docs
* fix beam search finalize test
* correct docstring
* clean some files
* Made suggested changes to the documentation
* Style doc ?
* Style doc using the Python util
* Update src/transformers/generation_utils.py
* fix empty lines
* fix all test
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Store transformers version info when saving the model
* Store transformers version info when saving the model
* fix format
* fix format
* fix format
* Update src/transformers/configuration_utils.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Update configuration_utils.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* [t5 doc] typos
a few run away backticks
@sgugger
* style
* [trainer] put fp16 args together
this PR proposes a purely cosmetic change that puts all the fp16 args together - so they are easier to manager/read
@sgugger
* style
* create model
* add integration
* save current state
* make integration tests pass
* add one more test
* add explanation to tests
* remove from bart
* add padding
* remove unnecessary test
* make all tests pass
* re-add cookie cutter tests
* finish PyTorch
* fix attention test
* Update tests/test_modeling_common.py
* revert change
* remove unused file
* add string to doc
* save intermediate
* make tf integration tests pass
* finish tf
* fix doc
* fix docs again
* add led to doctree
* add to auto tokenizer
* added tips for led
* make style
* apply jplus statements
* correct tf longformer
* apply lysandres suggestions
* apply sylvains suggestions
* Apply suggestions from code review
* --model_parallel hasn't been implemented for most models
* make the help clear as well
* implement is_parallelizable; use it
* oops
* remove property
This PR proposes to:
* auto-flush `transformers` logging
When using logging for tracing signals from different parts of the code and which could be mixed with print debug this aids to get all the logging events synchronized.
I don't think this change will introduce any performance impacts.
If it helps someone here is the code I used to sync `transformers` logging with various other debug prints.
I was porting bart to MP and I needed to trace that the device switching happens correctly and I added a bunch of logger.info calls inside `modeling_bart.py` and also had some other helpers `print` debug messages which weren't logger based:
```
# auto flush std streams
from sys import stdout, stderr
def stdout_write_flush(args, w=stderr.write): w(args); stderr.flush()
def stderr_write_flush(args, w=stderr.write): w(args); stderr.flush()
stdout.write = stdout_write_flush
stderr.write = stderr_write_flush
from transformers import BartTokenizer, BartForConditionalGeneration, BartConfig
import logging
import transformers.utils.logging
import transformers.models.bart.modeling_bart
# I wanted a shorter simpler format
handlers = transformers.utils.logging._get_library_root_logger().handlers
for handler in handlers:
formatter = logging.Formatter("[%(funcName)s] %(message)s")
handler.setFormatter(formatter)
transformers.models.bart.modeling_bart.logger.setLevel(transformers.logging.INFO)
```
@LysandreJik, @sgugger, @patrickvonplaten
This PR:
* fixes trainer to have the logger agree with the actual default `output_dir`, but setting it one place and passing it as an argument to both places
@sgugger
```
python -c "from apex.normalization import FusedProphetNetLayerNorm"
Traceback (most recent call last):
File "<string>", line 1, in <module>
ImportError: cannot import name 'FusedProphetNetLayerNorm' from 'apex.normalization' (/home/stas/anaconda3/envs/main-38/lib/python3.8/site-packages/apex/normalization/__init__.py)
```
It looks like this code has never been tested, so it silently fails inside try/except.
Discovered this by accident in https://github.com/huggingface/transformers/issues/9338#issuecomment-752217708
* Create modeling_tf_dpr.py
* Add TFDPR
* Add back TFPegasus, TFMarian, TFMBart, TFBlenderBot
last commit accidentally deleted these 4 lines, so I recover them back
* Add TFDPR
* Add TFDPR
* clean up some comments, add TF input-style doc string
* Add TFDPR
* Make return_dict=False as default
* Fix return_dict bug (in .from_pretrained)
* Add get_input_embeddings()
* Create test_modeling_tf_dpr.py
The current version is already passed all 27 tests!
Please see the test run at :
https://colab.research.google.com/drive/1czS_m9zy5k-iSJbzA_DP1k1xAAC_sdkf?usp=sharing
* fix quality
* delete init weights
* run fix copies
* fix repo consis
* del config_class, load_tf_weights
They shoud be 'pytorch only'
* add config_class back
after removing it, test failed ... so totally only removing "use_tf_weights = None" on Lysandre suggestion
* newline after .. note::
* import tf, np (Necessary for ModelIntegrationTest)
* slow_test from_pretrained with from_pt=True
At the moment we don't have TF weights (since we don't have official official TF model)
Previously, I did not run slow test, so I missed this bug
* Add simple TFDPRModelIntegrationTest
Note that this is just a test that TF and Pytorch gives approx. the same output.
However, I could not test with the official DPR repo's output yet
* upload correct tf model
* remove position_ids as missing keys
* fix RagSeq generate with context_input_ids
fix RagSeq generate with context_input_ids
* apply style
* delete unused lines
* Add test_rag_sequence_generate_batch_from_context_input_ids
* Readability improved
* stylying
* Stylize
* typos
* add check_model_generate_from_context_input_ids
* make style
* Apply suggestions from code review
* make style2
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: patrickvonplaten <patrick@huggingface.co>
* add past_key_values
* add use_cache option
* make mask before cutting ids
* adjust position_ids according to past_key_values
* flatten past_key_values
* fix positional embeds
* fix _reorder_cache
* set use_cache to false when not decoder, fix attention mask init
* add test for caching
* add past_key_values for Roberta
* fix position embeds
* add caching test for roberta
* add doc
* make style
* doc, fix attention mask, test
* small fixes
* adress patrick's comments
* input_ids shouldn't start with pad token
* use_cache only when decoder
* make consistent with bert
* make copies consistent
* add use_cache to encoder
* add past_key_values to tapas attention
* apply suggestions from code review
* make coppies consistent
* add attn mask in tests
* remove copied from longformer
* apply suggestions from code review
* fix bart test
* nit
* simplify model outputs
* fix doc
* fix output ordering