Yacine Jernite
49c5202522
Eli5 examples ( #4968 )
...
* add eli5 examples
* add dense query script
* query_di
* merging
* merging
* add_utils
* adds nearest neighbor wikipedia
* batch queries
* training_retriever
* new notebooks
* moved retriever traiing script
* finished wiki40b
* max_len_fix
* train_s2s
* retriever_batch_checkpointing
* cleanup
* merge
* dim_fix
* fix_indexer
* fix_wiki40b_snippets
* fix_embed_for_r
* fp32 index
* fix_sparse_q
* joint_training
* remove obsolete datasets
* add_passage_nn_results
* add_passage_nn_results
* add_batch_nn
* add_batch_nn
* add_data_scripts
* notebook
* notebook
* notebook
* fix_multi_gpu
* add_app
* full_caching
* full_caching
* notebook
* sparse_done
* images
* notebook
* add_image_gif
* with_Gif
* add_contr_image
* notebook
* notebook
* notebook
* train_functions
* notebook
* min_retrieval_length
* pandas_option
* notebook
* min_retrieval_length
* notebook
* notebook
* eval_Retriever
* notebook
* images
* notebook
* add_example
* add_example
* notebook
* fireworks
* notebook
* notebook
* joe's notebook comments
* app_update
* notebook
* notebook_link
* captions
* notebook
* assing RetriBert model
* add RetriBert to Auto
* change AutoLMHead to AutoSeq2Seq
* notebook downloads from hf models
* style_black
* style_black
* app_update
* app_update
* fix_app_update
* style
* style
* isort
* Delete WikiELI5training.ipynb
* Delete evaluate_eli5.py
* Delete WikiELI5explore.ipynb
* Delete ExploreWikiELI5Support.html
* Delete explainlikeimfive.py
* Delete wiki_snippets.py
* children before parent
* children before parent
* style_black
* style_black_only
* isort
* isort_new
* Update src/transformers/modeling_retribert.py
Co-authored-by: Julien Chaumond <chaumond@gmail.com >
* typo fixes
* app_without_asset
* cleanup
* Delete ELI5animation.gif
* Delete ELI5contrastive.svg
* Delete ELI5wiki_index.svg
* Delete choco_bis.svg
* Delete fireworks.gif
* Delete huggingface_logo.jpg
* Delete huggingface_logo.svg
* Delete Long_Form_Question_Answering_with_ELI5_and_Wikipedia.ipynb
* Delete eli5_app.py
* Delete eli5_utils.py
* readme
* Update README.md
* unused imports
* moved_info
* default_beam
* ftuned model
* disclaimer
* Update src/transformers/modeling_retribert.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co >
* black
* add_doc
* names
* isort_Examples
* isort_Examples
* Add doc to index
Co-authored-by: Julien Chaumond <chaumond@gmail.com >
Co-authored-by: Lysandre Debut <lysandre@huggingface.co >
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr >
2020-06-16 16:36:58 -04:00
Sam Shleifer
c3e607496c
[cleanup] examples test_run_squad uses tiny model ( #5059 )
2020-06-16 14:06:45 -04:00
Sylvain Gugger
d5477baf7d
Convert hans to Trainer ( #5025 )
...
* Convert hans to Trainer
* Tick box
2020-06-16 08:06:31 -04:00
Anthony MOI
36434220fc
[HUGE] Refactoring tokenizers backend - padding - truncation - pre-tokenized pipeline - fast tokenizers - tests ( #4510 )
...
* Use tokenizers pre-tokenized pipeline
* failing pretrokenized test
* Fix is_pretokenized in python
* add pretokenized tests
* style and quality
* better tests for batched pretokenized inputs
* tokenizers clean up - new padding_strategy - split the files
* [HUGE] refactoring tokenizers - padding - truncation - tests
* style and quality
* bump up requied tokenizers version to 0.8.0-rc1
* switched padding/truncation API - simpler better backward compat
* updating tests for custom tokenizers
* style and quality - tests on pad
* fix QA pipeline
* fix backward compatibility for max_length only
* style and quality
* Various cleans up - add verbose
* fix tests
* update docstrings
* Fix tests
* Docs reformatted
* __call__ method documented
Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com >
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr >
2020-06-15 17:12:51 -04:00
Sylvain Gugger
1affde2f10
Make DataCollator a callable ( #5015 )
...
* Make DataCollator a callable
* Update src/transformers/data/data_collator.py
Co-authored-by: Julien Chaumond <chaumond@gmail.com >
2020-06-15 11:58:33 -04:00
Stefan Schweter
d812e6d76e
NER: fix construction of input examples for RoBERTa ( #4943 )
...
* utils_ner: do not add extra sep token for RoBERTa model
* run_pl_ner: do not add extra sep token for RoBERTa model
2020-06-15 08:30:40 -04:00
Sylvain Gugger
403d309857
Hans data ( #4854 )
...
* Update hans data to be able to use Trainer
* Fixes
* Deal with tokenizer that don't have token_ids
* Clean up things
* Simplify data use
* Fix the input dict
* Formatting + proper path in README
2020-06-13 09:35:13 -04:00
VictorSanh
473808da0d
update mvmt-pruning/saving_prunebert (updating torch to 1.5)
2020-06-11 19:42:45 +00:00
Sylvain Gugger
e8db8b845a
Remove unused arguments in Multiple Choice example ( #4853 )
...
* Remove unused arguments
* Formatting
* Remove second todo comment
2020-06-09 20:05:09 -04:00
songyouwei
29c36e9f36
run_pplm.py bug fix ( #4867 )
...
`is_leaf` may become `False` after `.to(device=device)` function call.
2020-06-09 19:14:27 -04:00
Sam Shleifer
f90bc44d9a
[examples] Cleanup summarization docs ( #4876 )
2020-06-09 17:38:28 -04:00
Amil Khare
02e5f79662
[examples] consolidate summarization examples ( #4837 )
2020-06-09 11:14:12 -04:00
daniel-shan
b6f365a8ed
Updates args in tf squad example. ( #4820 )
...
Co-authored-by: Daniel Shan <daniel.shan@workday.com >
2020-06-08 05:36:09 -04:00
Mr Ruben
ddf9a3dfc7
Updated path "cd examples/text-generation/pplm" ( #4778 )
...
https://github.com/huggingface/transformers/issues/4776
2020-06-05 21:16:48 -04:00
Sam Shleifer
875288b344
[isort] add matplotlib to known 3rd party dependencies ( #4800 )
2020-06-05 17:27:31 -04:00
Julien Chaumond
b9109f2de1
[doc] Make it clearer that text-generation does not involve training
2020-06-05 14:59:22 +02:00
Stefan Schweter
2a4b9e09c0
NER: Add new WNUT’17 example ( #4681 )
...
* ner: add preprocessing script for examples that splits longer sentences
* ner: example shell scripts use local preprocessing now
* ner: add new example section for WNUT’17 NER task. Remove old English CoNLL-03 results
* ner: satisfy black and isort
2020-06-04 19:13:17 -04:00
prajjwal1
48a05026de
removed deprecared use of Variable api from pplm example
2020-06-04 18:07:49 -04:00
Jason Phang
492b352ab6
Remove unnecessary model_type arg in example ( #4771 )
2020-06-04 13:41:24 -04:00
Jin Young Sohn
b231a413f5
Add cache_dir to save features in GLUE + Differentiate match/mismatch for MNLI metrics ( #4621 )
...
* Glue task cleaup
* Enable writing cache to cache_dir in case dataset lives in readOnly
filesystem.
* Differentiate match vs mismatch for MNLI metrics.
* Style
* Fix pytype
* Fix type
* Use cache_dir in mnli mismatch eval dataset
* Small Tweaks
Co-authored-by: Julien Chaumond <chaumond@gmail.com >
2020-06-02 13:40:14 -04:00
Julien Chaumond
b42586ea56
Fix CI after killing archive maps ( #4724 )
...
GitHub-hosted runner / check_code_quality (push) Has been cancelled
* 🐛 Fix model ids for BART and Flaubert
2020-06-02 10:21:09 -04:00
Julien Chaumond
d4c2cb402d
Kill model archive maps ( #4636 )
...
* Kill model archive maps
* Fixup
* Also kill model_archive_map for MaskedBertPreTrainedModel
* Unhook config_archive_map
* Tokenizers: align with model id changes
* make style && make quality
* Fix CI
2020-06-02 09:39:33 -04:00
Lysandre Debut
88762a2f8c
Specify PyTorch versions for examples ( #4710 )
2020-06-02 04:29:28 -04:00
Victor SANH
bf760c80b5
finish README
2020-06-01 09:23:31 -04:00
Victor SANH
9d7d9b3ae0
weird import
2020-06-01 09:23:31 -04:00
Victor SANH
2a3c88a659
Update examples/movement-pruning/README.md
...
Co-authored-by: Julien Chaumond <chaumond@gmail.com >
2020-06-01 09:23:31 -04:00
Victor SANH
4ac462bfb8
Update examples/movement-pruning/README.md
...
Co-authored-by: Julien Chaumond <chaumond@gmail.com >
2020-06-01 09:23:31 -04:00
Victor SANH
35fa0bbca0
clarify README
2020-06-01 09:23:31 -04:00
Victor SANH
cc746a5020
flake8 compliance
2020-06-01 09:23:31 -04:00
Victor SANH
b11386e158
less prints in saving prunebert
2020-06-01 09:23:31 -04:00
Victor SANH
8b5d4003ab
complete README
2020-06-01 09:23:31 -04:00
Victor SANH
5c8e5b3709
commplying with isort
2020-06-01 09:23:31 -04:00
Victor SANH
db2a3b2e01
space
2020-06-01 09:23:31 -04:00
Victor SANH
5f8f2d849a
add floppy bert model notebok
2020-06-01 09:23:31 -04:00
Victor SANH
b41948f5cd
add requirements
2020-06-01 09:23:31 -04:00
Victor SANH
fb8f4277b2
add scripts
2020-06-01 09:23:31 -04:00
Victor SANH
d489a6d3d5
add masked_run_*
2020-06-01 09:23:31 -04:00
Victor SANH
e4c07faf0a
add sparsity modules
2020-06-01 09:23:31 -04:00
Patrick von Platen
96f57c9ccb
[Benchmark] Memory benchmark utils ( #4198 )
...
* improve memory benchmarking
* correct typo
* fix current memory
* check torch memory allocated
* better pytorch function
* add total cached gpu memory
* add total gpu required
* improve torch gpu usage
* update memory usage
* finalize memory tracing
* save intermediate benchmark class
* fix conflict
* improve benchmark
* improve benchmark
* finalize
* make style
* improve benchmarking
* correct typo
* make train function more flexible
* fix csv save
* better repr of bytes
* better print
* fix __repr__ bug
* finish plot script
* rename plot file
* delete csv and small improvements
* fix in plot
* fix in plot
* correct usage of timeit
* remove redundant line
* remove redundant line
* fix bug
* add hf parser tests
* add versioning and platform info
* make style
* add gpu information
* ensure backward compatibility
* finish adding all tests
* Update src/transformers/benchmark/benchmark_args.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co >
* Update src/transformers/benchmark/benchmark_args_utils.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co >
* delete csv files
* fix isort ordering
* add out of memory handling
* add better train memory handling
Co-authored-by: Lysandre Debut <lysandre@huggingface.co >
2020-05-27 23:22:16 +02:00
Lysandre Debut
6a17688021
per_device instead of per_gpu/error thrown when argument unknown ( #4618 )
...
* per_device instead of per_gpu/error thrown when argument unknown
* [docs] Restore examples.md symlink
* Correct absolute links so that symlink to the doc works correctly
* Update src/transformers/hf_argparser.py
Co-authored-by: Julien Chaumond <chaumond@gmail.com >
* Warning + reorder
* Docs
* Style
* not for squad
Co-authored-by: Julien Chaumond <chaumond@gmail.com >
2020-05-27 11:36:55 -04:00
Hao Tan
a9aa7456ac
Add back --do_lower_case to uncased models ( #4245 )
...
The option `--do_lower_case` is currently required by the uncased models (i.e., bert-base-uncased, bert-large-uncased).
Results:
BERT-BASE without --do_lower_case: 'exact': 73.83, 'f1': 82.22
BERT-BASE with --do_lower_case: 'exact': 81.02, 'f1': 88.34
2020-05-26 21:13:07 -04:00
Antonis Maronikolakis
50d1ce411f
add DistilBERT to supported models ( #4558 )
2020-05-25 14:50:45 -04:00
Zhangyx
49296533ca
Adds predict stage for glue tasks, and generate result files which can be submitted to gluebenchmark.com ( #4463 )
...
* Adds predict stage for glue tasks, and generate result files which could be submitted to gluebenchmark.com website.
* Use Split enum + always output the label name
Co-authored-by: Julien Chaumond <chaumond@gmail.com >
2020-05-21 09:17:44 -04:00
Tobias Lee
271bedb485
[examples] fix no grad in second pruning in run_bertology ( #4479 )
...
* fix no grad in second pruning and typo
* fix prune heads attention mismatch problem
* fix
* fix
* fix
* run make style
* run make style
2020-05-21 09:17:03 -04:00
Patrick von Platen
aa925a52fa
[Tests, GPU, SLOW] fix a bunch of GPU hardcoded tests in Pytorch ( #4468 )
...
* fix gpu slow tests in pytorch
* change model to device syntax
2020-05-19 21:35:04 +02:00
Julien Chaumond
5e7fe8b585
Distributed eval: SequentialDistributedSampler + gather all results ( #4243 )
...
* Distributed eval: SequentialDistributedSampler + gather all results
* For consistency only write to disk from world_master
Close https://github.com/huggingface/transformers/issues/4272
* Working distributed eval
* Hook into scripts
* Fix #3721 again
* TPU.mesh_reduce: stay in tensor space
Thanks @jysohn23
* Just a small comment
* whitespace
* torch.hub: pip install packaging
* Add test scenarii
2020-05-18 22:02:39 -04:00
Boris Dayma
d9ece8233d
fix(run_language_modeling): use arg overwrite_cache ( #4407 )
2020-05-18 11:37:35 -04:00
Julien Chaumond
757baee846
Fix un-prefixed f-string
...
see https://github.com/huggingface/transformers/pull/4367#discussion_r426356693
Hat/tip @girishponkiya
2020-05-18 11:20:46 -04:00
Julien Chaumond
15550ce0d1
[skip ci] remove local rank
2020-05-15 17:08:38 -04:00
Lysandre Debut
edf9ac11d4
Should return overflowing information for the log ( #4385 )
2020-05-15 09:49:11 -04:00