HuggingFace_transformer

Author	SHA1	Message	Date
Yacine Jernite	49c5202522	Eli5 examples (#4968 ) * add eli5 examples * add dense query script * query_di * merging * merging * add_utils * adds nearest neighbor wikipedia * batch queries * training_retriever * new notebooks * moved retriever traiing script * finished wiki40b * max_len_fix * train_s2s * retriever_batch_checkpointing * cleanup * merge * dim_fix * fix_indexer * fix_wiki40b_snippets * fix_embed_for_r * fp32 index * fix_sparse_q * joint_training * remove obsolete datasets * add_passage_nn_results * add_passage_nn_results * add_batch_nn * add_batch_nn * add_data_scripts * notebook * notebook * notebook * fix_multi_gpu * add_app * full_caching * full_caching * notebook * sparse_done * images * notebook * add_image_gif * with_Gif * add_contr_image * notebook * notebook * notebook * train_functions * notebook * min_retrieval_length * pandas_option * notebook * min_retrieval_length * notebook * notebook * eval_Retriever * notebook * images * notebook * add_example * add_example * notebook * fireworks * notebook * notebook * joe's notebook comments * app_update * notebook * notebook_link * captions * notebook * assing RetriBert model * add RetriBert to Auto * change AutoLMHead to AutoSeq2Seq * notebook downloads from hf models * style_black * style_black * app_update * app_update * fix_app_update * style * style * isort * Delete WikiELI5training.ipynb * Delete evaluate_eli5.py * Delete WikiELI5explore.ipynb * Delete ExploreWikiELI5Support.html * Delete explainlikeimfive.py * Delete wiki_snippets.py * children before parent * children before parent * style_black * style_black_only * isort * isort_new * Update src/transformers/modeling_retribert.py Co-authored-by: Julien Chaumond <chaumond@gmail.com> * typo fixes * app_without_asset * cleanup * Delete ELI5animation.gif * Delete ELI5contrastive.svg * Delete ELI5wiki_index.svg * Delete choco_bis.svg * Delete fireworks.gif * Delete huggingface_logo.jpg * Delete huggingface_logo.svg * Delete Long_Form_Question_Answering_with_ELI5_and_Wikipedia.ipynb * Delete eli5_app.py * Delete eli5_utils.py * readme * Update README.md * unused imports * moved_info * default_beam * ftuned model * disclaimer * Update src/transformers/modeling_retribert.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * black * add_doc * names * isort_Examples * isort_Examples * Add doc to index Co-authored-by: Julien Chaumond <chaumond@gmail.com> Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>	2020-06-16 16:36:58 -04:00
Sam Shleifer	c3e607496c	[cleanup] examples test_run_squad uses tiny model (#5059 )	2020-06-16 14:06:45 -04:00
Sylvain Gugger	d5477baf7d	Convert hans to Trainer (#5025 ) * Convert hans to Trainer * Tick box	2020-06-16 08:06:31 -04:00
Anthony MOI	36434220fc	[HUGE] Refactoring tokenizers backend - padding - truncation - pre-tokenized pipeline - fast tokenizers - tests (#4510 ) * Use tokenizers pre-tokenized pipeline * failing pretrokenized test * Fix is_pretokenized in python * add pretokenized tests * style and quality * better tests for batched pretokenized inputs * tokenizers clean up - new padding_strategy - split the files * [HUGE] refactoring tokenizers - padding - truncation - tests * style and quality * bump up requied tokenizers version to 0.8.0-rc1 * switched padding/truncation API - simpler better backward compat * updating tests for custom tokenizers * style and quality - tests on pad * fix QA pipeline * fix backward compatibility for max_length only * style and quality * Various cleans up - add verbose * fix tests * update docstrings * Fix tests * Docs reformatted * __call__ method documented Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com> Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>	2020-06-15 17:12:51 -04:00
Sylvain Gugger	1affde2f10	Make DataCollator a callable (#5015 ) * Make DataCollator a callable * Update src/transformers/data/data_collator.py Co-authored-by: Julien Chaumond <chaumond@gmail.com>	2020-06-15 11:58:33 -04:00
Stefan Schweter	d812e6d76e	NER: fix construction of input examples for RoBERTa (#4943 ) * utils_ner: do not add extra sep token for RoBERTa model * run_pl_ner: do not add extra sep token for RoBERTa model	2020-06-15 08:30:40 -04:00
Sylvain Gugger	403d309857	Hans data (#4854 ) * Update hans data to be able to use Trainer * Fixes * Deal with tokenizer that don't have token_ids * Clean up things * Simplify data use * Fix the input dict * Formatting + proper path in README	2020-06-13 09:35:13 -04:00
VictorSanh	473808da0d	update `mvmt-pruning/saving_prunebert` (updating torch to 1.5)	2020-06-11 19:42:45 +00:00
Sylvain Gugger	e8db8b845a	Remove unused arguments in Multiple Choice example (#4853 ) * Remove unused arguments * Formatting * Remove second todo comment	2020-06-09 20:05:09 -04:00
songyouwei	29c36e9f36	run_pplm.py bug fix (#4867 ) `is_leaf` may become `False` after `.to(device=device)` function call.	2020-06-09 19:14:27 -04:00
Sam Shleifer	f90bc44d9a	[examples] Cleanup summarization docs (#4876 )	2020-06-09 17:38:28 -04:00
Amil Khare	02e5f79662	[examples] consolidate summarization examples (#4837 )	2020-06-09 11:14:12 -04:00
daniel-shan	b6f365a8ed	Updates args in tf squad example. (#4820 ) Co-authored-by: Daniel Shan <daniel.shan@workday.com>	2020-06-08 05:36:09 -04:00
Mr Ruben	ddf9a3dfc7	Updated path "cd examples/text-generation/pplm" (#4778 ) https://github.com/huggingface/transformers/issues/4776	2020-06-05 21:16:48 -04:00
Sam Shleifer	875288b344	[isort] add matplotlib to known 3rd party dependencies (#4800 )	2020-06-05 17:27:31 -04:00
Julien Chaumond	b9109f2de1	[doc] Make it clearer that `text-generation` does not involve training	2020-06-05 14:59:22 +02:00
Stefan Schweter	2a4b9e09c0	NER: Add new WNUT’17 example (#4681 ) * ner: add preprocessing script for examples that splits longer sentences * ner: example shell scripts use local preprocessing now * ner: add new example section for WNUT’17 NER task. Remove old English CoNLL-03 results * ner: satisfy black and isort	2020-06-04 19:13:17 -04:00
prajjwal1	48a05026de	removed deprecared use of Variable api from pplm example	2020-06-04 18:07:49 -04:00
Jason Phang	492b352ab6	Remove unnecessary model_type arg in example (#4771 )	2020-06-04 13:41:24 -04:00
Jin Young Sohn	b231a413f5	Add cache_dir to save features in GLUE + Differentiate match/mismatch for MNLI metrics (#4621 ) * Glue task cleaup * Enable writing cache to cache_dir in case dataset lives in readOnly filesystem. * Differentiate match vs mismatch for MNLI metrics. * Style * Fix pytype * Fix type * Use cache_dir in mnli mismatch eval dataset * Small Tweaks Co-authored-by: Julien Chaumond <chaumond@gmail.com>	2020-06-02 13:40:14 -04:00
Julien Chaumond	b42586ea56	Fix CI after killing archive maps (#4724 ) Some checks failed GitHub-hosted runner / check_code_quality (push) Has been cancelled Details * 🐛 Fix model ids for BART and Flaubert	2020-06-02 10:21:09 -04:00
Julien Chaumond	d4c2cb402d	Kill model archive maps (#4636 ) * Kill model archive maps * Fixup * Also kill model_archive_map for MaskedBertPreTrainedModel * Unhook config_archive_map * Tokenizers: align with model id changes * make style && make quality * Fix CI	2020-06-02 09:39:33 -04:00
Lysandre Debut	88762a2f8c	Specify PyTorch versions for examples (#4710 )	2020-06-02 04:29:28 -04:00
Victor SANH	bf760c80b5	finish README	2020-06-01 09:23:31 -04:00
Victor SANH	9d7d9b3ae0	weird import	2020-06-01 09:23:31 -04:00
Victor SANH	2a3c88a659	Update examples/movement-pruning/README.md Co-authored-by: Julien Chaumond <chaumond@gmail.com>	2020-06-01 09:23:31 -04:00
Victor SANH	4ac462bfb8	Update examples/movement-pruning/README.md Co-authored-by: Julien Chaumond <chaumond@gmail.com>	2020-06-01 09:23:31 -04:00
Victor SANH	35fa0bbca0	clarify README	2020-06-01 09:23:31 -04:00
Victor SANH	cc746a5020	flake8 compliance	2020-06-01 09:23:31 -04:00
Victor SANH	b11386e158	less prints in saving prunebert	2020-06-01 09:23:31 -04:00
Victor SANH	8b5d4003ab	complete README	2020-06-01 09:23:31 -04:00
Victor SANH	5c8e5b3709	commplying with isort	2020-06-01 09:23:31 -04:00
Victor SANH	db2a3b2e01	space	2020-06-01 09:23:31 -04:00
Victor SANH	5f8f2d849a	add floppy bert model notebok	2020-06-01 09:23:31 -04:00
Victor SANH	b41948f5cd	add requirements	2020-06-01 09:23:31 -04:00
Victor SANH	fb8f4277b2	add scripts	2020-06-01 09:23:31 -04:00
Victor SANH	d489a6d3d5	add masked_run_*	2020-06-01 09:23:31 -04:00
Victor SANH	e4c07faf0a	add sparsity modules	2020-06-01 09:23:31 -04:00
Patrick von Platen	96f57c9ccb	[Benchmark] Memory benchmark utils (#4198 ) * improve memory benchmarking * correct typo * fix current memory * check torch memory allocated * better pytorch function * add total cached gpu memory * add total gpu required * improve torch gpu usage * update memory usage * finalize memory tracing * save intermediate benchmark class * fix conflict * improve benchmark * improve benchmark * finalize * make style * improve benchmarking * correct typo * make train function more flexible * fix csv save * better repr of bytes * better print * fix __repr__ bug * finish plot script * rename plot file * delete csv and small improvements * fix in plot * fix in plot * correct usage of timeit * remove redundant line * remove redundant line * fix bug * add hf parser tests * add versioning and platform info * make style * add gpu information * ensure backward compatibility * finish adding all tests * Update src/transformers/benchmark/benchmark_args.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/benchmark/benchmark_args_utils.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * delete csv files * fix isort ordering * add out of memory handling * add better train memory handling Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-05-27 23:22:16 +02:00
Lysandre Debut	6a17688021	per_device instead of per_gpu/error thrown when argument unknown (#4618 ) * per_device instead of per_gpu/error thrown when argument unknown * [docs] Restore examples.md symlink * Correct absolute links so that symlink to the doc works correctly * Update src/transformers/hf_argparser.py Co-authored-by: Julien Chaumond <chaumond@gmail.com> * Warning + reorder * Docs * Style * not for squad Co-authored-by: Julien Chaumond <chaumond@gmail.com>	2020-05-27 11:36:55 -04:00
Hao Tan	a9aa7456ac	Add back --do_lower_case to uncased models (#4245 ) The option `--do_lower_case` is currently required by the uncased models (i.e., bert-base-uncased, bert-large-uncased). Results: BERT-BASE without --do_lower_case: 'exact': 73.83, 'f1': 82.22 BERT-BASE with --do_lower_case: 'exact': 81.02, 'f1': 88.34	2020-05-26 21:13:07 -04:00
Antonis Maronikolakis	50d1ce411f	add DistilBERT to supported models (#4558 )	2020-05-25 14:50:45 -04:00
Zhangyx	49296533ca	Adds predict stage for glue tasks, and generate result files which can be submitted to gluebenchmark.com (#4463 ) * Adds predict stage for glue tasks, and generate result files which could be submitted to gluebenchmark.com website. * Use Split enum + always output the label name Co-authored-by: Julien Chaumond <chaumond@gmail.com>	2020-05-21 09:17:44 -04:00
Tobias Lee	271bedb485	[examples] fix no grad in second pruning in run_bertology (#4479 ) * fix no grad in second pruning and typo * fix prune heads attention mismatch problem * fix * fix * fix * run make style * run make style	2020-05-21 09:17:03 -04:00
Patrick von Platen	aa925a52fa	[Tests, GPU, SLOW] fix a bunch of GPU hardcoded tests in Pytorch (#4468 ) * fix gpu slow tests in pytorch * change model to device syntax	2020-05-19 21:35:04 +02:00
Julien Chaumond	5e7fe8b585	Distributed eval: SequentialDistributedSampler + gather all results (#4243 ) * Distributed eval: SequentialDistributedSampler + gather all results * For consistency only write to disk from world_master Close https://github.com/huggingface/transformers/issues/4272 * Working distributed eval * Hook into scripts * Fix #3721 again * TPU.mesh_reduce: stay in tensor space Thanks @jysohn23 * Just a small comment * whitespace * torch.hub: pip install packaging * Add test scenarii	2020-05-18 22:02:39 -04:00
Boris Dayma	d9ece8233d	fix(run_language_modeling): use arg overwrite_cache (#4407 )	2020-05-18 11:37:35 -04:00
Julien Chaumond	757baee846	Fix un-prefixed f-string see https://github.com/huggingface/transformers/pull/4367#discussion_r426356693 Hat/tip @girishponkiya	2020-05-18 11:20:46 -04:00
Julien Chaumond	15550ce0d1	[skip ci] remove local rank	2020-05-15 17:08:38 -04:00
Lysandre Debut	edf9ac11d4	Should return overflowing information for the log (#4385 )	2020-05-15 09:49:11 -04:00

1 2 3 4 5 ...

1063 Commits