Thomas Wolf
9beaa85b07
Merge pull request #1055 from qipeng/run_squad_fix
...
Fix #1015 (tokenizer defaults to use_lower_case=True when loading from trained models)
2019-08-21 01:20:46 +02:00
Lysandre
2d042274ac
Sequence special token handling for BERT and RoBERTa
2019-08-20 14:15:28 -04:00
Peng Qi
3bffd2e8e5
more fixes
2019-08-20 10:59:28 -07:00
Thomas Wolf
3b56427a1e
Merge pull request #1040 from FeiWang96/multi_gpu
...
Fix bug of multi-gpu training in lm finetuning
2019-08-20 17:13:44 +02:00
thomwolf
a690edab17
various fix and clean up on run_lm_finetuning
2019-08-20 15:52:12 +02:00
erenup
fc74132598
add best steps to train
2019-08-20 19:06:41 +08:00
Duzeyao
d86b49ac86
swap optimizer.step and scheduler.step
2019-08-20 16:46:34 +08:00
Duzeyao
45ab8bf60e
Revert "Update finetune_on_pregenerated.py"
...
This reverts commit a1359b970c .
2019-08-20 16:40:39 +08:00
erenup
97c30b73d5
add test related code
2019-08-20 16:31:04 +08:00
erenup
d5e60e5b7a
add test related code
2019-08-20 16:25:50 +08:00
Zeyao Du
a1359b970c
Update finetune_on_pregenerated.py
2019-08-20 16:00:07 +08:00
Zeyao Du
28f7ca1f80
swap optimizer.step and scheduler.step
2019-08-20 15:58:42 +08:00
Peng Qi
a368b87791
Fix #1015
2019-08-19 13:07:00 -07:00
Lysandre
f94f1c6016
Distributed training + tokenizer agnostic mask token
2019-08-19 14:58:50 -04:00
Thomas Wolf
5a49b793d9
Merge pull request #1023 from tuvuumass/patch-1
...
fix issue #824
2019-08-19 15:31:46 +02:00
erenup
4270d3da1b
fix a bug of evaluating
2019-08-19 16:38:52 +08:00
Chi-Liang Liu
40acf6b52a
don't save model without training
2019-08-18 05:02:25 -04:00
erenup
47e9aea0fe
add args info to evaluate_result.txt
2019-08-18 17:00:53 +08:00
erenup
5582bc4b23
add multiple choice to robreta and xlnet, test on swag, roberta=0.82.28
...
, xlnet=0.80
2019-08-18 16:01:48 +08:00
wangfei
856a63da4d
Fix: save model/model.module
2019-08-18 11:03:47 +08:00
wangfei
1ef41b8337
Revert "Fix: save model/model.module"
...
This reverts commit 00e9c4cc96 .
2019-08-18 11:03:12 +08:00
wangfei
00e9c4cc96
Fix: save model/model.module
2019-08-18 11:02:02 +08:00
erenup
e384ae2b9d
Merge remote-tracking branch 'huggingface/master'
...
merge huggingface/master to update
2019-08-17 12:05:57 +08:00
Jason Phang
d8923270e6
Correct truncation for RoBERTa in 2-input GLUE
2019-08-16 16:30:38 -04:00
Lysandre
5652f54ac2
Simplified data generator + better perplexity calculator
...
GPT-2 now obtains ~20 perplexity on WikiText-2
2019-08-16 13:49:56 -04:00
LysandreJik
7e7fc53da5
Fixing run_glue example with RoBERTa
2019-08-16 11:53:10 -04:00
LysandreJik
715534800a
BERT + RoBERTa masking tokens handling + GPU device update.
2019-08-16 10:10:21 -04:00
LysandreJik
339e556feb
CLM for BERT, beginning of CLM fot RoBERTa; still needs a better masking token mechanism.
2019-08-16 10:10:20 -04:00
LysandreJik
5c18825a18
Removed dataset limit
2019-08-16 10:10:20 -04:00
LysandreJik
3e3e145497
Added GPT to the generative fine-tuning.
2019-08-16 10:10:20 -04:00
LysandreJik
47975ed53e
Language Modeling fine-tuning using GPT-2.
2019-08-16 10:10:20 -04:00
wangfei
b8ff56896c
Fix bug of multi-gpu training in lm finetuning
2019-08-16 12:11:05 +08:00
Rabeeh KARIMI
3d47a7f8ab
loads the tokenizer for each checkpoint, to solve the reproducability issue
2019-08-14 10:58:26 +02:00
LysandreJik
39f426be65
Added special tokens <pad> and <mask> to RoBERTa.
2019-08-13 15:19:50 -04:00
Julien Chaumond
baf08ca1d4
[RoBERTa] run_glue: correct pad_token + reorder labels
2019-08-13 12:51:15 -04:00
tuvuumass
ba4bce2581
fix issue #824
2019-08-13 11:26:27 -04:00
Julien Chaumond
912fdff899
[RoBERTa] Update run_glue for RoBERTa
2019-08-12 13:49:50 -04:00
erenup
b219029c45
refactoring old run_swag. This script is mainly refatored from run_squad in pytorch_transformers
2019-08-11 15:20:37 +08:00
Thomas Wolf
b4f9464f90
Merge pull request #960 from ethanjperez/patch-1
...
Fixing unused weight_decay argument
2019-08-07 10:09:55 +02:00
Thomas Wolf
d43dc48b34
Merge branch 'master' into auto_models
2019-08-05 19:17:35 +02:00
thomwolf
70c10caa06
add option mentioned in #940
2019-08-05 17:09:37 +02:00
thomwolf
b90e29d52c
working on automodels
2019-08-05 16:06:34 +02:00
Ethan Perez
28ba345ecc
Fixing unused weight_decay argument
...
Currently the L2 regularization is hard-coded to "0.01", even though there is a --weight_decay flag implemented (that is unused). I'm making this flag control the weight decay used for fine-tuning in this script.
2019-08-04 12:31:46 -04:00
Thomas Wolf
c054b5ee64
Merge pull request #896 from zijunsun/master
...
fix multi-gpu training bug when using fp16
2019-07-26 19:31:02 +02:00
zijunsun
f0aeb7a814
multi-gpu training also should be after apex fp16(squad)
2019-07-26 15:23:29 +08:00
zijunsun
adb3ef6368
multi-gpu training also should be after apex fp16
2019-07-25 13:09:10 +08:00
Chi-Liang Liu
a7fce6d917
fix squad v1 error (na_prob_file should be None)
2019-07-24 16:11:36 +08:00
thomwolf
6070b55443
fix #868
2019-07-23 17:46:01 +02:00
thomwolf
2c9a3115b7
fix #858
2019-07-23 16:45:55 +02:00
Thomas Wolf
268c6cc160
Merge pull request #845 from rabeehk/master
...
fixed version issues in run_openai_gpt
2019-07-23 15:29:31 +02:00