HuggingFace_transformer/examples/seq2seq/precomputed_pseudo_labels.md at e2bb9abb6ad20e695686dc410ebd398a30cc1943

SUMIN/HuggingFace_transformer

Fork 0

Files

Sam Shleifer e2bb9abb6a [s2s] release pseudolabel links and instructions (#7639 )

2020-10-07 11:20:44 -04:00

3.2 KiB

Raw Blame History

Precomputed pseudolabels

decompress with tar -xzvf. The produced directory name may differ from the filename.

Dataset	Model	Rouge Scores	Notes	Link
XSUM	facebook/bart-large-xsum	49.8/28.0/42.5		download
XSUM	google/pegasus-xsum	53.3/32.7/46.5		download
XSUM	facebook/bart-large-xsum	?	Bart pseudolabels filtered to those with Rouge2 > 10.0 w GT	download
				download
CNN/DM	sshleifer/pegasus-cnn-ft-v2	47.316/26.65/44.56	do not worry about the fact that train.source is one line shorter.	download
CNN/DM	facebook/bart-large-cnn		5K (2%) are missing, there should be 282173	download
CNN/DM	google/pegasus-xsum	21.5/6.76/25	extra labels for xsum distillation Used max_source_length=512, (and all other pegasus-xsum configuration).	download
EN-RO	Helsinki-NLP/opus-mt-en-ro			download
EN-RO	facebook/mbart-large-en-ro			download

Generating Pseudolabels

These command takes a while to run. For example, pegasus_cnn_cnn_pls.tgz took 8 hours on 8 GPUs.
Pegasus does not work in fp16 :(, Bart, mBART and Marian do.

python -m torch.distributed.launch --nproc_per_node=8 run_distributed_eval.py \
    --model_name facebook/bart-large-xsum --save_dir bart_xsum_pl --data_dir xsum \
    --fp16 --bs 32 --sync_timeout 60000 --max_source_length 1024

3.2 KiB Raw Blame History

Precomputed pseudolabels

Generating Pseudolabels

3.2 KiB

Raw Blame History