Files
HuggingFace_transformer/examples/seq2seq/precomputed_pseudo_labels.md

3.2 KiB

Precomputed pseudolabels

  • decompress with tar -xzvf. The produced directory name may differ from the filename.
Dataset Model Rouge Scores Notes Link
XSUM facebook/bart-large-xsum 49.8/28.0/42.5 download
XSUM google/pegasus-xsum 53.3/32.7/46.5 download
XSUM facebook/bart-large-xsum ? Bart pseudolabels filtered to those with Rouge2 > 10.0 w GT download
download
CNN/DM sshleifer/pegasus-cnn-ft-v2 47.316/26.65/44.56 do not worry about the fact that train.source is one line shorter. download
CNN/DM facebook/bart-large-cnn 5K (2%) are missing, there should be 282173 download
CNN/DM google/pegasus-xsum 21.5/6.76/25 extra labels for xsum distillation Used max_source_length=512, (and all other pegasus-xsum configuration). download
EN-RO Helsinki-NLP/opus-mt-en-ro download
EN-RO facebook/mbart-large-en-ro download

Generating Pseudolabels

  • These command takes a while to run. For example, pegasus_cnn_cnn_pls.tgz took 8 hours on 8 GPUs.
  • Pegasus does not work in fp16 :(, Bart, mBART and Marian do.
python -m torch.distributed.launch --nproc_per_node=8 run_distributed_eval.py \
    --model_name facebook/bart-large-xsum --save_dir bart_xsum_pl --data_dir xsum \
    --fp16 --bs 32 --sync_timeout 60000 --max_source_length 1024