Add DeeBERT (entropy-based early exiting for *BERT) (#5477)
* Add deebert code * Add readme of deebert * Add test for deebert Update test for Deebert * Update DeeBert (README, class names, function refactoring); remove requirements.txt * Format update * Update test * Update readme and model init methods
This commit is contained in:
54
examples/deebert/README.md
Normal file
54
examples/deebert/README.md
Normal file
@@ -0,0 +1,54 @@
|
||||
# DeeBERT: Early Exiting for *BERT
|
||||
|
||||
This is the code base for the paper [DeeBERT: Dynamic Early Exiting for Accelerating BERT Inference](https://www.aclweb.org/anthology/2020.acl-main.204/), modified from its [original code base](https://github.com/castorini/deebert).
|
||||
|
||||
The original code base also has information for downloading sample models that we have trained in advance.
|
||||
|
||||
## Usage
|
||||
|
||||
There are three scripts in the folder which can be run directly.
|
||||
|
||||
In each script, there are several things to modify before running:
|
||||
|
||||
* `PATH_TO_DATA`: path to the GLUE dataset.
|
||||
* `--output_dir`: path for saving fine-tuned models. Default: `./saved_models`.
|
||||
* `--plot_data_dir`: path for saving evaluation results. Default: `./results`. Results are printed to stdout and also saved to `npy` files in this directory to facilitate plotting figures and further analyses.
|
||||
* `MODEL_TYPE`: bert or roberta
|
||||
* `MODEL_SIZE`: base or large
|
||||
* `DATASET`: SST-2, MRPC, RTE, QNLI, QQP, or MNLI
|
||||
|
||||
#### train_deebert.sh
|
||||
|
||||
This is for fine-tuning DeeBERT models.
|
||||
|
||||
#### eval_deebert.sh
|
||||
|
||||
This is for evaluating each exit layer for fine-tuned DeeBERT models.
|
||||
|
||||
#### entropy_eval.sh
|
||||
|
||||
This is for evaluating fine-tuned DeeBERT models, given a number of different early exit entropy thresholds.
|
||||
|
||||
|
||||
|
||||
## Citation
|
||||
|
||||
Please cite our paper if you find the resource useful:
|
||||
```
|
||||
@inproceedings{xin-etal-2020-deebert,
|
||||
title = "{D}ee{BERT}: Dynamic Early Exiting for Accelerating {BERT} Inference",
|
||||
author = "Xin, Ji and
|
||||
Tang, Raphael and
|
||||
Lee, Jaejun and
|
||||
Yu, Yaoliang and
|
||||
Lin, Jimmy",
|
||||
booktitle = "Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics",
|
||||
month = jul,
|
||||
year = "2020",
|
||||
address = "Online",
|
||||
publisher = "Association for Computational Linguistics",
|
||||
url = "https://www.aclweb.org/anthology/2020.acl-main.204",
|
||||
pages = "2246--2251",
|
||||
}
|
||||
```
|
||||
|
||||
Reference in New Issue
Block a user