Add TF multiple choice example (#12865)
* Add new multiple-choice example, remove old one
This commit is contained in:
@@ -1,5 +1,5 @@
|
||||
<!---
|
||||
Copyright 2020 The HuggingFace Team. All rights reserved.
|
||||
Copyright 2021 The HuggingFace Team. All rights reserved.
|
||||
|
||||
Licensed under the Apache License, Version 2.0 (the "License");
|
||||
you may not use this file except in compliance with the License.
|
||||
@@ -13,26 +13,31 @@ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
-->
|
||||
# Multiple-choice training (e.g. SWAG)
|
||||
|
||||
# Multiple Choice
|
||||
This folder contains the `run_swag.py` script, showing an examples of *multiple-choice answering* with the
|
||||
🤗 Transformers library. For straightforward use-cases you may be able to use these scripts without modification,
|
||||
although we have also included comments in the code to indicate areas that you may need to adapt to your own projects.
|
||||
|
||||
## Fine-tuning on SWAG
|
||||
### Multi-GPU and TPU usage
|
||||
|
||||
By default, the script uses a `MirroredStrategy` and will use multiple GPUs effectively if they are available. TPUs
|
||||
can also be used by passing the name of the TPU resource with the `--tpu` argument.
|
||||
|
||||
### Memory usage and data loading
|
||||
|
||||
One thing to note is that all data is loaded into memory in this script. Most multiple-choice datasets are small
|
||||
enough that this is not an issue, but if you have a very large dataset you will need to modify the script to handle
|
||||
data streaming. This is particularly challenging for TPUs, given the stricter requirements and the sheer volume of data
|
||||
required to keep them fed. A full explanation of all the possible pitfalls is a bit beyond this example script and
|
||||
README, but for more information you can see the 'Input Datasets' section of
|
||||
[this document](https://www.tensorflow.org/guide/tpu).
|
||||
|
||||
### Example command
|
||||
```bash
|
||||
export SWAG_DIR=/path/to/swag_data_dir
|
||||
python ./examples/multiple-choice/run_tf_multiple_choice.py \
|
||||
--task_name swag \
|
||||
--model_name_or_path bert-base-cased \
|
||||
--do_train \
|
||||
--do_eval \
|
||||
--data_dir $SWAG_DIR \
|
||||
--learning_rate 5e-5 \
|
||||
--num_train_epochs 3 \
|
||||
--max_seq_length 80 \
|
||||
--output_dir models_bert/swag_base \
|
||||
--per_gpu_eval_batch_size=16 \
|
||||
--per_device_train_batch_size=16 \
|
||||
--logging-dir logs \
|
||||
--gradient_accumulation_steps 2 \
|
||||
--overwrite_output
|
||||
python run_swag.py \
|
||||
--model_name_or_path distilbert-base-cased \
|
||||
--output_dir output \
|
||||
--do_eval \
|
||||
--do_train
|
||||
```
|
||||
|
||||
Reference in New Issue
Block a user