* Expand a bit the presentation of examples * Apply suggestions from code review Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Address review comments Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
68 lines
2.1 KiB
Markdown
68 lines
2.1 KiB
Markdown
<!---
|
|
Copyright 2020 The HuggingFace Team. All rights reserved.
|
|
|
|
Licensed under the Apache License, Version 2.0 (the "License");
|
|
you may not use this file except in compliance with the License.
|
|
You may obtain a copy of the License at
|
|
|
|
http://www.apache.org/licenses/LICENSE-2.0
|
|
|
|
Unless required by applicable law or agreed to in writing, software
|
|
distributed under the License is distributed on an "AS IS" BASIS,
|
|
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
|
See the License for the specific language governing permissions and
|
|
limitations under the License.
|
|
-->
|
|
|
|
## Multiple Choice
|
|
|
|
Based on the script [`run_swag.py`]().
|
|
|
|
## PyTorch script: fine-tuning on SWAG
|
|
|
|
`run_swag` allows you to fine-tune any model from our [hub](https://huggingface.co/models) (as long as its architecture as a `ForMultipleChoice` version in the library) on the SWAG dataset or your own csv/jsonlines files as long as they are structured the same way. To make it works on another dataset, you will need to tweak the `preprocess_function` inside the script.
|
|
|
|
```bash
|
|
python examples/multiple-choice/run_swag.py \
|
|
--model_name_or_path roberta-base \
|
|
--do_train \
|
|
--do_eval \
|
|
--learning_rate 5e-5 \
|
|
--num_train_epochs 3 \
|
|
--output_dir /tmp/swag_base \
|
|
--per_gpu_eval_batch_size=16 \
|
|
--per_device_train_batch_size=16 \
|
|
--overwrite_output
|
|
```
|
|
Training with the defined hyper-parameters yields the following results:
|
|
```
|
|
***** Eval results *****
|
|
eval_acc = 0.8338998300509847
|
|
eval_loss = 0.44457291918821606
|
|
```
|
|
|
|
|
|
## Tensorflow
|
|
|
|
```bash
|
|
export SWAG_DIR=/path/to/swag_data_dir
|
|
python ./examples/multiple-choice/run_tf_multiple_choice.py \
|
|
--task_name swag \
|
|
--model_name_or_path bert-base-cased \
|
|
--do_train \
|
|
--do_eval \
|
|
--data_dir $SWAG_DIR \
|
|
--learning_rate 5e-5 \
|
|
--num_train_epochs 3 \
|
|
--max_seq_length 80 \
|
|
--output_dir models_bert/swag_base \
|
|
--per_gpu_eval_batch_size=16 \
|
|
--per_device_train_batch_size=16 \
|
|
--logging-dir logs \
|
|
--gradient_accumulation_steps 2 \
|
|
--overwrite_output
|
|
```
|
|
|
|
# Run it in colab
|
|
[](https://colab.research.google.com/github/ViktorAlm/notebooks/blob/master/MPC_GPU_Demo_for_TF_and_PT.ipynb)
|