[XLSR-Wav2Vec2 Info doc] Add a couple of lines (#10806)

* finish

* fix

* fix

* fix

* fix
This commit is contained in:
Patrick von Platen
2021-03-19 12:52:54 +03:00
committed by GitHub
parent 117dba9948
commit e8968bd03a

View File

@@ -25,12 +25,13 @@ It is very much possible that prizes will be given to groups of people instead o
- [Google colab setup](#google-colab-setup) - [Google colab setup](#google-colab-setup)
- [Local machine](#local-machine) - [Local machine](#local-machine)
- [How to upload my trained checkpoint](#how-to-upload-my-trained-checkpoint) - [How to upload my trained checkpoint](#how-to-upload-my-trained-checkpoint)
- [How to create the README](#how-to-create-the-README) - [How to create the README](#how-to-create-the-readme)
- [How to evaluate my trained checkpoint](#how-to-evaluate-my-trained-checkpoint) - [How to evaluate my trained checkpoint](#how-to-evaluate-my-trained-checkpoint)
- [Rules of training and evaluation](#rules-of-training-and-evaluation) - [Rules of training and evaluation](#rules-of-training-and-evaluation)
- [Tips and tricks for training](#tips-and-tricks-for-training) - [Tips and tricks](#tips-and-tricks)
- [How to combine multiple datasests into one](#how-to-combine-multiple-datasets-into-one) - [How to combine multiple datasests into one](#how-to-combine-multiple-datasets-into-one)
- [How to effectively preprocess the data](#how-to-effectively-preprocess-the-data) - [How to effectively preprocess the data](#how-to-effectively-preprocess-the-data)
- [How to efficiently preproces the data](#how-to-do-efficiently-load-datasets-with-limited-ram-and-hard-drive-space)
- [How to do hyperparameter tuning](#how-to-do-hyperparameter-tuning) - [How to do hyperparameter tuning](#how-to-do-hyperparameter-tuning)
- [How to preprocess and evaluate character based languages](#how-to-preprocess-and-evaluate-character-based-languages) - [How to preprocess and evaluate character based languages](#how-to-preprocess-and-evaluate-character-based-languages)
- [Further reading material](#further-reading-material) - [Further reading material](#further-reading-material)
@@ -284,7 +285,7 @@ result = test_dataset.map(evaluate, batched=True, batch_size=8)
print("WER: {:2f}".format(100 * wer.compute(predictions=result["pred_strings"], references=result["sentence"]))) print("WER: {:2f}".format(100 * wer.compute(predictions=result["pred_strings"], references=result["sentence"])))
``` ```
**Result**: XX.XX % # TODO: write output of print here **Test Result**: XX.XX % # TODO: write output of print here
## Training ## Training
@@ -325,23 +326,26 @@ done, *e.g.* [here](https://discuss.huggingface.co/t/spanish-asr-fine-tuning-wav
## Tips and tricks ## Tips and tricks
TODO... This section summarizes a couple of tips and tricks across various topics. It will continously be updated during the week.
### How to combine multiple datasets into one ### How to combine multiple datasets into one
Check out [this](https://discuss.huggingface.co/t/how-to-combine-local-data-files-with-an-official-dataset/4685) post.
### How to effectively preprocess the data ### How to effectively preprocess the data
### How to do hyperparameter turing for my language ### How to do efficiently load datasets with limited ram and hard drive space
Check out [this](https://discuss.huggingface.co/t/german-asr-fine-tuning-wav2vec2/4558/8?u=patrickvonplaten) post.
### How to do hyperparameter tuning
### How to preprocess and evaluate character based languages ### How to preprocess and evaluate character based languages
### How to do lazy data loading
## Further reading material ## Further reading material
It is recommended that take some time to read up on how Wav2vec2 works in theory. It is recommended that take some time to read up on how Wav2vec2 works in theory.