Update TF text classification example (#11496)

Big refactor, fixes and multi-GPU/TPU support
2021-04-30 13:45:33 +01:00
parent 8b945ef03e
commit 20d6931e32
3 changed files with 199 additions and 185 deletions
--- a/examples/tensorflow/text-classification/README.md
+++ b/examples/tensorflow/text-classification/README.md
@@ -54,6 +54,20 @@ After training, the model will be saved to `--output_dir`. Once your model is tr
 by calling the script without a `--train_file` or `--validation_file`; simply pass it the output_dir containing
 the trained model and a `--test_file` and it will write its predictions to a text file for you.

+### Multi-GPU and TPU usage
+
+By default, the script uses a `MirroredStrategy` and will use multiple GPUs effectively if they are available. TPUs
+can also be used by passing the name of the TPU resource with the `--tpu` argument.
+
+### Memory usage and data loading
+
+One thing to note is that all data is loaded into memory in this script. Most text classification datasets are small
+enough that this is not an issue, but if you have a very large dataset you will need to modify the script to handle
+data streaming. This is particularly challenging for TPUs, given the stricter requirements and the sheer volume of data
+required to keep them fed. A full explanation of all the possible pitfalls is a bit beyond this example script and 
+README, but for more information you can see the 'Input Datasets' section of 
+[this document](https://www.tensorflow.org/guide/tpu).
+
 ### Example command
 ```
 python run_text_classification.py \