[docs] Performance docs tidy up, part 1 (#23963)
* first pass at the single gpu doc * overview: improved clarity and navigation * WIP * updated intro and deepspeed sections * improved torch.compile section * more improvements * minor improvements * make style * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * feedback addressed * mdx -> md * link fix * feedback addressed --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
This commit is contained in:
@@ -111,36 +111,40 @@
|
||||
- sections:
|
||||
- local: performance
|
||||
title: Overview
|
||||
- local: perf_train_gpu_one
|
||||
title: Training on one GPU
|
||||
- local: perf_train_gpu_many
|
||||
title: Training on many GPUs
|
||||
- local: perf_train_cpu
|
||||
title: Training on CPU
|
||||
- local: perf_train_cpu_many
|
||||
title: Training on many CPUs
|
||||
- local: perf_train_tpu
|
||||
title: Training on TPUs
|
||||
- local: perf_train_tpu_tf
|
||||
title: Training on TPU with TensorFlow
|
||||
- local: perf_train_special
|
||||
title: Training on Specialized Hardware
|
||||
- local: perf_infer_cpu
|
||||
title: Inference on CPU
|
||||
- local: perf_infer_gpu_one
|
||||
title: Inference on one GPU
|
||||
- local: perf_infer_gpu_many
|
||||
title: Inference on many GPUs
|
||||
- local: perf_infer_special
|
||||
title: Inference on Specialized Hardware
|
||||
- local: perf_hardware
|
||||
title: Custom hardware for training
|
||||
- sections:
|
||||
- local: perf_train_gpu_one
|
||||
title: Methods and tools for efficient training on a single GPU
|
||||
- local: perf_train_gpu_many
|
||||
title: Multiple GPUs and parallelism
|
||||
- local: perf_train_cpu
|
||||
title: Efficient training on CPU
|
||||
- local: perf_train_cpu_many
|
||||
title: Distributed CPU training
|
||||
- local: perf_train_tpu
|
||||
title: Training on TPUs
|
||||
- local: perf_train_tpu_tf
|
||||
title: Training on TPU with TensorFlow
|
||||
- local: perf_train_special
|
||||
title: Training on Specialized Hardware
|
||||
- local: perf_hardware
|
||||
title: Custom hardware for training
|
||||
- local: hpo_train
|
||||
title: Hyperparameter Search using Trainer API
|
||||
title: Efficient training techniques
|
||||
- sections:
|
||||
- local: perf_infer_cpu
|
||||
title: Inference on CPU
|
||||
- local: perf_infer_gpu_one
|
||||
title: Inference on one GPU
|
||||
- local: perf_infer_gpu_many
|
||||
title: Inference on many GPUs
|
||||
- local: perf_infer_special
|
||||
title: Inference on Specialized Hardware
|
||||
title: Optimizing inference
|
||||
- local: big_models
|
||||
title: Instantiating a big model
|
||||
- local: debugging
|
||||
title: Debugging
|
||||
- local: hpo_train
|
||||
title: Hyperparameter Search using Trainer API
|
||||
title: Troubleshooting
|
||||
- local: tf_xla
|
||||
title: XLA Integration for TensorFlow Models
|
||||
title: Performance and scalability
|
||||
@@ -182,6 +186,8 @@
|
||||
title: Perplexity of fixed-length models
|
||||
- local: pipeline_webserver
|
||||
title: Pipelines for webserver inference
|
||||
- local: model_memory_anatomy
|
||||
title: Model training anatomy
|
||||
title: Conceptual guides
|
||||
- sections:
|
||||
- sections:
|
||||
|
||||
Reference in New Issue
Block a user