[docs] Fix FlashAttention link (#35171)

fix link
2024-12-10 11:36:25 -08:00
parent 91b8ab18b7
commit 5290f6a62d
5 changed files with 5 additions and 5 deletions
--- a/docs/source/en/model_doc/mixtral.md
+++ b/docs/source/en/model_doc/mixtral.md
@@ -93,7 +93,7 @@ As can be seen, the instruction-tuned model requires a [chat template](../chat_t

 ## Speeding up Mixtral by using Flash Attention

-The code snippets above showcase inference without any optimization tricks. However, one can drastically speed up the model by leveraging [Flash Attention](../perf_train_gpu_one.md#flash-attention-2), which is a faster implementation of the attention mechanism used inside the model.
+The code snippets above showcase inference without any optimization tricks. However, one can drastically speed up the model by leveraging [Flash Attention](../perf_train_gpu_one#flash-attention-2), which is a faster implementation of the attention mechanism used inside the model.

 First, make sure to install the latest version of Flash Attention 2 to include the sliding window attention feature.