From dff6185d612c89cfa32edfab62eabc14583a5fbb Mon Sep 17 00:00:00 2001 From: Minseo Kim <75977640+luckyvickyricky@users.noreply.github.com> Date: Wed, 6 Aug 2025 23:52:43 +0900 Subject: [PATCH] docs: fix typo in 'quantization-aware training' (#39904) --- docs/source/en/quantization/fp_quant.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/source/en/quantization/fp_quant.md b/docs/source/en/quantization/fp_quant.md index a89e35da5c..0177cb0555 100644 --- a/docs/source/en/quantization/fp_quant.md +++ b/docs/source/en/quantization/fp_quant.md @@ -16,7 +16,7 @@ rendered properly in your Markdown viewer. # FP-Quant -[FP-Quant](https://github.com/IST-DASLab/FP-Quant) is a family of quantization algorithms tailored for the Blackwell generation of Nvidia GPUs. The goal is to allow for efficient post-training quantization (PTQ) and quantization-aware trainin (QAT) of LLMs in the [MXFP4 and NVFP4 data-types](https://www.opencompute.org/documents/ocp-microscaling-formats-mx-v1-0-spec-final-pdf). +[FP-Quant](https://github.com/IST-DASLab/FP-Quant) is a family of quantization algorithms tailored for the Blackwell generation of Nvidia GPUs. The goal is to allow for efficient post-training quantization (PTQ) and quantization-aware training (QAT) of LLMs in the [MXFP4 and NVFP4 data-types](https://www.opencompute.org/documents/ocp-microscaling-formats-mx-v1-0-spec-final-pdf). Currently, only PTQ with MXFP4 is supported. Models can either be quantized on the fly with `quantization_config=FPQuantConfig()`: @@ -63,4 +63,4 @@ model.forward = torch.compile(model.forward, mode="max-autotune", fullgraph=True FP-Quant currently performs best for very large batch size processing. -See [QuTLASS README](https://github.com/IST-DASLab/qutlass/blob/main/README.md) for speedups. \ No newline at end of file +See [QuTLASS README](https://github.com/IST-DASLab/qutlass/blob/main/README.md) for speedups.