Qdqbert example add benchmark script with ORT-TRT (#16592)

* add ort-trt benchmark script

* Update README.md

* ort version can be newer

* formatting

* specify ORT version
This commit is contained in:
Shang Zhang
2022-04-12 08:13:59 -07:00
committed by GitHub
parent db3edd050b
commit 14daa6102a
3 changed files with 58 additions and 0 deletions

View File

@@ -101,6 +101,12 @@ Recalibrating will affect the accuracy of the model, but the change should be mi
trtexec --onnx=model.onnx --explicitBatch --workspace=16384 --int8 --shapes=input_ids:64x128,attention_mask:64x128,token_type_ids:64x128 --verbose
```
### Benchmark the INT8 QAT ONNX model inference with [ONNX Runtime-TRT](https://onnxruntime.ai/docs/execution-providers/TensorRT-ExecutionProvider.html) using dummy input
```
python3 ort-infer-benchmark.py
```
### Evaluate the INT8 QAT ONNX model inference with TensorRT
```