What is TensorRT?
AI InfrastructureLast updated:
NVIDIA's SDK for optimizing and deploying deep learning models with maximum performance on NVIDIA GPUs.
TensorRT applies layer fusion, precision calibration, kernel auto-tuning, and other optimizations to dramatically speed up inference. It is widely used in production deployments where NVIDIA GPUs serve latency-sensitive AI workloads.