Stable versions of Torch-TensorRT are published on PyPI
pip install torch-tensorrt
Nightly versions of Torch-TensorRT are published on the PyTorch package index
pip install --pre torch-tensorrt --index-url https://download.pytorch.org/whl/nightly/cu128
Torch-TensorRT is also distributed in the ready-to-run NVIDIA NGC PyTorch Container which has all dependencies with the proper versions and example notebooks included.
For more advanced installation methods, please see here
You can use Torch-TensorRT anywhere you use torch.compile
:
import torch import torch_tensorrt model = MyModel().eval().cuda() # define your model here x = torch.randn((1, 3, 224, 224)).cuda() # define what the inputs to the model will look like optimized_model = torch.compile(model, backend="tensorrt") optimized_model(x) # compiled on first run optimized_model(x) # this will be fast!
If you want to optimize your model ahead-of-time and/or deploy in a C++ environment, Torch-TensorRT provides an export-style workflow that serializes an optimized module. This module can be deployed in PyTorch or with libtorch (i.e. without a Python dependency).
Step 1: Optimize + serializeimport torch import torch_tensorrt model = MyModel().eval().cuda() # define your model here inputs = [torch.randn((1, 3, 224, 224)).cuda()] # define a list of representative inputs here trt_gm = torch_tensorrt.compile(model, ir="dynamo", inputs=inputs) torch_tensorrt.save(trt_gm, "trt.ep", inputs=inputs) # PyTorch only supports Python runtime for an ExportedProgram. For C++ deployment, use a TorchScript file torch_tensorrt.save(trt_gm, "trt.ts", output_format="torchscript", inputs=inputs)
import torch import torch_tensorrt inputs = [torch.randn((1, 3, 224, 224)).cuda()] # your inputs go here # You can run this in a new python session! model = torch.export.load("trt.ep").module() # model = torch_tensorrt.load("trt.ep").module() # this also works model(*inputs)
#include "torch/script.h" #include "torch_tensorrt/torch_tensorrt.h" auto trt_mod = torch::jit::load("trt.ts"); auto input_tensor = [...]; // fill this with your inputs auto results = trt_mod.forward({input_tensor});
Note: Refer NVIDIA L4T PyTorch NGC container for PyTorch libraries on JetPack.
These are the following dependencies used to verify the testcases. Torch-TensorRT can work with other versions, but the tests are not guaranteed to pass.
Deprecation is used to inform developers that some APIs and tools are no longer recommended for use. Beginning with version 2.3, Torch-TensorRT has the following deprecation policy:
Deprecation notices are communicated in the Release Notes. Deprecated API functions will have a statement in the source documenting when they were deprecated. Deprecated methods and classes will issue deprecation warnings at runtime, if they are used. Torch-TensorRT provides a 6-month migration period after the deprecation. APIs and tools continue to work during the migration period. After the migration period ends, APIs and tools are removed in a manner consistent with semantic versioning.
Take a look at the CONTRIBUTING.md
The Torch-TensorRT license can be found in the LICENSE file. It is licensed with a BSD Style licence
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4