TensorRT is Nvidia’s SDK for accelerating deep learning inference tasks. Each TensorRT engine generated is platform specific, thus the engine must be generated on that platform. To do this, we used tkDNN, created by the High-Performance Real-Time Laboratory (HiPeRT) from the University of Modena and Reggie Emilia (UniMORE). This code base allows us to generate a TensorRT engine from either the darknet configuration and weights file, or the Pytorch weights file.

Once the engine has been generated, tkDNN provides a C++ library to use the engine for inference. Using these functions, a timing script was created to measure the time for a specified number of inferences. Furthermore, tkDNN also has a script for calculating mAP.

TkDNN was only used to generate TensorRT engine on the Jetson AGX. This was also attempted on the Jetson Nano, but a problem occurred during this process causing the smaller models (YOLOv4-tiny, MobileNet) to fail to generate. Secondly, the larger model (yolov4) would fail to build since the Jetson Nano would run out of memory.