Run Tensorflow models on the Jetson Nano with TensorRT

Object Detection Prediction Example

Run Tensorflow model on the Jetson Nano by converting them into TensorRT format.

Tensorflow models can be converted to TensorRT using TF-TRT.

TensorFlowâ„¢ integration with TensorRTâ„¢ (TF-TRT) optimizes and executes compatible subgraphs, allowing TensorFlow to execute the remaining graph. While you can still use TensorFlow's wide and flexible feature set, TensorRT will parse the model and apply optimizations to the portions of the graph wherever possible. - https://docs.nvidia.com/deeplearning/frameworks/tf-trt-user-guide/index.html

Installation

If you are using JetPack as your OS, TensorRT is preinstalled. Also, NVIDIA containers of TensorFlow are built with enabling TensorRT, which means TF-TRT is part of the TensorFlow binary in the container and can be used out of the box.

Using TF-TRT

Tensorflow 1.x

SavedModel format:

from tensorflow.python.compiler.tensorrt import trt_convert as trt
converter = trt.TrtGraphConverter(input_saved_model_dir=input_saved_model_dir)
converter.convert()
converter.save(output_saved_model_dir)

Frozen graph:

from tensorflow.python.compiler.tensorrt import trt_convert as trt
converter = trt.TrtGraphConverter(
	input_graph_def=frozen_graph,
	nodes_blacklist=['logits', 'classes'])
frozen_graph = converter.convert()

Tensorflow 2.x

SavedModel format:

from tensorflow.python.compiler.tensorrt import trt_convert as trt
converter = trt.TrtGraphConverterV2(input_saved_model_dir=input_saved_model_dir)
converter.convert()
converter.save(output_saved_model_dir)
Note: Converting frozen graphs is no longer supported in TensorFlow 2.0.

For more information, check out NVIDIAs TF-TRT User Guide.

Image Classification / Object Detection Example

NVIDIA also provides a repository containing scripts and documentation to use TensorFlow image classification and object detection models on NVIDIA Jetson. Full credit goes to the authors of the repository.

Installation (taken from the repository)

The installation has 5 steps. However, you should already have everything contained in steps 1-3 installed and can therefore skip these steps.

1. Flash your Jetson TX2 with JetPack 3.2 (including TensorRT).

2. Install miscellaneous dependencies on Jetson

sudo apt-get install python-pip python-matplotlib python-pil

3. Install TensorFlow 1.7+ (with TensorRT support). Download the pre-built pip wheel and install it using pip.

pip install tensorflow-1.8.0-cp27-cp27mu-linux_aarch64.whl --user

or if you're using Python 3.

pip3 install tensorflow-1.8.0-cp35-cp35m-linux_aarch64.whl --user

4. Clone this repository

git clone --recursive https://github.com/NVIDIA-Jetson/tf_trt_models.git
cd tf_trt_models

5. Run the installation script

./install.sh

or if you want to specify a python interpreter

./install.sh python3

Image Classification

Classification example
Figure 1: Classification example

Download pre-trained model

The repository provides you with a function that can be used to download a pretrained model from the TensorFlow model repository.

from tf_trt_models.classification import download_classification_checkpoint

checkpoint_path = download_classification_checkpoint('inception_v2')

You can also manually download a model from the slim folder.

Build frozen graph

from tf_trt_models.classification import build_classification_graph

frozen_graph, input_names, output_names = build_classification_graph(
    model='inception_v2',
    checkpoint=checkpoint_path,
    num_classes=1001
)

Optimize the graph with TensorRT

import tensorflow.contrib.tensorrt as trt

trt_graph = trt.create_inference_graph(
    input_graph_def=frozen_graph,
    outputs=output_names,
    max_batch_size=1,
    max_workspace_size_bytes=1 << 25,
    precision_mode='FP16',
    minimum_segment_size=50
)

Complete notebook

You can find a comprehensive example under examples/classification/classification.ipynb.

Object Detection

Object Detection Example
Figure 2: Object Detection Example

Download pre-trained model

from tf_trt_models.detection import download_detection_model

config_path, checkpoint_path = download_detection_model('ssd_inception_v2_coco')

Build frozen graph

from tf_trt_models.detection import build_detection_graph

frozen_graph, input_names, output_names = build_detection_graph(
    config=config_path,
    checkpoint=checkpoint_path
)

Optimize the graph with TensorRT

import tensorflow.contrib.tensorrt as trt

trt_graph = trt.create_inference_graph(
    input_graph_def=frozen_graph,
    outputs=output_names,
    max_batch_size=1,
    max_workspace_size_bytes=1 << 25,
    precision_mode='FP16',
    minimum_segment_size=50
)

Complete notebook

You can find a comprehensive example under examples/detection/detection.ipynb.

Custom model

The same process works for custom models. Just follow my "Creating your own object detector with the Tensorflow Object Detection API" guide until you finish training. Then use the checkpoint to generate the frozen graph and optimize it using TensorRT.