How to run Coral Edge TPU inference using Python TensorFlow Lite API

Revision as of 15:21, 8 February 2021 by Registered User

1. Article purpose[edit source]

This article describes how to run an inference on the STM32MP1 using a Google Coral EdgeTPU device and the Python TensorFlow Lite API. It is an example based on an image classification application.

Info white.png Information
There are many ways to achieve this result; this article provides a simple example. You are free to explore other methods that are better adapted to your development constraints.

2. Libedgetpu and TensorFlow Lite Python APIs[edit source]

The Artificial Intelligence expansion package X-LINUX-AI comes with TensorFlow Lite Python APIs and the libedgetpu (providing the support of the Coral Edge TPU) that has been rebuilt from source to be compatible with the embedded TensorFlow Lite runtime.

In the next section we explore, with a basic image-classification example, how to inference your models on the board using the Coral EdgeTPU device.

3. Running an inference on Coral EdgeTPU using the TensorFlow Lite Python API[edit source]

3.1. Installing prerequisites on the target[edit source]

Warning white.png Warning
The software package is provided AS IS, and by downloading it, you agree to be bound to the terms of the software license agreement (SLA). The detailed content licenses can be found here.

After having configured the AI OpenSTLinux package, you can install the X-LINUX-AI components and the packages needed to run our example.
The main packages are Python Numpy[1], Python OpenCV[2], Python TensorFlow Lite runtime[3] and libedgetpu

 apt-get install python3-numpy python3-opencv python3-tensorflow-lite libedgetpu

3.2. Preparing the workspace on the target[edit source]

 cd /usr/local/ && mkdir -p workspace
 cd workspace && mkdir -p models testdata 

In this example, we use the mobilenet_v1_1.0_224_quant_edgetpu.tflite model to classify download images, accompanied by the labels file from the Coral[4] website.

 wget https://github.com/google-coral/edgetpu/raw/master/test_data/mobilenet_v1_1.0_224_quant_edgetpu.tflite -O models/mobilenet_v1_1.0_224_quant_edgetpu.tflite
 wget https://github.com/google-coral/edgetpu/raw/master/test_data/imagenet_labels.txt -O models/labels.txt
 wget https://github.com/google-coral/edgetpu/raw/master/test_data/bird.bmp -O testdata/bird.bmp
Info white.png Information
You can run your own model but you have to make sure that your .tflite model is compiled for inferencing on Coral EdgeTPU. Refer first to Compile your custom model

3.3. Running the inference[edit source]

If you are already familiar with inferencing TensorFlow Lite models, use the following Python script directly. Otherwise, copy it to a file named classify_on_stm32mp1.py and refer to the subsequent sections.

#!/usr/bin/python3
#
# Copyright (c) 2020 STMicroelectronics. All rights reserved.
#
# This software component is licensed by ST under BSD 3-Clause license,
# the "License"; You may not use this file except in compliance with the
# License. You may obtain a copy of the License at:
#                        opensource.org/licenses/BSD-3-Clause

import sys
import numpy as np
import tflite_edgetpu_runtime.interpreter as tflite
import time
import cv2

label_file = "/usr/local/workspace/models/labels.txt"
with open( label_file, 'r') as  f :
           labels = [ line.strip() for line in f.readlines() ]
model_file = "/usr/local/workspace/models/mobilenet_v1_1.0_224_quant_edgetpu.tflite"
interpreter = tflite.Interpreter(model_path = model_file, experimental_delegates = [tflite.load_delegate('libedgetpu-max.so.1.0')])
interpreter.allocate_tensors()
#Getting the model input and output details
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()
height = input_details[0]['shape'][1]
width = input_details[0]['shape'][2]
image = cv2.imread(sys.argv[1])
nn_img_rgb = cv2.cvtColor(np.array(image), cv2.COLOR_BGR2RGB)
nn_img_rgb_resized = cv2.resize(nn_img_rgb, (width, height))
input_data = np.expand_dims(nn_img_rgb_resized, axis=0)
interpreter.set_tensor(input_details[0]['index'], input_data)
start = time.perf_counter()
interpreter.invoke()
inference_time = time.perf_counter() - start
print("inference time:", inference_time)
results = np.squeeze(interpreter.get_tensor(output_details[0]['index']))
top_k = results.argsort()[-5:][::-1]
for i in top_k:
    print('{0:08.6f}'.format(float(results[i]*100/255.0))+":", labels[i])
print("\n")

Now that our Python script is ready for execution, we send it to the board.

 scp path/to/your/script/classify_on_stm32mp1.py root@<board_ip_address>:/usr/local/workspace/

3.4. Running the inference from the board on the Coral Edge TPU[edit source]

After booting the board and connecting it to the host PC throught an SSH protocol, we are ready to run the inference using the following command:

 cd /usr/local/workspace
 python3 classify_on_stm32mp1.py test_data/<picture to classify>

4. Explanation of the parts of the script[edit source]

4.1. Instantiating the Tensorflow Lite Interpreter[edit source]

We first load the labels from the label file by adding the following code lines:

label_file = "/usr/local/workspace/models/labels.txt"
with open( label_file, 'r') as  f :
          labels = [ line.strip() for line in f.readlines() ]

We now load the model and feed it to the interpreter that we instantiate using the interpreter API [5] . In this interpreter we call a Tensorflow Lite delegate . This is an API that delegates all or part of the graph execution to the Edge TPU accelerator hardware. After calling the Edge TPU library inside the delegate, we allocate tensors for the graph execution through our interpreter.

model_file = "/usr/local/workspace/models/mobilenet_v1_1.0_224_quant_edgetpu.tflite"
interpreter = tflite.Interpreter(model_path = model_file, experimental_delegates = [tflite.load_delegate('libedgetpu-max.so.1.0')]) 
interpreter.allocate_tensors()

4.2. Getting the model details and processing the image[edit source]

Now that the interpreter is ready to be fed with the input images, it is important to get the details of the model in order to adjust the image to fit into the model.

#Getting the model input and output details
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()
height = input_details[0]['shape'][1]
width = input_details[0]['shape'][2]

Now we point to our image directory testdata and pass the image as parameter. The images are converted from BGR to RGB encoding, resized to fit the size of the model input and have their dimensions expanded by one.

image = cv2.imread(sys.argv[1])
nn_img_rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
nn_img_rgb_resized = cv2.resize(nn_img_rgb, (width, height))
input_data = np.expand_dims(nn_img_rgb_resized, axis=0)

4.3. Invoking the interpreter and displaying results[edit source]

Now that our input data has been processed to fit in the model input size, we feed the image to the interpreter input and launch the inference. We use the time library to record the inference duration, which gives a good indication of the Edge TPU performance compared to that of the CPU.

interpreter.set_tensor(input_details[0]['index'], input_data)
start = time.perf_counter()
interpreter.invoke()
inference_time = time.perf_counter() - start
print("inference time:", inference_time)
output_details = interpreter.get_output_details()
results = np.squeeze(interpreter.get_tensor(output_details[0]['index']))
top_k = results.argsort()[-5:][::-1]
for i in top_k:
    print('{0:08.6f}'.format(float(results[i]*100/255.0))+":", self._labels[i])
print("\n")

5. References[edit source]