Difference between revisions of "How to run Coral Edge TPU inference using Python TensorFlow Lite API"

[unchecked revision] [quality revision]
m
(Merge articles)
 

1 Article purpose[edit]

This article describes how to run an inference on the STM32MP1 using a Google Coral EdgeTPU device and the Python TensorFlow Lite API. It is an example based on an image classification application.

Info white.png Information
There are many ways to achieve this result; this article provides a simple example. You are free to explore other methods that are better adapted to your development constraints.

2 Difference between Libedgetpu and TensorFlow Lite Python APIs[edit]

The Artificial Intelligence expansion package X-LINUX-AI comes with two versions of TensorFlow Lite.

The first runtime is based on TensorFlow Lite[1]2.3.1, whereas the second runtime is based on TensorFlow Lite runtime[2]1.12.1 and is dedicated to Coral EdgeTPU use.

This is because TensorFlow Lite 2.3.1 does not yet support the Coral EdgeTPU runtime. The following figure explains the software structure.

File:Tensorflow lite runtime.png
TensorFlow Lite runtime package structure

If you wish to use the TensorFlow Lite 2.3.1 you have to import the following library in your Python script:

import tflite_runtime.interpreter as tflite

If you wish to run inferences on your Coral EdgeTPU device, you need to make the following call in your Python script:

import tflite_edgetpu_runtime.interpreter as tflite

TensorFlow Lite Python APIs and the libedgetpu (providing the support of the Coral Edge TPU) that has been rebuilt from source to be compatible with the embedded TensorFlow Lite runtime.

In the next section we explore, with a basic image-classification example, how to inference your models on the board using the Coral EdgeTPU device.

3 Running an inference on Coral EdgeTPU using the TensorFlow Lite Python API[edit]

3.1 Installing prerequisites on the target[edit]

We start by installing the After having configured the AI OpenSTLinux package, you can install the X-LINUX-AI components and the packages needed to run our example.
The main packages are Python Numpy[31], Python OpenCV[4] 4.1.x and 2], Python TensorFlow Lite Edge TPU runtime[2]1.12.13] and libedgetpu

Warning white.png Warning
The software package is provided AS IS, and by downloading it, you agree to be bound to the terms of the software license agreement (SLA). The detailed content licenses can be found here.
 apt-get install python3-numpy python3-opencv  apt-get install python3-tensorflow-lite-edgetpu libedgetpu

3.2 Preparing the workspace on the target[edit]

Before running the inference, make sure that your .tflite model is compiled for inferencing on Coral EdgeTPU. Refer first to Compile your custom model, then send the model to the board.

 cd /usr/local/ && mkdir -p workspace
 cd workspace && mkdir -p models testdata 

After preparing the workspace on the target and sending the compiled model to the model directory in the workspace, we send the associated label file and input image to the workspace so that the inference can be executed. Any number of .jpeg, .jpg and .png pictures of any size can be added. Some image processing operations are used later to adapt the picture sizes to the size of the input model. In this example, we use the mobilenet_v1_1.0_224_quant_edgetpu.tflite model to classify download images, accompanied by the labels file from the Coral[54] website using the following commands: .

 wget https://github.com/google-coral/edgetpu/raw/master/test_data/mobilenet_v1_1.0_224_quant_edgetpu.tflite -O models/mobilenet_v1_1.0_224_quant_edgetpu.tflite
 wget https://github.com/google-coral/edgetpu/raw/master/test_data/imagenet_labels.txt -O models/labels.txt

We send then these file via the scp protocol using the following commands :

scp path/to/your/compiled/model/mobilenet_v1_1.0_224_quant_edgetpu.tflite root@<board_ip_address>:/usr/local/workspace/models/ scp path/to/your/labels.txt root@<board_ip_address>:/usr/local/workspace/models/ scp path/to/your/pictures root@<board_ip_address>:/usr/local/workspace/testdata/

Now that our workspace is ready with a compiled model file, a label file and some sample pictures, we can run an inference using the Python API. To do this, we create a Python script that is transferred via the scp command and run on the target board. This is a very basic example that classifies images by executing an inference on the Coral Edge TPU.

 gedit classify_on_stm32mp1.py
 wget https://github.com/google-coral/edgetpu/raw/master/test_data/bird.bmp -O testdata/bird.bmp

Info white.png Information
You can run your own model but you have to make sure that your .tflite model is compiled for inferencing on Coral EdgeTPU. Refer first to Compile your custom model

3.3 Running the inference[edit]

If you are already familiar with inferencing TensorFlow Lite models, use the following Python script directly. Otherwise, copy it to a file named classify_on_stm32mp1.py and refer to the subsequent sectionsHere is a simple python script example to execute a NN inference on the Google Coral Edge TPU.

#!/usr/bin/python3
#
# Copyright (c) 2020 STMicroelectronics. All rights reserved.
#
# This software component is licensed by ST under BSD 3-Clause license,
# the "License"; You may not use this file except in compliance with the
# License. You may obtain a copy of the License at:
#                        opensource.org/licenses/BSD-3-Clause

import sys
import numpy as np
import tflite_edgetpu_runtime.interpreter as tflite
import time
import cv2

label_file = "/usr/local/workspace/models/labels.txt"
with open( label_file, 'r') as  f :
           labels = [ line.strip() for line in f.readlines() ]
model_file = "/usr/local/workspace/models/mobilenet_v1_1.0_224_quant_edgetpu.tflite"

#Create the interpreter and allocate tensors
interpreter = tflite.Interpreter(model_path = model_file, experimental_delegates = [tflite.load_delegate('libedgetpu-max.so.1.02')])
interpreter.allocate_tensors()

#Getting the model input and output details
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()
height = input_details[0]['shape'][1]
width = input_details[0]['shape'][2]

#Read the picture, convert from BGR to RGB encoding,
#resized to fit the size of the model input and have their
#dimensions expanded by one
image = cv2.imread(sys.argv[1])
nn_img_rgb = cv2.cvtColor(np.array(image), cv2.COLOR_BGR2RGB)
nn_img_rgb_resized = cv2.resize(nn_img_rgb, (width, height))
input_data = np.expand_dims(nn_img_rgb_resized, axis=0)

#Set the input data, execute the first inference (that could take longer
#since the model is being loaded on the Coral Edge TPU RAM)
interpreter.set_tensor(input_details[0]['index'], input_data)
start = time.perf_counter()
interpreter.invoke()
inference_time = time.perf_counter() - start
print("1st inference:", inference_time, "s")

# Execute the second inference and measure the inference duration
start = time.perf_counter()
interpreter.invoke()
inference_time = time.perf_counter() - start
print("2nd inference:", inference_time, "s")

#Print the results
results = np.squeeze(interpreter.get_tensor(output_details[0]['index']))
top_k = results.argsort()[-5:][::-1]
for i in top_k:
    print('{0:08.6f}'.format(float(results[i]*100/255.0))+":", labels[i])
print("\n")

Now that our Python script is ready for execution, we send it to the board.

scp

Copy this python script to the target: PC $> scp path/to/your/script/classify_on_stm32mp1.py

root@<board_ip_address>:/usr/local/workspace

/

3.4 Running the inference from the board on the Coral Edge TPU[edit]

After booting the board and connecting it to the host PC throught an SSH protocol, we are ready to run the inference using the following command:

 cd /usr/local/workspace
 python3 classify_on_stm32mp1.py test_data/<picture to classify>

Using the IA hardware accelerator speeds up the inferencing.

4 Explanation of the parts of the script[edit]

4.1 Instantiating the Tensorflow Lite Interpreter[edit]

We first load the labels from the label file by adding the following code lines:

label_file = "/usr/local/workspace/models/labels.txt" with open( label_file, 'r') as f : labels = [ line.strip() for line in f.readlines() ]

We now load the model and feed it to the interpreter that we instantiate using the interpreter API [6] . In this interpreter we call a Tensorflow Lite delegate . This is an API that delegates all or part of the graph execution to the Edge TPU accelerator hardware. After calling the Edge TPU library inside the delegate, we allocate tensors for the graph execution through our interpreter.

model_file = "/usr/local/workspace/models/mobilenet_v1_1.0_224_quant_edgetpu.tflite" interpreter = tflite.Interpreter(model_path = model_file, experimental_delegates = [tflite.load_delegate('libedgetpu-max.so.1.0')]) interpreter.allocate_tensors()

4.2 Getting the model details and processing the image[edit]

Now that the interpreter is ready to be fed with the input images, it is important to get the details of the model in order to adjust the image to fit into the model.

#Getting the model input and output details input_details = interpreter.get_input_details() output_details = interpreter.get_output_details() height = input_details[0]['shape'][1] width = input_details[0]['shape'][2]

Now we point to our image directory testdata and pass the image as parameter. The images are converted from BGR to RGB encoding, resized to fit the size of the model input and have their dimensions expanded by one.

image = cv2.imread(sys.argv[1]) nn_img_rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB) nn_img_rgb_resized = cv2.resize(nn_img_rgb, (width, height)) input_data = np.expand_dims(nn_img_rgb_resized, axis=0)

4.3 Invoking the interpreter and displaying results[edit]

Now that our input data has been processed to fit in the model input size, we feed the image to the interpreter input and launch the inference. We use the time library to record the inference duration, which gives a good indication of the Edge TPU performance compared to that of the CPU.

interpreter.set_tensor(input_details[0]['index'], input_data) start = time.perf_counter() interpreter.invoke() inference_time = time.perf_counter() - start print("inference time:", inference_time) output_details = interpreter.get_output_details() results = np.squeeze(interpreter.get_tensor(output_details[0]['index'])) top_k = results.argsort()[-5:][::-1] for i in top_k: print('{0:08.6f}'.format(float(results[i]*100/255.0))+":", self._labels[i]) print("\n")

5 References[edit]

testdata/bird.bmp
inference time:  0.1181572640016384 ms
88.627451: 20  chickadee
4.313725: 19  magpie
3.137255: 18  jay
2.745098: 21  water ouzel, dipper
1.176471: 14  junco, snowbird

Info white.png Information
The first inference may take longer since the model is being loaded on the Coral Edge TPU RAM

4 References[edit]


==Article purpose==
This article describes how to run an inference on the STM32MP1 using a Google Coral EdgeTPU device and the Python TensorFlow Lite API. It is an example based on an image classification application.  

{{info|There are many ways to achieve this result; this article provides a simple example. You are free to explore other methods that are better adapted to your development constraints.}}

== Difference betweenLibedgetpu and TensorFlow Lite Python APIs ==
The  [[X-LINUX-AI OpenSTLinux Expansion Package|Artificial Intelligence expansion package X-LINUX-AI]] comes with two versions of TensorFlow Lite.

The first runtime is based on {{Highlight|TensorFlow Lite<ref name=tensorflowlite_url>[https://www.tensorflow.org/lite TensorFlow Lite]</ref>2.3.1}}, whereas the second runtime is based on {{Highlight|TensorFlow Lite runtime<ref name=tensorflowlite_edgetpu_url>[https://github.com/google-coral/edgetpu/blob/master/WORKSPACE TensorFlow Lite runtime]</ref>1.12.1}} and is dedicated to Coral EdgeTPU use.

This is because {{Highlight|TensorFlow Lite 2.3.1 }} does not yet support the {{Highlight|Coral EdgeTPU}} runtime.  The following figure explains the software structure. 
[[File: tensorflow_lite_runtime.png|thumb|upright=2|center|link=|TensorFlow Lite runtime package structure]]

If you wish to use the {{Highlight|TensorFlow Lite 2.3.1 }} you have to import the following library in your Python script:
 import tflite_runtime.interpreter as tflite
If you wish to run inferences on your {{Highlight|Coral EdgeTPU}} device, you need to make the following call in your Python script:
 import tflite_edgetpu_runtime.interpreter as tflite
TensorFlow Lite Python APIs and the libedgetpu (providing the support of the Coral Edge TPU) that has been rebuilt from source to be compatible with the embedded TensorFlow Lite runtime.
In the next section we explore, with a basic image-classification example, how to inference your models on the board using the Coral EdgeTPU device.

== Running an inference on Coral EdgeTPU using the TensorFlow Lite Python API ==
=== Installing prerequisites on the target ===We start by installing the After having [[X-LINUX-AI_OpenSTLinux_Expansion_Package#Configure the AI OpenSTLinux package repository|configured the AI OpenSTLinux package]], you can install the X-LINUX-AI components and the packages needed to run our example.<br>

The main packages are {{Highlight|Python Numpy<ref name=numpy_url>[https://numpy.org/ Numpy]</ref>}}, {{Highlight|Python OpenCV<ref name=opencv_url>[https://opencv.org/ OpenCV]</ref> 4.1.x}} and }}, {{Highlight|Python TensorFlow Lite Edge TPU runtime<ref name=tensorflowlite_edgetpu_url></ref>1.12.1}}
 {{Board$}} apt-get install python3-numpy python3-opencv[https://www.tensorflow.org/lite TensorFlow Lite]</ref>}} and {{Highlight|libedgetpu}}
{{Warning|{{SoftwareLicenseAgreement | distribution=X-LINUX-AI}}}}
{{Board$}} apt-get install python3-tensorflow-lite-edgetpu
numpy python3-opencv python3-tensorflow-lite libedgetpu
=== Preparing the workspace on the target ===Before running the inference, make sure that your ''.tflite'' model is compiled for inferencing on Coral EdgeTPU. Refer first to [[How to compile model and run inference on Coral Edge TPU using STM32MP1|Compile your custom model]], then send the model to the board.  {{Board$}} cd /usr/local/ && mkdir -p workspace
 {{Board$}} cd workspace && mkdir -p models testdata After preparing the workspace on the target and sending the compiled model to the model directory in the workspace, we send the associated label file and input image to the workspace so that the inference can be executed. Any number of '''.jpeg''', '''.jpg''' and '''.png''' pictures of any size can be added. Some image processing operations are used later to adapt the picture sizes to the size of the input model. 
In this example, we use the '''mobilenet_v1_1.0_224_quant_edgetpu.tflite''' model to classify download images, accompanied by the labels file from the {{Highlight|'''Coral'''<ref name=coral_url>[https://coral.ai/models Coral AI]</ref>}} website using the following commands: 
 {{PC$.

 {{Board$}} wget https://github.com/google-coral/edgetpu/raw/master/test_data/mobilenet_v1_1.0_224_quant_edgetpu.tflite  {{PC$-O models/mobilenet_v1_1.0_224_quant_edgetpu.tflite
 {{Board$}} wget https://github.com/google-coral/edgetpu/raw/master/test_data/imagenet_labels.txt -O models/labels.txtWe send then these file via the scp protocol using the following commands : 
 {{PC$}} scp path/to/your/compiled/model/mobilenet_v1_1.0_224_quant_edgetpu.tflite root@<board_ip_address>:/usr/local/workspace/models/
 {{PC$}} scp path/to/your/labels.txt root@<board_ip_address>:/usr/local/workspace/models/
 {{PC$}} scp path/to/your/pictures root@<board_ip_address>:/usr/local/workspace/testdata/
Now that our workspace is ready with a compiled model file, a label file and some sample pictures, we can run an inference using the Python API. To do this, we create a Python script that is transferred via the '''scp''' command and run on the target board. This is a very basic example that classifies images by executing an inference on the Coral Edge TPU.
 {{PC$}} gedit classify_on_stm32mp1.py

=== Running the inference ===
If you are already familiar with inferencing TensorFlow Lite models, use the following Python script directly. Otherwise, copy it to a file named ''classify_on_stm32mp1.py'' and refer to the subsequent sections. 
 {{Board$}} wget https://github.com/google-coral/edgetpu/raw/master/test_data/bird.bmp -O testdata/bird.bmp

{{Info | You can run your own model but you have to make sure that your ''.tflite'' model is compiled for inferencing on Coral EdgeTPU. Refer first to [[How to compile model and run inference on Coral Edge TPU using STM32MP1|Compile your custom model]]}} 

=== Running the inference ===
Here is a simple python script example to execute a NN inference on the Google Coral Edge TPU.
<source lang="python">

#!/usr/bin/python3
#
# Copyright (c) 2020 STMicroelectronics. All rights reserved.
#
# This software component is licensed by ST under BSD 3-Clause license,
# the "License"; You may not use this file except in compliance with the
# License. You may obtain a copy of the License at:
#                        opensource.org/licenses/BSD-3-Clause

import sys
import numpy as np
import tflite_edgetpu_runtime.interpreter as tflite
import time
import cv2

label_file = "/usr/local/workspace/models/labels.txt"
with open( label_file, 'r') as  f :
           labels = [ line.strip() for line in f.readlines() ]
model_file = "/usr/local/workspace/models/mobilenet_v1_1.0_224_quant_edgetpu.tflite"

#Create the interpreter and allocate tensors
interpreter = tflite.Interpreter(model_path = model_file, experimental_delegates = [tflite.load_delegate('libedgetpu-max.so.1.02')])
interpreter.allocate_tensors()
#Getting the model input and output details
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()
height = input_details[0]['shape'][1]
width = input_details[0]['shape'][2]

#Read the picture, convert from BGR to RGB encoding,
#resized to fit the size of the model input and have their
#dimensions expanded by oneimage = cv2.imread(sys.argv[1])
nn_img_rgb = cv2.cvtColor(np.array(image), cv2.COLOR_BGR2RGB)
nn_img_rgb_resized = cv2.resize(nn_img_rgb, (width, height))
input_data = np.expand_dims(nn_img_rgb_resized, axis=0)

#Set the input data, execute the first inference (that could take longer
#since the model is being loaded on the Coral Edge TPU RAM)interpreter.set_tensor(input_details[0]['index'], input_data)
start = time.perf_counter()
interpreter.invoke()
inference_time = time.perf_counter() - start
print("1st inference time:", inference_time)
results :", inference_time, "s")

# Execute the second inference and measure the inference duration
start = time.perf_counter()
interpreter.invoke()
inference_time = time.perf_counter() - start
print("2nd inference:", inference_time, "s")

#Print the results
results = np.squeeze(interpreter.get_tensor(output_details[0]['index']))
top_k = results.argsort()[-5:][::-1]
for i in top_k:
    print('{0:08.6f}'.format(float(results[i]*100/255.0))+":", labels[i])
print("\n")</source>

Now that our Python script is ready for execution, we send it to the '''board'''. 
 {{PC$}} Copy this python script to the target:
PC $> scp path/to/your/script/classify_on_stm32mp1.py root@<board_ip_address>:/usr/local/workspace/


=== Running the inference from the board on the Coral Edge TPU ===After booting the board and connecting it to the host PC throught an SSH protocol, we are ready to run the inference using the following command: {{Board$}} cd /usr/local/workspace
 {{Board$}} python3 classify_on_stm32mp1.py test_data/<picture to classify>

Using the IA hardware accelerator speeds up the inferencing. 
== Explanation of the parts of the script ==
===Instantiating the Tensorflow Lite Interpreter ===
We first load the labels from the label file by adding the following code lines:<source lang="python">

label_file = "/usr/local/workspace/models/labels.txt"
with open( label_file, 'r') as  f :
          labels = [ line.strip() for line in f.readlines() ]</source>

We now  load the model and feed it to the interpreter that we instantiate using the ''interpreter API'' <ref name=interpreter_url>[https://www.tensorflow.org/api_docs/python/tf/lite/Interpreter Interpreter Class]</ref> . In this interpreter we call a '''Tensorflow Lite delegate '''. This is an API that delegates all or part of the graph execution to the Edge TPU accelerator hardware. After calling the Edge TPU library inside the delegate, we allocate tensors for the graph execution through our interpreter.<source lang="python">

model_file = "/usr/local/workspace/models/mobilenet_v1_1.0_224_quant_edgetpu.tflite"
interpreter = tflite.Interpreter(model_path = model_file, experimental_delegates = [tflite.load_delegate('libedgetpu-max.so.1.0')]) 
interpreter.allocate_tensors()</source>


=== Getting the model details and processing the image ===
Now that the interpreter is ready to be fed with the input images, it is important to get the details of the model in order to adjust the image to fit into the model.<source lang="python">

#Getting the model input and output details
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()
height = input_details[0]['shape'][1]
width = input_details[0]['shape'][2]</source>


Now we point to our image directory '''testdata''' and pass the image as parameter. The images are converted from BGR to RGB encoding, resized to fit the size of the model input and have their dimensions expanded by one. <source lang="python">

image = cv2.imread(sys.argv[1])
nn_img_rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
nn_img_rgb_resized = cv2.resize(nn_img_rgb, (width, height))
input_data = np.expand_dims(nn_img_rgb_resized, axis=0)</source>


=== Invoking the interpreter and displaying results ===
Now that our input data has been processed to fit in the model input size, we feed the image to the interpreter input and launch the inference. We use the time library to record the inference duration, which gives a good indication of the Edge TPU performance compared to that of the CPU. <source lang="python">

interpreter.set_tensor(input_details[0]['index'], input_data)
start = time.perf_counter()
interpreter.invoke()
inference_time = time.perf_counter() - start
print("inference time:", inference_time)
output_details = interpreter.get_output_details()
results = np.squeeze(interpreter.get_tensor(output_details[0]['index']))
top_k = results.argsort()[-5:][::-1]
for i in top_k:
    print('{0:08.6f}'.format(float(results[i]*100/255.0))+":", self._labels[i])
print("\n")</source>

testdata/bird.bmp
 inference time:  0.1181572640016384 ms
 88.627451: 20  chickadee
 4.313725: 19  magpie
 3.137255: 18  jay
 2.745098: 21  water ouzel, dipper
 1.176471: 14  junco, snowbird

{{Info|The first inference may take longer since the model is being loaded on the Coral Edge TPU RAM}}
==References==<references />

<noinclude>

[[Category:Artificial intelligence expansion packages |1009]]
{{PublicationRequestId | 16591 | 26Jun'20}}</noinclude>
Line 4: Line 4:
 
{{info|There are many ways to achieve this result; this article provides a simple example. You are free to explore other methods that are better adapted to your development constraints.}}
 
{{info|There are many ways to achieve this result; this article provides a simple example. You are free to explore other methods that are better adapted to your development constraints.}}
   
== Difference between TensorFlow Lite Python APIs ==
+
== Libedgetpu and TensorFlow Lite Python APIs ==
The  [[X-LINUX-AI OpenSTLinux Expansion Package|Artificial Intelligence expansion package X-LINUX-AI]] comes with two versions of TensorFlow Lite.
+
The  [[X-LINUX-AI OpenSTLinux Expansion Package|Artificial Intelligence expansion package X-LINUX-AI]] comes with TensorFlow Lite Python APIs and the libedgetpu (providing the support of the Coral Edge TPU) that has been rebuilt from source to be compatible with the embedded TensorFlow Lite runtime.
 
 
The first runtime is based on {{Highlight|TensorFlow Lite<ref name=tensorflowlite_url>[https://www.tensorflow.org/lite TensorFlow Lite]</ref>2.3.1}}, whereas the second runtime is based on {{Highlight|TensorFlow Lite runtime<ref name=tensorflowlite_edgetpu_url>[https://github.com/google-coral/edgetpu/blob/master/WORKSPACE TensorFlow Lite runtime]</ref>1.12.1}} and is dedicated to Coral EdgeTPU use.
 
 
 
This is because {{Highlight|TensorFlow Lite 2.3.1 }} does not yet support the {{Highlight|Coral EdgeTPU}} runtime.  The following figure explains the software structure.
 
[[File: tensorflow_lite_runtime.png|thumb|upright=2|center|link=|TensorFlow Lite runtime package structure]]
 
 
 
If you wish to use the {{Highlight|TensorFlow Lite 2.3.1 }} you have to import the following library in your Python script:
 
import tflite_runtime.interpreter as tflite
 
If you wish to run inferences on your {{Highlight|Coral EdgeTPU}} device, you need to make the following call in your Python script:
 
import tflite_edgetpu_runtime.interpreter as tflite
 
   
 
In the next section we explore, with a basic image-classification example, how to inference your models on the board using the Coral EdgeTPU device.
 
In the next section we explore, with a basic image-classification example, how to inference your models on the board using the Coral EdgeTPU device.
Line 21: Line 11:
 
== Running an inference on Coral EdgeTPU using the TensorFlow Lite Python API ==
 
== Running an inference on Coral EdgeTPU using the TensorFlow Lite Python API ==
 
=== Installing prerequisites on the target ===
 
=== Installing prerequisites on the target ===
We start by installing the X-LINUX-AI components and the packages needed to run our example. The main packages are {{Highlight|Python Numpy<ref name=numpy_url>[https://numpy.org/ Numpy]</ref>}}, {{Highlight|Python OpenCV<ref name=opencv_url>[https://opencv.org/ OpenCV]</ref> 4.1.x}} and {{Highlight|Python TensorFlow Lite Edge TPU runtime<ref name=tensorflowlite_edgetpu_url></ref>1.12.1}}
+
After having [[X-LINUX-AI_OpenSTLinux_Expansion_Package#Configure the AI OpenSTLinux package repository|configured the AI OpenSTLinux package]], you can install the X-LINUX-AI components and the packages needed to run our example.<br>
  {{Board$}} apt-get install python3-numpy python3-opencv
+
The main packages are {{Highlight|Python Numpy<ref name=numpy_url>[https://numpy.org/ Numpy]</ref>}}, {{Highlight|Python OpenCV<ref name=opencv_url>[https://opencv.org/ OpenCV]</ref>}}, {{Highlight|Python TensorFlow Lite runtime<ref name=tensorflowlite_url>[https://www.tensorflow.org/lite TensorFlow Lite]</ref>}} and {{Highlight|libedgetpu}}
{{Board$}} apt-get install python3-tensorflow-lite-edgetpu
+
{{Warning|{{SoftwareLicenseAgreement | distribution=X-LINUX-AI}}}}
  +
 
  +
  {{Board$}} apt-get install python3-numpy python3-opencv python3-tensorflow-lite libedgetpu
   
 
=== Preparing the workspace on the target ===
 
=== Preparing the workspace on the target ===
Before running the inference, make sure that your ''.tflite'' model is compiled for inferencing on Coral EdgeTPU. Refer first to [[How to compile model and run inference on Coral Edge TPU using STM32MP1|Compile your custom model]], then send the model to the board.
 
 
  {{Board$}} cd /usr/local/ && mkdir -p workspace
 
  {{Board$}} cd /usr/local/ && mkdir -p workspace
 
  {{Board$}} cd workspace && mkdir -p models testdata  
 
  {{Board$}} cd workspace && mkdir -p models testdata  
After preparing the workspace on the target and sending the compiled model to the model directory in the workspace, we send the associated label file and input image to the workspace so that the inference can be executed. Any number of '''.jpeg''', '''.jpg''' and '''.png''' pictures of any size can be added. Some image processing operations are used later to adapt the picture sizes to the size of the input model. In this example, we use the '''mobilenet_v1_1.0_224_quant_edgetpu.tflite''' model to classify download images, accompanied by the labels file from the {{Highlight|'''Coral'''<ref name=coral_url>[https://coral.ai/models Coral AI]</ref>}} website using the following commands:
+
 
  {{PC$}} wget https://github.com/google-coral/edgetpu/raw/master/test_data/mobilenet_v1_1.0_224_quant_edgetpu.tflite
+
In this example, we use the mobilenet_v1_1.0_224_quant_edgetpu.tflite model to classify download images, accompanied by the labels file from the {{Highlight|'''Coral'''<ref name=coral_url>[https://coral.ai/models Coral AI]</ref>}} website.
  {{PC$}} wget https://github.com/google-coral/edgetpu/raw/master/test_data/imagenet_labels.txt -O labels.txt
+
 
We send then these file via the scp protocol using the following commands :
+
  {{Board$}} wget https://github.com/google-coral/edgetpu/raw/master/test_data/mobilenet_v1_1.0_224_quant_edgetpu.tflite -O models/mobilenet_v1_1.0_224_quant_edgetpu.tflite
  {{PC$}} scp path/to/your/compiled/model/mobilenet_v1_1.0_224_quant_edgetpu.tflite root@<board_ip_address>:/usr/local/workspace/models/
+
  {{Board$}} wget https://github.com/google-coral/edgetpu/raw/master/test_data/imagenet_labels.txt -O models/labels.txt
{{PC$}} scp path/to/your/labels.txt root@<board_ip_address>:/usr/local/workspace/models/
+
  {{Board$}} wget https://github.com/google-coral/edgetpu/raw/master/test_data/bird.bmp -O testdata/bird.bmp
{{PC$}} scp path/to/your/pictures root@<board_ip_address>:/usr/local/workspace/testdata/
+
 
Now that our workspace is ready with a compiled model file, a label file and some sample pictures, we can run an inference using the Python API. To do this, we create a Python script that is transferred via the '''scp''' command and run on the target board. This is a very basic example that classifies images by executing an inference on the Coral Edge TPU.
+
{{Info | You can run your own model but you have to make sure that your ''.tflite'' model is compiled for inferencing on Coral EdgeTPU. Refer first to [[How to compile model and run inference on Coral Edge TPU using STM32MP1|Compile your custom model]]}}  
{{PC$}} gedit classify_on_stm32mp1.py
 
   
 
=== Running the inference ===
 
=== Running the inference ===
If you are already familiar with inferencing TensorFlow Lite models, use the following Python script directly. Otherwise, copy it to a file named ''classify_on_stm32mp1.py'' and refer to the subsequent sections.  
+
Here is a simple python script example to execute a NN inference on the Google Coral Edge TPU.
   
 
<source lang="python">
 
<source lang="python">
Line 54: Line 44:
 
import sys
 
import sys
 
import numpy as np
 
import numpy as np
import tflite_edgetpu_runtime.interpreter as tflite
+
import tflite_runtime.interpreter as tflite
 
import time
 
import time
 
import cv2
 
import cv2
Line 62: Line 52:
 
           labels = [ line.strip() for line in f.readlines() ]
 
           labels = [ line.strip() for line in f.readlines() ]
 
model_file = "/usr/local/workspace/models/mobilenet_v1_1.0_224_quant_edgetpu.tflite"
 
model_file = "/usr/local/workspace/models/mobilenet_v1_1.0_224_quant_edgetpu.tflite"
interpreter = tflite.Interpreter(model_path = model_file, experimental_delegates = [tflite.load_delegate('libedgetpu-max.so.1.0')])
+
 
  +
#Create the interpreter and allocate tensors
  +
interpreter = tflite.Interpreter(model_path = model_file, experimental_delegates = [tflite.load_delegate('libedgetpu-max.so.2')])
 
interpreter.allocate_tensors()
 
interpreter.allocate_tensors()
  +
 
#Getting the model input and output details
 
#Getting the model input and output details
 
input_details = interpreter.get_input_details()
 
input_details = interpreter.get_input_details()
Line 69: Line 62:
 
height = input_details[0]['shape'][1]
 
height = input_details[0]['shape'][1]
 
width = input_details[0]['shape'][2]
 
width = input_details[0]['shape'][2]
  +
  +
#Read the picture, convert from BGR to RGB encoding,
  +
#resized to fit the size of the model input and have their
  +
#dimensions expanded by one
 
image = cv2.imread(sys.argv[1])
 
image = cv2.imread(sys.argv[1])
 
nn_img_rgb = cv2.cvtColor(np.array(image), cv2.COLOR_BGR2RGB)
 
nn_img_rgb = cv2.cvtColor(np.array(image), cv2.COLOR_BGR2RGB)
 
nn_img_rgb_resized = cv2.resize(nn_img_rgb, (width, height))
 
nn_img_rgb_resized = cv2.resize(nn_img_rgb, (width, height))
 
input_data = np.expand_dims(nn_img_rgb_resized, axis=0)
 
input_data = np.expand_dims(nn_img_rgb_resized, axis=0)
  +
  +
#Set the input data, execute the first inference (that could take longer
  +
#since the model is being loaded on the Coral Edge TPU RAM)
 
interpreter.set_tensor(input_details[0]['index'], input_data)
 
interpreter.set_tensor(input_details[0]['index'], input_data)
 
start = time.perf_counter()
 
start = time.perf_counter()
 
interpreter.invoke()
 
interpreter.invoke()
 
inference_time = time.perf_counter() - start
 
inference_time = time.perf_counter() - start
print("inference time:", inference_time)
+
print("1st inference:", inference_time, "s")
  +
 
  +
# Execute the second inference and measure the inference duration
  +
start = time.perf_counter()
  +
interpreter.invoke()
  +
inference_time = time.perf_counter() - start
  +
print("2nd inference:", inference_time, "s")
  +
 
  +
#Print the results
 
results = np.squeeze(interpreter.get_tensor(output_details[0]['index']))
 
results = np.squeeze(interpreter.get_tensor(output_details[0]['index']))
 
top_k = results.argsort()[-5:][::-1]
 
top_k = results.argsort()[-5:][::-1]
Line 85: Line 93:
 
</source>
 
</source>
   
Now that our Python script is ready for execution, we send it to the '''board'''.
+
Copy this python script to the target:
{{PC$}} scp path/to/your/script/classify_on_stm32mp1.py root@<board_ip_address>:/usr/local/workspace/
+
PC $> scp path/to/your/script/classify_on_stm32mp1.py root@<board_ip_address>:/usr/local/workspace
   
 
=== Running the inference from the board on the Coral Edge TPU ===
 
=== Running the inference from the board on the Coral Edge TPU ===
After booting the board and connecting it to the host PC throught an SSH protocol, we are ready to run the inference using the following command:
 
 
  {{Board$}} cd /usr/local/workspace
 
  {{Board$}} cd /usr/local/workspace
  {{Board$}} python3 classify_on_stm32mp1.py test_data/<picture to classify>
+
  {{Board$}} python3 classify_on_stm32mp1.py testdata/bird.bmp
Using the IA hardware accelerator speeds up the inferencing.  
+
inference time: 0.1181572640016384 ms
== Explanation of the parts of the script ==
+
88.627451: 20  chickadee
===Instantiating the Tensorflow Lite Interpreter ===
+
  4.313725: 19  magpie
We first load the labels from the label file by adding the following code lines:
+
3.137255: 18  jay
<source lang="python">
+
  2.745098: 21  water ouzel, dipper
label_file = "/usr/local/workspace/models/labels.txt"
+
1.176471: 14  junco, snowbird
with open( label_file, 'r') as f :
 
          labels = [ line.strip() for line in f.readlines() ]
 
</source>
 
We now load the model and feed it to the interpreter that we instantiate using the ''interpreter API'' <ref name=interpreter_url>[https://www.tensorflow.org/api_docs/python/tf/lite/Interpreter Interpreter Class]</ref> . In this interpreter we call a '''Tensorflow Lite delegate '''. This is an API that delegates all or part of the graph execution to the Edge TPU accelerator hardware. After calling the Edge TPU library inside the delegate, we allocate tensors for the graph execution through our interpreter.
 
<source lang="python">
 
model_file = "/usr/local/workspace/models/mobilenet_v1_1.0_224_quant_edgetpu.tflite"
 
interpreter = tflite.Interpreter(model_path = model_file, experimental_delegates = [tflite.load_delegate('libedgetpu-max.so.1.0')])
 
interpreter.allocate_tensors()
 
</source>
 
   
=== Getting the model details and processing the image ===
+
{{Info|The first inference may take longer since the model is being loaded on the Coral Edge TPU RAM}}
Now that the interpreter is ready to be fed with the input images, it is important to get the details of the model in order to adjust the image to fit into the model.
 
<source lang="python">
 
#Getting the model input and output details
 
input_details = interpreter.get_input_details()
 
output_details = interpreter.get_output_details()
 
height = input_details[0]['shape'][1]
 
width = input_details[0]['shape'][2]
 
</source>
 
 
 
Now we point to our image directory '''testdata''' and pass the image as parameter. The images are converted from BGR to RGB encoding, resized to fit the size of the model input and have their dimensions expanded by one.
 
<source lang="python">
 
image = cv2.imread(sys.argv[1])
 
nn_img_rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
 
nn_img_rgb_resized = cv2.resize(nn_img_rgb, (width, height))
 
input_data = np.expand_dims(nn_img_rgb_resized, axis=0)
 
</source>
 
 
 
=== Invoking the interpreter and displaying results ===
 
Now that our input data has been processed to fit in the model input size, we feed the image to the interpreter input and launch the inference. We use the time library to record the inference duration, which gives a good indication of the Edge TPU performance compared to that of the CPU.
 
<source lang="python">
 
interpreter.set_tensor(input_details[0]['index'], input_data)
 
start = time.perf_counter()
 
interpreter.invoke()
 
inference_time = time.perf_counter() - start
 
print("inference time:", inference_time)
 
output_details = interpreter.get_output_details()
 
results = np.squeeze(interpreter.get_tensor(output_details[0]['index']))
 
top_k = results.argsort()[-5:][::-1]
 
for i in top_k:
 
    print('{0:08.6f}'.format(float(results[i]*100/255.0))+":", self._labels[i])
 
print("\n")
 
</source>
 
   
 
==References==
 
==References==
Line 146: Line 112:
   
 
<noinclude>
 
<noinclude>
[[Category:Artificial intelligence expansion packages |10]]
+
[[Category:Artificial intelligence expansion packages |09]]
 
{{PublicationRequestId | 16591 | 26Jun'20}}
 
{{PublicationRequestId | 16591 | 26Jun'20}}
 
</noinclude>
 
</noinclude>