Semantic segmentation

Applicable for

This article explains how to use the stai_mpu API for semantic segmentation applications supporting OpenVX ^[1] back-end.

1. Description[edit source]

The semantic segmentation neural network model allows categorizing each pixel in an image into a class or object with the final objective to produce a dense pixel-wise segmentation map of an image, where each pixel is assigned to a specific class or object. The DeepLabV3 is a state-of-art deep learning model for semantic image segmentation, where the goal is to assign semantic labels (such as person, dog, cat) to every pixel in the input image.

The application enables three main features:

A camera streaming preview implemented using GStreamer. All the image processing is done on the ISP level using the DCMIPP main pipe.
An NN inference based on the camera (or test data pictures) inputs is being run on the NPU using OpenVX back-end.
A user interface implemented using Python™ GTK, where the NN inference results are drawn and displayed.

The performance depends on the NN model used and on the execution engine. Using the NPU/GPU hardware acceleration for NN model inference gives better performance. In this case, the CPU is dedicated to the camera stream and is not involved in the NN inference. For a software inference, the CPU resources are shared between the handling of the camera stream and the NN inference.

The model used with this application is the DeepLabV3 downloaded from the TensorFlow™ Lite Hub^[2].

Information

For this application, an TensorFlow Lite per-tensor asymmetric quantized model is used, which is accelerated using the neural processing unit (NPU). Then, the model has been converted to NBG format using ST Edge AI tool. For more information about this tool, refer to the dedicated article.

2. Installation[edit source]

2.1. Install from the OpenSTLinux AI package repository[edit source]

Warning

The software package is provided AS IS, and by downloading it, you agree to be bound to the terms of the software license agreement (SLA0048). The detailed content licenses can be found here.

After having configured the AI OpenSTLinux package you can install X-LINUX-AI components for semantic segmentation application:

2.1.1. Install on STM32MP2x board[edit source]

The OpenVX application will be installed to take advantage of the neural processing unit (NPU) and graphics processing unit (GPU). It is only available in Python and on STM32MP2x boards.

To install this application, please use the following command:

 x-linux-ai -i stai-mpu-semantic-segmentation-python-ovx

Important

Semantic segmentation application is only provided for STM32MP2x as a python application based on OpenVX back-end

Then, restart the demo launcher:

 systemctl restart weston-graphical-session.service

2.2. Source code location[edit source]

in the OpenSTLinux Distribution with X-LINUX-AI Expansion Package:

<Distribution Package installation directory>/layers/meta-st/meta-st-x-linux-ai/recipes-samples/semantic-segmentation/files/stai_mpu

on the target:

/usr/local/x-linux-ai/semantic-segmentation/stai_mpu_semantic_segmentation.py

on GitHub:

recipes-samples/semantic-segmentation/files/stai_mpu

3. How to use the application[edit source]

3.1. Launching via the demo launcher[edit source]

You can click on the icon to run Python OpenVX application installed on your STM32MP2x board.

3.2. Executing with the command line[edit source]

The semantic segmentation Python application is located in the userfs partition:

/usr/local/x-linux-ai/semantic-segmentation/stai_mpu_semantic_segmentation.py

It accepts the following input parameters:

 
usage: stai_mpu_semantic_segmentation.py [-h] [-m MODEL_FILE] [-i IMAGE] 
                                    [-v VIDEO_DEVICE] [--conf_threshold CONF_THRESHOLD] [--iou_threshold IOU_THRESHOLD] 
                                    [--frame_width FRAME_WIDTH] [--frame_height FRAME_HEIGHT] [--framerate FRAMERATE] 
                                    [--input_mean INPUT_MEAN] [--input_std INPUT_STD] [--normalize NORMALIZE] 
                                    [--validation] [--val_run VAL_RUN]
options:
  -h, --help  show this help message and exit
  -m MODEL_FILE, --model_file MODEL_FILE Neural network model to be executed
  -i IMAGE, --image IMAGE image directory with image to be classified
  -v VIDEO_DEVICE, --video_device VIDEO_DEVICE video device ex: video0
  --conf_threshold CONF_THRESHOLD Confidence threshold
  --iou_threshold IOU_THRESHOLD IoU threshold, used to compute NMS
  --frame_width FRAME_WIDTH width of the camera frame (default is 640)
  --frame_height FRAME_HEIGHT height of the camera frame (default is 480)
  --framerate FRAMERATE framerate of the camera (default is 15fps)
  --input_mean INPUT_MEAN input mean
  --input_std INPUT_STD input standard deviation
  --normalize NORMALIZE input standard deviation
  --validation enable the validation mode
  --val_run VAL_RUN set the number of draws in the validation mode

4. Testing with DeepLabV3 on STM32MP2x[edit source]

The model used for testing is the deeplabv3_257_int8_per_tensor.nb

To ease launching of the application, two shell scripts are available for Python application on the board:

launch semantic segmentation based on camera frame inputs:

/usr/local/x-linux-ai/semantic-segmentation/launch_python_semantic_segmentation.sh

Information

In camera mode, you need to click on the screen, which will freeze the last frame and use it for DeepLabV3 neural network inference. The image with the segmentation will be displayed. To display the camera preview again, click on the screen a second time.

launch semantic segmentation based on the pictures located in /usr/local/demo-ai/semantic-segmentation/models/deeplabv3/testdata directory

/usr/local/x-linux-ai/semantic-segmentation/launch_python_semantic_segmentation_testdata.sh

Important

Note that you need to populate the testdata directory with your own data sets. The pictures are then randomly read from the testdata directory

5. Going further[edit source]

The two shell scripts described before offers the possibility to select the framework automatically depending on the model provided or by specifying it. To be able to run the application using a specify frameworks, the models for each frameworks must be available in the /usr/local/x-linux-ai/semantic-segmentation/models/deeplabv3/ directory. Then, you will need to specify the framework as an argument of the launch scripts as follow.

Information

For application like semantic segmentation that targets STM32MP2x board with NPU acceleration, only NBG model is provided as example. But you can still add TFLite or ONNX models yourself and run the application on the CPU.

Run semantic segmentation based on camera input with the chosen framework. Available framework option is : nbg

/usr/local/x-linux-ai/semantic-segmentation/launch_python_semantic_segmentation.sh nbg

Information

In camera mode, you need to click on the screen, which will freeze the last frame and use it for DeepLabV3 neural network inference. The image with the segmentation will be displayed. To display the camera preview again, click on the screen a second time.

Run semantic segmentation based on picture located in the /usr/local/demo-ai/semantic-segmentation/models/deeplabv3/testdata directory with the chosen framework. Available framework option is: nbg.

/usr/local/x-linux-ai/semantic-segmentation/launch_python_semantic_segmentation_testdata.sh nbg

Important

Note that you need to populate the testdata directory with your own data sets. The pictures are then randomly read from the testdata directory

6. References[edit source]

↑ OpenVX
↑ TFLite Hub

[openvx_url-1] OpenVX

[tflite_hub_url-2] TFLite Hub

[1]

[2]