How to automatize code generation and validation with X-CUBE-AI CLI

Revision as of 19:08, 10 July 2022 by Registered User

This article describes how to automatize machine learning model code generation and validation through X-CUBE-AI Command Line Interface (CLI). The example is provided for Windows thanks using batch script, however it can be easily adapted to other operating system or through Python. More information on CLI can be found in the embedded documentation. For Python, please refer to the section How to run locally a c-model.

Info white.png Information
  • X-CUBE-AI is a software that generates optimized C code for STM32 microcontrollers and Neural Network inference. It is delivered under the Mix Ultimate Liberty+OSS+3rd-party V1 software license agreement (SLA0048).

1. Requirements & installations

  • X-CUBE-AI latest version (latest tested v7.2)
  • STM32CubeIDE latest version (latest tested v1.10.1)
  • STM32CubeProgrammer latest version (latest tested v1.10.1)
  • A STM32 evaluation board, for this example: NUCLEO-H723ZG
  • A model (Keras .h5, tensorflow lite .tflite or ONNX). For this example we used a MobileNet v1 0.25 quantized with input of image input of 128x128x3.

You can download the model: mobilenet_v1_0.25_128_quantized.tflite.

Once X-CUBE-AI is X-CUBE-AI installed as well as STM32CubeIDE, please note the installation paths, usually by default on Windows (replacing username by your Windows user account name and adapting to your tool version):

  • For X-CUBE-AI: C:\Users\username\STM32Cube\Repository\Packs\STMicroelectronics\X-CUBE-AI\7.2.0
  • For STM32CubeIDE: C:\ST\STM32CubeIDE_1.10.1\STM32CubeIDE\
  • For arm gcc installed through STM32CubeIDE: C:\ST\STM32CubeIDE_1.10.1\STM32CubeIDE\plugins\com.st.stm32cube.ide.mcu.externaltools.gnu-tools-for-stm32.10.3-2021.10.win32_1.0.0.202111181127\tools\bin
  • For STM32CubeProgrammer: C:\Program Files\STMicroelectronics\STM32Cube\STM32CubeProgrammer\bin

2. Validation application generation

In order to generate the initial validation project for the targeted board, we are using the STM32CubeMX with X-CUBE-AI plugin.

2.1. Create a new project

Open STM32CubeMX and start the project using the board selector:

STM32CubeMX start project

Select the board to use, in our case the NUCLEO-H723ZG, and create a project without initializing all peripherals with their default Mode:

STM32CubeMX initialization

2.2. Add X-CUBE-AI software pack

Select X-CUBE-AI software pack Core and Validation application:

STM32CubeAI software pack

Click on X-CUBE-AI software pack:

STM32CubeAI software pack

If by default the peripherals parameters are not set for the best performance, the system warns you of this. Select 'yes' to make sure that the maximal frequency is used.

STM32CubeAI best performance

X-CUBE-AI configures default parameters to set the best performance as well as configuring the UART used to report performances.

2.3. Generate the validation application

Upload a model, in our example the mobilenet_v1_0.25_128_quantized.tflite.

STM32CubeAI load model

For our example, we keep the default (from versionv7.2 of X-CUBE-AI) memory options allocate input / output.

STM32CubeAI options

You can then generate the corresponding validation application as STM32Cube IDE project in a dedicated directory

STM32CubeAI generate validation

Open the project in STM32CubeIDE

And build the project

build validation

To check that the validation application is working as expected, program first the board (either though STM32CubeIDE or STM32CubeProgrammer) then reset the board. Open the validation on target panel and either select the board COM board or leave the automatic detection. Do not select the automatic compilation and download option as the board is already programmed with the right firmware.

STM32CubeAI validation on target

Press ok and check that the validation is running correctly.

STM32CubeAI validation done

3. Automatic validation

The below batch script uses ST tools through command line interface to automatically:

  • get the full memory footprint of a given model
  • get the inference time
  • get the validation metrics (validation done on random data)
Warning DB.png Important

The script might need to be adapted depending on the targeted configuration, for instance if the input model is a Keras quantized model, or if specific options are selected like using external memory. Please refer to the Command Line Interface section of the embedded documentation.

In below script, the local working directory is "C:\Project" which shall correspond to the directory of the initial validation application project. There is two modes, defined by the variable "mode". The analyze mode, mode=analyze, do only an analyze to get main memory footprint contributors (ie weights and activations buffer). In this mode, it is not needed to generate a validation application (section2 can be skipped) and only X-CUBE-AI needs to be installed. The validation mode, mode=validate (default), do a validation to get full memory footprint (weights, activations and runtime) as well as inference time and validation metrics. In the script a temporary directory "tmp" is generated with two sub directories "stm32ai_ws" and "stm32ai_output". These directories will be used by the X-CUBE-AI stm32ai CLI executable as workspace and output directory. The main outputs and reports are copied from these directories to a result directory, for our example C:\Project\Results\STM32H723\mobilenet_v1_0.25_128_quant.tflite

The total memory footprint for a given model shall consider not only the size of the weights of the model (stored in Flash) and the size of the activations, input, output buffers. The size of the runtime code (Flash and RAM) shall also be considered. The runtime memory footprint can only be measured by building the code with a toolchain. A convenient way to measure the runtime footprint is to generate the code using the relocatable option. Note that to gain RAM space the "Use activation buffer for input buffer" and "Use activation buffer for the output buffer" options are selected.

he column Flash reports the Flash occupancy including the model weights, the runtime code generated by X-CUBE-AI to run the neural network and its constants (including the initialized tables). The column RAM reports the RAM buffers occupancy, used to store the model activations as well as input and output buffers, and the RAM required by the runtime to inference the model. Note that to gain RAM space the "Use activation buffer for input buffer" and "Use activation buffer for the output buffer" options are selected (through X-CUBE-AI Advanced Settings panel).