Difference between revisions of "AI:X-CUBE-AI documentation"

[checked revision] [quality revision]
m (Escoda Michael moved page AI:X-CUBE-AI documentation? to AI:X-CUBE-AI documentation without leaving a redirect)
m

This article describes the documentation related to X-CUBE-AI and especially the X-CUBE-AI embedded documentation contents and how to access it. The embedded documentation is installed with X-CUBE-AI, which ensures to provide the accurate documentation for the considered version of X-CUBE-AI.

Info white.png Information
  • X-CUBE-AI is a software that generates optimized C code for STM32 microcontrollers and Neural Network inference. It is delivered under the Mix Ultimate Liberty+OSS+3rd-party V1 software license agreement (SLA0048).

1 X-CUBE-AI getting started and user guide[edit]

There are four main documentation items for X-CUBE-AI completed by WiKi articles:

2 X-CUBE-AI embedded documentation contents[edit]

The embedded documentation describes the following topics (for X-CUBE-AI version 7.0.0) in detail:

  • User guide
    • Installation: specific environment settings to use the command-line interface in a console.
    • Command-line interface: the stm32ai application is a console utility, which provides a complete and unified command-line interface (CLI) to generate the X-CUBE-AI optimized library for an STM32 device family from a pre-trained DL/ML model. This section describes the CLI in detail.
    • Embedded inference client API: this section describes the embedded inference client API, which must be used by a C application layer (AI client) to use a deployed C model.
    • Evaluation report and metrics: this section describes the different metrics (and associated computing flow) that are used to evaluate the performance of the generated C files (or C model), mainly through the validate command.
    • Quantized model and quantize command: the X-CUBE-AI code generator can be used to deploy a quantized model (8-bit integer format). Quantization (also called calibration) is an optimization technique to compress a 32-bit floating-point model by reducing its size, by improving the CPU/MCU usage and latency, at the expense of a small degradation of the accuracy. This section describes the way X-CUBE-AI supports quantized models and the CLI internal post-training quantization process.
  • Advanced features
    • Relocatable binary network support: a relocatable binary model designates a binary object, which can be installed and executed anywhere in an STM32 memory sub-system. It contains a compiled version of the generated NN C files including the requested forward kernel functions and the weights. The principal objective is to provide a flexible way to upgrade an AI-based application without generating and programming the whole end-user firmware again. For example, this is the primary element needed to use the FOTA (Firmware Over-The-Air) technology. This section describes how to build and use a relocatable binary.
    • Keras LSTM stateful support: this section describes how X-CUBE-AI provides an initial support for the Keras stateful LSTM.
    • Keras Lambda/custom layer support: the goal of this section is to explain how to import a Keras model containing Lambda or custom layers, following one way or another depending on the nature of the model.
    • Platform Observer API: for advanced run-time, debug or profiling purposes, an AI client application can register a call-back function to be notified before or/end after the execution of a C node. The call-back can be used to measure the execution time, dump the intermediate values, or both. This section describes how to use and take advantage of this feature.
    • STM32 CRC IP as shared resources: to use the network_runtime library, the STM32 CRC IP must be enabled (or clocked); otherwise, the application hangs. To improve the usage of the CRC IP and consider it as a shared resource, two optional specific hooks or call-back functions are defined to facilitate its usage with a resource manager. This section describes how it works.
    • TensorFlow™ Lite for Microcontrollers support: the X-CUBE-AI Expansion Package integrates a specific path, which allows to generate a ready-to-use STM32 IDE project embedding a TensorFlow™ Lite for Microcontrollers runtime and its associated TFLite (TensorFlow™ Lite) model. This can be considered as an alternative to the default X-CUBE-AI solution to deploy an AI solution based on a TFLite model. This section describes how X-CUBE-AI supports TensorFlow™ Lite for Microcontrollers.
  • How to
    • How to use USB-CDC driver for validation: this article explains how to enable the USB-CDC profile to perform faster the validation on the board. A client USB device with the STM32 Communication Device Class (namely the Virtual COM port) is used as communication link with the host. It avoids the overhead of the ST-LINK bridge connecting the UART pins to or from the ST-LINK USB port. However, an STM32 Nucleo or Discovery board with a built-in USB device peripheral is requested.
    • How to run locally a C model: this article explains how to run locally the generated C model. The first goal is to enhance an end-user validation process with a large data set including the specific pre- and post-processing functions with the X-CUBE-AI inference run-time. It is also useful to integrate an X-CUBE-AI validation step in a CI/CD/MLOps flow without an STM32 board.
    • How to upgrade an STM32 project: this article describes how to upgrade manually or with the CLI an STM32CubeMX-based or proprietary source tree with a new version of the X-CUBE-AI library.
  • Supported DL/ML frameworks: lists the supported Deep Learning frameworks, and the operators and layers supported for each of them.
    • Keras toolbox: lists the Keras layers (or operators) that can be imported and converted. Keras is supported through the TensorFlow™ backend with channels-last dimension ordering. Keras.io 2.0 up to version 2.5.1 is supported, while networks defined in Keras 1.x are not officially supported. Up to TF Keras 2.5.0 is supported.
    • TensorFlow™ Lite toolbox: lists the TensorFlow™ Lite layers (or operators) that can be imported and converted. TensorFlow™ Lite is the format used to deploy a Neural Network model on mobile platforms. STM32Cube.AI imports and converts the .tflite files based on the flatbuffer technology. The official ‘schema.fbs’ definition (tags v2.5.0) is used to import the models. A number of operators from the supported operator list are handled, including the quantized models and/or operators generated by the Quantization Aware Training process, by the post-training quantization process, or both.
    • ONNX toolbox: lists the ONNX layers (or operators) that can be imported and converted. ONNX is an open format built to represent Machine Learning models. A subset of operators from Opset 7, 8, 9 and 10 of ONNX 1.6 is supported.
  • Frequently asked questions
    • Generic aspects
      • How to know the version of the Deep Learning framework components used?
      • Channel first support for ONNX model
      • How is used the CMSIS-NN library?
      • What is the EABI used for the network_runtime libraries?
      • X-CUBE-AI Python™ API availability?
      • Stateful LSTM support?
      • How is used the ONNX optimizer?
      • How is used the TFLite interpreter?
      • TensorFlow™ Keras (tf.keras) vs. Keras.io
      • It is possible to update a model on the firmware without having to do a full firmware update?
      • Keras model or sequential layer support?
      • Is it possible to split the weights buffer?
      • Is it possible to place the “activations” buffer in different memory segments?
      • How to compress the non-dense/fully-connected layers?
      • Is it possible to apply a compression factor different from x8, x4?
      • How to specify or indicate a compression factor by layer?
      • Why is a small negative ratio reported for the weights size with a model without compression?
      • Is it possible to dump/capture the intermediate values during inference execution?
    • Validation aspects
      • Validation on target vs. validation on desktop
      • How to interpret the validation results?
      • How to generate npz/npy files from an image data set?
      • How to validate a specific network when multiple networks are embedded into the same firmware?
      • Reported STM32 results are incoherent
      • Unable to perform automatic validation on target
      • Long time process or crash with a large test data set
    • Quantization and post-training quantization process
      • Backward compatibility with X-CUBE-AI 4.0 and X-CUBE-AI 4.1
      • Is it possible to use the Keras post-training quantization process through the UI?
      • Is it possible to use the Keras post-training quantization process with a non-classifier model?
      • Is it possible to use the compression for a quantized model?
      • How to apply the Keras post-training quantization process on a non-Keras model?
      • TensorFlow™ Lite, OPTIMIZE_FOR_SIZE option support

3 X-CUBE-AI embedded documentation access[edit]

To access the embedded documentation, first install X-CUBE-AI. The installation process is described in the getting started user manual. Once the installation is done, the documentation is available in the installation directory under X-CUBE-AI/7.0.0/Documentation/index.html (adapt the example, given here for version 7.0.0, to the version used). For Windows®, by default, the documentation is located here (replace the string username by your Windows® username): file:///C:/Users/username/STM32Cube/Repository/Packs/STMicroelectronics/X-CUBE-AI/7.0.0/Documentation/index.html. The release notes are available here: file:///C:/Users/username/STM32Cube/Repository/Packs/STMicroelectronics/X-CUBE-AI/7.0.0/Release_Notes.html.

The embedded documentation can also be accessed through the STM32CubeMX UI, once the X-CUBE-AI Expansion Package has been selected and loaded, by clicking on the "Help" menu and then on "X-CUBE-AI documentation":

X-CUBE-AI documenation


This article describes the documentation related to [https://www.st.com/en/product/x-cube-ai X-CUBE-AI] and especially the [https://www.st.com/en/product/x-cube-ai X-CUBE-AI] embedded documentation contents and how to access it. 
The embedded documentation is installed with [https://www.st.com/en/product/x-cube-ai X-CUBE-AI], which ensures to provide the accurate documentation for the considered version of [https://www.st.com/en/product/x-cube-ai X-CUBE-AI].

{{Info| 
* [https://www.st.com/en/embedded-software/x-cube-ai.html X-CUBE-AI] is a software that generates optimized C code for STM32 microcontrollers and Neural Network inference. It is delivered under the Mix Ultimate Liberty+OSS+3rd-party V1 software license agreement ([https://st.com/SLA0048 SLA0048]).}}

= X-CUBE-AI getting started and user guide =

There are four main documentation items for [https://www.st.com/en/product/x-cube-ai X-CUBE-AI] completed by WiKi articles:
* Documentation available on [https://www.st.com/en/product/x-cube-ai#documentation ''st.com'']:
** [https://www.st.com/resource/en/data_brief/x-cube-ai.pdf X-CUBE-AI data brief]
** [https://www.st.com/resource/en/user_manual/dm00570145.pdf X-CUBE-AI getting started user manual]
* The release notes for each version available within [https://www.st.com/en/product/x-cube-ai X-CUBE-AI]
* Embedded documentation available within the [https://www.st.com/en/product/x-cube-ai X-CUBE-AI] Expansion Package and described below.

= X-CUBE-AI embedded documentation contents =

The embedded documentation describes the following topics (for [https://www.st.com/en/product/x-cube-ai X-CUBE-AI]  version '''7.0.0''') in detail:
* {{Highlight|User guide}}
** {{Highlight|Installation}}: specific environment settings to use the command-line interface in a console.
** {{Highlight|Command-line interface}}: the stm32ai application is a console utility, which provides a complete and unified command-line interface (CLI) to generate the X-CUBE-AI optimized library for an STM32 device family from a pre-trained DL/ML model. This section describes the CLI in detail.
** {{Highlight|Embedded inference client API}}: this section describes the embedded inference client API, which must be used by a C application layer (AI client) to use a deployed C model.
** {{Highlight|Evaluation report and metrics}}: this section describes the different metrics (and associated computing flow) that are used to evaluate the performance of the generated C files (or C model), mainly through the validate command.
** {{Highlight|Quantized model and quantize command}}: the X-CUBE-AI code generator can be used to deploy a quantized model (8-bit integer format). Quantization (also called calibration) is an optimization technique to compress a 32-bit floating-point model by reducing its size, by improving the CPU/MCU usage and latency, at the expense of a small degradation of the accuracy. This section describes the way X-CUBE-AI supports quantized models and the CLI internal post-training quantization process.
* {{Highlight|Advanced features}}
** {{Highlight|Relocatable binary network support}}: a relocatable binary model designates a binary object, which can be installed and executed anywhere in an STM32 memory sub-system. It contains a compiled version of the generated NN C files including the requested forward kernel functions and the weights. The principal objective is to provide a flexible way to upgrade an AI-based application without generating and programming the whole end-user firmware again. For example, this is the primary element needed to use the FOTA (Firmware Over-The-Air) technology. This section describes how to build and use a relocatable binary.
** {{Highlight|Keras LSTM stateful support}}: this section describes how X-CUBE-AI provides an initial support for the Keras stateful LSTM. 
** {{Highlight|Keras Lambda/custom layer support}}: the goal of this section is to explain how to import a Keras model containing Lambda or custom layers, following one way or another depending on the nature of the model.
** {{Highlight|Platform Observer API}}: for advanced run-time, debug or profiling purposes, an AI client application can register a call-back function to be notified before or/end after the execution of a C node. The call-back can be used to measure the execution time, dump the intermediate values, or both. This section describes how to use and take advantage of this feature.
** {{Highlight|STM32 CRC IP as shared resources}}: to use the network_runtime library, the STM32 CRC IP must be enabled (or clocked); otherwise, the application hangs. To improve the usage of the CRC IP and consider it as a shared resource, two optional specific hooks or call-back functions are defined to facilitate its usage with a resource manager. This section describes how it works.
** {{Highlight|TensorFlow™ Lite for Microcontrollers support}}: the X-CUBE-AI Expansion Package integrates a specific path, which allows to generate a ready-to-use STM32 IDE project embedding a TensorFlow™ Lite for Microcontrollers runtime and its associated TFLite (TensorFlow™ Lite) model. This can be considered as an alternative to the default X-CUBE-AI solution to deploy an AI solution based on a TFLite model. This section describes how X-CUBE-AI supports TensorFlow™ Lite for Microcontrollers.
* {{Highlight|How to}}
** {{Highlight|How to use USB-CDC driver for validation}}: this article explains how to enable the USB-CDC profile to perform faster the validation on the board. A client USB device with the STM32 Communication Device Class (namely the Virtual COM port) is used as communication link with the host. It avoids the overhead of the ST-LINK bridge connecting the UART pins to or from the ST-LINK USB port. However, an STM32 Nucleo or Discovery board with a built-in USB device peripheral is requested.
** {{Highlight|How to run locally a C model}}: this article explains how to run locally the generated C model. The first  goal is to enhance an end-user validation process with a large data set including the specific pre- and post-processing functions with the X-CUBE-AI inference run-time. It is also useful to integrate an X-CUBE-AI validation step in a CI/CD/MLOps flow without an STM32 board.
** {{Highlight|How to upgrade an STM32 project}}: this article describes how to upgrade manually or with the CLI an [https://www.st.com/en/product/stm32cubemx STM32CubeMX]-based or proprietary source tree with a new version of the X-CUBE-AI library. 
* {{Highlight|Supported DL/ML frameworks}}: lists the supported Deep Learning frameworks, and the operators and layers supported for each of them.
** {{Highlight|Keras toolbox}}: lists the Keras layers (or operators) that can be imported and converted. Keras is supported through the TensorFlow™ backend with channels-last dimension ordering. Keras.io 2.0 up to version 2.5.1 is supported, while networks defined in Keras 1.x are not officially supported. Up to TF Keras 2.5.0 is supported.
** {{Highlight|TensorFlow™ Lite toolbox}}: lists the TensorFlow™ Lite layers (or operators) that can be imported and converted. TensorFlow™ Lite is the format used to deploy a Neural Network model on mobile platforms. STM32Cube.AI imports and converts the .tflite files based on the flatbuffer technology. The official ‘schema.fbs’ definition (tags v2.5.0) is used to import the models. A number of operators from the supported operator list are handled, including the quantized models and/or operators generated by the Quantization Aware Training process, by the post-training quantization process, or both.
** {{Highlight|ONNX toolbox}}: lists the ONNX layers (or operators) that can be imported and converted. ONNX is an open format built to represent Machine Learning models. A subset of operators from Opset 7, 8, 9 and 10 of ONNX 1.6 is supported.
* {{Highlight|Frequently asked questions}}
** {{Highlight|Generic aspects}}
*** How to know the version of the Deep Learning framework components used?
*** Channel first support for ONNX model
*** How is used the CMSIS-NN library?
*** What is the EABI used for the network_runtime libraries?
*** X-CUBE-AI Python™ API availability?
*** Stateful LSTM support?
*** How is used the ONNX optimizer?
*** How is used the TFLite interpreter?
*** TensorFlow™ Keras (tf.keras) vs. Keras.io
*** It is possible to update a model on the firmware without having to do a full firmware update?
*** Keras model or sequential layer support?
*** Is it possible to split the weights buffer?
*** Is it possible to place the “activations” buffer in different memory segments?
*** How to compress the non-dense/fully-connected layers?
*** Is it possible to apply a compression factor different from x8, x4?
*** How to specify or indicate a compression factor by layer?
*** Why is a small negative ratio reported for the weights size with a model without compression?
*** Is it possible to dump/capture the intermediate values during inference execution?
** {{Highlight|Validation aspects}}
*** Validation on target vs. validation on desktop
*** How to interpret the validation results?
*** How to generate npz/npy files from an image data set?
*** How to validate a specific network when multiple networks are embedded into the same firmware?
*** Reported STM32 results are incoherent
*** Unable to perform automatic validation on target
*** Long time process or crash with a large test data set
** {{Highlight|Quantization and post-training quantization process}}
*** Backward compatibility with X-CUBE-AI 4.0 and X-CUBE-AI 4.1
*** Is it possible to use the Keras post-training quantization process through the UI?
*** Is it possible to use the Keras post-training quantization process with a non-classifier model?
*** Is it possible to use the compression for a quantized model?
*** How to apply the Keras post-training quantization process on a non-Keras model?
*** TensorFlow™ Lite, OPTIMIZE_FOR_SIZE option support

= X-CUBE-AI embedded documentation access =

To access the embedded documentation, first install [https://www.st.com/en/product/x-cube-ai X-CUBE-AI].
The installation process is described in the [https://www.st.com/resource/en/user_manual/dm00570145.pdf getting started] user manual.
Once the installation is done, the documentation is available in the installation directory under X-CUBE-AI/7.0.0/Documentation/index.html (adapt the example, given here for version 7.0.0, to the version used).
For Windows<sup>®</sup>, by default, the documentation is located here (replace the string ''username'' by your Windows<sup>®</sup> username): file:///C:/Users/username/STM32Cube/Repository/Packs/STMicroelectronics/X-CUBE-AI/7.0.0/Documentation/index.html.
The release notes are available here: file:///C:/Users/username/STM32Cube/Repository/Packs/STMicroelectronics/X-CUBE-AI/7.0.0/Release_Notes.html.

The embedded documentation can also be accessed through the [https://www.st.com/en/product/stm32cubemx STM32CubeMX] UI, once the X-CUBE-AI Expansion Package has been selected and loaded, by clicking on the "Help" menu and then on "X-CUBE-AI documentation":
<div class="res-img">

[[File:CubeAI_Help_Doc.png|center|alt=X-CUBE-AI documenation|X-CUBE-AI documentation]]</div>

<noinclude>
[[Category:STM32Cube.AI | 0]]{{PublicationRequestId | 20964 | 2021-09-01 }}	</noinclude>
Line 89: Line 89:
   
 
<noinclude>
 
<noinclude>
  +
[[Category:STM32Cube.AI | 0]]
 
{{PublicationRequestId | 20964 | 2021-09-01 }}
 
{{PublicationRequestId | 20964 | 2021-09-01 }}
 
</noinclude>
 
</noinclude>