X-CUBE-AI documentation

Revision as of 13:14, 27 September 2021 by Registered User
Renaming.png This page is a candidate for renaming (move).
The requested new name is: X-CUBE-AI embedded documentation .
The supplied reason is: Preferable to use lower case .
-- Registered User (-) 13:02, 27 September 2021 (CEST).
Wiki maintainers: remember to update the pages that link this page before renaming (moving) it.

This article describes the documentation related to X-CUBE-AI and especially the X-CUBE-AI embedded documentation contents and how to access it. The embedded documentation is installed with X-CUBE-AI, which ensures to provide the accurate documentation for the considered version of X-CUBE-AI.

Info white.png Information
  • X-CUBE-AI is a software that generates optimized C code for STM32 microcontrollers and Neural Network inference. It is delivered under the Mix Ultimate Liberty+OSS+3rd-party V1 software license agreement (SLA0048).

1. X-CUBE-AI getting started and user guide

There are four main documentation items for X-CUBE-AI completed by WiKi articles:

2. X-CUBE-AI embedded documentation contents

The embedded documentation describes in details the following topics (for X-CUBE-AI version 7.0.0):

  • User Guide
    • Installation: specific environment settings to use the Command Line Interface in a console
    • Command Line Interface: the stm32ai application is a console utility which provides a complete and unified Command Line Interface (CLI) to generate from a pre-trained DL/ML model, the X-CUBE-AI optimized library for STM32 device family. This section describes in detail the CLI.
    • Embedded inference client API: this section describes the embedded inference client API which must be used by a C-application layer (AI client) to use a deployed C-model.
    • Evaluation report and metrics: this section describes the different metrics (and associated computing flow) which are used to evaluate the performance of the generated C-files (or C-model) mainly through the validate command.
    • Quantized model and quantize command: X-CUBE-AI code generator can be used to deploy a quantized model (8b integer format). Quantization (also called calibration) is an optimization technique to compress a 32-bit floating-point model by reducing the size, by improving CPU/MCU usage and latency with a small degradation of accuracy. This section describes the way X-CUBE-AI support quantized model and the CLI internal post-training quantization process.
  • Advanced features
    • Relocatable binary network support : a relocatable binary model designates a binary object which can be installed and executed anywhere in a STM32 memory sub-system. It contents a compiled version of the generated NN C-files including the requested forward kernel functions and the weights. The principal objective is to provide a flexible way to upgrade an AI-based application w/o re-generating and flashing the whole end-user firmware. This is the primary element to use for example the FOTA (Firmware Over-The-Air) technology. This section describes how to build and use a relocatable binary.
    • Keras LSTM stateful support: this section describes how X-CUBE-AI v7 provides an initial support for the Keras stateful LSTM.
    • Keras Lambda/custom layer support: the goal of this section is to explain how you can import a Keras model containing Lambda or Custom layers. Depending on the nature of your model you will have to follow one way or another.
    • Platform Observer API: for advanced run-time, debug or profiling purposes, an AI client application can register a call-back function to be notified before or/end after the execution of a c-node. The call-back can be used to measure the execution time or/and to dump the intermediate values. This section describes how to use and take advantage of this feature.
    • STM32 CRC IP as shared resources: to use the network-runtime library, the STM32 CRC IP should be enabled (or clocked) else the application hangs. To improve the usage of the CRC IP and to consider it as a shared resource, two optional specific hooks or callback functions are defined to facilitate its usage with a resource manager. This section describes how it works.
    • TensorFlow™ lite for micro-controller support: X-CUBE-AI Expansion Package integrates a specific path which allows to generate a ready-to-use STM32 IDE project embedding a TensorFlow™ Lite for Microcontrollers run-time and its associated TFLite model. This can be considered as an alternative of the default X-CUBE-AI solution to deploy a AI solution based on a TFLite model. This section describes how X-CUBE-AI supports TensorFlow™ Lite for Microcontrollers.
  • HowTo
    • How to use USB-CDC driver for validation: this article is a how-to to explain how to enable the USB-CDC profile to perform faster the validation on the board. A client USB device with the STM32 Communication Device Class (i.e. Virtual COM Port) is used as communication link with the host. It allows to avoid the overhead of the ST Link bridge connecting the UART pins to/from the ST Link USB port. However, a STM32 Nucleo or Discovery board with a built-in USB device peripheral is requested.
    • How to run locally a c-model: this article explains how to run locally the generated c-model. The first goal is to enhance an end-user validation process with a large data set including the specific pre and post processing functions with the X-CUBE-AI inference run-time. It is also to integrate a X-CUBE-AI validation step in a CI/CD/MLOps flow w/o STM32 board.
    • How to upgrade a STM32 project: this article describes how to upgrade manually or with the CLI a STM32CubeMX-based or proprietary source tree with a new version of the X-CUBE-AI library.
  • Supported DL/ML frameworks: list the supported deep learning frameworks and the operators/layers supported for each of them.
    • Keras toolbox: lists the Keras layers (or operators) which can be imported and converted. Keras is supported through the TensorFlow™ backend with channels-last dimension ordering. Keras.io 2.0 up to version 2.5.1 is supported, while networks defined in Keras 1.x are not officially supported. Up-to TF Keras 2.5.0 is supported.
    • TensorFlow™ Lite toolbox: lists the TensorFlow™ Lite layers (or operators) which can be imported and converted. TensorFlow™ Lite is the format used to deploy a neural network model on mobile platforms. STM.ai imports and converts the .tflite files which is based on the flatbuffer technology. The official ‘schema.fbs’ definition (tags v2.5.0) is used to import the models. A number of operators from the supported operator list are handled, including the quantized models and/or operators generated by the Quantization Aware Training or/and Post-training quantization processes.
    • ONNX toolbox: lists the ONNX layers (or operators) which can be imported and converted. ONNX is an open format built to represent machine learning models. A part of a subset of operators from Opset 7, 8, 9 and 10 of ONNX 1.6 is supported.
  • Frequently Asked Questions
    • Generic aspects:
      • How to know the version of the deep-learning framework components which are used?
      • Channel first support for ONNX model
      • How is used the CMSIS-NN library?
      • What is the EABI used for the network_runtime libraries?
      • X-CUBE-AI Python API availability?
      • Stateful LSTM support?
      • How is used the ONNX optimizer?
      • How is used the TFLite interpreter?
      • TensorFlow™ Keras (tf.keras) vs Keras.io
      • It is possible to update a model on the firmware w/o having to do a full firmware update?
      • Keras Model or Sequential layer support?
      • Is it possible to split the weights buffer?
      • Is it possible to place the “activations” buffer in different memory segments?
      • How to compress the non-dense/fully-connected layers?
      • Is it possible to apply a compression factor different of x8, x4?
      • How to specify or to indicate a compression factor by layer?
      • Why a small negative ratio is reported for the weights size with a model w/o compression?
      • Is it possible to dump/capture the intermediate values during the execution of the inference?
    • Validation aspects
      • Validation on target vs validation on desktop
      • How to interpret the validation results?
      • How to generate a npz/npy files from an image data set?
      • How to validate a specific network when multiple networks are embedded into the same firmware?
      • Reported STM32 results are incoherent
      • Unable to perform automatic validation on-target
      • Long time process or crash with a large test data set
    • Quantization and post-training quantization process
      • Backward compatibility with X-CUBE-AI 4.0 and X-CUBE-AI 4.1
      • Is it possible to use the Keras post-training quantization process through the UI?
      • Is it possible to use the Keras post-training quantization process with a non-classifier model?
      • Is it possible to use the compression for a quantized model?
      • How to apply the Keras post-training quantization process on a non-Keras model?
      • TensorFlow™ lite, OPTIMIZE_FOR_SIZE option support

3. X-CUBE-AI Embedded Documentation Access

To access the embedded documentation, you shall first install X-CUBE-AI . Installation process is described in the Getting Started document. Once installed, the documentation is installed in the directory installation directory under X-CUBE-AI/7.0.0/Documentation/index.html (the version shall be adapted to the one used, here the 7.0.0). For Windows, by default the documentation is located here (replace username by your Windows user name): file:///C:/Users/username/STM32Cube/Repository/Packs/STMicroelectronics/X-CUBE-AI/7.0.0/Documentation/index.html The release notes is available here: file:///C:/Users/username/STM32Cube/Repository/Packs/STMicroelectronics/X-CUBE-AI/7.0.0/Release_Notes.html

The embedded documentation can also be accessed through the UI once the X-CUBE-AI Software Package has been selected and loaded by clicking on the "Help" menu then "X-CUBE-AI documentation":

X-CUBE-AI documenation