This article is providing benchmark of a set of well-known or reference pre-trained neural network models.
1. Benchmark results
STM32 | Board | Model | Source | Memory Config |
Flash Weights |
RAM Model |
Inference Time |
Current (mA) |
Energy (mJ) @ 3.3V |
Version | RAM Activations |
RAM Input |
RAM Output |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
STM32H723 | STM32H723 DK | mobilenet | All internal | 500 KB | 200 KB | 10 ms | NA | NA | X-CUBE-AI v7.0.0 STM32CubeIDE v1.7.0 |
200 kB | 100 kB | 3B | |
STM32H723 | STM32H723 DK | mobilenet | All internal | 500 KB | 200 KB | 10 ms | NA | NA | X-CUBE-AI v7.0.0 STM32CubeIDE v1.7.0 |
200 kB | 100 kB | 3B | |
STM32H723 | STM32H723 DK | mobilenet | All internal | 500 KB | 200 KB | 10 ms | NA | NA | X-CUBE-AI v7.0.0 STM32CubeIDE v1.7.0 |
200 kB | 100 kB | 3B |
2. Measure process
Only the machine learning inference is considered. In a complete application, the sensor acquisition, the data conditioning and pre-processing shall also be considered.
The memory footprint are the one reported by X-CUBE-AI using the "Analyze" function (the version of X-CUBE-AI used is mentioned in the table). The input / output buffers are included, but the options have been selected allowing to overlay these buffers with the activations. The input / output buffer size are also reported.
RAM Model: buffers required to run the model, activations / input / output buffers with the "" option activated.
The inference time as well as the X-Cross error is the one reported by the "Validation on target". STM32Cube.AI is not modifying the DL/ML model topology. The impact on accuracy should be limited and the X-Cross error ensure that the difference...
The validation can be done also with dataset...
Quantized case through CLI scripts + data compression.
When power measure is https://wiki.st.com/stm32mcu/wiki/AI:How_to_measure_machine_learning_model_power_consumption_with_STM32Cube.AI_generated_application