Difference between revisions of "AI:How to perform motion sensing on STM32L4 IoTnode"

[quality revision] [pending revision]
m
m
 

In this guide, you will learn how to create a motion sensing application to recognize human activities using machine learning on an STM32 microcontroller.

The model used classifies activities such as stationary, walking, or running from accelerometer data provided by the LSM6DSL sensor.

We will be creating a Human Activity Recognition (HAR) application for the STM32L4 Discovery kit IoT node B-L475E-IOT01A development board.

The board is powered by an STM32L475VG microcontroller (Arm® Cortex®-M4F MCU operating at 80 MHz with 1 Mbyte of Flash memory and 128 Kbytes of SRAM).

The following videos can also be used to follow along with this article:

Human Activity Recognition (HAR)

1 What you will learn[edit]

  • How to read motion sensor data.
  • How to generate neural network code for STM32 using X-CUBE-AI.
  • How to input sensor data into a neural network code.
  • How to output inference results.

2 Requirements[edit]

  • B-L475E-IOT01A: STM32L4 Discovery kit IoT node.
  • STM32CubeIDE (v1.6.1 or later) with:
    • X-CUBE-MEMS1 (v8.3.0 or later) - for motion sensor component drivers.
    • X-CUBE-AI (v7.0.0 or later) - for the neural network conversion tool & network runtime library.
    • Note: X-CUBE-AI and X-CUBE-MEMS1 installation is not going to be covered in this tutorial. Installation instructions can be found in UM1718 - Section 3.4.4 Installing embedded software packs).
  • A serial terminal application (such as Tera Term, PuTTY, GNU Screen or others).
  • A Keras model. You can either download the pre-trained model.h5 or create your own model using your own data captures.
B-L475E-IOT01Ax

3 Create a new project[edit]

Create a new STM32 project: File > New > STM32 Project.

STM32CubeIDE New STM32 Project

3.1 Select your board[edit]

  1. Open the Board Selector tab.
  2. Search for B-L475E-IOT01A1.
  3. Select the board in the bottom-right board list table.
  4. Click on Next to configure the project name, and location.
  5. When prompted to initialize all peripherals with their default mode, select No to avoid generating unused code.
Board selection

3.2 Add the LSM6DSL sensor driver from X-CUBE-MEMS Components[edit]

Open the Software Packs Component Selector: Software Packs > Select Components.

  • In the Pinout & Configuration tab, click on Software Packs > Select Components.
Select Components

Select the LSM6DSL intertial module with an I2C interface.

  • Apply the Board Part filter.
  • Under the STMicroelectronics.X-CUBE-MEMS1 bundle, select Board Part > AccGyr / LSM6DSL I2C and click on OK.
MEMS Board Component

3.3 Configure the I2C bus interface[edit]

  1. Back in STM32CubeMX, choose the I2C2 Connectivity interface.
  2. Enable the I2C bus interface by choosing mode I2C.
  3. Set the I2C mode to Fast Mode and speed to 400 kHz (LSM6DSL maximum I2C clock frequency)
  4. Under the GPIO settings tab, associate the PB10 and PB11 pins with the I2C2 interface.
I2C2 configuration

3.4 Configure the LSM6DSL interrupt line[edit]

To receive INT1 signals from the LSM6DSL sensor, check your MCU GPIO pin is correctly configured to receive external interrupts on PD11.

  1. In the GPIO settings (under System Core), set PD11 to External Interrupt Mode with Rising edge trigger detection (EXTI_RISING)
  2. In the NVIC settings (under System Core), enable EXTI line[15:10] interrupts for EXTI11 interrupt.

3.5 Configure X-CUBE-MEMS1[edit]

  1. Expand Software Packs to select STMicroelectronics.X-CUBE-MEMS1.8.3.0.
  2. Enable Board Part AccGyr.
  3. Configure the Platform settings with I2C2 from the dropdown menu.
STM32CubeMX MEMS Component configuration

3.6 Configure the UART communication interface[edit]

Configure the USART1 interface in asynchronous mode and with the default parameters (115200 bauds, 8-bit, no parity and 1 stop bit). You can also check the GPIO settings to view the associated USART1 GPIO. PB6 and PB7 should be configured as Alternate Function Push-Pull.

  1. Select the USART1 Connectivity interface.
  2. Enable Asynchronous mode.
  3. If not already configured, set the Baud Rate to 115200 bit/s.
  4. In the Pinout view or under the GPIO Settings tab, ensure PB6 and PB7 are associated with USART1.
UART1 configuration

3.7 Generate code[edit]

The MCU configuration is now complete and you can generate code either by saving saving your project or Project > Generate Code.

Code generation

4 Bootstrap the code in main.c[edit]

4.1 Open main.c[edit]

In the Project Explorer pane, double-click on Core/Src/main.c to open the code editor for the user application code.

STM32CubeIDE Project Explorer

4.2 Include headers for the LSM6DSL sensor[edit]

Add the LSM6DSL driver, the I2C bus header files. stdio is also included because we are going to use printf outputs.

This snippet is provided AS IS, and by taking it, you agree to be bound to the license terms which can be found here for the component: Application.
/* Private includes ----------------------------------------------------------*/
/* USER CODE BEGIN Includes */
#include "lsm6dsl.h"
#include "b_l475e_iot01a1_bus.h"
#include <stdio.h>
/* USER CODE END Includes */

4.3 Create a global LSM6DSL motion sensor instance and data available status flag[edit]

The variable is marked as volatile because it is going to be modified by an interrupt service routine:

This snippet is provided AS IS, and by taking it, you agree to be bound to the license terms which can be found here for the component: Application.
/* Private variables ---------------------------------------------------------*/
UART_HandleTypeDef huart1;

/* USER CODE BEGIN PV */
LSM6DSL_Object_t MotionSensor;
volatile uint32_t dataRdyIntReceived;
/* USER CODE END PV */

4.4 Define the MEMS_Init() function to configure the LSM6DSL motion sensor[edit]

Add a MEMS_Init() function declaration:

This snippet is provided AS IS, and by taking it, you agree to be bound to the license terms which can be found here for the component: Application.
/* USER CODE BEGIN PFP */
static void MEMS_Init(void);
/* USER CODE END PFP */

Use the following sequence to configure the LSM6DSL sensor to:

  • Range: ±4 g
  • Output Data Rate (ODR): 26 Hz
  • Linear acceleration sensitivity: 0.122 mg/LSB (FS = ±4)
  • Resolution: 16 bits (little endian by default)
This snippet is provided AS IS, and by taking it, you agree to be bound to the license terms which can be found here for the component: Application.
/* USER CODE BEGIN 4 */
static void MEMS_Init(void)
{
  LSM6DSL_IO_t io_ctx;
  uint8_t id;
  LSM6DSL_AxesRaw_t axes;

  /* Link I2C functions to the LSM6DSL driver */
  io_ctx.BusType     = LSM6DSL_I2C_BUS;
  io_ctx.Address     = LSM6DSL_I2C_ADD_L;
  io_ctx.Init        = BSP_I2C2_Init;
  io_ctx.DeInit      = BSP_I2C2_DeInit;
  io_ctx.ReadReg     = BSP_I2C2_ReadReg;
  io_ctx.WriteReg    = BSP_I2C2_WriteReg;
  io_ctx.GetTick     = BSP_GetTick;
  LSM6DSL_RegisterBusIO(&MotionSensor, &io_ctx);

  /* Read the LSM6DSL WHO_AM_I register */
  LSM6DSL_ReadID(&MotionSensor, &id);
  if (id != LSM6DSL_ID) {
    Error_Handler();
  }

  /* Initialize the LSM6DSL sensor */
  LSM6DSL_Init(&MotionSensor);

  /* Configure the LSM6DSL accelerometer (ODR, scale and interrupt) */
  LSM6DSL_ACC_SetOutputDataRate(&MotionSensor, 26.0f); /* 26 Hz */
  LSM6DSL_ACC_SetFullScale(&MotionSensor, 4);          /* [-4000mg; +4000mg] */
  LSM6DSL_ACC_Set_INT1_DRDY(&MotionSensor, ENABLE);    /* Enable DRDY */
  LSM6DSL_ACC_GetAxesRaw(&MotionSensor, &axes);        /* Clear DRDY */

  /* Start the LSM6DSL accelerometer */
  LSM6DSL_ACC_Enable(&MotionSensor);
}
/* USER CODE END 4 */

Info: For code simplicity and readability, status code return value checking has been omitted. It is strongly advised to check if the return status is equal to LSM6DSL_OK.


4.5 Add a callback to the LSM6DSL sensor interrupt line (INT1 signal on GPIO PD11)[edit]

It is used to set the dataRdyIntReceived status flag when a new set of measurement data is available to be read:

This snippet is provided AS IS, and by taking it, you agree to be bound to the license terms which can be found here for the component: Application.
/* USER CODE BEGIN 4 */
/*...*/

void HAL_GPIO_EXTI_Callback(uint16_t GPIO_Pin)
{
  if (GPIO_Pin == GPIO_PIN_11) {
    dataRdyIntReceived++;
  }
}
/* USER CODE END 4 */

4.6 Retarget printf to a UART serial port[edit]

Retarget the printf output to the serial UART interface. Insert the following code snippet if you are working with a GCC toolchain:

This snippet is provided AS IS, and by taking it, you agree to be bound to the license terms which can be found here for the component: Application.
/*...*/

int _write(int fd, char * ptr, int len)
{
  HAL_UART_Transmit(&huart1, (uint8_t *) ptr, len, HAL_MAX_DELAY);
  return len;
}
/* USER CODE END 4 */

Info: stdout redirection is toolchain dependent. Example implementations for other compilers can be found in aiSystemPerformance.c from the SystemPerformance application.

4.7 Implement an Error_Handler()[edit]

To protect your application of any potential issues, it is recommended to implement a trapping mechanism in the the Error_Handler() function.

  • Create an infinite while loop to trap errors and blink the board’s LED.
This snippet is provided AS IS, and by taking it, you agree to be bound to the license terms which can be found here for the component: Application.
void Error_Handler(void)
{
  /* USER CODE BEGIN Error_Handler_Debug */
  while(1) {
    HAL_GPIO_TogglePin(LED2_GPIO_Port, LED2_Pin);
    HAL_Delay(50); /* wait 50 ms */
  }
  /* USER CODE END Error_Handler_Debug */
}

5 Read accelerometer data[edit]

5.1 Call the previously implemented MEMS_Init() function[edit]

This snippet is provided AS IS, and by taking it, you agree to be bound to the license terms which can be found here for the component: Application.
int main(void)
{
  /* ... */

  /* USER CODE BEGIN 2 */

  dataRdyIntReceived = 0;
  MEMS_Init();

  /* USER CODE END 2 */

5.2 Add code to read acceleration data[edit]

This snippet is provided AS IS, and by taking it, you agree to be bound to the license terms which can be found here for the component: Application.
int main(void)
{
  /* ... */

  /* Infinite loop */
  /* USER CODE BEGIN WHILE */
  while (1)
  {
    if (dataRdyIntReceived != 0) {
      dataRdyIntReceived = 0;
      LSM6DSL_Axes_t acc_axes;
      LSM6DSL_ACC_GetAxes(&MotionSensor, &acc_axes);
      printf("% 5d, % 5d, % 5d\r\n",  (int) acc_axes.x, (int) acc_axes.y, (int) acc_axes.z);
    }
    /* USER CODE END WHILE */

    /* USER CODE BEGIN 3 */
  }
  /* USER CODE END 3 */
}

5.3 Compile, download and run[edit]

  • Build your project: Project > Build All
  • Run > Run As > 1 STM32 Cortex-M C/C++ Application. Then click Run.
  • Open a serial monitor application (such as Tera Term, PuTTY or GNU Screen) and connect to the ST-LINK Virtual COM port (COM port number may differ on your machine).
TeraTerm: New Connection
  • Configure the communication settings to 115200 baud, 8-bit, no parity.
TeraTerm Serial port settings

5.4 Visualize and capture live sensor data[edit]

When running your application, a new set of 24 acceleration axis values (x, y, z) should be displayed in the serial output every 1/26 Hz = 38 ms. When the board is sitting upright on your desk (not moving), x, y values should be close to 0 g and z should be close to 1 g (1000 mg).

TeraTerm xyz output
  • To capture data, you can copy paste the serial output into a .csv text file or use the following command on a Linux machine:
$ cat /dev/ttyXXXX > capture.csv

With every new capture, you can plot the data using matplotib or any other tool capable of reading csv data. Below is an example of a simple python script to plot (x, y, z data captures):

# plotter.py
# Copyright (c) STMicroelectronics
# License: BSD-3-Clause
import argparse
import numpy as np
import matplotlib.pyplot as plt

parser = argparse.ArgumentParser()
parser.add_argument('filename')
args = parser.parse_args()

data = np.loadtxt(args.filename, delimiter=',')

plt.figure(figsize=(10, 3))
plt.plot(data[:, 0], label='x')
plt.plot(data[:, 1], label='y')
plt.plot(data[:, 2], label='z')
plt.legend()
plt.show()

  • You can then visualize the captured data using with the following command:
$ python plotter.py capture.csv
Motion data plot

If desired, the .csv file can be manually edited for cleanup using a text editor for example. Once you are satisfied with you captures, they can be used for a machine learning model training.

6 Create an STM32Cube.AI application using X-CUBE-AI[edit]

6.1 Create your model[edit]

For this example, we will be using a Keras model trained using a small dataset created specifically for this example. Either download the pre-trained model.h5 or create your own model using your own data captures.

  • dataset.zip Is a ready-to-use dataset of 3-axis acceleration data for various human activities.
Info white.png Information
The model will be using unprocessed data. For increased accuracy, data preprocessing such as gravity rotation and suppression can be added. This requires a model re-training to include the data preprocessing. Data preprocessing usage and alternative models can be found in FP-AI-SENSING1. If using a different model, make sure to adjust sensor settings (data rate and scale) accordingly to match the training dataset.

6.2 Add STM32Cube.AI to your project[edit]

X-CUBE-AI v7.1.0 introduced a new feature: the multi-heap support which allows to split the activations buffer onto multiple memory segments. Therefore the initialization sequence to instantiate a c-model has been modified to be able to set the address of different activations buffers. In the following sections, both API v7.0.0 and v7.1.0 are described but consider the API of v7.0.0 as deprecated.

Back into STM32CubeIDE, got back to the Software Components selection:

Software components

Add the X-CUBE-AI Core component to include the library and code generation options:

  1. Use the Artificial Intelligence filter.
  2. Enable Core.
  3. Click OK.
CubeAI components

Next, configure the X-CUBE-AI component to use your keras model:

  1. Expand Additional Software to select STMicroelectronics.X-CUBE-AI.7.1.0
  2. Check to make sure the X-CUBE-AI component is selected
  3. Click Add network
  4. Change the Network Type to Keras
  5. Browse to select the model
  6. (optional) Click on Analyze to view the model memory footprint, occupation and complexity.
  7. Save or Project > Generate Code
X-CUBE-AI: Add model
Info white.png Information
For complex models, it is recommended to increase the application stack size. Stack size usage can be found using the X-CUBE-AI System Performance application component.

6.3 Include headers for STM32Cube.AI[edit]

Add both the Cube.AI runtime interface header file (ai_platform.h) and model-specific header files generated by Cube.AI (network.h and network_data.h).

This snippet is provided AS IS, and by taking it, you agree to be bound to the license terms which can be found here for the component: Application.
/* Private includes ----------------------------------------------------------*/
/* USER CODE BEGIN Includes */
#include "lsm6dsl.h"
#include "b_l475e_iot01a1_bus.h"
#include "ai_platform.h"
#include "network.h"
#include "network_data.h"
#include <stdio.h>
/* USER CODE END Includes */

6.4 Declare neural-network buffers[edit]

With default generation options, three additional buffers should be allocated: the activations, input and output buffers. The activations buffer is private memory space for the CubeAI runtime. During the execution of the inference, it is used to store intermediate results. Declare a neural-network input and output buffer (aiInData and aiOutData). The corresponding output labels must also be added to activities.

X-CUBE-AI v7.1.0 version:

This snippet is provided AS IS, and by taking it, you agree to be bound to the license terms which can be found here for the component: Application.
/* USER CODE BEGIN PV */
LSM6DSL_Object_t MotionSensor;
volatile uint32_t dataRdyIntReceived;
ai_handle network;
float aiInData[AI_NETWORK_IN_1_SIZE];
float aiOutData[AI_NETWORK_OUT_1_SIZE];
ai_u8 activations[AI_NETWORK_DATA_ACTIVATIONS_SIZE];
const char* activities[AI_NETWORK_OUT_1_SIZE] = {
  "stationary", "walking", "running"
};
ai_buffer * ai_input;
ai_buffer * ai_output;
/* USER CODE END PV */

X-CUBE-AI v7.0.0 version (deprecated):

This snippet is provided AS IS, and by taking it, you agree to be bound to the license terms which can be found here for the component: Application.
/* USER CODE BEGIN PV */
LSM6DSL_Object_t MotionSensor;
volatile uint32_t dataRdyIntReceived;
ai_handle network;
float aiInData[AI_NETWORK_IN_1_SIZE];
float aiOutData[AI_NETWORK_OUT_1_SIZE];
uint8_t activations[AI_NETWORK_DATA_ACTIVATIONS_SIZE];
const char* activities[AI_NETWORK_OUT_1_SIZE] = {
  "stationary", "walking", "running"
};
/* USER CODE END PV */

6.5 Add AI bootstrapping functions[edit]

In the list of function prototypes, add the following declarations:

X-CUBE-AI v7.1.0 version:

This snippet is provided AS IS, and by taking it, you agree to be bound to the license terms which can be found here for the component: Application.
/* Private function prototypes -----------------------------------------------*/
void SystemClock_Config(void);
static void MX_GPIO_Init(void);
static void MX_USART1_UART_Init(void);
static void MX_CRC_Init(void);
/* USER CODE BEGIN PFP */
static void MEMS_Init(void);
static void AI_Init(void);
static void AI_Run(float *pIn, float *pOut);
static uint32_t argmax(const float * values, uint32_t len);
/* USER CODE END PFP */

X-CUBE-AI v7.0.0 version (deprecated):

This snippet is provided AS IS, and by taking it, you agree to be bound to the license terms which can be found here for the component: Application.
/* Private function prototypes -----------------------------------------------*/
void SystemClock_Config(void);
static void MX_GPIO_Init(void);
static void MX_USART1_UART_Init(void);
static void MX_CRC_Init(void);
/* USER CODE BEGIN PFP */
static void MEMS_Init(void);
static void AI_Init(ai_handle w_addr, ai_handle act_addr);
static void AI_Run(float *pIn, float *pOut);
static uint32_t argmax(const float * values, uint32_t len);
/* USER CODE END PFP */

And add the following code snippets to use the STM32Cube.AI library for models with float32 inputs.

X-CUBE-AI v7.1.0 version:

This snippet is provided AS IS, and by taking it, you agree to be bound to the license terms which can be found here for the component: Application.
/* USER CODE BEGIN 4 */
/*...*/

static void AI_Init(void)
{
  ai_error err;

  /* Create a local array with the addresses of the activations buffers */
  const ai_handle act_addr[] = { activations };
  /* Create an instance of the model */
  err = ai_network_create_and_init(&network, act_addr, NULL);
  if (err.type != AI_ERROR_NONE) {
    printf("ai_network_create error - type=%d code=%d\r\n", err.type, err.code);
    Error_Handler();
  }
  ai_input = ai_network_inputs_get(network, NULL);
  ai_output = ai_network_outputs_get(network, NULL);
}
/* USER CODE END 4 */

This snippet is provided AS IS, and by taking it, you agree to be bound to the license terms which can be found here for the component: Application.
/* USER CODE BEGIN 4 */
/*...*/

static void AI_Run(float *pIn, float *pOut)
{
  ai_i32 batch;
  ai_error err;

  /* Update IO handlers with the data payload */
  ai_input[0].data = AI_HANDLE_PTR(pIn);
  ai_output[0].data = AI_HANDLE_PTR(pOut);

  batch = ai_network_run(network, ai_input, ai_output);
  if (batch != 1) {
    err = ai_network_get_error(network);
    printf("AI ai_network_run error - type=%d code=%d\r\n", err.type, err.code);
    Error_Handler();
  }
}
/* USER CODE END 4 */

X-CUBE-AI v7.0.0 version (deprecated):

This snippet is provided AS IS, and by taking it, you agree to be bound to the license terms which can be found here for the component: Application.
/* USER CODE BEGIN 4 */
/*...*/

static void AI_Init(ai_handle w_addr, ai_handle act_addr)
{
  ai_error err;

  /* 1 - Create an instance of the model */
  err = ai_network_create(&network, AI_NETWORK_DATA_CONFIG);
  if (err.type != AI_ERROR_NONE) {
    printf("ai_network_create error - type=%d code=%d\r\n", err.type, err.code);
    Error_Handler();
  }

  /* 2 - Initialize the instance */
  const ai_network_params params = AI_NETWORK_PARAMS_INIT(
    AI_NETWORK_DATA_WEIGHTS(w_addr),
    AI_NETWORK_DATA_ACTIVATIONS(act_addr)
  );

  if (!ai_network_init(network, &params)) {
    err = ai_network_get_error(network);
    printf("ai_network_init error - type=%d code=%d\r\n", err.type, err.code);
    Error_Handler();
  }
}
/* USER CODE END 4 */

This snippet is provided AS IS, and by taking it, you agree to be bound to the license terms which can be found here for the component: Application.
/* USER CODE BEGIN 4 */
/*...*/

static void AI_Run(float *pIn, float *pOut)
{
  ai_i32 batch;
  ai_error err;

  /* 1 - Create the AI buffer IO handlers with the default definition */
  ai_buffer ai_input[AI_NETWORK_IN_NUM] = AI_NETWORK_IN;
  ai_buffer ai_output[AI_NETWORK_OUT_NUM] = AI_NETWORK_OUT;

  /* 2 - Update IO handlers with the data payload */
  ai_input[0].n_batches = 1;
  ai_input[0].data = AI_HANDLE_PTR(pIn);
  ai_output[0].n_batches = 1;
  ai_output[0].data = AI_HANDLE_PTR(pOut);

  batch = ai_network_run(network, ai_input, ai_output);
  if (batch != 1) {
    err = ai_network_get_error(network);
    printf("AI ai_network_run error - type=%d code=%d\r\n", err.type, err.code);
    Error_Handler();
  }
}
/* USER CODE END 4 */

6.6 Create an argmax function[edit]

Create an argmax function to return the index of the highest scored output.

This snippet is provided AS IS, and by taking it, you agree to be bound to the license terms which can be found here for the component: Application.
/* USER CODE BEGIN 4 */
/*...*/

static uint32_t argmax(const float * values, uint32_t len)
{
  float max_value = values[0];
  uint32_t max_index = 0;
  for (uint32_t i = 1; i < len; i++) {
    if (values[i] > max_value) {
      max_value = values[i];
      max_index = i;
    }
  }
  return max_index;
}
/* USER CODE END 4 */

6.7 Call the previously implemented AI_Init() function[edit]

This snippet is provided AS IS, and by taking it, you agree to be bound to the license terms which can be found here for the component: Application.
int main(void)
{
  /* ... */

  /* USER CODE BEGIN 2 */

  dataRdyIntReceived = 0;
  MEMS_Init();
  AI_Init();
  /* X-CUBE-AI v7.0.0 deprecated : comment line above and uncomment line below */
  /* AI_Init(ai_network_data_weights_get(), activations); */

  /* USER CODE END 2 */

6.8 Update the main while loop[edit]

Finally, put everything together with the following changes in your main while loop:

This snippet is provided AS IS, and by taking it, you agree to be bound to the license terms which can be found here for the component: Application.
  /* Infinite loop */
  /* USER CODE BEGIN WHILE */
  uint32_t write_index = 0;
  while (1)
  {
    if (dataRdyIntReceived != 0) {
      dataRdyIntReceived = 0;
      LSM6DSL_Axes_t acc_axes;
      LSM6DSL_ACC_GetAxes(&MotionSensor, &acc_axes);
      // printf("% 5d, % 5d, % 5d\r\n",  (int) acc_axes.x, (int) acc_axes.y, (int) acc_axes.z);

      /* Normalize data to [-1; 1] and accumulate into input buffer */
      /* Note: window overlapping can be managed here */
      aiInData[write_index + 0] = (float) acc_axes.x / 4000.0f;
      aiInData[write_index + 1] = (float) acc_axes.y / 4000.0f;
      aiInData[write_index + 2] = (float) acc_axes.z / 4000.0f;
      write_index += 3;

      if (write_index == AI_NETWORK_IN_1_SIZE) {
        write_index = 0;

        printf("Running inference\r\n");
        AI_Run(aiInData, aiOutData);

        /* Output results */
        for (uint32_t i = 0; i < AI_NETWORK_OUT_1_SIZE; i++) {
          printf("%8.6f ", aiOutData[i]);
        }
        uint32_t class = argmax(aiOutData, AI_NETWORK_OUT_1_SIZE);
        printf(": %d - %s\r\n", (int) class, activities[class]);
      }
    }
    /* USER CODE END WHILE */

    /* USER CODE BEGIN 3 */
  }
  /* USER CODE END 3 */
}

6.9 Enable float with printf in the build settings[edit]

When using GCC and Newlib-nano, formatted input/output of floating-point number are implemented as weak symbol. If you want to use %f, you have to pull in the symbol by explicitly specifying -u _printf_float command option in the GCC Linker flags. This option can be added to the project build settings:

  1. Open the project properties in Project > Properties.
  2. Expand C/++ Build and go to Settings.
  3. Under the Tool Settings tab, enable Use float with printf from newlib-nano (-u _printf_float).
STM32CubeIDE Build settings to enable printf with float

6.10 Compile, download and run[edit]

You can now compile, download and run your project to test the application using live sensor data. Try to move the board around at different speeds to simulate human activities.

  • At idle, when the board is at rest, the serial output should display “stationary”.
  • If you move the board up and down slowly to moderately fast, the serial output should display “walking”.
  • If you shake the board quickly, the serial output should display “running”.
TeraTerm: HAR Output

7 Real-time scheduling considerations[edit]

In order to provide real-time results and allow the STM32 application to read accelerometer data without interrupting the model inference, all tasks inside the main while loop should be able to execute under 38 ms (1/26 Hz). For example, we can have the following scheduling breakdown on the STM32L4 @ 80 MHz:

  • 0.33 ms - Sensor data acquisition over I2C (3x2 bytes of data (16-bit x, y, z values) + 1 byte for sensitivity read).
  • 4 ms - Sensor data output UART printf (40 bytes; 3-axis values in ASCII)
  • < 0.1 ms - Preprocessing (data normalization)
  • AI model inference time:
    • ~6 ms for the provided model.h5 (used here)
    • ~4 ms for the IGN_WSDM model
    • ~11 ms for the GMP model
  • 4 ms - Output inference results over UART(40 bytes in ASCII)
logic mems.png
Info white.png Information
To provide more breathing space for time consuming task such as running a neural network inference and blocking serial output messages using printf, the LSM6DSL sensor can be configured to use its internal FIFO to accumulate accelerometer data capture and wake-up the MCU once all required acceleration values have been captured for the model inference. The FIFO threshold size can be adjusted to the neural network model input size.
Info white.png Information
Inference time and memory footprint can also be reduced when using a quantized model.

7.1 Overrun protection[edit]

If you want to guarantee real-time execution of your model and application, you can add sensor reading overrun detection by changing the LSM6DSL Data-Ready interrupt signal to pulse even if it values have not yet been read. The dataRdyIntReceived counter will be incremented each time the STM32 sees a a pulse from the sensor; even if the main thread is locked in a time consuming task. The counter can then be check ed prior to each data read to make sure there has not been more than one pulse since the last reading.

  • Change the sensor initialization routine to configure DRDY INT1 signal in pulse mode:
This snippet is provided AS IS, and by taking it, you agree to be bound to the license terms which can be found here for the component: Application.
static void MEMS_Init(void)
{
  /* ... */

  /* Configure the LSM6DSL accelerometer (ODR, scale and interrupt) */
  LSM6DSL_ACC_SetOutputDataRate(&MotionSensor, 26.0f); /* 26 Hz */
  LSM6DSL_ACC_SetFullScale(&MotionSensor, 4);          /* [-4000mg; +4000mg] */
  LSM6DSL_Set_DRDY_Mode(&MotionSensor, 1);             /* DRDY pulsed mode */
  LSM6DSL_ACC_Set_INT1_DRDY(&MotionSensor, ENABLE);    /* Enable DRDY */
  LSM6DSL_ACC_GetAxesRaw(&MotionSensor, &axes);        /* Clear DRDY */

  /* Start the LSM6DSL accelerometer */
  LSM6DSL_ACC_Enable(&MotionSensor);
}

  • And check that there has not been more than one DRDY pulse since the last sensor reading:
This snippet is provided AS IS, and by taking it, you agree to be bound to the license terms which can be found here for the component: Application.
  while (1)
  {

    if (dataRdyIntReceived != 0) {
      if (dataRdyIntReceived != 1) {
        printf("Overrun error: new data available before reading previous data.\r\n");
        Error_Handler();
      }
      LSM6DSL_Axes_t acc_axes;
      LSM6DSL_ACC_GetAxes(&MotionSensor, &acc_axes);
      dataRdyIntReceived = 0;

      /* ... */

    }
  }

If you don't get any errors, your are good to go. Otherwise, you might want to use a smaller model, limit the number of printf or even consider some other approach like the usage of a FIFO or an RTOS for example. To check that your overrun capture is working correctly, you can try adding a simple HAL_Delay() to trigger an overrun error.

8 Closing thoughts[edit]

Now that you have seen how to capture and record data, you can:

  • create additional data captures to increase the dataset robustness against model over fitting. It is a good idea to vary the sensor position and user.
  • capture new classes of activities (such as cycling, automotive, skiing and others) to enrich the dataset.
  • experiment with different model architectures for other use-cases.

The boards offers many other sensing and connectivity options:

  • MEMS microphones: for audio and voice applications
  • Other motion sensors (gyroscope, magnetometer)
  • Environmental sensors (temperature & humidity)
  • VL53L0X Time-of-Flight (ToF) proximity sensor
  • Connectivity (Bluetooth® Low Energy, Wi-Fi® and Sub-GHz)
  • ARDUINO® and Pmod™ connectors

9 References[edit]


In this guide, you will learn how to create a motion sensing application to recognize human activities using machine learning on an STM32 microcontroller.

The model used classifies activities such as ''stationary'', ''walking'', or ''running'' from accelerometer data provided by the [https://www.st.com/en/product/lsm6dsl LSM6DSL] sensor.

We will be creating a '''Human Activity Recognition (HAR)''' application for the '''STM32L4 Discovery kit IoT node''' [https://www.st.com/en/product/b-l475e-iot01a B-L475E-IOT01A] development board.

The board is powered by an [https://www.st.com/en/product/stm32l475vg STM32L475VG] microcontroller (Arm<sup>®</sup> Cortex<sup>®</sup>-M4F MCU operating at 80 MHz with 1 Mbyte of Flash memory and 128 Kbytes of SRAM).

The following videos can also be used to follow along with this article:
* Part 1: https://www.youtube.com/watch?v=Djj2WMPgdsQ 
* Part 2: https://www.youtube.com/watch?v=VznrX-Uyv3U
* Part 3: https://www.youtube.com/watch?v=dd6DdY4To9Q
<div class="res-img">

[[File:har.png|900px|center|Human Activity Recognition (HAR)]]</div>


== What you will learn ==

* How to read motion sensor data.
* How to generate neural network code for STM32 using X-CUBE-AI.
* How to input sensor data into a neural network code.
* How to output inference results.

== Requirements ==

* [https://www.st.com/en/product/b-l475e-iot01a B-L475E-IOT01A]: '''STM32L4 Discovery kit IoT node'''.
* [https://www.st.com/en/product/stm32cubeide STM32CubeIDE] (v1.6.1 or later) with:
** [https://www.st.com/en/product/x-cube-mems1 X-CUBE-MEMS1] (v8.3.0 or later) - for motion sensor component drivers.
** [https://www.st.com/en/product/x-cube-ai X-CUBE-AI] (v7.0.0 or later) - for the neural network conversion tool &amp; network runtime library.
** <u>Note</u>: X-CUBE-AI and X-CUBE-MEMS1 installation is not going to be covered in this tutorial. Installation instructions can be found in [https://www.st.com/resource/en/user_manual/dm00104712.pdf UM1718] - Section 3.4.4 ''Installing embedded software packs'').
* A serial terminal application (such as [https://ttssh2.osdn.jp/ Tera Term], [https://www.putty.org/ PuTTY], [https://www.gnu.org/software/screen/ GNU Screen] or others).
* A Keras model. You can either download the pre-trained [https://github.com/STMicroelectronics/stm32ai/raw/master/AI_resources/HAR/model.h5 model.h5] or create your own model using your own data captures.
<div class="res-img">

[[File:ST7309 B-L475E-IOT01Ax-scr.jpg|500px|center|B-L475E-IOT01Ax]]</div>


== Create a new project ==

Create a new STM32 project: '''File > New > STM32 Project'''.
<div class="res-img">

[[File:cubeide_newproject.png|center|STM32CubeIDE New STM32 Project]]</div>


=== Select your board ===

# Open the '''Board Selector''' tab.
# Search for '''B-L475E-IOT01A1'''.
# Select the board in the bottom-right board list table.
# Click on '''Next''' to configure the project name, and location.
# When prompted to initialize all peripherals with their default mode, select '''No''' to avoid generating unused code.
<div class="res-img">

[[File:cubeide_selectboard.png|900px|center|Board selection]]</div>


=== Add the LSM6DSL sensor driver from X-CUBE-MEMS Components ===
Open the Software Packs Component Selector: '''Software Packs > Select Components'''.

* In the '''Pinout & Configuration''' tab, click on '''Software Packs > Select Components'''.
<div class="res-img">

[[File:cubeide3.png|center|Select Components]]</div>


Select the LSM6DSL intertial module with an I2C interface.

* Apply the '''Board Part''' filter.
* Under the '''STMicroelectronics.X-CUBE-MEMS1''' bundle, select '''Board Part''' > '''AccGyr / LSM6DSL I2C''' and click on '''OK'''.
<div class="res-img">

[[File:cubeide_swpacks.png|center|900px|MEMS Board Component]]</div>


=== Configure the I<sup>2</sup>C bus interface ===

# Back in STM32CubeMX, choose the '''I2C2''' Connectivity interface.
# Enable the I<sup>2</sup>C bus interface by choosing mode '''I2C'''.
# Set the I<sup>2</sup>C mode to '''Fast Mode''' and speed to '''400 kHz''' (LSM6DSL maximum I<sup>2</sup>C clock frequency)
# Under the '''GPIO settings''' tab, associate the '''PB10''' and '''PB11''' pins with the I2C2 interface.
<div class="res-img">

[[File:cubeide i2c.png|900px|center|I2C2 configuration]]</div>


=== Configure the LSM6DSL interrupt line ===

To receive INT1 signals from the LSM6DSL sensor, check your MCU GPIO pin is correctly configured to receive external interrupts on PD11.

# In the '''GPIO''' settings (under ''System Core''), set '''PD11''' to '''External Interrupt Mode with Rising edge trigger detection''' (EXTI_RISING)
# In the '''NVIC''' settings (under ''System Core''), enable '''EXTI line[15:10] interrupts''' for EXTI11 interrupt.

=== Configure X-CUBE-MEMS1 ===

# Expand '''Software Packs''' to select '''STMicroelectronics.X-CUBE-MEMS1.8.3.0'''.
# Enable '''Board Part AccGyr'''.
# Configure the '''Platform settings''' with '''I2C2''' from the dropdown menu.
<div class="res-img">

[[File:cubeide mems.png|900px|center|STM32CubeMX MEMS Component configuration]]</div>


=== Configure the UART communication interface ===

Configure the USART1 interface in asynchronous mode and with the default parameters (115200 bauds, 8-bit, no parity and 1 stop bit). You can also check the GPIO settings to view the associated USART1 GPIO. <code>PB6</code> and <code>PB7</code> should be configured as ''Alternate Function Push-Pull''.

# Select the '''USART1''' Connectivity interface.
# Enable '''Asynchronous''' mode.
# If not already configured, set the '''Baud Rate''' to '''115200 bit/s'''.
# In the '''Pinout''' view or under the '''GPIO Settings''' tab, ensure '''PB6''' and '''PB7''' are associated with USART1.
<div class="res-img">

[[File:cubeide uart1.png|900px|center|UART1 configuration]]</div>


=== Generate code ===

The MCU configuration is now complete and you can generate code either by saving saving your project or '''Project > Generate Code'''.
<div class="res-img">

[[File:cubeide generate.png|center|Code generation]]</div>


== Bootstrap the code in <code>main.c</code> ==

=== Open <code>main.c</code> ===

In the Project Explorer pane, double-click on <code>Core/Src/main.c</code> to open the code editor for the user application code.
<div class="res-img">

[[File:cubeide main.png|center|STM32CubeIDE Project Explorer]]</div>


=== Include headers for the LSM6DSL sensor ===

Add the LSM6DSL driver, the I<sup>2</sup>C bus header files. <code>stdio</code> is also included because we are going to use <code>printf</code> outputs.

{{Snippet | category=AI | component=Application | snippet=<source lang="c" highlight="3-5">

/* Private includes ----------------------------------------------------------*/
/* USER CODE BEGIN Includes */
#include "lsm6dsl.h"
#include "b_l475e_iot01a1_bus.h"
#include <stdio.h>

/* USER CODE END Includes */</source>

}}

=== Create a global LSM6DSL motion sensor instance and data available status flag ===

The variable is marked as volatile because it is going to be modified by an interrupt service routine:

{{Snippet | category=AI | component=Application | snippet=<source lang="c" highlight="5,6">

/* Private variables ---------------------------------------------------------*/
UART_HandleTypeDef huart1;

/* USER CODE BEGIN PV */
LSM6DSL_Object_t MotionSensor;
volatile uint32_t dataRdyIntReceived;
/* USER CODE END PV */</source>

}}

=== Define the <code>MEMS_Init()</code> function to configure the LSM6DSL motion sensor ===

Add a <code>MEMS_Init()</code> function declaration:

{{Snippet | category=AI | component=Application | snippet=<source lang="c" highlight="2">

/* USER CODE BEGIN PFP */
static void MEMS_Init(void);
/* USER CODE END PFP */</source>

}}

Use the following sequence to configure the LSM6DSL sensor to:

* Range: ±4 g
* Output Data Rate (ODR): 26 Hz
* Linear acceleration sensitivity: 0.122 mg/LSB (FS = ±4)
* Resolution: 16 bits (little endian by default)

{{Snippet | category=AI | component=Application | snippet=<source lang="c" highlight="2-35">

/* USER CODE BEGIN 4 */
static void MEMS_Init(void)
{
  LSM6DSL_IO_t io_ctx;
  uint8_t id;
  LSM6DSL_AxesRaw_t axes;

  /* Link I2C functions to the LSM6DSL driver */
  io_ctx.BusType     = LSM6DSL_I2C_BUS;
  io_ctx.Address     = LSM6DSL_I2C_ADD_L;
  io_ctx.Init        = BSP_I2C2_Init;
  io_ctx.DeInit      = BSP_I2C2_DeInit;
  io_ctx.ReadReg     = BSP_I2C2_ReadReg;
  io_ctx.WriteReg    = BSP_I2C2_WriteReg;
  io_ctx.GetTick     = BSP_GetTick;
  LSM6DSL_RegisterBusIO(&MotionSensor, &io_ctx);

  /* Read the LSM6DSL WHO_AM_I register */
  LSM6DSL_ReadID(&MotionSensor, &id);
  if (id != LSM6DSL_ID) {
    Error_Handler();
  }

  /* Initialize the LSM6DSL sensor */
  LSM6DSL_Init(&MotionSensor);

  /* Configure the LSM6DSL accelerometer (ODR, scale and interrupt) */
  LSM6DSL_ACC_SetOutputDataRate(&MotionSensor, 26.0f); /* 26 Hz */
  LSM6DSL_ACC_SetFullScale(&MotionSensor, 4);          /* [-4000mg; +4000mg] */
  LSM6DSL_ACC_Set_INT1_DRDY(&MotionSensor, ENABLE);    /* Enable DRDY */
  LSM6DSL_ACC_GetAxesRaw(&MotionSensor, &axes);        /* Clear DRDY */

  /* Start the LSM6DSL accelerometer */
  LSM6DSL_ACC_Enable(&MotionSensor);
}
/* USER CODE END 4 */</source>

}}

'''<u>Info:</u>''' For code simplicity and readability, status code return value checking has been omitted. It is strongly advised to check if the return status is equal to <code>LSM6DSL_OK</code>.

=== Add a callback to the LSM6DSL sensor interrupt line (INT1 signal on GPIO PD11) ===

It is used to set the <code>dataRdyIntReceived</code> status flag  when a new set of measurement data is available to be read:

{{Snippet | category=AI | component=Application | snippet=<source lang="c" highlight="4-9">

/* USER CODE BEGIN 4 */
/*...*/

void HAL_GPIO_EXTI_Callback(uint16_t GPIO_Pin)
{
  if (GPIO_Pin == GPIO_PIN_11) {
    dataRdyIntReceived++;
  }
}
/* USER CODE END 4 */</source>

}}

=== Retarget <code>printf</code> to a UART serial port ===

Retarget the printf output to the serial UART interface. Insert the following code snippet if you are working with a GCC toolchain:

{{Snippet | category=AI | component=Application | snippet=<source lang="c" highlight="3-7">

/*...*/

int _write(int fd, char * ptr, int len)
{
  HAL_UART_Transmit(&huart1, (uint8_t *) ptr, len, HAL_MAX_DELAY);
  return len;
}
/* USER CODE END 4 */</source>

}}

'''<u>Info:</u>'''  stdout redirection is toolchain dependent. Example implementations for other compilers can be found in <code>aiSystemPerformance.c</code> from the ''SystemPerformance'' application.

=== Implement an <code>Error_Handler()</code> ===

To protect your application of any potential issues, it is recommended to implement a trapping mechanism in the the <code>Error_Handler()</code> function.

* Create an infinite <code>while</code> loop to trap errors and blink the board’s LED.

{{Snippet | category=AI | component=Application | snippet=<source lang="c" highlight="4-7">

void Error_Handler(void)
{
  /* USER CODE BEGIN Error_Handler_Debug */
  while(1) {
    HAL_GPIO_TogglePin(LED2_GPIO_Port, LED2_Pin);
    HAL_Delay(50); /* wait 50 ms */
  }
  /* USER CODE END Error_Handler_Debug */
}</source>

}}

== Read accelerometer data ==

=== Call the previously implemented <code>MEMS_Init()</code> function ===

{{Snippet | category=AI | component=Application | snippet=<source lang="c"  highlight="7-8">

int main(void)
{
  /* ... */

  /* USER CODE BEGIN 2 */

  dataRdyIntReceived = 0;
  MEMS_Init();

  /* USER CODE END 2 */</source>

}}

=== Add code to read acceleration data ===

{{Snippet | category=AI | component=Application | snippet=<source lang="c"  highlight="9-14">

int main(void)
{
  /* ... */

  /* Infinite loop */
  /* USER CODE BEGIN WHILE */
  while (1)
  {
    if (dataRdyIntReceived != 0) {
      dataRdyIntReceived = 0;
      LSM6DSL_Axes_t acc_axes;
      LSM6DSL_ACC_GetAxes(&MotionSensor, &acc_axes);
      printf("% 5d, % 5d, % 5d\r\n",  (int) acc_axes.x, (int) acc_axes.y, (int) acc_axes.z);
    }
    /* USER CODE END WHILE */

    /* USER CODE BEGIN 3 */
  }
  /* USER CODE END 3 */
}</source>

}}

=== Compile, download and run ===

* Build your project: '''Project > Build All'''

* '''Run > Run As > 1 STM32 Cortex-M C/C++ Application'''. Then click '''Run'''.

* Open a serial monitor application (such as Tera Term, PuTTY or GNU Screen) and connect to the ST-LINK Virtual COM port (COM port number may differ on your machine).
<div class="res-img">

[[File:teraterm1.png|700px|center|TeraTerm: New Connection]]</div>


* Configure the communication settings to 115200 baud, 8-bit, no parity.
<div class="res-img">

[[File:teraterm2.png|300px|center|TeraTerm Serial port settings]]</div>


=== Visualize and capture live sensor data ===

When running your application, a new set of 24 acceleration axis values (''x'', ''y'', ''z'') should be displayed in the serial output every 1/26 Hz = 38 ms. When the board is sitting upright on your desk (not moving), ''x'', ''y'' values should be close to 0 g and ''z'' should be close to 1 g (1000 mg).
<div class="res-img">

[[File:teraterm xyz.png|center|TeraTerm xyz output]]</div>


* To capture data, you can copy paste the serial output into a <code>.csv</code> text file or use the following command on a Linux machine:
<pre>$ cat /dev/ttyXXXX > capture.csv</pre>


With every new capture, you can plot the data using <code>matplotib</code> or any other tool capable of reading csv data. Below is an example of a simple python script to plot (''x'', ''y'', ''z'' data captures):
<source lang="python">

# plotter.py
# Copyright (c) STMicroelectronics
# License: BSD-3-Clause
import argparse
import numpy as np
import matplotlib.pyplot as plt

parser = argparse.ArgumentParser()
parser.add_argument('filename')
args = parser.parse_args()

data = np.loadtxt(args.filename, delimiter=',')

plt.figure(figsize=(10, 3))
plt.plot(data[:, 0], label='x')
plt.plot(data[:, 1], label='y')
plt.plot(data[:, 2], label='z')
plt.legend()
plt.show()</source>


* You can then visualize the captured data using with the following command:<pre>$ python plotter.py capture.csv</pre>

<div class="res-img">

[[File:matplotlib memsread.png|center|Motion data plot]]</div>


If desired, the <code>.csv</code> file can be manually edited for cleanup using a text editor for example. Once you are satisfied with you captures, they can be used for a machine learning model training.

== Create an STM32Cube.AI application using X-CUBE-AI ==

=== Create your model ===

For this example, we will be using a Keras model trained using a small dataset created specifically for this example. Either download the pre-trained [https://github.com/STMicroelectronics/stm32ai/raw/master/AI_resources/HAR/model.h5 model.h5] or create your own model using your own data captures.

* Instructions on how to create and train your own model can be found in the following Python Notebook: https://colab.research.google.com/github/STMicroelectronics/stm32ai/blob/master/AI_resources/HAR/Human_Activity_Recognition.ipynb

* [https://github.com/STMicroelectronics/stm32ai/raw/master/AI_resources/HAR/dataset.zip dataset.zip] Is a ready-to-use dataset of 3-axis acceleration data for various human activities.

{{ Info | The model will be using unprocessed data. For increased accuracy, data preprocessing such as gravity rotation and suppression can be added. This requires a model re-training to include the data preprocessing. Data preprocessing usage and alternative models can be found in [https://www.st.com/en/embedded-software/fp-ai-sensing1.html FP-AI-SENSING1]. If using a different model, make sure to adjust sensor settings (data rate and scale) accordingly to match the training dataset.}}

=== Add STM32Cube.AI to your project ===

X-CUBE-AI v7.1.0 introduced a new feature: the multi-heap support which allows to split the activations buffer onto multiple memory segments.
Therefore the initialization sequence to instantiate a c-model has been modified to be able to set the address of different activations buffers.
In the following sections, both API v7.0.0 and v7.1.0 are described but consider the API of v7.0.0 as deprecated.

Back into STM32CubeIDE, got back to the Software Components selection:
<div class="res-img">

[[File:cubeide swpacks2.png|center|Software components]]</div>


Add the '''X-CUBE-AI''' '''Core''' component to include the library and code generation options:

# Use the '''Artificial Intelligence''' filter.
# Enable '''Core'''.
# Click '''OK'''.
<div class="res-img">

[[File:cubeide_swpacks3.png|center|CubeAI components]]</div>


Next, configure the '''X-CUBE-AI''' component to use your keras model:

# Expand '''Additional Software''' to select '''STMicroelectronics.X-CUBE-AI.7.1.0'''
# Check to make sure the '''X-CUBE-AI''' component is selected
# Click '''Add network'''
# Change the '''Network Type''' to '''Keras'''
# Browse to select the model
# (optional) Click on '''Analyze''' to view the model memory footprint, occupation and complexity.
# '''Save''' or '''Project > Generate Code'''
<div class="res-img">

[[File:cubeide cubeai.png|900px|center|X-CUBE-AI: Add model]]</div>


{{ Info | For complex models, it is recommended to increase the application stack size. Stack size usage can be found using the ''X-CUBE-AI System Performance'' application component. }}

=== Include headers for STM32Cube.AI ===

Add both the Cube.AI runtime interface header file (<code>ai_platform.h</code>) and model-specific header files generated by Cube.AI (<code>network.h</code> and <code>network_data.h</code>).

{{Snippet | category=AI | component=Application | snippet=<source lang="c" highlight="5-7">

/* Private includes ----------------------------------------------------------*/
/* USER CODE BEGIN Includes */
#include "lsm6dsl.h"
#include "b_l475e_iot01a1_bus.h"
#include "ai_platform.h"
#include "network.h"
#include "network_data.h"
#include <stdio.h>

/* USER CODE END Includes */</source>

}}

=== Declare neural-network buffers ===

With default generation options, three additional buffers should be allocated: the activations, input and output buffers.
The activations buffer is private memory space for the CubeAI runtime. During the execution of the inference,  it is used to store intermediate results.
Declare a neural-network input and output buffer (<code>aiInData</code> and <code>aiOutData</code>). The corresponding output labels must also be added to <code>activities</code>.

X-CUBE-AI v7.1.0 version:
{{Snippet | category=AI | component=Application | snippet=<source lang="c" highlight="4-12">

/* USER CODE BEGIN PV */
LSM6DSL_Object_t MotionSensor;
volatile uint32_t dataRdyIntReceived;
ai_handle network;
float aiInData[AI_NETWORK_IN_1_SIZE];
float aiOutData[AI_NETWORK_OUT_1_SIZE];
ai_u8 activations[AI_NETWORK_DATA_ACTIVATIONS_SIZE];
const char* activities[AI_NETWORK_OUT_1_SIZE] = {
  "stationary", "walking", "running"
};
ai_buffer * ai_input;
ai_buffer * ai_output;
/* USER CODE END PV */</source>

}}

X-CUBE-AI v7.0.0 version (deprecated):
{{Snippet | category=AI | component=Application | snippet=<source lang="c" highlight="4-10">

/* USER CODE BEGIN PV */
LSM6DSL_Object_t MotionSensor;
volatile uint32_t dataRdyIntReceived;
ai_handle network;
float aiInData[AI_NETWORK_IN_1_SIZE];
float aiOutData[AI_NETWORK_OUT_1_SIZE];
uint8_t activations[AI_NETWORK_DATA_ACTIVATIONS_SIZE];
const char* activities[AI_NETWORK_OUT_1_SIZE] = {
  "stationary", "walking", "running"
};
/* USER CODE END PV */</source>

}}

=== Add AI bootstrapping functions ===

In the list of function prototypes, add the following declarations:

X-CUBE-AI v7.1.0 version:
{{Snippet | category=AI | component=Application | snippet=<source lang="c" highlight="8-10">

/* Private function prototypes -----------------------------------------------*/
void SystemClock_Config(void);
static void MX_GPIO_Init(void);
static void MX_USART1_UART_Init(void);
static void MX_CRC_Init(void);
/* USER CODE BEGIN PFP */
static void MEMS_Init(void);
static void AI_Init(void);
static void AI_Run(float *pIn, float *pOut);
static uint32_t argmax(const float * values, uint32_t len);
/* USER CODE END PFP */</source>

}}

X-CUBE-AI v7.0.0 version (deprecated):
{{Snippet | category=AI | component=Application | snippet=<source lang="c" highlight="8-10">

/* Private function prototypes -----------------------------------------------*/
void SystemClock_Config(void);
static void MX_GPIO_Init(void);
static void MX_USART1_UART_Init(void);
static void MX_CRC_Init(void);
/* USER CODE BEGIN PFP */
static void MEMS_Init(void);
static void AI_Init(ai_handle w_addr, ai_handle act_addr);
static void AI_Run(float *pIn, float *pOut);
static uint32_t argmax(const float * values, uint32_t len);
/* USER CODE END PFP */</source>

}}

And add the following code snippets to use the STM32Cube.AI library for models with float32 inputs.

X-CUBE-AI v7.1.0 version:
{{Snippet | category=AI | component=Application | snippet=<source lang="c"  highlight="1-154-18">
/* USER CODE BEGIN 4 */
/*...*/
static void AI_Init(void)
{
  ai_error err;

  /* Create a local array with the addresses of the activations buffers */
  const ai_handle act_addr[] = { activations };
  /* Create an instance of the model */
  err = ai_network_create_and_init(&network, act_addr, NULL);
  if (err.type != AI_ERROR_NONE) {
    printf("ai_network_create error - type=%d code=%d\r\n", err.type, err.code);
    Error_Handler();
  }
  ai_input = ai_network_inputs_get(network, NULL);
  ai_output = ai_network_outputs_get(network, NULL);
}/* USER CODE END 4 */</source>

}}

{{Snippet | category=AI | component=Application | snippet=<source lang="c"  highlight="1-164-19">
/* USER CODE BEGIN 4 */
/*...*/
static void AI_Run(float *pIn, float *pOut)
{
  ai_i32 batch;
  ai_error err;

  /* Update IO handlers with the data payload */
  ai_input[0].data = AI_HANDLE_PTR(pIn);
  ai_output[0].data = AI_HANDLE_PTR(pOut);

  batch = ai_network_run(network, ai_input, ai_output);
  if (batch != 1) {
    err = ai_network_get_error(network);
    printf("AI ai_network_run error - type=%d code=%d\r\n", err.type, err.code);
    Error_Handler();
  }
}/* USER CODE END 4 */</source>

}}

X-CUBE-AI v7.0.0 version (deprecated):
{{Snippet | category=AI | component=Application | snippet=<source lang="c"  highlight="1-234-26">
/* USER CODE BEGIN 4 */
/*...*/
static void AI_Init(ai_handle w_addr, ai_handle act_addr)
{
  ai_error err;

  /* 1 - Create an instance of the model */
  err = ai_network_create(&network, AI_NETWORK_DATA_CONFIG);
  if (err.type != AI_ERROR_NONE) {
    printf("ai_network_create error - type=%d code=%d\r\n", err.type, err.code);
    Error_Handler();
  }

  /* 2 - Initialize the instance */
  const ai_network_params params = AI_NETWORK_PARAMS_INIT(
    AI_NETWORK_DATA_WEIGHTS(w_addr),
    AI_NETWORK_DATA_ACTIVATIONS(act_addr)
  );

  if (!ai_network_init(network, &params)) {
    err = ai_network_get_error(network);
    printf("ai_network_init error - type=%d code=%d\r\n", err.type, err.code);
    Error_Handler();
  }
}/* USER CODE END 4 */</source>

}}

{{Snippet | category=AI | component=Application | snippet=<source lang="c"  highlight="1-224-25">
/* USER CODE BEGIN 4 */
/*...*/
static void AI_Run(float *pIn, float *pOut)
{
  ai_i32 batch;
  ai_error err;

  /* 1 - Create the AI buffer IO handlers with the default definition */
  ai_buffer ai_input[AI_NETWORK_IN_NUM] = AI_NETWORK_IN;
  ai_buffer ai_output[AI_NETWORK_OUT_NUM] = AI_NETWORK_OUT;

  /* 2 - Update IO handlers with the data payload */
  ai_input[0].n_batches = 1;
  ai_input[0].data = AI_HANDLE_PTR(pIn);
  ai_output[0].n_batches = 1;
  ai_output[0].data = AI_HANDLE_PTR(pOut);

  batch = ai_network_run(network, ai_input, ai_output);
  if (batch != 1) {
    err = ai_network_get_error(network);
    printf("AI ai_network_run error - type=%d code=%d\r\n", err.type, err.code);
    Error_Handler();
  }
}/* USER CODE END 4 */</source>

}}

=== Create an <code>argmax</code> function ===

Create an <code>argmax</code> function to return the index of the highest scored output.

{{Snippet | category=AI | component=Application | snippet=<source lang="c" highlight="1-124-15">
/* USER CODE BEGIN 4 */
/*...*/
static uint32_t argmax(const float * values, uint32_t len)
{
  float max_value = values[0];
  uint32_t max_index = 0;
  for (uint32_t i = 1; i < len; i++) {
    if (values[i] > max_value) {
      max_value = values[i];
      max_index = i;
    }
  }
  return max_index;
}/* USER CODE END 4 */</source>

}}

=== Call the previously implemented <code>AI_Init()</code> function ===

{{Snippet | category=AI | component=Application | snippet=<source lang="c" highlight="9-11">

int main(void)
{
  /* ... */

  /* USER CODE BEGIN 2 */

  dataRdyIntReceived = 0;
  MEMS_Init();
  AI_Init();
  /* X-CUBE-AI v7.0.0 deprecated : comment line above and uncomment line below */
  /* AI_Init(ai_network_data_weights_get(), activations); */

  /* USER CODE END 2 */</source>

}}

=== Update the main <code>while</code> loop ===

Finally, put everything together with the following changes in your main <code>while</code> loop:

{{Snippet | category=AI | component=Application | snippet=<source lang="c" highlight="3,10, 12-32">

  /* Infinite loop */
  /* USER CODE BEGIN WHILE */
  uint32_t write_index = 0;
  while (1)
  {
    if (dataRdyIntReceived != 0) {
      dataRdyIntReceived = 0;
      LSM6DSL_Axes_t acc_axes;
      LSM6DSL_ACC_GetAxes(&MotionSensor, &acc_axes);
      // printf("% 5d, % 5d, % 5d\r\n",  (int) acc_axes.x, (int) acc_axes.y, (int) acc_axes.z);

      /* Normalize data to [-1; 1] and accumulate into input buffer */
      /* Note: window overlapping can be managed here */
      aiInData[write_index + 0] = (float) acc_axes.x / 4000.0f;
      aiInData[write_index + 1] = (float) acc_axes.y / 4000.0f;
      aiInData[write_index + 2] = (float) acc_axes.z / 4000.0f;
      write_index += 3;

      if (write_index == AI_NETWORK_IN_1_SIZE) {
        write_index = 0;

        printf("Running inference\r\n");
        AI_Run(aiInData, aiOutData);

        /* Output results */
        for (uint32_t i = 0; i < AI_NETWORK_OUT_1_SIZE; i++) {
          printf("%8.6f ", aiOutData[i]);
        }
        uint32_t class = argmax(aiOutData, AI_NETWORK_OUT_1_SIZE);
        printf(": %d - %s\r\n", (int) class, activities[class]);
      }
    }
    /* USER CODE END WHILE */

    /* USER CODE BEGIN 3 */
  }
  /* USER CODE END 3 */
}</source>
}}

=== Enable float with printf in the build settings ===

When using GCC and Newlib-nano, formatted input/output of floating-point number are implemented as weak symbol. If you want to use <code>%f</code>, you have to pull in the symbol by explicitly specifying <code>-u _printf_float</code> command option in the GCC Linker flags. This option can be added to the project build settings:

# Open the project properties in '''Project > Properties'''.
# Expand '''C/++ Build''' and go to '''Settings'''.
# Under the '''Tool Settings''' tab, enable '''Use float with printf from newlib-nano (-u _printf_float)'''.
<div class="res-img">

[[File:cubeide settings.png|center|STM32CubeIDE Build settings to enable printf with float]]</div>


=== Compile, download and run ===

You can now '''compile''', '''download''' and '''run''' your project to test the application using live sensor data. Try to move the board around at different speeds to simulate human activities.

* At idle, when the board is at rest, the serial output should display ''“stationary”''.
* If you move the board up and down slowly to moderately fast, the serial output should display ''“walking”''.
* If you shake the board quickly, the serial output should display ''“running”''.
<div class="res-img">

[[File:teraterm har out.png|center|TeraTerm: HAR Output]]</div>


== Real-time scheduling considerations ==

In order to provide real-time results and allow the STM32 application to read accelerometer data without interrupting the model inference, all tasks inside the main <code>while</code> loop should be able to execute under 38 ms (1/26 Hz). For example, we can have the following scheduling breakdown on the STM32L4 @ 80 MHz:

* 0.33 ms - Sensor data acquisition over I<sup>2</sup>C (3x2 bytes of data (16-bit x, y, z values) + 1 byte for sensitivity read).
* 4 ms - Sensor data output UART printf (40 bytes; 3-axis values in ASCII)
* &lt; 0.1 ms - Preprocessing (data normalization)
* AI model inference time:
** ~6 ms for the provided model.h5 (used here)
** ~4 ms for the IGN_WSDM model
** ~11 ms for the GMP model
* 4 ms - Output inference results over UART(40 bytes in ASCII)
<div class="res-img">

[[File:logic mems.png|900px|center]]</div>


{{ Info | To provide more breathing space for time consuming task such as running a neural network inference and blocking serial output messages using printf, the [https://www.st.com/en/product/lsm6dsl LSM6DSL] sensor can be configured to use its internal FIFO to accumulate accelerometer data capture and wake-up the MCU once all required acceleration values have been captured for the model inference. The FIFO threshold size can be adjusted to the neural network model input size.}}

{{ Info | Inference time and memory footprint can also be reduced when using a quantized model.}}

=== Overrun protection ===
If you want to guarantee real-time execution of your model and application, you can add sensor reading overrun detection by changing the LSM6DSL Data-Ready interrupt signal to pulse even if it values have not yet been read. The <code>dataRdyIntReceived</code> counter will be incremented each time the STM32 sees a a pulse from the sensor; even if the main thread is locked in a time consuming task. The counter can then be check ed prior to each data read to make sure there has not been more than one pulse since the last reading.

* Change the sensor initialization routine to configure '''DRDY INT1''' signal in '''pulse mode''':

{{Snippet | category=AI | component=Application | snippet=<source lang="c" highlight="8">

static void MEMS_Init(void)
{
  /* ... */

  /* Configure the LSM6DSL accelerometer (ODR, scale and interrupt) */
  LSM6DSL_ACC_SetOutputDataRate(&MotionSensor, 26.0f); /* 26 Hz */
  LSM6DSL_ACC_SetFullScale(&MotionSensor, 4);          /* [-4000mg; +4000mg] */
  LSM6DSL_Set_DRDY_Mode(&MotionSensor, 1);             /* DRDY pulsed mode */
  LSM6DSL_ACC_Set_INT1_DRDY(&MotionSensor, ENABLE);    /* Enable DRDY */
  LSM6DSL_ACC_GetAxesRaw(&MotionSensor, &axes);        /* Clear DRDY */

  /* Start the LSM6DSL accelerometer */
  LSM6DSL_ACC_Enable(&MotionSensor);
}</source>

}}

* And check that there has not been more than one DRDY pulse since the last sensor reading:

{{Snippet | category=AI | component=Application | snippet=<source lang="c" highlight="5-8">

  while (1)
  {

    if (dataRdyIntReceived != 0) {
      if (dataRdyIntReceived != 1) {
        printf("Overrun error: new data available before reading previous data.\r\n");
        Error_Handler();
      }
      LSM6DSL_Axes_t acc_axes;
      LSM6DSL_ACC_GetAxes(&MotionSensor, &acc_axes);
      dataRdyIntReceived = 0;

      /* ... */

    }
  }</source>

}}

If you don't get any errors, your are good to go. Otherwise, you might want to use a smaller model, limit the number of printf or even consider some other approach like the usage of a FIFO or an RTOS for example. To check that your overrun capture is working correctly, you can try adding a simple <code>HAL_Delay()</code> to trigger an overrun error.

== Closing thoughts ==

Now that you have seen how to capture and record data, you can:

* create additional data captures to increase the dataset robustness against model over fitting. It is a good idea to vary the sensor position and user.
* capture new classes of activities (such as cycling, automotive, skiing and others) to enrich the dataset.
* experiment with different model architectures for other use-cases.

The boards offers many other sensing and connectivity options:

* MEMS microphones: for audio and voice applications
* Other motion sensors (gyroscope, magnetometer)
* Environmental sensors (temperature &amp; humidity)
* [https://www.st.com/en/product/vl53l0x VL53L0X] Time-of-Flight (ToF) proximity sensor
* Connectivity (Bluetooth<sup>®</sup> Low Energy, Wi-Fi<sup>®</sup> and Sub-GHz)
* ARDUINO<sup>®</sup> and Pmod™ connectors

== References ==

* [https://www.st.com/en/product/fp-ai-sensing1 FP-AI-SENSING1]
* [https://github.com/Shahnawax/HAR-CNN-Keras Shahnawax/HAR-CNN-Keras: Human Activity Recognition Using Convolutional Neural Network in Keras]
* [https://machinelearningmastery.com/cnn-models-for-human-activity-recognition-time-series-classification/ How to Develop 1D Convolutional Neural Network Models for Human Activity Recognition]
* [https://towardsdatascience.com/human-activity-recognition-har-tutorial-with-keras-and-core-ml-part-1-8c05e365dfa0 Human Activity Recognition (HAR) Tutorial with Keras and Core ML]
<noinclude>

[[Category:Sensing|60]]
{{PublicationRequestId | 15629 | 2020-04-07 }}</noinclude>
(One intermediate revision by the same user not shown)
Line 547: Line 547:
 
X-CUBE-AI v7.1.0 version:
 
X-CUBE-AI v7.1.0 version:
 
{{Snippet | category=AI | component=Application | snippet=
 
{{Snippet | category=AI | component=Application | snippet=
<source lang="c"  highlight="1-15">
+
<source lang="c"  highlight="4-18">
  +
/* USER CODE BEGIN 4 */
  +
/*...*/
  +
 
 
static void AI_Init(void)
 
static void AI_Init(void)
 
{
 
{
Line 563: Line 566:
 
   ai_output = ai_network_outputs_get(network, NULL);
 
   ai_output = ai_network_outputs_get(network, NULL);
 
}
 
}
  +
/* USER CODE END 4 */
 
</source>
 
</source>
 
}}
 
}}
   
 
{{Snippet | category=AI | component=Application | snippet=
 
{{Snippet | category=AI | component=Application | snippet=
<source lang="c"  highlight="1-16">
+
<source lang="c"  highlight="4-19">
  +
/* USER CODE BEGIN 4 */
  +
/*...*/
  +
 
 
static void AI_Run(float *pIn, float *pOut)
 
static void AI_Run(float *pIn, float *pOut)
 
{
 
{
Line 584: Line 591:
 
   }
 
   }
 
}
 
}
  +
/* USER CODE END 4 */
 
</source>
 
</source>
 
}}
 
}}
Line 589: Line 597:
 
X-CUBE-AI v7.0.0 version (deprecated):
 
X-CUBE-AI v7.0.0 version (deprecated):
 
{{Snippet | category=AI | component=Application | snippet=
 
{{Snippet | category=AI | component=Application | snippet=
<source lang="c"  highlight="1-23">
+
<source lang="c"  highlight="4-26">
  +
/* USER CODE BEGIN 4 */
  +
/*...*/
  +
 
 
static void AI_Init(ai_handle w_addr, ai_handle act_addr)
 
static void AI_Init(ai_handle w_addr, ai_handle act_addr)
 
{
 
{
Line 613: Line 624:
 
   }
 
   }
 
}
 
}
  +
/* USER CODE END 4 */
 
</source>
 
</source>
 
}}
 
}}
   
 
{{Snippet | category=AI | component=Application | snippet=
 
{{Snippet | category=AI | component=Application | snippet=
<source lang="c"  highlight="1-22">
+
<source lang="c"  highlight="4-25">
  +
/* USER CODE BEGIN 4 */
  +
/*...*/
  +
 
 
static void AI_Run(float *pIn, float *pOut)
 
static void AI_Run(float *pIn, float *pOut)
 
{
 
{
Line 640: Line 655:
 
   }
 
   }
 
}
 
}
  +
/* USER CODE END 4 */
 
</source>
 
</source>
 
}}
 
}}
Line 648: Line 664:
   
 
{{Snippet | category=AI | component=Application | snippet=
 
{{Snippet | category=AI | component=Application | snippet=
<source lang="c" highlight="1-12">
+
<source lang="c" highlight="4-15">
  +
/* USER CODE BEGIN 4 */
  +
/*...*/
  +
 
 
static uint32_t argmax(const float * values, uint32_t len)
 
static uint32_t argmax(const float * values, uint32_t len)
 
{
 
{
Line 661: Line 680:
 
   return max_index;
 
   return max_index;
 
}
 
}
  +
/* USER CODE END 4 */
 
</source>
 
</source>
 
}}
 
}}