STM32WB Bluetooth® LE – Heart Rate Sensor project Migration to Azure RTOS ThreadX OS

Revision as of 08:47, 11 May 2022 by Registered User (→‎Modifications in existing files)

1. Overview of Azure RTOS ThreadX OS

  • Picokernel, preemption-threshold, event-chaining unique features
  • Execution profiling and performance metrics
  • Totally available in source code (ANSI C and assembler)
  • Safety Certifications (TÜV, MISRA, UL)
  • Integrated with other Azure RTOS components:
    • USBX
    • NETX and NETXDUO
    • FILEX
    • LEVELX
    • TRACEX

Fully detailed description of ThreadX OS features and benefits can be found here: Microsoft AzureRTOS ThreadX Documentation

1.1. ThreadX code organization

File:ThreadXStructure.png
ThreadX folders organization

Main folders descriptions

  • cmake: original build system (not mandatory)
  • common: mcu architecture independent source code
  • common_modules: module feature (see: ThreadX Modules)
  • common_smp: Symmetric Multi Processing feature (see: ThreadX SMP)
  • docs: Just a dependency tree of Azure RTOS components
  • ports: mcu architecture dependent (M3, M4, M33,...)
  • ports_module: architecture dependent module code
  • ports_smp: architecture dependent smp code
  • samples: Microsoft example code (demo_threadx.c)
  • utility:
    • benchmarks: Thread-Metric test suite
    • execution_profile_kit: thread execution time tracker
    • low_power: Low power management files
    • rtos_compatibility_layers: adaptation layers (FreeRTOS, Posix, OSEK)








1.2. ThreadX way of working

A typical ThreadX application could be similar to this one (courtesy of Microsoft):

#include "tx_api.h"
unsigned long my_thread_counter = 0;
TX_THREAD my_thread;
main( )
{
    /* Enter the ThreadX kernel. */
    tx_kernel_enter( );
}
void tx_application_define(void *first_unused_memory)
{
    /* Create my_thread! */
    tx_thread_create(&my_thread, "My Thread",
    my_thread_entry, 0x1234, first_unused_memory, 1024,
    3, 3, TX_NO_TIME_SLICE, TX_AUTO_START);
}
void my_thread_entry(ULONG thread_input)
{
    /* Enter into a forever loop. */
    while(1)
    {
        /* Increment thread counter. */
        my_thread_counter++;
        /* Sleep for 1 tick. */
        tx_thread_sleep(1);
    }
}

tx_kernel_enter
This API is used to give the control to the OS, tx_application_define is called and the OS scheduler will be in charge to select the first thread (ready) to run.

tx_application_define
Here we have the creation of ThreadX resources (threads, semaphores, mutexes, events, queues,...). At least one thread must be created, other threads and resources could be created later. Please pay attention to how this function is called. Interrupts are disabled before (inside _tx_initialize_low_level) the call to this function and re-enabled just after (inside _tx_thread_schedule). So don't put any code relying on interrupt management inside this function.

1.3. ThreadX important files

  • tx_api.h: C header file containing all public definitions, data structures and service prototypes usable in the application.
    For a list of all available APIs, see: threadx APIs
  • tx_port.h: C header file containing all development-tool and target-specific data definitions and structures.
    For more details on target integration, see: target considerations
  • tx_user.h: Several configuration options can be set when building the ThreadX library and the application. These configurations cane be used to enable smallest code size, fastest execution, performance evaluation, error checking, quality coding rules and generally speaking extra features These options can be defined in the application source, on the command line or inside tx_user.h (in this latter case TX_INCLUDE_USER_DEFINE_FILE has to be defined). A "tx_user_sample.h" template is given for reference.
    For more info see: threadx configuration options
  • tx_initialize_low_level.S: Low-level processor initialization, including setting up interrupt-vectors, setting up a periodic timer interrupt source, saving the system stack pointer for use in ISR processing later and finding the first available RAM memory address for tx_application_define

Description of initialization process (between processor reset and the entry point of the thread scheduling loop) and the ThreadX way of working is fully detailed here: ThreadX way of working. Please have a look at this chapter before porting your application to ThreadX

2. ST Sequencer VS ThreadX

The sequencer executes registered functions one by one. It's able to :

  • Support up to 32 functions
  • Request a function to be executed
  • Pause / Resume the execution of a function
  • Wait for a specific event (might be not blocking)
  • Priority on functions

The sequencer is an optimized packaging of a while loop bare metal classic implementation and it doesn't intend to compete versus standard OSs but versus standard bare metal implementations. It allows to avoid race conditions which are most of the time faced in bare metal implementation especially when low power mode are implemented.
For a detailed description of ST Sequencer, please have a look at : AN 5289, paragraph 4.4

2.1. List of Sequencer API used in Heart Rate project

void UTIL_SEQ_Run (UTIL_SEQ_bm_t mask_bm)

Requests the sequencer to execute functions that are pending and enabled in the mask mask_bm.

void UTIL_SEQ_RegTask(UTIL_SEQ_bm_t task_id_bm, uint32_t flags, void (*task)( void ))

Registers a function (task) associated with a signal (task_id_bm) in the sequencer. The task_id_bm must have a single bit set.

void UTIL_SEQ_SetTask( UTIL_SEQ_bm_t task_id_bm,* UTIL_SEQ_PauseTask)

Requests the function associated with the task_id_bm to be executed. The task_prio is evaluated by the sequencer only when a function has finished. If several functions are pending at any one time, the one with the highest priority (0) is executed.

void UTIL_SEQ_PauseTask(UTIL_SEQ_bm_t task_id_bm )

Disables the sequencer to execute the function associated with task_id_bm.

void UTIL_SEQ_ResumeTask(UTIL_SEQ_bm_t task_id_bm )

Enables the sequencer to execute the function associated with task_id_bm.

void UTIL_SEQ_SetEvt(UTIL_SEQ_bm_t evt_id_bm )

Notifies the sequencer that the event evt_id_bm occurred (the event must have been first requested).

void UTIL_SEQ_WaitEvt(UTIL_SEQ_bm_t evt_id_bm )

Requests the sequencer to wait for a specific event evt_id_bm and does not return until the event is set with UTIL_SEQ_SetEvt().

void UTIL_SEQ_EvtIdle(UTIL_SEQ_bm_t task_id_bm, UTIL_SEQ_bm_t evt_waited_bm)

Called while the sequencer is waiting for a specific event.

void UTIL_SEQ_Idle( void )

Function internally called by the Sequencer when there is nothing to execute(typically used to enter in Low Power )

2.2. API mapping between ST Sequencer and ThreadX

A table could be inserted to explain API mappings

3. Required modifications

We could identify two different types of modifications

Platform specific:

  • microcontroller architecture
  • Development environment (IDE)
  • Memory allocation strategy used by the OS
  • Low Power

Application specific:

  • Usage of ThreadX API according application logic

3.1. STM32WB platform integration

ThreadX is highly integrated to the hardware platform through a direct usage of assembler code so we should pay attention for the integration phase.

3.1.1. Initialization Code Integration

The most important role is played by tx_initialize_low_level.S placed in "Application/User/Core" of each ThreadX project.
In this file we can inform the OS at which clock frequency our microcontroller is running (through SYSTEM_CLOCK variable) and at which frequency a system tick interrupt is raised (through SYSTICK_CYCLES variables). Please note that "Systick Timer" (Cortex Core IP) is used exclusively by the OS and "HW Timer 17" peripheral is used by the ST HAL drivers.

The memory allocation strategy can be static or dynamic.
The dynamic approach can be selected using the USE_DYNAMIC_MEMORY_ALLOCATION macro and properly setting the heap memory.
By default a static approach is used through a static array declared at application level
Inside this file we have also all the low level initialization code as required by ThreadX This file is split in three section to cover the 3 supported IDEs according specific macro:
- __clang__ for Keil (AC6)
- __IAR_SYSTEMS_ASM__ for IAR
- (__GNUC__) && !defined(__clang__) for CubeIDE

3.1.2. Scheduler Integration

ThreadX scheduler is a pure assembler code and so is strictly dependent on mcu architecture (M3/M4/M33/...) and compiler (IAR/Keil/GCC/...) These files can be found in the "Middlewares/AzureRTOS/threadx/ports/<mcu_arc>/<compiler>" directory of each ThreadX project. Where <mcu_arch> is cortex_m4 and <compiler> is "iar" for IAR, "gnu" for CubeIDE or "keil" for Keil

3.1.3. Low Power Integration

Low power mechanism is implemented through a mix of assembler code already available in the scheduler and C helper functions already defined by the OS and are not required to be modified.
These C helper functions (tx_low_power.c and tx_low_power.h) can be found in "Middlewares/AzureRTOS/threadx/utility/low_power".
In ThreadX we have no idle task but an assembler loop (__tx_ts_wait) defined in the scheduler that can invoke low power routines if no task is ready to run. The low power framework is fully customizable using specific macro and hook functions:

Threadx LP Macro Usage Implementation used in BLE_HeartRate_ThreadX
TX_LOW_POWER To enable low power support DEFINED
TX_ENABLE_WFI To enable WFI inside ThreadX scheduler NOT DEFINED
TX_LOW_POWER_TICKLESS To enable ticklless mode in LP NOT DEFINED
TX_LOW_POWER_USER_ENTER Function to enter in LP UTIL_LPM_EnterLowPower
TX_LOW_POWER_USER_EXIT Function to exit from LP NOT DEFINED
TX_LOW_POWER_TIMER_SETUP Function to start a timer before entering in LP APP_BLE_ThreadX_Low_Power_Setup
TX_LOW_POWER_USER_TIMER_ADJUST Function to calculate time spent in LP APP_BLE_Threadx_Low_Power_Adjust_Ticks

3.2. Required files to be added at application level

Inside "Application/User/Core" of your project these files must be present
- stm32wbxx_hal_timebase_tim.c
- tx_initialize_low_level.S
- tx_user.h

ThreadX component must be present in "Middlewares/AzureRTOS" section:


3.3. Modifications in existing files

3.3.1. Core\Src\main.c

  ...
  /* Init code for STM32_WPAN */
  /* Containing all application initialization could be run before kernel launching */
  MX_APPE_Init();
  /* Launching ThreadX kernel */
  MX_ThreadX_Init();
  /* Infinite loop */
  while(1)
  {
  }
  ...
 
/**
  * @brief  Period elapsed callback in non blocking mode
  * @note   This function is called  when TIM17 interrupt took place, inside
  * HAL_TIM_IRQHandler(). It makes a direct call to HAL_IncTick() to increment
  * a global variable "uwTick" used as application time base.
  * @param  htim : TIM handle
  * @retval None
  */
void HAL_TIM_PeriodElapsedCallback(TIM_HandleTypeDef *htim)
{
  /* USER CODE BEGIN Callback 0 */

  /* USER CODE END Callback 0 */
  if (htim->Instance == TIM17) {
    HAL_IncTick();
  }
  /* USER CODE BEGIN Callback 1 */

  /* USER CODE END Callback 1 */
}

3.3.2. Core\Src\app_entry.c

...
#include "tx_api.h"
...
...
/* USER CODE BEGIN PV */
CHAR* p_pointer = TX_NULL;
static TX_MUTEX     mutex_shci;
static TX_SEMAPHORE semaphore_shci;
static TX_SEMAPHORE semaphore_shci_tl_notify_async_evt;
static TX_THREAD    thread_ShciUserEvtProcess;
static TX_BYTE_POOL byte_pool_ble;
static UCHAR a_memory_area[DEMO_BYTE_POOL_SIZE];
/* USER CODE END PV */

/* Private functions prototypes-----------------------------------------------*/
static void thread_ShciUserEvtProcess_entry(ULONG thread_input);
...
...
/**
  * @brief  MX_ThreadX_Init
  * @param  None
  * @retval None
  */
void MX_ThreadX_Init(void)
{
  /* USER CODE BEGIN  Before_Kernel_Start */

  /* USER CODE END  Before_Kernel_Start */

  tx_kernel_enter();

  /* USER CODE BEGIN  Kernel_Start_Error */

  /* USER CODE END  Kernel_Start_Error */
}
void tx_application_define(void* first_unused_memory)
{
  UNUSED(first_unused_memory);

  /* Here we should declare all the initial ThreadX resources
   * We should have at least one thread to be launched from the scheduler */

  /* Create a byte memory pool from which to allocate the thread stacks, 
   * using static memory coming from a_memory_area array  */
  tx_byte_pool_create(&byte_pool_ble, "byte pool 0", a_memory_area, DEMO_BYTE_POOL_SIZE);

  tx_mutex_create(&mutex_shci, "mutex_shci", TX_NO_INHERIT);
  tx_semaphore_create(&semaphore_shci, "semaphore_shci", 0);
  tx_semaphore_create(&semaphore_shci_tl_notify_async_evt,
                      "semaphore_shci_tl_notify_async_evt",
                      0);

  tx_byte_allocate(&byte_pool_ble, (VOID**) &p_pointer, DEMO_STACK_SIZE_LARGE, TX_NO_WAIT);
  tx_thread_create(&thread_ShciUserEvtProcess,
                   "thread_ShciUserEvtProcess",
                   thread_ShciUserEvtProcess_entry,
                   0,
                   p_pointer,
                   DEMO_STACK_SIZE_LARGE,
                   16,
                   16,
                   TX_NO_TIME_SLICE,
                   TX_AUTO_START);
}
...
...
static void APPE_SysStatusNot( SHCI_TL_CmdStatus_t status )
{
    switch (status) {
        case SHCI_TL_CmdBusy:
            tx_mutex_get(&mutex_shci, TX_WAIT_FOREVER);
            break;

        case SHCI_TL_CmdAvailable:
            tx_mutex_put(&mutex_shci);
            break;

        default:
            break;
    }
    return;
}
...
 
...
static void thread_ShciUserEvtProcess_entry(ULONG thread_input)
{
  UNUSED(thread_input);
  appe_Tl_Init();
  while (1) {
        tx_semaphore_get(&semaphore_shci_tl_notify_async_evt, TX_WAIT_FOREVER);
        shci_user_evt_proc();
    }
}

void shci_notify_asynch_evt(void* pdata)
{
  UNUSED(pdata);
  tx_semaphore_put(&semaphore_shci_tl_notify_async_evt);
	return;
}

void shci_cmd_resp_release(uint32_t flag)
{
  UNUSED(flag);
  tx_semaphore_put(&semaphore_shci);
	return;
}

void shci_cmd_resp_wait(uint32_t timeout)
{
  UNUSED(timeout);
  tx_semaphore_get(&semaphore_shci, TX_WAIT_FOREVER);
	return;
}
...

3.3.3. STM32_WPAN/App/app_ble.c

...
#include "tx_api.h"
...
#include "limits.h"
...
...
#if (CFG_LPM_SUPPORTED == 1)
typedef struct
{
  uint32_t LpTXTimeLeftOnEntry;
  uint8_t LpTXTimerThreadx_Id;
} LpTXTimerContext_t;
#endif
...
...
#if (CFG_LPM_SUPPORTED == 1)
static LpTXTimerContext_t LpTXTimerContext;
#endif
...
...
static TX_MUTEX     mtx_hci;
static TX_SEMAPHORE sem_hci;
static TX_THREAD    thread_HciUserEvtProcess;
static TX_SEMAPHORE sem_HciUserEvtProcessSignal;
static TX_THREAD    thread_AdvUpdateProcess;
static TX_SEMAPHORE sem_AdvUpdateProcessSignal;


/* Private function prototypes -----------------------------------------------*/
static void thread_AdvUpdateProcess_entry(ULONG thread_input);
static void thread_HciUserEvtProcess_entry(ULONG thread_input);
...
/* USER CODE BEGIN PFP */
#if (CFG_LPM_SUPPORTED == 1)
static void APP_BLE_Threadx_LpTimerCb(void);
#endif
/* USER CODE END PFP */
...
...
/* Functions Definition ------------------------------------------------------*/
void APP_BLE_Init(TX_BYTE_POOL* p_byte_pool)
{
  SHCI_CmdStatus_t status;
  /* USER CODE BEGIN APP_BLE_Init_1 */
...


 ...
  CHAR* p_pointer;
  tx_mutex_create(&mtx_hci, "mtx_hci", TX_NO_INHERIT);
  tx_semaphore_create(&sem_hci, "sem_hci", 0);
  tx_semaphore_create(&sem_HciUserEvtProcessSignal, "sem_HciUserEvtProcessSignal", 0);
  tx_byte_allocate(p_byte_pool, (VOID**) &p_pointer, DEMO_STACK_SIZE_LARGE, TX_NO_WAIT);
  tx_thread_create(&thread_HciUserEvtProcess,
                   "thread_HciUserEvtProcess",
                   thread_HciUserEvtProcess_entry,
                   0,
                   p_pointer,
                   DEMO_STACK_SIZE_LARGE,
                   16,
                   16,
                   TX_NO_TIME_SLICE,
                   TX_AUTO_START);
...


  ...
  tx_semaphore_create(&sem_AdvUpdateProcessSignal, "sem_AdvUpdateProcessSignal", 0);
  tx_byte_allocate(p_byte_pool, (VOID**) &p_pointer, DEMO_STACK_SIZE_REDUCED, TX_NO_WAIT);
  tx_thread_create(&thread_AdvUpdateProcess,
                   "thread_AdvUpdateProcess",
                   thread_AdvUpdateProcess_entry,
                   0,
                   p_pointer,
                   DEMO_STACK_SIZE_REDUCED,
                   16,
                   16,
                   TX_NO_TIME_SLICE,
                   TX_AUTO_START);
 ...

3.4. ThreadX resources declaration

3.5. Synchronization mechanism replacement

3.6. Low Power management

3.7. ThreadX configuration

4. References