Introduction to X-CUBE-AUDIO-KIT

1. Overview

This page is a condensed version of the Audio-Kit reference manual. It focuses on architecture, workflow, and integration checkpoints. This summary was derived from the reference manual chapters under the folder Middlewares\ST\Audio-Kit\docs\refman of the X-CUBE-AUDIO-KIT package.


1.1. Why audio-kit is the fastest way to develop audio on the STM32

Audio-Kit shortens the path from first prototype to production-ready firmware. During development, you can assemble and run an audio chain quickly, tune parameters live on the target board, and validate behavior immediately without rebuilding the full application each time. Once the tuning stage is stable, the same design can be exported as C code and integrated into Tuner or Release profiles for productization with controlled footprint and deterministic runtime behavior. This continuity between design, tuning, and deployment reduces development cycle time and lowers integration risk.

The practical value is short iteration cycles during design, then controlled footprint and behavior for production.


1.2. Why dataflow creation is fast

Building a first working flow usually takes only a few actions in LiveTune:

  • Drag a source (for example SysIn-Microphones), an algorithm, and a sink (for example SysOut-Codec).
  • Connect matching pins; LiveTune blocks invalid links and highlights most inconsistencies.
  • Press Start to run immediately, then tune parameters live without rebuilding firmware.

This short loop (drag, connect, start, tune) is the main productivity benefit of the Audio-Kit workflow.

1.3. System overview

  • Host side: LiveTune lets you compose and tune flows live on real hardware, and generate C code that bridges prototype to product without redesigning the full pipeline.
  • Device side: AudioChain executes the audio data flow on target.
  • Package also provides utility components, examples, and algorithm wrappers.


1.4. Targeted audio use cases

This Expansion Package is intended for use cases such as:

  • Voice denoising for speech recognition or voice communication
  • Audio output enhancement
  • Audio conditioning
  • Sound generation
  • Audio effects
  • Audio analysis
  • Any other audio processing use case

1.5. Audio formats

  • Temporal or spectral domain,
  • PCM (Pulse Code Modulation) fixed or floating point, PDM (Pulse Density Modulation) 1-bit LSB first or MSB first,
  • Interleaved or not,
  • Mono, stereo, or a wider range of channels,
  • Multiple sampling frequencies
  • Format conversion is transparent

1.6. Key features in practice

  • Live tuning without rebuild: tweak denoiser, EQ, or gain parameters while audio is running to accelerate end-to-end validation.
  • Deterministic callback model: isolate data movement (dataInOut) from heavier processing (process) to keep real-time behavior stable.
  • Profile before productization: inspect CPU load, task activity, and memory usage early to reduce late integration risk.
  • Three deployment profiles: start with Designer for velocity, move to Tuner for focused tuning, then Release for compact deployment.
  • Custom algorithm onboarding: generate wrapper scaffolding, then focus only on algorithm-specific logic and consistency checks.

For the full feature list, refer to https://www.st.com/en/embedded-software/x-cube-audio-kit.html

1.7. Supported STM32 discovery kits

X-CUBE-AUDIO-KIT includes one generic designer firmware for each of the following:

  • STM32H573I-DK
  • STM32H735G-DK
  • STM32H7S78-DK
  • STM32N6570-DK

Each firmware features a set of algorithms, the AudioChain framework, and the LiveTune interface.

1.8. Software overview

The following figure illustrates a high-level architecture view of an application implemented around AudioChain.










1.9. LiveTune at a glance

  • LiveTune can run a design immediately, flash a design, or generate C code.
  • Communication with target uses JSON messages over UART.
  • Terminal commands help inspect memory, CPU load, tasks, traces and audio configuration.
  • LiveTune uses an HTML5-compliant browser. Communication with the device uses the Virtual COM port through the ST-LINK USB interface.
  • Please find livetune.html at the root of the package

Typical workflow:

  1. Connect LiveTune to a Designer firmware target.
  2. Drag and connect system I/O and algorithm elements.
  3. Start execution and verify behavior with terminal traces and viewers.
  4. Tune parameters and check memory/CPU headroom.
  5. Generate C code for Tuner or Release firmware.

The following figure illustrates the Host/Target Interaction Sequence during this workflow.

This is how LiveTune designer canvas looks, elements from the left panel are simply dragged onto the canvas so users can connect them:

1.10. AudioChain architecture

AudioChain is hardware-independent and organizes processing with:

  • AudioBuffer: sample format and audio description
  • AudioChunk: multi-frame transport between elements
  • AudioAlgo: algorithm wrapper with common lifecycle callbacks
  • AudioChain scheduler: executes dataInOut, process, and control callbacks
AudioChain Architecture

AudioChain implements a generic callback model:

  • checkConsistency: validates I/O compatibility and configuration
  • init / deinit: lifecycle and memory setup/cleanup
  • configure: runtime parameter updates
  • dataInOut: high-priority movement of audio data
  • process: heavier compute stage
  • control: non-audio control/events exchange

1.11. Build profiles

Profile Purpose Main define When to use
Designer Full LiveTune + AudioChain for design/debug USE_LIVETUNE_DESIGNER Bring-up, architecture exploration, and rapid iteration
Tuner Tune generated flow with reduced footprint USE_LIVETUNE_TUNER Parameter finalization close to production constraints
Release Minimal runtime for generated flow only AUDIO_CHAIN_RELEASE Product firmware where footprint and determinism are priorities

1.12. Integration checklist

  • Decide integration path:
    • Start from an Audio-Kit project and port to your board.
    • Or add Audio-Kit components to an existing project.
  • Keep mandatory components for selected mode (Designer/Tuner/Release).
  • Validate:
    • Audio I/O startup and synchronization
    • UART/JSON link (if LiveTune is used)
    • Memory pools and linker layout
    • Thread priorities and OS wrappers
Audio-Kit software architecture


1.13. Algorithm integration

  • Use AcIntegrate.py to scaffold wrappers from an ID card JSON file.
  • Generate notifications for data management and runtime variables.
  • Complete algorithm-specific processing and validate consistency callbacks.
  • Rebuild and confirm the new plugin appears in LiveTune.

1.14. Common pitfalls and quick fixes

  • No audio output: verify SysIn/SysOut routing, channel count, and sample-format compatibility.
  • LiveTune cannot connect: make sure a LiveTune compliant firmware was flashed, check ST-LINK VCP availability, UART mapping, and JSON link configuration.
  • Dropouts or glitches: review frame size, scheduling mode, and CPU headroom under realistic load.
  • Plugin not visible in LiveTune: confirm wrapper registration, build flags, and successful firmware rebuild.
  • Unexpected memory pressure: inspect pool sizing and linker layout before increasing algorithm complexity.
  • Compiling issues: please make sure that version of your toolset is the one listed by the release notes of X-CUBE-AUDIO-KIT package. For instance, STM32CubeIDE v2.1.1 is using GCC14 which is not compliant with X-CUBE-AUDIO-KIT v1.4.1 and lower.
  • Flashing issues: make sure that the version of STM32CubeProgrammer is the one listed by the release notes. For instance, for the STM32N6570-DK, STM32CubeProgrammer v2.22.0 is not compatible with X-CUBE-AUDIO-KIT v1.4.1 and lower.

1.15. Latency notes

  • End-to-end latency depends on SysIO buffering, frame duration, and scheduling mode.
  • Low-latency mode reduces delay by running dataInOut/process in a tighter execution model.
  • Use practical loopback measurement and stress tests before final tuning.
Latency illustration

1.16. Benefit from optimized algorithms

The table below shows the required CPU frequency (MHz) for representative configurations across targets of a 48 kHz stereo FIR_Equalizer with 200 coefficients. Lower values mean the same workload can run at a lower clock, leaving more headroom for other tasks.

Data format Filter phase Channels STM32N6 MHz STM32H7 MHz STM32H5 MHz
float linear phase 1 ch 20.6 48.9 59.6
float linear phase 2 ch 44.3 58.7 89.2
float linear phase 3 ch 68.7 147.2 178.2
float linear phase 4 ch 111.7 196.4 out
float minimum phase 1 ch 26.2 53.7 64.4
float minimum phase 2 ch 52 76 103.7
float minimum phase 3 ch 79 147.1 192.7
float minimum phase 4 ch 125.4 198.5 out
fixed16 linear phase 1 ch 13 27.4 49.6
fixed16 linear phase 2 ch 20.7 47.2 79.4
fixed16 linear phase 3 ch 40.2 134 193.9
fixed16 linear phase 4 ch 53.6 178.3 out
fixed16 minimum phase 1 ch 16.7 24.4 44.9
fixed16 minimum phase 2 ch 26.6 36.3 65
fixed16 minimum phase 3 ch 50.5 93.3 142.2
fixed16 minimum phase 4 ch 54.2 178 189.8
fixed32 linear phase 1 ch 20 23.9 46.6
fixed32 linear phase 2 ch 33.2 49.7 83.1
fixed32 linear phase 3 ch 60.3 108.7 184.8
fixed32 linear phase 4 ch 100.5 145.2 out
fixed32 minimum phase 1 ch 24.7 33.5 46.8
fixed32 minimum phase 2 ch 41 37.9 66.1
fixed32 minimum phase 3 ch 74.7 119 170.2
fixed32 minimum phase 4 ch 101 146 out

How this translates to per-target optimized code in Audio-Kit:

  • For Cortex-M55 targets, Helium-enabled kernels are selected to reduce compute cost and clock requirements.
  • For non-Helium targets (for example Cortex-M7/Cortex-M33), compatible optimized implementations are selected for that core family.
  • The same audio chain design can therefore be productized per target with architecture-specific optimization while preserving behavior and tuning intent.
  • In practice, this means one functional design flow with target-dependent performance optimization at integration/build time.

2. Step-by-step example (host/target sequence): microphone echo to loudspeaker

This example follows the interaction pattern illustrated in the Host/Target Interaction figure in the earlier LiveTune at a Glance chapter.

It is based on the LiveTune generated dataflow in: Middlewares\ST\Audio-Kit\examples\usecases\livetune\Src\audio_chain_generated_code.c.

Reference processing branch used for this walkthrough (microphone to speaker):

  • SysIn-Microphones
  • stereo2mono-1
  • echo-1 (delay=0.5 s, feedback=0.3, level=1)
  • gain-1 (gain=12 dB)
  • mono2stereo-2
  • SysOut-Codec

2.1. Preconditions

  • A supported board is flashed with Designer firmware and connected through ST-LINK USB.
  • The board Virtual COM port is visible on host.
  • Audio input/output path is available (on-board microphones + codec/speaker output).
  • Connected headset or loudspeaker to Audio out jack connector

2.2. Procedure

2.2.1. STEP 0- Flash the firmware

  • Refer to the "readme.txt" inside the Bin folder to get details about the way to proceed.
  • Select the 'Designer' binary for the chosen STM32 DK.

2.2.2. STEP 1- Open LiveTune and connect to target

LiveTune connected to target
LiveTune connected to target




Host action:

  • Open LiveTune in the browser.
  • Select the board VCP/UART endpoint and click Connect.

Target reaction:

  • JSON handshake succeeds and target capabilities are reported.
  • LiveTune connected








2.2.3. STEP 2- Create and save a dataflow from microphone to loudspeaker

Host action:

  • Drag SysIn-Microphones, stereo2mono, echo, gain, mono2stereo, and SysOut-Codec.
  • Connect them in this order:
    • SysIn-Microphones -> stereo2mono
    • stereo2mono -> echo
    • echo -> gain
    • gain -> mono2stereo
    • mono2stereo -> SysOut-Codec
  • To save or back up your dataflow, you can use:
    • the 'Flash' button so that the dataflow is stored in the ROM and will be present after reset.
    • the 'Save' button to keep a version on the host that can be reopened with the 'Load' button.

Target reaction:

  • Canvas with connected audio chain.
Microphone echo chain connected on canvas


2.2.4. STEP 3- Start execution and update algorithm parameters

Host action:

  • Start the dataflow using the 'Start' button; the LiveTune top banner then turns green.
    • On the STM32 side, AudioChain creates/starts the pipeline and transitions to the playing state.
    • In the Terminal tab, the console trace typically indicates successful playback. If a dataflow error occurs (for example, wrong channel count or sampling frequency), the console also provides details to help fix the issue.
  • Speak into microphones and listen on speaker output.
  • Modify echo-1 and set a higher/lower delay, level, or feedback. Also, try lowering gain-1 and hear the impact on speaker output.
  • Speak into microphones and listen on speaker output to hear the effect of your dataflow modifications.

Target reaction:

  • All parameter values are updated at runtime.
Echo and gain parameters configured


2.2.5. STEP 4- Generate C code for integration

Host action:

  • Once tuning is done, use the LiveTune 'Generate code' button. It opens a new tab with the generated C content.
  • Copy the content using the 'Copy' button
  • Paste the copied content in the file called audio_chain_generated_code.c
  • Open the Release target in your preferred environment (IAR or STM32CubeIDE)
  • Compile and listen on speaker output.

Target/project result:

  • Generated C initializes and connects the same logical path used above.
  • In this package example, generated code is available in:

Middlewares\ST\Audio-Kit\examples\usecases\livetune\Src\audio_chain_generated_code.c.


AudioChain Generated Code