How to exchange data buffers with the coprocessor

Revision as of 11:26, 9 June 2021 by Registered User

1. Article purpose[edit source]

This article gives an example of high-rate transfers of data chunks from the Arm® Cortex®-M core to the Arm® Cortex®-A core.

2. Introduction[edit source]

Relying on a logic analyzer sample, this article describes the mechanism and the software implemented to perform high-rate transfers. In this example, the Cortex-M core is used to perform continuously:

  • real-time operations
  • simple data algorithm (masking bit)
  • copy of the resulting data flow to DDR buffers, or TTY over RPMSG buffers, depending on the frequency sampling.

Copy transfer to DDR implementation requires :

  • contiguous memory allocation in DDR memory
  • Cortex-M awareness of the physical address and size of the memory buffers
  • mmaping of buffers to enable Linux® user land application access to them.

A specific Linux driver, rpmsg_sdb (shared data buffer), has been developed to take care of such constraints.
For details on the buffer exchange mechanisms, refer to the how to exchange large data buffers with the coprocessor - principle article.

3. Example of context description[edit source]

Let us implement a logic analyzer running on the STM32MP1 discovery kit.

From the user interface, press the START button to start the logic analyzer sampling. The logic analyzer samples GPIO PORT E bits 8 to 14, which are present on the Arduino connector. They correspond to 7 bits. The 8th bit will be reset by M4 algorithm.

The number of received data is displayed on the screen as bytes and Mbytes.

4. Example of static architecture for exchanging large data buffers[edit source]

The example of large data buffer exchange includes:

  • A Cortex-M firmware
  • A Linux user land application
  • A Linux rpmsg_sdb (shared data buffer) driver
  • A Linux rpmsg_tty driver

In the figure above, the numbers indicate the chronological order of data flows.

5. Cortex-M firmware[edit source]

The Cortex-M firmware is responsible for:

  • receiving a command giving of the number of DDR buffers through the TTY RPMsg channel, from the Linux application
  • receiving messages containing the physical address and size of DDR buffer(s), from the Linux rpmsg_sdb driver
  • receiving a command Start/Stop sampling (including sampling frequency) through the TTY RPMsg channel, from the Linux application
  • On start request:
    • sampling the data at the requested sampling frequency
    • masking and transferring data to Linux application
      • thanks to copy to DDR buffer by packet of 1024 bytes, if the frequency sampling is more than 5MHz, and informing the Cortex-A user interface (through the SDB RPMsg channel) when a DDR buffer of 1 Mbyte is filled, and roll to next DDR buffer.
      • thanks to TTY over RPMSG buffer by packet of 256 bytes if the frequency is less or equal to 5MHz

6. Linux user land application[edit source]

The Cortex-A Linux application includes a GTK user interface.

It allows controlling:

  • the sampling frequency
  • the start / stop of the sampling.
  • the data to be sampled thanks to "Set data" notch UI widget

The user interface displays :

  • Statistics : the number of data received by the user interface, as bytes and Mbytes
  • the first data of every new received Mbyte

7. Linux drivers[edit source]

  • The rpmsg_sdb Linux driver is responsible for the shared buffer management.
  • The rpmsg_tty driver is used to communicate (transport commands and status/events) between the Cortex-M firmware and the Cortex-A user land application.

8. Dynamic view[edit source]

At startup, the Linux application performs the following actions:

  • It loads the rpmsg_sdb.ko module.
  • It loads the Cortex-M firmware, then starts it.
  • It opens a rpmsg_tty channel for Cortex-M firmware control.
  • It opens a rpmsg_tty channel for Cortex-M firmware trace debug.
  • It opens the rpmsg_sdb driver, then uses rpmsg_sdb IOCTL interface to allocate and mmap 10 buffers of 1Mbyte in DDR memory.

When the user presses User button2, the Linux application starts.

How2bigdataSTARTmsc.png


When the START button is pressed, the application sends the sampling command to the Cortex-M firmware (including the sampling frequency).

Case 1: user selects a frequency sampling of 8MHz => case of copy to DDR buffers.

When the Cortex-M firmware sends a "buffer full" signal via the rpmsg_sdb driver, Linux application updates the statistics information.

How2bigdataDDRmsc.png


Case 2: user selects a frequency sampling of 8MHz => case of TTY buffers

When the Linux application receives a data buffer over TTY it checks if a new MByte has been fully received, and in this case it updates the statistics information.

How2bigdataTTYmsc.png


When the STOP button is pressed, the application sends the stop command to the Cortex-M firmware.

When the user presses User button2, the Linux application stops.

How2bigdataENDmsc.png


9. Results[edit source]

The Cortex-M CPU performs a mask and copy data operation on 1024 bytes within 75.4µs; This implies a maximum frequency sampling of: 1 / (75.4e-6 / 1024) => 13.58MHz. This corresponds to the maximum frequency sampling that can be achieved. In order to let a margin, the maximum frequency sampling implemented in this example is set to 12MHz.

How2bigdataChrono.png

On this oscilloscope snapshot, a GPIO is set at the beginning of the data masking and copying algorithm, and reset at the end of the algorithm. So, 75.4 µs are spent to mask and copy 1024 bytes of data in DDR.

10. Source code[edit source]

The source code corresponding to this use case is available as a Yocto layer at:

https://github.com/STMicroelectronics/meta-st-stm32mpu-app-logicanalyser.git

The firmware is included in the Yocto layer as an .elf file.

The source code of the Cortex-M firmware is available at:

https://github.com/STMicroelectronics/logicanalyser

For firmware compilation, please have a look into: Developer Package for STM32CubeMP1

In the source code example, 10 buffers of 1MByte each are allocated for the exchange. 3 buffers is the minimum to guarantee the real time behavior of the application. If the number of buffer needs to be increased (more than 10), then rpmsg_sdb_driver, M4 firmware, and Linux application must be modified, as the messaging relies on a single digit for the buffer index "BxLyyyyyyyy" => "BxxLyyyyyyyy".

11. Usage[edit source]

Please follow README.md of Yocto layer to perform installation.

The logicanalyser application is launched/stopped by pressing User2 button of the STM32MP1 Discovery board.

Select the sampling frequency and click on Start to start the use case.

Snapshot view of user interface :

ScreenshotLA.png