Template:ArticleProposedVersion
1. Article purpose[edit source]
The purpose of this document is to provide information on how to measure and improve the boot-time[1] of a typical STM32MP15 Linux system. Although this article is not an exhaustive list of possible optimizations, and the ones that are considered not reliable enough for industrial grade usages are missing on purpose.
2. Overview[edit source]
On a typical STM32MP1 Linux system, the boot-chain is respectively performed by the ROM code, TF-A, U-Boot, the Linux kernel, and the user-land[2]. All these components but the ROM code can be modified, and thus configured to start more quickly. For each of them, the procedure is always the same: features that are not required at boot-time must be moved away, and features that improve the boot-time must be enabled.
3. Measuring boot-time[edit source]
Before optimizing the performance of any piece of software, one should know the amount of time taken by each part of it, that is, where the effort must be focused on.
3.1. Using a serial console[edit source]
One of the easiest way to measure the boot-time of a Linux system is to observe traces emitted on a serial console using a timing software, like serialgrab or like the script File:Measure-timing.txt based on microcom and p2f:
user@pc$ microcom -p /dev/ttyACM0 | bash Measure-timing.txt
Waiting for board reset...
NOTICE: Model: STMicroelectronics STM32MP157C eval daughter on eval mother
NOTICE: Board: MB1263 Var1 Rev.C-01
…
U-Boot 2018.11-stm32mp-r2 (Nov 14 2018 - 16:10:06 +0000)
CPU: STM32MP157AAA Rev.B
Model: STMicroelectronics STM32MP157C eval daughter on eval mother
…
Starting kernel ...
[ 0.000000] Booting Linux on physical CPU 0x0
…
Freeing unused kernel memory: 1024K
Run /sbin/init as init process
…
ST OpenSTLinux - Weston - (A Yocto Project Based Distro) 2.6-openstlinux-4.19-thud-mp1-19-02-20 stm32mp1 ttySTM0
stm32mp1 login:
Timing results: FSBL: 1.23s SSBL: 2.34s Linux: 2.34s init: 1.23s total: 7.14s
Please note these are not actual figures since the boot-time depends on too many parameters to provide something meaningful here.
3.2. Using hardware timers[edit source]
When measuring traces from a serial console turns out to not be precise enough, or when traces are not available, then it is possible to use hardware timers. Some components of the boot-chain provide timing information based on hardware timers; for instance with U-Boot, the following options can be enabled::
CONFIG_BOOTSTAGE=y CONFIG_BOOTSTAGE_REPORT=y
This will print on the serial console — just before booting the OS — something as below (not actual figures):
Starting kernel ...
Timer summary in microseconds (11 records):
Mark Elapsed Stage
0 0 reset
123,456 123,456 board_init_f
2,345,678 2,222,222 board_init_r
3,456,789 1,111,111 id=64
3,567,890 111,101 id=65
3,678,901 111,011 main_loop
4,567,890 888,989 bootm_start
4,678,901 111,011 id=15
4,789,012 110,111 start_kernel
Accumulated time:
23,456 dm_r
567,890 dm_f
Booting Linux on physical CPU 0x0
If the serial console is not available, these timing information can also be stored into the device-tree or in memory:
CONFIG_BOOTSTAGE_FDT=y CONFIG_BOOTSTAGE_STASH=y
Of course it is possible to add this feature to boot-chain components
that do not implement it by default, like TF-A. See the file
arch/arm/cpu/armv7/arch_timer.c
from U-Boot for an
example.
3.3. Using GPIOs[edit source]
When console is disabled, then the components of the boot-chain can be modified to trig events on GPIOs according to the current stage of the boot-process. Then a logical analyzer can be used to measure the time between each such event.
4. Optimizing boot-time[edit source]
4.1. TF-A[edit source]
The execution time of TF-A can be noticeably reduced just by disabling features that are not required. To achieve this, the right options have to be specified when building TF-A; for instance with a system that boots exclusively from a NAND connected through FMC:
user@pc$ make STM32MP1_DEBUG_ENABLE=0 \
STM32MP1_UART_PROGRAMMER=0 \
STM32MP1_USB=0 \
STM32MP1_QSPI_NOR=0 \
STM32MP1_QSPI_NAND=0 \
STM32MP_FMC_NAND=1 \
STM32MP_EMMC=0 \
STM32MP_SDMMC=0 \
…
Of course these build options have to be adjusted according to the expected usage and version of TF-A.
4.2. U-Boot[edit source]
4.2.1. Delay[edit source]
Before loading the Linux kernel and its device tree, U-Boot waits by
default for an eventual user input during one second. This behavior —
undesirable when boot-time matters — can be easily removed by
specifying bootdelay=0
in
include/configs/stm32mp1.h
.
4.2.2. Configuration & device tree[edit source]
More broadly, removing support for all unused devices from [[U-Boot overview#U-Boot configuration|U-Boot configuration]] and unused nodes from device-tree drastically reduce U-Boot execution time, since this removes the time taken to initialize those devices and the time to parse the device-tree.
Also, there are a couple of features in U-Boot that are specially designed to improve the boot-time; for example:
CONFIG_MTD_UBI_FASTMAP=y
CONFIG_SYS_MALLOC_CLEAR_ON_INIT=n
Regarding support for UBI fastmap (for NOR and NAND storage media), please see the "Linux" section below for more information.
4.2.3. Configuration access[edit source]
By default U-Boot searches a boot configuration file
extlinux.conf
or a boot script boot.scr.uimg
from a bootable partition ("bootfs" for OpenSTLinux), see
U-Boot_overview#Generic_Distro_configuration for details. Such
access to a file-system on a storage media can take a small amount of
time, but fortunately it is possible to embed a boot command into the
U-Boot binary instead. To accomplish that, the U-Boot environment
must be wholly specified from scratch:
CONFIG_USE_DEFAULT_ENV_FILE=y CONFIG_DEFAULT_ENV_FILE="path/to/env.txt"
Or the U-Boot "distro" mode must be disabled:
CONFIG_DISTRO_DEFAULTS=n
In this latter case, be aware that if you are booting from an ext2/4 partition — typically when booting from an SD card or from eMMC — then the following options has to be selected explicitly (they were previously selected implicitly by CONFIG_DISTRO_DEFAULTS):
CONFIG_CMD_EXT2=y CONFIG_CMD_EXT4=y
In both cases only CONFIG_ENV_IS_NOWHERE
must be set to
y
(remove all the other CONFIG_ENV_IS...) , and the environment variable bootcmd
must
contain the expected boot command, for example:
CONFIG_BOOTCOMMAND="run bootcmd_ubi"
bootcmd_ubi=env set bootargs ubi.mtd=UBI root=ubi0:boot \ rootfstype=ubifs rootwait \ rw console=ttySTM0,115200; \ ubi part UBI; \ ubifsmount ubi0:boot; \ ubifsload 0xc2000000 /zImage; \ ubifsload 0xc4000000 /stm32mp157c-ev1.dtb; \ bootz 0xc2000000 - 0xc4000000
4.3. Linux[edit source]
4.3.1. Configuration & device-tree[edit source]
As for U-Boot, one of the most efficient way to decrease the boot-time of Linux is to remove support for all unused devices from its configuration and device-tree, since this removes the time taken to initialize those devices. Also, support for devices and features that are required but not mandatory at boot-time can be compiled as modules, and then loaded by the user-land once the boot-process is done (see "Init" subsection below).
4.3.2. Traces[edit source]
Linux boot traces are about 2kB big, so it takes about 2 secondes to
be transfered on a serial link configured at 128kpbs. If this comes
to be an issue, one just has to use the following kernel parameters
(bootargs
variable in U-Boot) to remove these traces:
quiet loglevel=0
It is automatically done by U-Boot when silent mode is require (see CONFIG_SILENT_CONSOLE and CONFIG_SILENT_U_BOOT_ONLY): silent=1 in used U-Boot environment.
4.3.3. UBI volume[edit source]
When partitions like "bootfs" and "rootfs" are stored on a UBI volume, it is recommanded to perform the following actions to highly reduce the time of volume attachment performed at boot-time both by U-Boot and by the Linux kernel:
1. decrease size of the UBI volume ; and
2. enable the fastmap feature. For that, the option
CONFIG_MTD_UBI_FASTMAP
must be set to y
both for Linux and U-Boot, and ubi.fm_autoconvert=1
has to be added to the kernel boot parameters. Please note that
the very first boot is used to create the fastmap information, thus
this one does not get faster.
A lot of other volume/file-systems exist for Flash media, like JFFS, LogFS, F2FS, ... Depending on the use-case, some can be faster than others.
4.4. User-land[edit source]
4.4.1. Init[edit source]
The very first user-land process launched by the Linux kernel is
init
; it is in charge to launch others processes in the
right order. As a consequence, removing all unnecessary services can
greatly speed up this part of the boot-process. The way services can
be removed highly depends on the system in use
(systemd,
OpenRC, ...), so please
refer to its documentation.
Ultimately when no services are required at all the
init
system can be replaced by the final application
itself; either by storing its binary to /sbin/init
, or by
passing the following Linux boot parameter:
init=/path/to/application/binary
4.4.2. Framework[edit source]
For graphical applications, the choice of the the display framework can seriously impact the startup time of the application itself. For instance, one could use the Linux framebuffer instead of Weston or Xorg, since this former is set up faster than this latter.
Likewise for video applications; if the boot-time really matters, then it is recommended to use directly the V4L2 interface instead of using more high-level interfaces like GStreamer.
5. Conclusion[edit source]
There is no universal recipe to improve the boot-time of a Linux system, since it highly depends on the contraints induced by the final use-case. However there is one universal rule: always benchmark, before and after optimizing.