How to optimize the boot time

Revision as of 16:26, 28 June 2019 by Registered User (→‎Delay)

Template:ArticleMainWriter

Template:ArticleProposedVersion


1. Article purpose[edit source]

The purpose of this document is to provide information on how to measure and improve the boot-time[1] of a typical STM32MP15 Linux system. Although this article ain't an exhaustive list of possible optimizations, and the ones that are considered not reliable enough for industrial grade usages are missing on purpose.


2. Overview[edit source]

On a typical STM32MP1 Linux system, the boot-chain is respectively performed by the ROM code, TF-A, U-Boot, the Linux kernel, and the user-land[2]. All these components but the ROM code can be modified, and thus configured to start more quickly. For each of them, the procedure is always the same: features that are not required at boot-time must be moved away, and features that improve the boot-time must be enabled.


3. Measuring boot-time[edit source]

Before optimizing the performance of any piece of software, one should know the amount of time taken by each part of it, that is, where the effort must be focused on.


3.1. Using a serial console[edit source]

One of the easiest way to measure the boot-time of a Linux system is to observe traces emitted on a serial console using a timing software, like serialgrab or like the script File:Measure-timing.txt based on microcom and p2f:

 user@pc$ microcom -p /dev/ttyACM0 | bash measure-timing.txt


3.2. Using hardware timers[edit source]

When measuring traces from a serial console turns out to not be precise enough, or when traces are not available, then it is possible to use hardware timers. Some components of the boot-chain provide timing information based on hardware timers; for instance with U-Boot, the following options can be enabled::

 CONFIG_BOOTSTAGE=y
 CONFIG_BOOTSTAGE_REPORT=y

Of course it is possible to add this feature to boot-chain components that do not implement it by default, like TF-A. See the file arch/arm/cpu/armv7/arch_timer.c from U-Boot for an example.


3.3. Using GPIOs[edit source]

If no serial console is available, then the components of the boot-chain can be modified to trig events on GPIOs according to the current stage of the boot-process. Then a logical analyzer can be used to measure the time between each such event.


4. Optimizing boot-time[edit source]

4.1. TF-A[edit source]

The execution time of TF-A can be noticeably reduced just by disabling features that are not required. To achieve this, the right options have to be specified when building TF-A; for instance with a system that boots exclusively from eMMC:

  user@pc$ make STM32MP1_DEBUG_ENABLE=0    \
                STM32MP1_UART_PROGRAMMER=0 \
                STM32MP1_USB=0             \
                STM32MP1_QSPI_NOR=0        \
                STM32MP1_QSPI_NAND=0       \
                STM32MP_FMC_NAND=0         \
                STM32MP_EMMC=1             \
                STM32MP_SDMMC=0            \
                …

Of course these build options have to be adjusted according to the expected usage and version of TF-A.


4.2. U-Boot[edit source]

4.2.1. Delay[edit source]

Before loading the Linux kernel and its device tree, U-Boot waits by default for an eventual user input during one second. This behavior — undesirable when boot-time matters — can be easily removed by specifying bootdelay=0 in include/configs/stm32mp1.h.


4.2.2. Configuration & device tree[edit source]

More broadly, removing support for all unused devices from U-Boot configuration and device-tree drastically reduce its execution time, since this removes the time taken to initialize those devices.

Also, there are a couple of features in U-Boot that are specially designed to improve the boot-time; for example:

 CONFIG_MTD_UBI_FASTMAP=y
 CONFIG_SYS_MALLOC_CLEAR_ON_INIT=n

Regarding support for UBI fastmap (for NOR and NAND storage media), please see the "Linux" section below for more information.


4.2.3. File-system access[edit source]

By default U-Boot searches a boot configuration file extlinux.conf or a boot script boot.scr.uimg from a bootable partition ("bootfs" for OpenSTLinux), see U-Boot_overview#Generic_Distro_configuration for details. Such access to a file-system on a storage media can take a small amount of time, but fortunately it is possible to embed a boot command into the U-Boot binary instead (see . To accomplish that, either U-Boot "distro" mode must be disabled (and required features that were automatically selected by this option must be re-enabled explicitly), for instance:

 CONFIG_DISTRO_DEFAULTS=n
 CONFIG_CMD_EXT2=y

Or the U-Boot environment must be wholly specified from scratch:

 CONFIG_USE_DEFAULT_ENV_FILE=y
 CONFIG_DEFAULT_ENV_FILE="path/to/env.txt"

In both cases only CONFIG_ENV_IS_NOWHERE must be set to y (remove all the other CONFIG_ENV_IS...) , and the environment variable bootcmd must contain the expected boot command, for example:

 CONFIG_BOOTCOMMAND="run bootcmd_ubi"
 bootcmd_ubi=env set bootargs ubi.mtd=UBI root=ubi0:boot \
                          rootfstype=ubifs rootwait  \
                          rw console=ttySTM0,115200; \
          ubi part UBI;                              \
          ubifsmount ubi0:boot;                      \
          ubifsload 0xc2000000 /zImage;              \
          ubifsload 0xc4000000 /stm32mp157c-ev1.dtb; \
          bootz 0xc2000000 - 0xc4000000


4.3. Linux[edit source]

4.3.1. Configuration & device-tree[edit source]

As for U-Boot, one of the most efficient way to decrease the boot-time of Linux is to remove support for all unused devices from its configuration and device-tree, since this removes the time taken to initialize those devices. Also, support for devices and features that are required but not mandatory at boot-time can be compiled as modules, and then loaded by the user-land once the boot-process is done (see "Init" subsection below).


4.3.2. Traces[edit source]

Linux boot traces are about 2kB big, so it takes about 2 secondes to be transfered on a serial link configured at 128kpbs. If this comes to be an issue, one just has to use the following kernel parameters (bootargs variable in U-Boot) to remove these traces:

 quiet loglevel=0

It is automatically done by U-Boot when silent mode is require (see CONFIG_SILENT_CONSOLE and CONFIG_SILENT_U_BOOT_ONLY): silent=1 in used U-Boot environment.

4.3.3. UBI volume[edit source]

When partitions like "bootfs" and "rootfs" are stored on a UBI volume, it is recommanded to perform the following actions to highly reduce the time of volume attachment performed at boot-time both by U-Boot and by the Linux kernel:

1. decrease size of the UBI volume ; and

2. enable the fastmap feature. For that, the option CONFIG_MTD_UBI_FASTMAP must be set to y both for Linux and U-Boot, and ubi.fm_autoconvert=1 has to be added to the kernel boot parameters. Please note that the very first boot is used to create the fastmap information, thus this one does not get faster.

A lot of other volume/file-systems exist for Flash media, like JFFS, LogFS, F2FS, ... Depending on the use-case, some can be faster than others.


4.4. User-land[edit source]

4.4.1. Init[edit source]

The very first user-land process launched by the Linux kernel is init; it is in charge to launch others processes in the right order. As a consequence, removing all unnecessary services can greatly speed up this part of the boot-process. The way services can be removed highly depends on the system in use (systemd, OpenRC, ...), so please refer to its documentation.


Ultimately — when no services are required at all — the init system can be replaced by the final application itself; either by storing its binary to /sbin/init, or by passing the following Linux boot parameter:

 init=/path/to/application/binary


4.4.2. Framework[edit source]

For graphical applications, the choice of the the display framework can seriously impact the startup time of the application itself. For instance, one could use the Linux framebuffer instead of Weston or Xorg, since this former is set up faster than this latter.

Likewise for video applications; if the boot-time really matters, then it is recommended to use directly the V4L2 interface instead of using more high-level interfaces like GStreamer.


5. Conclusion[edit source]

There is no universal recipe to improve the boot-time of a Linux system, since it highly depends on the contraints induced by the final use-case. However there is one universal rule: always benchmark, before and after optimizing.


Template:Reflist

  1. Sometimes refered to as startup-time.
  2. Sometimes refered to as user-space.