How to start with Coupling Chaining Bridge with Key Wrapped for an ECDSA Signature example on STM32U3
85min
Literature
- RM0487 STM32U3 Reference Manual
- UM2237 STM32CubeProgrammer software description
- AN5054 Secure programming using STM32CubeProgrammer
- AN2606 STM32 microcontroller system memory boot mode
- AN6205 Introduction to the use of PKA Key wrapping
- It is advised to read the STM32U3 reference manual chapter dedicated to the Coupling and Chaining Bridge (CCB)
- Start by reading the introduction article CCB KW ECDSA Signature How to Introduction
Target description
The purpose of this article is to provide an example of use case involving the CCB feature.
You will discover how the private key is protected through wrapping, that once wrapped, only this wrapped key is used and the private key in clear is no longer needed.
The provided example is using the STM32U385 or STM32U3C5 Nucleo board, devices embedding hardware cryptographic accelerators.
Introduction
Through this practical example, the user will learn how to:
- Use an application provided in the MbedTLS_HW_KWE of STM32Cube_FW_U3.
- Wrap the private key
- Perform a digital signature of a message using the Elliptic Curve Digital Signature Algorithm (ECDSA) and the wrapped private key.
- To modify the application to generate a public key from a wrapped private key. To use this computed public key to verify the signature
All the steps using the hidden (wrapped) private key is implicitly involving the embedded hardware CCB.
- Note: this provided code is an example and needs to be adapted for a commercial product. For instance, the protection of the embedded flash and RAM needs to be implemented.
- Warning: this example uses a private key declared as a constant, so unprotected. This example shows that once wrapped, this wrapped key is used and the original private key is not anymore needed. For a commercial product, it's obvious that the private key should never be visible in clear in the field.
Prerequisites
- Hardware
- U385RG-Q Nucleo board (MB1841) or U3C5ZI-Q Nucleo board (MB2222)
Note: to run this example, a device supporting Hardware Crypto accelerator is requested (STM32U385 or STM32U3C5)
- Required tools
- STM32Cube_FW_U3_V1.3.0 [1] or upper
- IAR Embedded Workbench rev 9.20.1 + patch STM32U37x-38x or STM32U3Bx-3Cx
- The patch is available in the STM32CubeFW: STM32Cube_FW_U3_Vx.x.x\Utilities\PC_Software
- Environment setup
- Download the STM32CubeU3 package and install it
- A directory NUCLEO-STM32U385RG-Q is included in the CubeFW Projects directory (see figure below).
- The Applications\MbedTLS_HW_KWE directory contains the different application examples using wrapped keys.
- For the NUCLEO-U3C5ZI-Q the Applications\MbedTLS_HW_KWE folder is not included but the NUCLEO-STM32U385RG-Q examples can be used.
- Create the MbedTLS_HW_KWE folder in Projects\NUCLEO-U3C5ZI-Q\Applications and copy the NUCLEO-U385RG-Q\Applications\MbedTLS_HW_KWE\ECC_ECDSA_Sign_KWE example into it.
- Proceed in the same way if you want to execute other provided Key Wrap examples.
1. Execution of the ECC_ECDSA_Sign_KWE application example
The following example is shown for IAR Embedded Workbench. But it is applicable for the other supported tool chains.
- Open the ECC_ECDSA_Sign_KWE example.
- The firmware is available at: STM32Cube_FW_U3_Vx.x.x\Projects\NUCLEO-U385RG-Q\Applications\MbedTLS_HW_KWE\ECC_ECDSA_Sign_KWE
- Or in the copied folder (see Prerequisites) STM32Cube_FW_U3_Vx.x.x\Projects\NUCLEO-U3C5ZI-Q\Applications\MbedTLS_HW_KWE\ECC_ECDSA_Sign_KWE
- STM32U385RG should be automatically selected in the product option (see figure below)
- For IAR: menu: Project -> Option -> General Option
- If it's not the case please check if your IDE version is supporting this product or if the STM32U3 patch is correctly installed (see prerequisites at the beginning of this article)
The example provided for the NUCLEO-U385RG should have the correct compilation switch as shown in the right part of bellow's figure.
(If it's not the case it needs to be updated).
For the NUCLEO-U3C5ZI-Q, since the example has been copied, the switch needs to be updated as shown in the left part of the figure below.
- To run this example instruction after instruction it is advised to change the optimization to "none", as shown in the figure below:
- Compile the project: Project -> Rebuild All
- Connect the Nucleo and start the debug session.
- Do a step by step execution to follow the explanations below.
1.1. ECDSA configuration
After initialization of the HAL, the clocks and the BSP, the configuration of the ECDSA is done through the key_attributes structure.
- The digital signature is done through signing the generated Hash of the input message; selected with attribute: PSA_KEY_USAGE_SIGN_HASH.
- ECDSA signature is selected with attribute: PSA_ALG_ECDSA(PSA_ALG_SHA_224)
- Different possible Hashing algorithms are supported, for this example SHA2-224 is chosen, generating a fixed length 224 bits hash.
- In this example a pair of keys is used, one private (to sign) and one public (to verify the signature), selected with attribute: PSA_KEY_TYPE_ECC_KEY_PAIR
- The argument PSA_ECC_FAMILY_SECP_R1 defines the family of the chosen elliptic curve.
- The size of the private key is defining which curve of this family is selected (256 bits for this example, see main.c, Private_Key declaration).
- Resulting in elliptic curve secp256r1.
See the appendix of this article for more details about the elliptic curve used for this example.
1.2. Private key configuration and private key wrapping using CCB
The figure below shows the setting of the private key.
The CCB (Coupling Chaining Bridge) embedded hardware IP is implicitly used to wrap the key
The CCB is using different hardware IPs: the Random Number Generator (RNG), Public Key Accelerator (PKA) and Secure AES co-processor (SAES).
But all the operation are hidden by the CBB.
- PSA_KEY_PERSISTENCE_DEFAULT: defines that the private key is stored in the embedded flash and under which condition the key can be destroyed.
- PSA_CRYPTO_KWE_DRIVER_LOCATION: driver using the CCB to wrap the private key (wrapped with the Derived Hardware Unique Key (DHUK) making the wrapped key only usable on this specific device)
- Note: for products such as STM32U3 having code execution isolated through HDPL levels, every HDPL level has a different associated DHUK. Thus, a key wrapped for a code executed in HDPLx can't be used for a code executed in a different HDPL level than x.
- When the key import has been completed (line 170 in the picture below)
- Add to watch: "key_attributes", "Private_Key", "sizeof(Private_Key)" and "key_handle_private"
- The "key_attributes" contains the private key settings.
- The "Private_Key" is defined as a constant declaration in the main.c, 32x8bits => 256bits (therefore located in flash)
- The "Public_Key" is defined as a constant declaration in the main.c (X and Y coordinates of the public key point on the elliptic curve, also located in flash).
- Therefore 512 bits for the public key and 8 bits (0x04) to define that this public key is related to an elliptic curve expressed in the weierstrass form.
- The "key_attributes" contains the private key settings.
- The key_handle_private is the handler containing all the needed information for the private key.
- Note that this handler is located in SRAM (for this example at address 0x2000 A438) (see watch window in the figure above)."
- After the key import, the private key attributes are not needed anymore, so the structure can be cleared as shown in the figure below.
- It can be observed in the watch window, that the key_attributes values are removed.
1.3. Computing the HASH of the message and perform the digital signature
The next step is the generation of the digest of the message and to sign this digest.
1.3.1. Message Hashing
- The digest algorithm used in this example is the SHA2-224 hashing algorithm.
- When the digest is completed (line 191 in this exemple)
- Right click on Message and select Add to Watch
- Do the same for sizeof(Message), Computed_Hash and sizeof(Computed_Hash)
- In the watch window the input message is displayed as written in the constant declaration (line 62 for this exemple). The message size is 1024 bits (128 x 8)
- "Computed_Hash" in the watch window shows the Hash generated from the Message. The Hash size is 224 bits (28 x 8).
- Note: Using SHA2-224, the Hash length will always be 224, this for any input message length (also for message lengths smaller than 224 bits)
1.3.2. Digital signature generation
The ECDSA algorithm is used to perform the digital signature.
The secp256r1 curve is used, curve that has been defined during the private key configuration and wrapping.
- Reminder: the basic principle is based on a key pair. One key used to sign (the private key, kept secret) and one key to verify the signature (the public key, not secret, distributed in clear). The used algorithm, the used elliptic curve, all the parameters and the public key are all pubic. The only secret is the private key. It is obviously very important to protect this private key against attacks.
- The role of the CCB, implicitly used in this example, is to keep the private key hidden to the user and the CPU (wrapped key)
- When the digital signature has been performed (line 220 for this example)
- Right click on Computed_Signature and select Add to Watch
- Do the same for sizeof(Computed_Signature),
- You can observe the value of the signature result.
- Note: you can note that every code execution creates another signature results completely different, even with the same input message and applied parameters and algorithm. A random number is generated at every run (see explanations in introduction article).
- Note: more explanation about the computes signature can be found in the appendix of this article.
This completes the digital signature process.
1.4. Digital signature verification
When a signed message is received, the digital signature verification ensures that the message has been issued by the correct author and that the message has not been modified (authenticity and integrity).
Non-repudiation is also guarantied, since the private key is only known by the signer and he is the only one who can perform the signature. If the public key allows to verify the signature, the signer can't deny being the issuer of the signed message.
- Reminder:
- The hash signed with the private key of the author can only be verified with the related public key.
- The public key can be generated by the author and is directly linked to his private key.
- Or the public key can be generated by the STM32 device using the wrapped private key (example is shown in the second part of this article).
- In both cases the verifier has never access to the private key in clear.
The figure below shows the configuration for the public key
- Note:
- For this example, the public key is defined as a constant and stored in the flash (add to watch: Public_Key).
- The key_handle_public is located in the RAM (add to watch: key_handle_public).
The figure below shows the command used to verify the digital signature.
If the verification is successful the remaining steps are:
- The deletion of some sensitive data.
- The green LED will be switched ON if all the operations have performed successfully.
More detail explanations are provided in the next chapters.
2. Rerun of the example with more in deep analysis of some points
The previous chapter has given an overview of all the operations performed in this example.
A more in deep analysis of some executed steps is explained in the following sections.
- Reset and restart the code execution.
2.1. Private key wrapping
- Open the kwe_core.c and set the breakpoints as indicated in the figure below.
- Middlewares/ST/mbedtls_key_wrap_engine/kwe_core.c
- Search: /* Configure Wrapping Key : DHUK */
- Set two breakpoints as indicated in the figure below (the code line number can differ depending on the CubeFw release)
- Open the psa_crypot.c
- Middlewares/mbedtls/psa_crypto.c
- Search: /* Key material is saved in export representation in the slot
- Set a breakpoint as indicated in the figure below (the code line number can differ depending on the CubeFw release)
- Execute the code till the first breakpoint.
- Open a memory watch window and enter the address 0x200122c0 (View -> Memory -> Memory 1 ).
- The figure below shows the 32 bytes of the private key temporarily stored in the embedded RAM.
- The figure below shows the 32 bytes of the private key temporarily stored in the embedded RAM.
- Note: in case the indicated RAM address is different in your use case, make a search (ctrl+f) in the memory windows as indicated in the figure above.
- Note: the private key copy in RAM is optional since this key is already available in flash. This is a generic example since for some use cases this copy can be useful.
- Add to watch: ecdsa_blob (64 bytes)
- IV (Initialization Vector), 16 bytes
- Tag 16 bytes
- Wrapped key 32 bytes
- Execute the code till the second break point.
- In the memory watch window enter the address indicated by the IV pointer (in this example 0x2000 AA18).
- The memory window shows the IV, the Tag and the wrapped private key stored in the embedded RAM.
- Note:
- The wrapped key is different for every device since it is depending on the RHUK (Root Hardware Unique Key) that is different for every device.
- And the wrapped key is also different for every key wrapping, so if you run twice the code the obtained result is different.
- The wrapped key is different for every device since it is depending on the RHUK (Root Hardware Unique Key) that is different for every device.
- ecdsa_blob, some explanations:
- IV: the key wrapping mechanism is using AES-GCM (Advanced Encryption Standard, Galois Counter Mode) requiring an Initialization Vector.
- The first 4 bytes of the IV are fixed to 0x2 and the 3 other bytes are generated using the RNG (see appendix for further details).
- Tag: allows to verify the authenticity of the key during the unwrap process. During the key wrap, the HW generates a specific tag related to the key and the IV. During the unwrap the hardware generates a tag related to the key and the software. Comparison of these two tags allows to verify the authenticity of the key.
- WrappedKey: is the encrypted private key, 256 bits (same size as the private key).
- IV: the key wrapping mechanism is using AES-GCM (Advanced Encryption Standard, Galois Counter Mode) requiring an Initialization Vector.
- Execute the code till the third breakpoint set previously.
- Add a memory watch window and display address 0x080FC034.
- Make a "Step over" to execute the command.
- In the figure above, it can be seen that the wrapped key has been stored in the embedded flash memory.
- The reason is that the key has been defined as persistent (see in main.c /* Setup the key policy for the private key */)
- Keep the two memory windows open.
- Set a breakpoint in the main as indicated in the figure below.
- Execute the code till the newly set breakpoint.
- Make a "Step over" to execute the command.
- The figure above shows:
- The wrapped private key has been removed from the embedded Flash and RAM location.
- Open a memory window 3 and window 4, with Flash and RAM location of the private key in clear (see previous private key wrapping section).
- For this example addresses : 0x200122c0 and 0x8011ac4:
- The figure above shows that the private key is not erased neither in the RAM neither in the flash.
- This code example uses a private key stored unprotected in the flash and copied in RAM.
- For a real application, a customer needs to define his private key strategy to keep it secret.
- Once the private key is wrapped all operations can be done with this wrapped key and the private key can be erased.
The next section will
3. Modification of the provided example
It is advised to copy the directory ECC_ECDSA_Sign_KWE before modifying the example.
3.1. Generation of the public key from the wrapped private key
In the previous handsons, the public key was already generated and defined as a constant.
- With the following code modifications:
- The public key is generated from the wrapped private key.
- The generated public key is used to verify the signature.
3.1.1. Heap size increase
The generated public key requires some additional memory allocation.
- Increase the heap size
- Project -> Option -> Linker -> Stack/Heap Sizes : increase the heap to 0x1400 (see figure below)
3.1.2. New private variable and comment out of the old public key constant
- Define a new private variable for the computed public key
- Copy and paste the following:
uint8_t Computed_Pub_Key[65];
- Comment out the public key constant.
3.1.3. Public generation from the wrapped private key and replacement to use this computed public key
- Add the following code to generate the public key from the wrapped private key
- Copy and paste the following code:
/* Export the public key from the wrapped private key */
retval = psa_export_public_key(key_handle_private, Computed_Pub_Key, sizeof(Computed_Pub_Key), &computed_size);
if (retval != PSA_SUCCESS)
{
Error_Handler();
}
- Add the following instruction to replace the public key previously defined as a constant.
3.1.4. Code execution with public key generation from the wrapped private key
- Compile the full project and set the following break points.
- Execute till the first break point.
- Add to watch: the Computed_Pub_key.
- Execute a " step over " and check the generated public key.
- Compare the generated public key with the public key previously defined as a constant and that has been commented out.
- Execute a "step over" and check that the code execution is not entering into the "Error_Handler".
- Execute till the second break point where the signature is verified using the computed public key.
- Execute a "step over" and verify that the code execution is not entering into the "Error Handler".
- Execute the code till the end, if the green led is ON and not blinking the complete code has been executed and the signature has been verified successfully
3.2. Appendix
3.2.1. Elliptic curve parameters
- Open the ecp_curves.c file (located in the Middlewares -> mbedtls folder)
- The different parameters of the curves are defined in this file.
- For instance for the secp256r1 curve as shown in the figure below
- Curve parameters:
- a & b: constants for the curve equation
- a: is defined as p-3, see RM0487 chapter "Supported elliptic curves".
- n: order of the curve, number of points on the curve
- p: large prime number specific to the curve modulo operations. It keeps all calculation results within a specific range.
- G: generator point, fixed defined starting point (starting point for the points additions)
- a & b: constants for the curve equation
Note: the recommended values for the elliptic curves are defined in the SEC2 document: https://www.secg.org/sec2-v2.pdf.
The endianness of the parameters in the SEC2 file and in the ecp_curves.c are inversed.
Note: for the 16 fixed point defined in the ecp_curves.c file used to speed up the point addition calculation (MBEDTLS_ECP_FIXED_POINT_OPTIM == 1) see explanations in https://mbed-tls.readthedocs.io/en/latest/kb/how-to/how-do-i-tune-elliptic-curves-resource-usage/
3.2.2. ECDSA signature format
The principle of an ECDSA signature and verification can be found on the net.
To verify the signature the receiver needs to have:
- The Original message
- The signature containing parameters "r" and "S"
- The public key.
- The Computed_Signature of this example contains:
- The parameter "r": x coordinate of a point on the elliptic curve (based on a random number), so the size of r is 32 bytes.
- The parameter "S": 32 bytes, computed from the private key and the message Hash ("S" proves that the sender is the owner of the private key).
- So, the Computed_Signature has a total size of 64 bytes in a raw format, but the length can vary if another format is used such as DER (check on the net).
3.2.3. Intitialization Vector generation for AES-GCM
One step during this example is the generation of the IV (Initialization Vector) also called Nonce and used for the AES-GCM algorithm (see on the net for more details about the algorithm).
The embedded hardware Random Number Generator (RNG) is used to generate the 3 bytes of the IV.
The first byte is fixed and set to 0x2.
The figure below shows the code used.
4. References






