1. YOLO for STM32
The STMicroelectronics Ultralytics fork provides a collection of pre-trained and quantized YOLOv8, YOLO11, YOLO26 models. These models are compatible with STM32 platforms, ensuring seamless integration and efficient performance for edge computing applications.
The tutorial below offers a standalone, end-to-end guide to deploy YOLOv8, YOLO11 and YOLO26 models to STM32N6.
1.1. Benefits
- Offers a set of models compatible with STM32 platforms and stm32ai-modelzoo.
- Offers a step by step guide on how to export quantization friendly YOLO26 ONNX models to be used with stm32ai-modelzoo-services and deployed on STM32N6 Discovery Kit.
- Offers a quantization friendly pose estimation model (fixed on the latest version of Ultralytics)
- A step by step guide on how to use AiRunner to evaluate YOLOv8 models on STM32N6. See here.
- A guide on how to deploy the gesture detection model on STM32N6. See here.
1.2. Notice
If You combine this software (“Software”) with other software from STMicroelectronics ("ST Software"), to generate a software or software package ("Combined Software"), for instance for use in or in combination with STM32 products, You must comply with the license terms under which ST distributed such ST Software ("ST Software Terms"). Since this Software is provided to You under AGPL-3.0-only license terms, in most cases (such as, but not limited to, ST Software delivered under the terms of SLA0044, SLA0048, or SLA0078), ST Software Terms contain restrictions which will strictly forbid any distribution or non-internal use of the Combined Software. You are responsible for compliance with applicable license terms for any Software You use, and as such, You must limit your use of this software and any Combined Software accordingly.
1.3. Available YOLO Models
| Models | Task | Input Resolution | Format | Input Type | Output Type |
|---|---|---|---|---|---|
| YOLOv8n | person_detection | 256x256x3 | per channel int8 | uint8 | float |
| YOLOv8n | person_detection | 320x320x3 | per channel int8 | uint8 | float |
| YOLOv8n | person_detection | 416x416x3 | per channel int8 | uint8 | float |
| YOLO11n | person_detection | 256x256x3 | per channel int8 | uint8 | float |
| YOLOv8n | gesture detection | 256x256x3 | per channel int8 | uint8 | float |
| YOLOv8n | gesture detection | 320x320x3 | per channel int8 | uint8 | float |
| YOLOv8n | pose_estimation | 256x256x3 | per tensor int8 | uint8 | float |
| YOLOv8n | pose_estimation | 256x256x3 | per channel int8 | uint8 | float |
| YOLOv8n | pose_estimation | 320x320x3 | per channel int8 | uint8 | float |
| YOLOv8n | pose_estimation | 192x192x3 | per channel int8 | uint8 | float |
| YOLO11n | pose_estimation | 256x256x3 | per channel int8 | uint8 | float |
| YOLO11n | pose_estimation | 320x320x3 | per channel int8 | uint8 | float |
| YOLOv8n | segmentation | 256x256x3 | per channel int8 | int8 | int8 |
| YOLOv8n | segmentation | 320x320x3 | per channel int8 | int8 | int8 |
| YOLO11n | segmentation | 256x256x3 | per channel int8 | int8 | int8 |
2. YOLO Ultralytics deployment guide to STM32N6
This tutorial explains how to train a YOLO26n model using the ST Ultralytics fork, export it to ONNX format, quantize it using STM32AI Model Zoo Services, and deploy it on an STM32N6 board.
2.1. Train and Export in ST Ultralytics Fork
2.1.1. Clone ST Ultralytics fork
git clone https://github.com/stm32-hotspot/ultralytics.git
cd ultralytics
2.1.3. Install Ultralytics
Create a Python 3.12 environment:
python -m venv yolo-env
Activate the environment:
- On Windows:
yolo-env\Scripts\activate
- On Unix or MacOS:
source yolo-env/bin/activate
Then, install the package in editable mode to be able to use the yolo command line tool from anywhere in the environment, and to reflect any changes made to the code without needing to reinstall.
pip install -e .
2.1.4. Go to YOLOv8-STEdgeAI example folder
cd examples/YOLOv8-STEdgeAI
2.1.5. Install ONNX dependencies for export and inference
pip install onnx==1.16.1 onnxruntime==1.20.1 tqdm
2.1.6. Download dataset
We will use a small dataset as an example, based on the COCO 2017 validation dataset for this tutorial. You can change the dataset by updating the paths in the next steps accordingly.
We provide a script to download and prepare the dataset for training and evaluation. Run the following command from the current directory:
python download_dataset.py
2.1.7. Convert dataset to YOLO format
As required by the current version of the training pipeline, we need to convert the COCO annotations to YOLO format.
Run the following command from the current directory:
python convert_dataset.py --coco_images_dir datasets/coco/images/val --coco_annotations_file datasets/coco/annotations/instances_val2017.json
2.1.8. Train model
In this example we will train a YOLO26n model for 3 epochs you can adjust the number of epochs as needed, with an image size of 256. You can adjust these parameters as needed, but make sure to keep the image size consistent across training, export, and evaluation steps.
yolo train model=yolo26n.pt data=dataset.yaml epochs=3 imgsz=256
2.1.9. Export to ONNX
Now we will export the trained model to ONNX format, which is compatible with STM32AI Model Zoo Services. Make sure to keep the opset version consistent with the one specified in the quantization configuration in the next steps.
Update the model path in the command below if the model is saved in a different location.
yolo export model=../../runs/detect/train/weights/best.pt format=onnx end2end=False imgsz=256 simplify=True opset=17
2.1.10. Evaluate exported ONNX model
Update the model path in the command below if the model is saved in a different location. This step is important to verify that the exported ONNX model has good accuracy before proceeding with quantization and deployment. Make sure to keep the image size and dataset paths consistent with the previous steps.
yolo val task=detect model=../../runs/detect/train/weights/best.onnx imgsz=256 data=dataset.yaml
2.2. Quantize and Deploy with STM32AI Model Zoo Services
2.2.1. Clone STM32AI Model Zoo Services
Make sure to clone the project in the same location as the ST Ultralytics fork for easier path management, but this is not mandatory. You can clone it anywhere and update the paths in the next steps accordingly.
git clone https://github.com/STMicroelectronics/stm32ai-modelzoo-services.git --depth 1
cd stm32ai-modelzoo-services
2.2.3. Initialize and update submodules
git submodule update --init --recursive
If this step fails, rerun it from the repository root and verify network/proxy access.
2.2.4. Create a dedicated environment and install requirements
conda create -n st_zoo python=3.12.9
conda activate st_zoo
pip install -r requirements.txt
cd object_detection
2.2.6. Update user_config.yaml to quantize and evaluate exported ONNX model
- Update the user_config.yaml file with the following content, and make sure to update the model_path with the path to the exported ONNX model from the previous steps and the dataset paths to point to the COCO validation images and labels in YOLO format generated in the previous steps:
operation_mode: chain_eqe
model:
model_type: yolo26n
model_path: tuto/ultralytics/runs/detect/train/weights/best.onnx
dataset:
format: darknet_yolo
dataset_name: darknet_yolo
exclude_unlabeled: true
download_data: false
class_names: ['person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus', 'train', 'truck', 'boat', 'traffic light',
'fire hydrant', 'stop sign', 'parking meter', 'bench', 'bird', 'cat', 'dog', 'horse', 'sheep', 'cow',
'elephant', 'bear', 'zebra', 'giraffe', 'backpack', 'umbrella', 'handbag', 'tie', 'suitcase',
'frisbee', 'skis', 'snowboard', 'sports ball', 'kite', 'baseball bat', 'baseball glove',
'skateboard', 'surfboard', 'tennis racket', 'bottle', 'wine glass', 'cup', 'fork', 'knife',
'spoon', 'bowl', 'banana', 'apple', 'sandwich', 'orange', 'broccoli', 'carrot', 'hot dog',
'pizza', 'donut', 'cake', 'chair', 'couch', 'potted plant', 'bed', 'dining table',
'toilet','tv','laptop','mouse','remote','keyboard','cell phone','microwave','oven','toaster',
'sink','refrigerator','book','clock','vase','scissors','teddy bear','hair drier','toothbrush']
test_images_path: tuto/ultralytics/examples/YOLOv8-STEdgeAI/datasets/coco/images/val
test_annotations_path: tuto/ultralytics/examples/YOLOv8-STEdgeAI/datasets/coco/labels/val
quantization_path: tuto/ultralytics/examples/YOLOv8-STEdgeAI/datasets/coco/images/val
quantization_split: 0.001
preprocessing:
rescaling:
scale: 1/255
offset: 0
resizing:
aspect_ratio: fit
interpolation: nearest
color_mode: rgb
postprocessing:
confidence_thresh: 0.001
NMS_thresh: 0.5
IoU_eval_thresh: 0.5
plot_metrics: False #True # Plot precision versus recall curves. Default is False.
max_detection_boxes: 100
quantization:
quantizer: onnx_quantizer
target_opset: 17
granularity: per_channel #per_channel
quantization_type: PTQ
quantization_input_type: float
quantization_output_type: float
export_dir: quantized_models
mlflow:
uri: ./tf/src/experiments_outputs/mlruns
hydra:
run:
dir: ./tf/src/experiments_outputs/${now:%Y_%m_%d_%H_%M_%S}
- Run the quantization and evaluation pipeline:
python stm32ai_main.py
2.2.7. Deploy exported ONNX model on N6
- Update user_config.yaml to deploy the onnx model on the N6 as follows, and make sure to update the model_path with the path to the quantized model generated from the previous step:
operation_mode: deployment
model:
model_type: yolo26n
model_path: tf/src/experiments_outputs/%Y_%m_%d_%H_%M_%S}/quantized_models/best_quant_qdq_pc.onnx # update with the path to the quantized model generated from the previous step
dataset:
class_names: ['person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus', 'train', 'truck', 'boat', 'traffic light',
'fire hydrant', 'stop sign', 'parking meter', 'bench', 'bird', 'cat', 'dog', 'horse', 'sheep', 'cow',
'elephant', 'bear', 'zebra', 'giraffe', 'backpack', 'umbrella', 'handbag', 'tie', 'suitcase',
'frisbee', 'skis', 'snowboard', 'sports ball', 'kite', 'baseball bat', 'baseball glove',
'skateboard', 'surfboard', 'tennis racket', 'bottle', 'wine glass', 'cup', 'fork', 'knife',
'spoon', 'bowl', 'banana', 'apple', 'sandwich', 'orange', 'broccoli', 'carrot', 'hot dog',
'pizza', 'donut', 'cake', 'chair', 'couch', 'potted plant', 'bed', 'dining table',
'toilet','tv','laptop','mouse','remote','keyboard','cell phone','microwave','oven','toaster',
'sink','refrigerator','book','clock','vase','scissors','teddy bear','hair drier','toothbrush']
preprocessing:
resizing:
aspect_ratio: crop
interpolation: nearest
color_mode: rgb
postprocessing:
confidence_thresh: 0.5
NMS_thresh: 0.5
IoU_eval_thresh: 0.5
max_detection_boxes: 10
tools:
stedgeai:
optimization: balanced
on_cloud: False
path_to_stedgeai: C:/ST/STEdgeAI/4.0/Utilities/windows/stedgeai.exe
path_to_cubeIDE: C:/ST/STM32CubeIDE_1.17.0/STM32CubeIDE/stm32cubeide.exe
deployment:
c_project_path: ../application_code/object_detection/STM32N6/
IDE: GCC
verbosity: 1
hardware_setup:
serie: STM32N6
board: STM32N6570-DK
mlflow:
uri: ./tf/src/experiments_outputs/mlruns
hydra:
run:
dir: ./tf/src/experiments_outputs/${now:%Y_%m_%d_%H_%M_%S}
- Verify the board setup following this tutorial: https://github.com/STMicroelectronics/stm32ai-modelzoo-services/blob/main/object_detection/docs/README_DEPLOYMENT_STM32N6.md#3-deployment
- Run deployment:
python stm32ai_main.py
2.3. Notes
- Replace placeholder paths (for example, model_path values) with your real local paths before running.
- Keep ONNX opset aligned between export and quantization settings.
- Use the same class order everywhere (training, evaluation, quantization, deployment).