X-LINUX-AI - face recognition using TensorFlow Lite C++ API

Revision as of 17:47, 9 December 2020 by Registered User (→‎In pratice)

This article explains how to get started on the TensorFlow Lite[1] face recognition application.

1. Description[edit source]

The face recognition application is capable of recognizing the face of a known (i.e. enrolled) user.

C/C++ TensorFlow Lite face recognition application

The application demonstrates a computer vision use case for face recognition where frames are grabbed from a camera input (/dev/videox) and compute by 2 neural network models (face detection and feace recognition) interpreted by the TensorFlow Lite[1] framework.
A Gstreamer pipeline is used to stream camera frames (using v4l2src), to display a preview (using waylandsink) and to execute neural network inference (using appsink).
The result of the inference is displayed in the preview. The overlay is done using GtkWidget with cairo.
This combination is quite simple and efficient in terms of CPU overhead.

1.1. Camera capture[edit source]

The camera frame capture has the following characteristics:

  • Resolution is set to QVGA (320 x 240)
  • Pixel color format is set to RGB565

1.2. Frame Preprocessing[edit source]

The main preprocessing stage involved in the Face Reco application is the pixel color format conversion so to convert the RGB565 captured frame into a RGB888 frame.

1.3. Face Detection[edit source]

The Face Detection block is in charge of finding the faces present in the input frame (QVGA, RGB888). In the current version of the application, the maximum number of faces that can be found is set to one. The output of this block is a frame of resolution 96 x 96 that contains the face found in the input captured frame.

1.4. Face Recognition[edit source]

The Face Recognition block is in charge of extracting features and computing a signature (feature vector) corresponding to the input face.

1.5. Face Identification[edit source]

The Face Identification block is in charge of computing the distance between:

  • The vector produced by the Face Recognition block, and
  • Each of the vectors stored in memory (and corresponding to the enrolled faces)

The output Face Identification block generates the two following outputs:

  • a User Face ID corresponding to the minimum distance
  • a similarity score

2. Installation[edit source]

Info white.png Information
The application is only available in binary format from the AI OpenSTLinux package repository.

Please contact the local STMcroelectornics support for more information about this application.

2.1. Install from the OpenSTLinux AI package repository[edit source]

Warning white.png Warning
The software package is provided AS IS, and by downloading it, you agree to be bound to the terms of the software license agreement (SLA). The detailed content licenses can be found here.

After having configured the AI OpenSTLinux package you can install X-LINUX-AI components for this application:

 apt-get install tflite-cv-apps-face-recognition-c++

Then restart the demo launcher:

 systemctl restart weston@root

3. How to use the application[edit source]

3.1. Launching via the demo launcher[edit source]

Launch cpp tfl face recognition.png

3.2. Executing with the command line[edit source]

The facereco_tfl_gst_gtk C/C++ application is located in the userfs partition:

/usr/local/demo-ai/computer-vision/tflite-face-recognition/bin/facereco_tfl_gst_gtk

It accepts the following input parameters:

 
Usage: ./facereco_tfl_gst_gtk [option]

--reco_simultaneous_faces <val>:      number of faces that could be recognized simultaneously (default is 1)
--reco_threshold <val>:               face recognition threshold for face similarity (default is 0.70 = 70%)
--max_db_faces <val>:                 max number of faces to be stored in the data base (default is 200)
-i --image <directory path>:          image directory with image to be classified
-v --video_device <n>:                video device (default /dev/video0)
--frame_width  <val>:                 width of the camera frame (default is 640)
--frame_height <val>:                 height of the camera frame (default is 480)
--framerate <val>:                    framerate of the camera (default is 15fps)
--verbose:                            enable verbose mode
--help:                               show this help


  • launch face recognition based on the pictures located in /usr/local/demo-ai/computer-vision/tflite-face-recognition/testdata directory
 /usr/local/demo-ai/computer-vision/tflite-face-recognition/bin/launch_bin_facereco_tfl_model_testdata.sh
Info white.png Information
Note that you need to populate the testdata directory with your own data sets.

The pictures are then randomly read from the testdata directory

3.3. In pratice[edit source]

As soon as a face is detected within the camera captured frame, a rectangle box is drawn around it.

If the system is not able to match the detected face with one of the enrolled faces (either because the user face is not yet enrolled or because the Face Identification similarity score is lower than the default recognition threshold), the rectangle box is drawn in red with the unknown identity.

Unknown person

To enroll a new user, simply touch (or clicked) inside the red rectangle. The virtual keyboard is then displayed to get the user name. To finish the enrollment process simply press the return key. Note that the face picture is capture when the red rectangle is touched (or clicked).

Virtual keyboard displayed after having touch an unknown face

If the system is able to match the detected face with one of the enrolled faces, the rectangle box is drawn in green and the registered user name is displayed with the similarity score expressed in percentage (%). The thumbnail (representing the user's enrolled face matching the detected face) is displayed and highlight with a green rectangle in the banner located at the bottom of the preview. This banner is displaying thumbnail that reflect the historic of the previous detections.

user2 is detected

When in nominal mode a FPS (Frame Per Second) information is displayed at the top of the right banner. This FPS corresponds to the number of frames that could be processed by the system in one second. As a result, the FPS is directly linked to the following timings:

  • Duration of the Face Detection execution
  • Duration of the Face Recognition execution

The camera frame acquisition is performed in parallel of the Face Detection and Face Recognition algorithm executions.


As the enrollment database is stored in the file system, the database is persistent upon reset.

4. References[edit source]



No categories assignedEdit