To achieve the facial recognition requirements of the project, we are utilizing the face_recognition Python library. The face_recognition Python library was built using dlib’s state-of-the-art face recognition by using deep learning. This module has an accuracy of 99.38% using the Labeled Faces in the Wild benchmark. The face_recognition module has two available models that can be used to identify faces: a Convolutional Neural Network (CNN) model and a Histogram of Oriented Gradients (HOG) model. To test the two models, we need to compare known pictures of various people to use in our database with an unknown input. We will use regular .jpg images of ten people as our known pictures and videos of those ten people as our unknown inputs. Each input video is of comparable length and only includes the person that is in the known image. The group of people is made up of the four group members and six notable people from society:

Known Images Used in Model Comparison

To compare the two models, we will be measuring three different metrics: distance from known photo to unknown input, accuracy, and run time per frame. The distance between known and unknown inputs is found by finding the Frobenius norm of the difference between the numpy ndarray encoding of the known input and the numpy ndarray encoding of the unknown input.

Accuracy is defined by the percentage of correct match predictions made on the unknown video inputs. A match prediction is made based on the match threshold of the system. The match threshold is a set distance that an input frame from the unknown video input must meet to be considered a match with the known image. The recommended match threshold is set at 0.50. The threshold can be raised to 0.60; however, this will result in less accurate match predictions.

The run time per frame metric will keep track of the cost for each model. This will be a critical factor in helping us decide which model to use in the final system. We will measure run time per frame by adding a timer to the Python script that compares the known picture with the unknown video input. The timer will reset after each frame is processed.

Facial Recognition Model Comparison Algorithm Flowchart

Results

Average Distance Between Known Picture and Unknown Video Input Frame

Run Time per Frame (in seconds) for Each Video Input

Match Accuracy by Threshold for HOG Model

Match Accuracy by Threshold for CNN Model

Result of Successful Image Match (i.e. Person is authorized to enter facility)

Result of No Image Match