8 different ways to detect faces in images

Over the years of doing many A.I. algorithms revolving around humans,
I needed to detect faces as the first step,
Many methods for detecting faces have sprung up by different packages and although they might look to be the same, different applications could make use of the difference between them as some of them are slower than the others while having immunity to occlusion or noise, some of them are slow with no apparent benefit, some are fast but only work on relatively noise free images, some can only detect front faces, some are cpu based and others use the gpu which means not all can be used on all embedded systems like Raspberry pi for example, sometimes you just have the option of using that one package.
So it all depends on your choice.
Also one seemingly minor factor that can affect detection significantly is the image scale.

                                             MTCNN
Although introduced in 2016, it can easily be one of the best,
MTCNN is a package that uses the famous paper about face detection and exposes functions to make detecting faces extremely easy.
MTCNN requires you to install keras, opencv and tensorflow.
MTCNN can work on cpu which will be slow and inconvenient for any real time applications but if you install tensorflow-gpu , MTCNN will be perfect.
The problem that will face you when trying to use the gpu, assuming you have an nvidia card, is installing the cuda library.
MTCNN works well enough for most face orientations not just frontal face.
It returns a list of json objects containing data about confidence, bounding box, as well as two keypoints for each of the nose, right_eye, left_eye, mouth_right, mouse_left.

                                   Cascade classifier
Opencv offers a haar feature-based Cascade classifier.
Haar feature selection is iterating over certain locations and considering adjacent rectangles to that location then taking the average over the pixel intensities within each rectangle, the difference between the average pixel intensities within each rectangle is used to construct features and categorize based on that.
 For example,
Imagine we take our location as the nose, then we will find out that the nose has an average pixel intensity higher than the two eyes and that the eyes have lower pixel intensity than the cheeks.
Those are two discovered features.
It can be quite fast, light and is multithreaded.
opencv is generally cpu oriented.
It needs an xml file that contains the features of the particular area like a face to search for in images, you can search for and download what you desire from github.
it returns data about the bounding box of the faces only.
It detects more faces using haarcascade_frontalface_default.xml when compared to haarcascade_frontalface_alt.xml but it had significantly more false faces as well.

                                                 Caffemodels
There are many libraries that can handle caffemodels.
I used opencv to read different already trained caffe models.
Obviously speed and accuracy are governed by the caffe model itself, but a quick trick for improving speed is using cv2.dnn.blobFromImages.
Its fast, light and is multithreaded.
Its cpu based but recently gpu support has been implemented.
It returns data about the bounding box of the faces as well as a confidence value.
                                                   
                                                     Dlib
we can use the function get_frontal_face_detector which detects faces using  hog features and SVM.
Its the easiest when comes to loading, detecting, cutting and plotting faces.
Its reasonably fast on cpu but not suitable for real time applications.
Its single threaded.
Its fast when compiled to run on gpu.
Although it says frontal face but it actually can detect sides as well and accuracy is good and comparable to the others.
It returns a bbox for the detected faces.
It features a shape detector that returns  68 points for the mouth, eyes, and nose for each face.
                 
                                                   Dlib.cnn
Dlib features a function that uses .dat trained models file to detect objects.
Its quite easy to load and handle input and output.
Its not fast on cpu and it uses a lot of memory with a peak 12 gb of ram usage for a 2122 x 1415 image.
Its single threaded.
Its the best compared to all the above methods when it comes to the number of faces detected using the available on the internet models.
It returns a bbox for the detected faces.
                                                     Faced
Its a free package that uses tensorflow and a YOLO model.
YOLO is one of the fastest models to use when detecting objects
                                                    Tensorflow
Tensorflow is a very powerful library with many options.
Speed and precision are  a function of the model used.
Its usually not useful for real time applications when run on cpu, there is tensorflow lite though.
Can be a bit hard and counterintuitive for simple applications but makes many complex jobs rather simple for professional applications.
Its returned data is dependent on the model and is quite flexible.

                                                    Keras
Its an API that runs over Tensorflow, CNTK or Theano.
It makes tensorflow a bit more user friendly and has no taxing effect on the performance.
Its quite extensible.

Comments

Popular posts from this blog

Create a route optimization algorithm with zero costs using google's OR-tools and OSRM Part 3

Learn python programming through algorithms - Binpacking part 2

Create a route optimization algorithm with zero costs using google's OR-tools and OSRM Part 1