Building a Gaze Estimator For Full Face Images Using ResNet-50

3 min readSep 11, 2021

1. Introduction

Gaze Estimation is the task for predicting where a person is looking by using the image of face/eyes of the person. In this article we will be using full face images of the persons. Head position angles and eye position angles are used to determine where the person is looking at, we shall focus more on the eye position angles predicting the pitch and yaw angles (please see the animation in the link).

2. Datasets Used

A cross dataset evaluation is performed for better generalizability.

Training: Gaze Capture Dataset: A large scale eye tracking dataset from over 1450 people consisting of almost 2.5 million images.

Testing: MPIIGaze Dataset: This contains over 213,659 images collected from 15 participants during natural everyday laptop use over 3 months.

3. Dataset Preprocessing

For each of the images, facial detection and landmark localization is performed and then data normalization is performed to yield the processed image. The following pipeline is used to do the same

GitHub - swook/faze_preprocess: Preprocessing pipeline for the MPIIGaze and GazeCapture datasets…

This is a repository for the code used to generate input data for the ICCV 2019 oral paper: Few-Shot Adaptive Gaze…

github.com

4. Model

Residual Neural Networks are used here to perform the prediction, below is architecture of the model.

GazeRedirectionEstimation/gazeheadResnet.py at main · codeastra2/GazeRedirectionEstimation

Gaze Estimation Pipeline for Full Face Images. . Contribute to codeastra2/GazeRedirectionEstimation development by…

github.com

5. Training/Evaluation Methodology

Batch Processing: Given the size of each image being 128x128, the training needs to be done in batches. Fixed number of images (say 10) are chosen randomly from each person till the total number of training images is reached. Then using dataloaders and a batch size of 128, training is performed.
Losses: The pitch and yaw angles are converted to vectors, and the cosine similarity b/w the two is calculated to give the angular loss. The L1 loss is used for calculating the backpropagation gradients and updating weights.
Optimizers, Schedulers: The ADAM optimizer is used here to perform backpropagation and update the weights. The learning rate is chosen to be 0.0003 and the beta values are chosen to be 0.9 and 0.95.

GazeRedirectionEstimation/train_gaze_estimator.ipynb at main · codeastra2/GazeRedirectionEstimation

Gaze Estimation Pipeline for Full Face Images. . Contribute to codeastra2/GazeRedirectionEstimation development by…

github.com

6. Results

After training on a total of 20,000 images the model has an angular error of 6 degrees. Here is a comparison of the performance of estimator when trained on a variable number of images and also with data augmentation.

Estimator Performance comparison for different number of images

All the required code and also a pre trained model can be found on the Github project page!

GitHub - codeastra2/GazeRedirectionEstimation: Gaze Estimation Pipeline for Full Face Images.

This repository contains the implementation to develop a gaze estimator for 128x128 full face images. It also contains…

github.com

7. References

https://www.researchgate.net/publication/344878253_Self-Learning_Transformations_for_Improving_Gaze_and_Head_Redirection

Building a Gaze Estimator For Full Face Images Using ResNet-50

1. Introduction

2. Datasets Used

3. Dataset Preprocessing

GitHub - swook/faze_preprocess: Preprocessing pipeline for the MPIIGaze and GazeCapture datasets…

This is a repository for the code used to generate input data for the ICCV 2019 oral paper: Few-Shot Adaptive Gaze…

4. Model

GazeRedirectionEstimation/gazeheadResnet.py at main · codeastra2/GazeRedirectionEstimation

Gaze Estimation Pipeline for Full Face Images. . Contribute to codeastra2/GazeRedirectionEstimation development by…

5. Training/Evaluation Methodology

GazeRedirectionEstimation/train_gaze_estimator.ipynb at main · codeastra2/GazeRedirectionEstimation

Gaze Estimation Pipeline for Full Face Images. . Contribute to codeastra2/GazeRedirectionEstimation development by…

6. Results

GitHub - codeastra2/GazeRedirectionEstimation: Gaze Estimation Pipeline for Full Face Images.

This repository contains the implementation to develop a gaze estimator for 128x128 full face images. It also contains…

7. References

Written by srinivas kumar r

No responses yet