Face Betrays Your Age

Xin Wang, Yanan Li, Yongfu Lou

Abstract

Even though relatively limited efforts have been made towards it, age estimation has a large variety of applications ranging from access control, human machine interaction, person identification and data mining and organization. Human face is the most essential reflection of age. During the process of aging, changes of face shape and texture will take place.

This project is about estimating human age based on face images. It was split into four stages: dataset obtaining and preprocessing, shape and texture feature extraction, model training and age prediction, comparing prediction errors of deferent models.

Dataset

The database we are used is FG-NET(Face and Gesture Recognition Research Network), which is built by the group of European Union project FG-NET. This dataset contains 1002 images of 82 subjects whose age varies from 0 to 69. Each image was manually annotated with 68 landmark points located on the face. And for each image, the corresponding points file is available. Figure 1 is a sample image from the dataset.

Figure 1. Image sample from the dataset

Data Preprocessing

Phase 1: Filtering

To simplify the processing and enhance the prediction accuracy by the end, we chose only the images in which people roughly faced the camera directly. Images like Figure 2 were not chosen. At last, we got 617 satisfying images out of the total 1002 images in the dataset.

Figure 2. Image sample not chosen

Phase 2. Graying

Since we need to use the texture on the face as one important feature that predicts the age, we processed all the images into gray scale ones.

Phase 3. Rotation

Each image was rotated until the two eyes reached horizontal.

Phase 4. Resizing

We resized all the images. Go through all the coordinates files to find the shortest distance D between two eyes in each image and then resize all the images until the eye-distance in each image is equal to D.

The whole procedure of data preprocessing is shown as Figure 3.

Figure 3. Procedure of data preprocessing

Feature Extraction

1.Shape Feature vectors obtained from Shape Model

A. After the data preprocessing stage, we have 617 images and the corresponding coordinate files, with each one recording 68 point-positions on the face.

B. We aligned the shape vectors a little further by applying depth angle modification to make the people exactly face the camera directly. Figure 4 shows the procedure.

The new shape vector we got can be presented as following:

C. Reduce the dimension of the shape vector to accelerate the model training procedure later by using Principle Component Analysis.

PCA is applied to the shape vector by computing the mean shape:

And the covariance,

The eigenvalues and eigenvectors of the covariance matrix, epresents for the eigenvalues, and is listed in descending order.

Choose the t largest eigenvalues which can explain the 98% of the variance in the training shapes.

So we can get eigenvalues , and the corresponding eigenvectors.

After PCA, each shape vector can be written as a linear combination of and:

In the above formula,, which are the shape parameters and the coefficients of the first t models.

Figure 4. Mean face shape and some eigenvector variations

(Blue lines show the mean shape and red lines show some eigenvector variations based on the mean shape )

2. Texture Feature vectors obtained from Appearance Model.

To establish the statistical texture model for reflecting the global texture variation of the faces, firstly, what we need to do is getting all the pixel gray-scale values in the face contour area, which can be regarded as extracting the shape-independent texture. And then we use principle component analysis of the shape-independent texture to modeling.

A. Texture normalization

Texture normalization is used to compensate the illumination difference in all the images. We need to get the grey-scale value of each pixel in the texture image in a fixed sequence to generate a vector where n is the number of the pixels. And the elements in this vector have a mean of 0 and a variance of 1.

B. Triangulation of the mean shape

We introduced Delaunay Triangulation method to divide the mean shape into a collection of triangles (triangular meshes).

Figure 5. Delaunay Triangulation of the mean shape

C. Obtaining the shape-independent texture.

Firstly, each training image is deformed into the mean shape image by piecewise affine transformation. After that, we will find that there are some black lines in the new image which are 0 pixels. To fill all the blank pixels (value=0) generated by the affine transformation, interpolation on the deformed image is applied. Then, the shape-independent texture feature is generated.

Figure 6. The procedure to obtain shape-independent texture

D. Principle Components Analysis.

Like shape feature extraction, AAM also use principle component analysis to build a statistical model of texture. As a result, we can get the eigenvalues, eigenvectors and the mean texture of the appearance model, and after PCA, each shape vector can be written as a linear combination of mean texture and the eigenvectors.

Till now, we have all the needed feature vectors, shape and texture. In the next stage, SVC and SVR are used to train a model and estimate ages. For SVR, there are two execution plans. The first is to use the combined shape and texture vectors which are dimension-reduced again by applying PCA. The other is to use hierarchical SVRs with the two separate features.

Training Models

In this stage we applied four models, and made comparisons of the results.

Model 1: SVC (Support Vector Classification)

The SVC method in this project is C-SVC. Using the combined shape and texture feature vectors, we built 18 different models according to the class size. They are 1-year-a-class, 2-year-a-class, ... , 18-year-a-class respectively. And at last, we calculated and compared the standard error of the models trying to find the best class size.

Model 2: SVR (Support Vector Regression)

Specifically, what we used is epsilon-SVR, one kind of SVR. In this model, the shape feature vectors and the texture feature vectors are also combined to form the overall feature vectors. These vectors are then used to train the model.

Model 3: Hierarchical SVC

Before we use SVC or SVR to train the model, we have 2 kinds of features, shape features and texture features. They are combined in the above two models. However, different kinds of features usually have distinct magnitudes. The features with lower magnitude will be assigned lower weight in SVM, thus relatively neglected. They didn’t get the deserved influence on the result, which leads to the higher inaccuracy of the result.

Therefore, we introduce Hierarchical SVM which was suggested by Professor Lorenzo. Hierarchical Support Vector Machine (HSVM) for multi-class classification is a decision tree with an SVM at each node. In this model, we used three SVC models at each node. The first SVC was trained using only the shape feature vectors and the second using only the texture feature vectors. We got two prediction result vectors and then combined them (prediction from the first SVC, the second SVC and the labels) as the new feature vectors. At last, we trained a third SVC using the above feature vectors. And the prediction procedure of the new model is shown in Figure 7.

Figure 7. Prediction process unsing Hierarchical SVC

Moreover, Hierarchical models are better for large amount of data. The number of levels should be controlled in some degree, since if there are too many levels, data may not be enough to train a model well and the way from root to leaf maybe too long to make it right to the correct leaf.

Further, we can use multi-level Hierarchical SVM instead of 2 levels. On each node, SVM can have different parameters, we can choose the optical parameter depend on the result from the previous level.

Model 4: Hierarchical SVR

Hierarchical SVR is very similar to hierarchical SVC and the difference between them is that in the first model we used three SVCs while second one we used three SVR nodes.

Experiments and Results

Ÿ Result for C-SVC

Figure 8 shows the classification accuracy as a function of class size.

Description: Macintosh HD:Users:yongfulou:Desktop:accu-span.jpg

Figure 8. Classification accuracy of C-SVC

Ÿ Result for Hierarchical C-SVC

Figure 9 shows the classification accuracy as a function of class size.

Description: Macintosh HD:Users:yongfulou:Desktop:haccuracy-span.jpg

Figure 9. Classification accuracy of Hierarchical C-SVC

After implementing hierarchical model to SVC, a light improvement appears. Later we can observe a larger performance improvement when SVR is optimized by hierarchical model.

Another important thought suggested by the professor is the standard error of SVC should go down then come up. Because if age span is small, there are not enough samples to train the model. While if the age span is large, then the error is big, this is caused by precision itself. By observing the standard error of different age-span, we can make a balance of training maturity and precision, thus find the best age-span. From the following plot, the best age-span can be chosen, which is 14 year.

image017 (Xin Wang's conflicted copy 2013-03-08).jpg

Figure 10. Classification standard error of Hierarchical C-SVC

Ÿ Result for Epsilon-SVR and Hierarchical Epsilon-SVR

Figure 10 shows the mean squared errors and other parameters of SVR and Hierarchical SVR.

Figure 11. Mean squared errors of SVR and Hierarchical SVR

From the results, we can generally get some conclusions:

a. Accuracy gets higher when age range span grows. The standard error goes down and rises, which can be used to decide the best class age-span.

b. Hierarchical SVR and SVC performs better than normal ones. The reason has been stated in the part of hierarchical models.

c. SVR produces smaller error than SVC.

When we compare the SVC and SVR, we cannot regard SVR as a fine-grained SVC. If so, SVR will have the lowest accuracy. It was the opposite because of an assumption that all misclassifications are equal, SVC missed a very informative part of the data, therefore is not as good as SVR in this problem.

Summary

Facial age estimation can figure in a variety of applications. However, it is quite challenging a task. In this project, we proposed an age estimation method. Firstly, shape and texture features are extracted from the faces. Then we train four different models and predict human ages using them. After analysis and comparison on the results (for SVC is the misclassification rate and for SVR is the mean square error), we can conclude that the whole process produces a quite acceptable overall result.

External Software Modified

There are mainly two sections we use the external code. The first one is AAM&ASM for feature extraction and the second is SVM algorithms for model training. Original Code, sources of external software and how we modified them by writing own code are listed as following.

Data preprocessing

AAM&ASM

SVM

External Source Link

Active Appearance Model and Active Appearance Model

http://www.mathworks.com/matlabcentral/fileexchange/26706-active-shape-model-asm-and-active-appearance-model-aam

By Dirk-Jan Kroon

libsvm

http://www.csie.ntu.edu.tw/~cjlin/libsvm/

By Chih-Chung Chang and Chih-Jen Lin

Code Written by Own

1. dataPreprocess.m: Data Preprocessing procedure including reading the coordinates, image, identifier and age, storing them in a structure, change all the images into gray-scale ones, applying rotation to the coordinates and the image, applying scaling to the coordinates and the image and applying depth angle modification to the coordinate and the images.

2. makelabel.m: Make labels for classes with different age range.

3. showImgPnt.m: Show the visualization of feature points on a picture after data Preprocessing.

The basic idea of the project we refer is to train Active Shape Model and Active Appearance Model for automatic segmentation and recognition of biomedical objects. However, our project is to extract feature from the training set and the 68 landmarks are already known. So we modified the source code and we neglect the search step and add piecewise affine transformation in it.

1. average_shape.m: Calculate the mean face shape of the original data.

2. triangulation.m: Apply Delaunay Triangulation to get the triangular meshes of mean shape.

3. triangleAffine.m: Apply piecewise affine transformation to deform each face into mean shape.

4. GetShapeFeature.m: Get the shape feature vector of the training data.

5. GetAppearanceFeature.m: Get the texture feature vector of the training data.

6. GetCombinedFeature.m: Get the combined shape and appearance feature vectors of the training data.

7. PCA.m: Reduce the dimension of the feature vectors to accelerate the model training and age prediction processes.

1. combineData.m: Combine the output data files obtained from feature extraction stage.

2. SVC.m: Using SVC to predict people's age.

3. SVR.m: Using SVR to predict people's age.

4. hSVC.m: Using hierarchical SVC to predict people's age.

5. hSVR.m: Using hierarchical SVR to predict people's age.

6. DifRangeSVC.m: To plot the accuracy as a function of class size.

7. DifRangeHSVC.m: To plot the accuracy as a function of class size.

Table 1. Original Code and External code modification details

References

[1] Xin Geng, Zhi-Hua Zhou, Kate Smith-Miles (2007). Automatic Age Estimation Based on Facial Aging Patterns. Pattern Analysis Machine Intelligence, 29(12), 2234-2240.

[2] Unsang Park, Yiying Tong, Anil K.Jain (2010). Age-Ivariant Face Recognition. Pattern Analysis and Machine Intelligence, 32(5), 947-954.

[3] Ramanathan, N., Chellappa, R., & Biswas, S. (2009). Age progression in human faces: A survey. Visual Languages and Computing.

[4] Steiner, M. Facial Image-based Age Estimation.

[5] Xing Gao. Research on Facial Image Age Estimation.

[6] Hsu, C. W., Chang, C. C., & Lin, C. J. (2009). A practical guide to support vector classification, 2003. Paper available at http://www. csie. ntu. edu. tw/~cjlin/papers/guide/guide. pdf.

[7] Van Ginneken, B., Frangi, A. F., Staal, J. J., ter Haar Romeny, B. M., & Viergever, M. A. (2002). Active shape model segmentation with optimal features. Medical Imaging, IEEE Transactions on, 21(8), 924-933.

[8] Cootes, T. F., Taylor, C. J., Cooper, D. H., & Graham, J. (1995). Active shape models-their training and application. Computer vision and image understanding, 61(1), 38-59.

[9] Cootes, T. F., Edwards, G. J., & Taylor, C. J. (2001). Active appearance models. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 23(6), 681-685.

[10] Chih-Chung Chang and Chih-Jen Lin, LIBSVM : a library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2:27:1--27:27, 2011. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm