Inferring Personality from Mobile Phone Behavioral Data

Fanglin Chen, Jianfu Zhou
Department of Computer Science
Dartmouth College
{fanglin.chen.gr, jianfu.zhou.gr}@dartmouth.edu

1  Introduction

Nowadays, there have been many research papers focusing on using text data to infer human personality [1,2,3]. However, the associations between word categories and personality are relatively weak [1]. To infer personality more accurately, we need to utilize information more than those simple linguistic markers. Massive data from our pervasive smartphones give us this opportunity.
In our project, we focus on inferring human personality, especially five core traits (known as Big Five Personality Traits [4]), by means of looking into the daily mobile phone behavioral data (e.g. social interaction, calling behavior) [5], hoping that we could find out the correlation between mobile phone behavioral data and personality categorization.
In our project, we will use the MIT Friend and Family dataset [6]. This dataset consists of two parts: the sensor data and the survey data, which together show the social structure of a young family residential living community. To be specific, (a) the sensor data shows the proximity to other people and records of phone calls and text messages (SMS), and (b) the survey data shows the relations with other participants in the community before the test, social interactions and the personality indicators. 53 participants took the Big Five Personality Test [7]. In the dataset, each person's personality in this group of people has been labeled.

2  Methodology

2.1  Preprocessing

The bluetooth data measures the physical co-location behavior in a person's daily life. If there is a bluetooth mobile device entering a user's radiation range, then a co-location behavior is monitored. To eliminate random occurrence of co-location behavior, we don't take into account the co-location behavior which has relative low-frequency occurrence.
The text messages and voice calls measure the communicational behaviour of a person in everyday life. If there are text messages or voice calls between two people, that means they are in each other's contacts.
Before calculating the features, we carried out Laplace smoothing on the dataset. We scale all the features to improve the convergence rates and reduce the influence of different ranges of feature value. From the dataset we get the questionnaire answers of "The Big Five Inventory" [7], then get the 100-point scale scores of each personality dimension. We simplify the personality score into binary class. When the score is higher 50, we regard it as the first class. For the score which is lower than 50, we regard it as the second class.

2.2  Features

Entropy:  Is a quantitative measure reflecting how many different categories there are in a given random variable, and simultaneously takes into account how evenly the basic units are distributed among those categories. For example, the entropy of one's contacts is the ratio between one's total number of contacts and the relative frequency at which one interacts with them.
H(a − c)
=


c 
fc logfc
fc
=
# of a communicates with c

total # of a′s contacts
where c is a contact and fc the frequency at which a communicates with c. The more one interacts equally often with a large number of contacts the higher the entropy will be. Our project considers the entropy of bluetooth, calls and text [8].
Inter-event time:  Is the time elapsed between two events. Our project then consider both the average and variance of the inter-event time of ones' call and text.
Response rate and latency (text):   We consider a text from a user (A) to be a response to a text received from another user (B) if it is sent within an hour after user A received the last text from user B. The response rate is the percentage of texts people respond to. The latency is the median time it takes people to answer a text. Note than by definition, latency will be less or equal to one hour [8].
Number of contacts:  Is a quantitative measure reflecting how many persons there are one communicates with. Our project considers the contacts from the bluetooth, call and text.
Percentage of call during night:  Is a quantitative measure reflecting how many calls there are made during the night. In our project, we set the night to be 6pm to 6am.
p = # of calls one made during night

total # of one′s calls
Percentage of initiated contacts:  Is the ratio between the number of contacts one initiated and one's total number of contacts. Our project then consider the percentage of initiated call and text.
p = # of contacts one initiated

total # of one′s contacts
We have totally 12 features in our feature set: Entropy (Bluetooth, Message, Call), Inter-event time (Message, Call), Response rate and latency (Message), Number of Contacts (Bluetooth, Message, Call), Percentage of call during night, Percentage of initiated contacts (Message, Call).

2.3  Models

In the proposal, we proposed to use random forest and support vector machine because they were recommended in our literature review. Since we are implementing both of these approaches in class, we needed to choose a different classifier. We decided to start with SVM to get the preliminary results and move on to Gradient Boosted Decision Tree.

2.3.1  SVM

The relationship between personality traits and numerous behavioral and psychological factors are really not linear [9]. So we decide to start with SVM with RBF kernel to build model.

2.3.2  Gradient Boosted Decision Tree

GBDT is a machine learning algorithm that iteratively constructs an ensemble of weak decision tree learners through boosting. The pipeline of this algorithm can be shown in Figure 1, and the pseudocode is shown in Figure 2. Basically we incorporate regression tree into the framework of Gradient Boosting.
Figure
Figure 1: Gradient Boosted Decision Tree's Growing Pipeline.
Figure
Figure 2: Gradient Boosted Decision Tree's Growing Pipeline.

2.4  Accuracy and Error

The prediction accuracy over the test set and the cross validation set is shown in Figure 3 and 4. As shown in the figure, we can find that the prediction of openness, conscientiousness and extroversion can be better predicted than agreeableness and neuroticism. To some extent this makes sense because agreeableness and neuroticism are more delicate than the traits like extroversion. People who shows a strong tendency to extroversion may seek stimulation in the company of others, and they are more likely to be outgoing and energetic. This behavioral pattern can be correlated with their phone usage level.
Figure
Figure 3: The average accuracy over the test set data.
Figure
Figure 4: The average 5 fold cross-validation accuracy.

3  Result Analysis

The result listed in the last section shows that to prediction of personality is a difficult task given the present dataset. Although we have got more than 50 subjects. But the data is limited, in terms of types and number of complete features. Beyond that, we do want to find out what would be the best indicators set to predict personality traits. We provide a hypothesis that the best indicators would be different among different personality traits. The performance comparison between SVM and GBDT is shown in Figure 5.
Figure
Figure 5: Performance comparison between SVM and GBDT algorithms.

3.1  Feature Selection

An investigation of the most important feature to predict each trait revealed interesting associations. For example, the entropy of participants' contacts helped predict Extroversion. These findings are inline with past research showing these traits both relate to different aspects of the diversity of one's social network: extraverts tend to seek more friends than introverts, agreeable individuals tend to be selected more as friends by other people.
We use sequential forward selection (SFS) algorithm to find the approximate best indicators in the feature set. The algorithm starts with one-feature model to get the feature which returns the highest accuracy and then iterates to get the second feature in the same way. The pseudocode is shown in the Figure 6. An example process of algorithm convergence is shown in the Figure 7. We have found the top-3 indicators in the feature set for different personality dimensions (shown in Figure 8, Figure 9, and Figure 10).
Figure
Figure 6: The pseudocode of SFS algorithm.
Figure
Figure 7: After using SFS greedy algorithm, the feature set reaches its highest accuracy after about 20 iterations.
Figure
Figure 8: The entropy of the message contact, call contact, and the variance of the inter-event time are the top 3 indicators in the feature set.
Figure
Figure 9: The variance of inter-message time, average inter-message time, and the average inter-call time are the top 3 indicators in the feature set.
Figure
Figure 10: The average inter-message time, percent of initiated message, and the average inter-call time are the top 3 indicators in the feature set.

4  Discussion

The poor result of personality prediction is that we face sparsity over the dataset, we do want to carry out of finding to gather as much data as possible to analysis personality. The ongoing work is to collect various types from data over 60 Dartmouth full-time students. It includes, audio features to infer conversation duration and frequency; accelerometer features to infer activity level; GPS and Wifi data to infer mobility pattern; Mobile applications installation pattern; Daily mood level from EMA(ecological momentous assessment) survey; Self-efficiency which is shown by their accomplishment of their calendars; Time management curve of their coursework over the term, etc. With more data from the student participants, we aim to predict personality in a higher accuracy.

References

[1]
L. Qiu, H. Lin, J. Ramsay, and F. Yang, "You are what you tweet: Personality expression and perception on twitter," Journal of Research in Personality, 2012.
[2]
F. Mairesse, Learning to Adapt in Dialogue Systems: Data-driven Models for Personality Recognition and Generation. PhD thesis, University of Sheffield, United Kingdom, 2008.
[3]
D. Quercia, R. Lambiotte, M. Kosinski, D. Stillwell, and J. Crowcroft, "The personality of popular facebook users," in Proceedings of the ACM 2012 conference on Computer Supported Cooperative Work (CSCW'12), 2012.
[4]
S. Gosling, P. Rentfrow, and W. Swann, "A very brief measure of the big-five personality domains," Journal of Research in personality, vol. 37, no. 6, pp. 504-528, 2003.
[5]
N. Lane, E. Miluzzo, H. Lu, D. Peebles, T. Choudhury, and A. Campbell, "A survey of mobile phone sensing," Communications Magazine, IEEE, vol. 48, no. 9, pp. 140-150, 2010.
[6]
N. Aharony, W. Pan, C. Ip, I. Khayal, and A. Pentland, "Social fmri: Investigating and shaping social mechanisms in the real world," Pervasive Mob. Comput., vol. 7, pp. 643-659, Dec. 2011.
[7]
"http://www.outofservice.com/bigfive."
[8]
Y.-A. de Montjoye, J. Quoidbach, F. Robic, and A. S. Pentland, "Predicting people personality using novel mobile phone-based metric," 2012.
[9]
M. J. Benson and J. P. Campbell, "To be, or not to be, linear: An expanded representation of personality and its relationship to leadership performance," International Journal of Selection and Assessment, vol. 15, no. 2, pp. 232-249, 2007.