Inferring Personality from Mobile Phone Behavioral Data
Fanglin Chen, Jianfu Zhou
Department of Computer Science
Dartmouth College
{fangli, jianfu.zhou.gr}@dartmouth.edu
1 Introduction
Nowadays, there have been many research papers focusing on using text data to infer human personality [1,2,3]. However, the associations between word categories and personality are relatively weak [1]. To infer personality more accurately, we need to utilize information more than those simple linguistic markers. Massive data from our pervasive smartphones give us this opportunity.
In our project, we focus on inferring human personality, especially five core traits (known as Big Five Personality Traits [4]), by means of looking into the daily mobile phone behavioral data (e.g. social interaction, calling behavior) [5], hoping that we could find out the correlation between mobile phone behavioral data and personality categorization.
2 Methodology
In our project, the following indicators [6] will be used to make inference:
- basic phone use (i.e. number of phone calls and text messages),
- active user behaviors (e.g. number of call initiated and time to answer a text),
- location (i.e. radius of gyration and number of places from which calls have been made),
- regularity (e.g. temporal calling routine and inter-time between call and text),
- and diversity (i.e. call entropy and number of interactions by number of contacts ratio).
Moreover, the features planned to be used in our project will be extracted from social networks composed of 53 persons. To be specific, these 53 persons formed three networks, from each of which we will extract four measures (feature sets) respectively: Centrality Measures, Small World and Efficiency Measures, Transitivity Measures, and Triadic Measures [7].
Furthermore, we are going to perform binary classification (i.e. LOW or HIGH) on personality traits, using Support Vector Machine with rbf kernel [8] and random forest [9]. For the dataset, the details of which is provided in the next section, 80% will be used as the training set, and the remaining as the test set.
3 Dataset
In our project, we will use the MIT Friend and Family dataset [10]. This dataset consists of two parts: the sensor data and the survey data, which together show the social structure of a young family residential living community. To be specific, (a) the sensor data shows the proximity to other people and records of phone calls and text messages (SMS), and (b) the survey data shows the relations with other participants in the community before the test, social interactions and the personality indicators. 53 participants took the Big Five Personality Test [11]. In the dataset, each person's personality in this group of people has been labelled.
4 Timeline
2013.01.20 - 2013.02.19, totally 4 weeks.
- First week. Network formulation, using Gephi [12] etc. software to formulate all the three networks based on the sensor and survey data;
- Second week. Adjust features, eliminate redundant features and incorporate new features. Finish coding the training algorithm and start training;
- Third week. Complete training the data set, start to write the write-up and make comparison of used algorithms;
- Last week. Complete the write-up.
References
- [1]
-
L. Qiu, H. Lin, J. Ramsay, and F. Yang, "You are what you tweet: Personality
expression and perception on twitter," Journal of Research in
Personality, 2012.
- [2]
-
F. Mairesse, Learning to Adapt in Dialogue Systems: Data-driven Models for
Personality Recognition and Generation.
PhD thesis, University of Sheffield, United Kingdom, 2008.
- [3]
-
D. Quercia, R. Lambiotte, M. Kosinski, D. Stillwell, and J. Crowcroft, "The
personality of popular facebook users," in Proceedings of the ACM 2012
conference on Computer Supported Cooperative Work (CSCW'12), 2012.
- [4]
-
S. Gosling, P. Rentfrow, and W. Swann, "A very brief measure of the big-five
personality domains," Journal of Research in personality, vol. 37,
no. 6, pp. 504-528, 2003.
- [5]
-
N. Lane, E. Miluzzo, H. Lu, D. Peebles, T. Choudhury, and A. Campbell, "A
survey of mobile phone sensing," Communications Magazine, IEEE,
vol. 48, no. 9, pp. 140-150, 2010.
- [6]
-
Y.-A. de Montjoye, J. Quoidbach, F. Robic, and A. S. Pentland, "Predicting
people personality using novel mobile phone-based metric," 2012.
- [7]
-
D. Knoke, S. Yang, and J. Kuklinski, Social network analysis, vol. 2.
Sage Publications Los Angeles, CA, 2008.
- [8]
-
G. Chittaranjan, J. Blom, and D. Gatica-Perez, "Who's who with big-five:
Analyzing and classifying personality traits with smartphones," in
Wearable Computers (ISWC), 2011 15th Annual International Symposium on,
pp. 29-36, IEEE, 2011.
- [9]
-
J. Staiano, B. Lepri, N. Aharony, F. Pianesi, N. Sebe, and A. Pentland,
"Friends don't lie - inferring personality traits from social network
structure," 2012.
- [10]
-
N. Aharony, W. Pan, C. Ip, I. Khayal, and A. Pentland, "Social fmri:
Investigating and shaping social mechanisms in the real world,"
Pervasive Mob. Comput., vol. 7, pp. 643-659, Dec. 2011.
- [11]
-
"http://www.outofservice.com/bigfive."
- [12]
-
"https://gephi.org."
File translated from
TEX
by
TTH,
version 3.81.
On 23 Jan 2013, 11:11.