Proposal - Mohammod Mashfiqui & Jyotishman Nag

Predicting Human Conversations using Social Networks

Mohammod Mashfiqui Rabbi Jyotishman Nag

Introduction:

Due to recent advances in ubiquitous sensing technologies, it is now possible to record fine grained human activities more precisely than ever before. With this rich behavioral data, we not only can understand someone's activities at individual level but also can identify entities that influence his or her daily life. Identifying these influences can have a profound impact in the next generation of health care, where these influence channels can be exploited to give advices for individual well-being.

In this project, we will try to model face-to-face conversations with the help of social network data. We want to detect whether someone's social ties influence his or her social interactions, and if so then to what degree. Our focus is on face-to-face conversations because it is the the biggest medium of human communication, and consequently a powerful channel for providing advices to somebody.

Description of Machine Learning Technique

As we said, in order to model conversations, we want to find causes that influence human conversations. There can be many influences on somebody's conversations, and social tie is one of them. As an example of other types of influences, we can say whether somebody will be in a conversation at time $1$ largely depends on whether she is in a conversation with a talkative person at time .

Although we are predicting human conversations using social networks, we want to use both the information of talkativeness of a person and social ties. We are doing so because we would like to quantitatively explore the contributing factors in somebody's conversations due to the influence of social network and due to speaking with a talkative person. This sort of discrimination helps us to find appropriate channels when we want to pass sensitive information to somebody.

In this project, we know the social network structure as a priori, and the social network was derived from bluetooth proximity, previous conversations and self-reported surveys. From our dataset, we also know whether somebody is talking or not at a given time. Then we can find conversations using techniques described in [2].

We will model human conversations using multiple interacting Markov chains. We will be using the boolean variable $1$ which is 1 when subject $1$ is in conversation at time $1$ and 0 when she is not. Let us consider conversations between $1$ and $1$ . Now, to predict whether $1$ is in a conversation at time $1$ , we will be modeling $1$ , i.e., the probability of $1$ conditioned on $1$ , and the social tie strength between $1$ and $1$ denoted by $1$ . To compute the high-dimensional conditional probability table needed for $1$ , we will be using the Saul and Jordan [1] proposed method, where one variable conditioned on many others is approximated as a convex combination of the pairwise conditional tables. We use their approach and model $1$ as mixture of conditional distribution as follows:

$1$

We will learn the parameters of (1), which we call the influence parameters, such that it maximizes the likelihood of our data. The parameters thus learnt will give us appropriate influence level for a specific social tie. However, inferences using this model is very challenging because inference even in a normal Markov chain is of quadratic complexity, and coming up with a inference algorithm for a fairly general dependency structure as (1) is even more challenging, which we will be attempting to solve in this project.

The Dataset

The dataset, we will be using came from an experiment conducted at MIT [3] in which 23 people (a mix of students, faculty, and administrative support staff) agreed to wear the sociometer, a wearable data acquisition board [3],[4]. The device stored audio information from a single microphone at 8 KHz. During the experiment the users wore the device both indoors and outdoors for six hours a day for 11 days.

Accomplishment at the Milestone

Learning the parameters of the model and device an algorithm for inference.

References

Saul, L.K. and M. Jordan. “Mixed Memory Markov Models.” Machine Learning, 1999. 37: p. 75-85.
Danny Wyatt, Tanzeem Choudhury, and Jeff Bilmes. "Conversation Detection and Speaker Segmentation in Privacy Sensitive Situated Speech Data". Interspeech 2007. August 2007, Antwerp, Belgium.
Choudhury, T., "Sensing and Modeling Human Networks", Doctoral Thesis in Media Arts and Sciences. MIT. Cambridge, MA, 2003.
Gerasimov, V., T. Selker, and W. Bender, "Sensing and Effecting Environment with Extremity Computing Devices". Motorola Offspring, 2002. 1(1).