Next Steps

Next: Bibliography Up: Schedule Previous: Schedule

Next Steps

Next, we'd like to finish analyzing the bigram model, tweak the model to improve its performance, and begin analyzing the trigram model.

If time permits, we'd like to account for the message recipient set, message context, and typed letters to see if any of these attributes have a positive impact on prediction performance. In practice, both letter and message context seem important: certain topic-based words can be used multiple times within a message and typed letters narrow significantly the word prediction scope. Furthermore, using letter-typed context could convert a number of near hits and near saves to hits and saves. Exploiting letters might allow us to rely on posterior probability to reduce the set of predicted values.

Finally, we'd like to explore the use of the message recipient set to determine the reply-to email address. For example, I might use a jac@cs.dartmouth.edu reply-to address in messages to academic peers and a gmail reply-to address in messages to family members.

jac 2010-05-11