Paper Recommendation Based on Linear Model CS 74/174 Project Proposal

Paper Recommendation Based on Linear Model
CS 74/174 Project Proposal

Wenyuan Feng, Huizhe Li, Zheyu Liu

Jan 24, 2013

Introduction

Researchers and academics have long been disturbed by the time-consuming search for paper of interest, which generally involves consideration of a paper's references, the time it was published, its key words, author information and etc. Our goal is to achieve automated paper recommendation that assists users in paper searching. The system, in its learning procedure, models user behaviour patterns and determines parameters in that model. Then it is able to make predictions about which papers the user is likely to be interested in based on certain attributes of the paper the user just clicked.

Method

We model a user's preference for papers, based on ratings of papers given by that user. Several factors of a paper, such as title, author, key words and references, may contribute to the overall rating. Our model observes these attributes, and tries to extract the user's interest by looking at huge amount of data and eliminating redundant, inconsistent, and/or bogus information in the rating of one single paper.

For a particular rated paper, values from each attribute mentioned above are weighted respectively, according to the rating. The total weighting of each value from one of the four attributes is then obtained by summing over ratings of all papers followed by normalization. Such a weighting score of a specific value reflects to what extent the user is more likely to be interested in a paper where that value is present. Having learned the knowledge of weightings of all known values, we are able to rate a new paper by searching for known values in that paper, and decide to recommend it if the result exceeds a threshold.

Data Set

A member of our team has experience in this field and collected Gigabytes of data in college. That combined with available data set from the Internet should be adequate for our project.

Milestone Goal

By Feb. 19th, we hope to:

Collect all the data needed for this project, including training set and test set.
Build a system framework in Java. The framework should have defined the format of input and output both in learning and testing procedure, and should have implemented our model with undetermined parameters.
Read literature as we proceed and perhaps from which come up with new ideas.

References

[1] Christopher M. Bishop, “Pattern Recognition and Machine Learning”, Springer 2006.

[2] Nikolaos F. Matsatsinis, Kleanthi Lakiotaki and Pavlos Delias “A System based on Multiple Criteria Analysis for Scientific Paper Recommendation ”2007.

[3] Wikipedia: http://en.wikipedia.org/wiki/Recommendation_system