BIB-VERSION:: CS-TR-v2.0 ID:: ncstrl.dartmouthcs//TR99-352 ENTRY:: June 10, 1999 ORGANIZATION:: Dartmouth College, Computer Science TITLE:: An Application of Word Sense Disambiguation to Information Retrieval TYPE:: Technical Report (paper) REVISION:: 1 AUTHOR:: Whaley, Jason M. DATE:: June 1999 RETRIEVAL:: For a paper copy, email RETRIEVAL:: For a paper copy, write to Technical Report Librarian Department of Computer Science Dartmouth College 6211 Sudikoff Laboratory Hanover, NH 03755-3510 USA RETRIEVAL:: Compressed Postscript at http://www.cs.dartmouth.edu/reports/TR99-352.ps.Z RETRIEVAL:: PDF at http://www.cs.dartmouth.edu/reports/TR99-352.pdf ABSTRACT:: The problems of word sense disambiguation and document indexing for information retrieval have been extensively studied. It has been observed that indexing using disambiguated meanings, rather than word stems, should improve information retrieval results. We present a new corpus-based algorithm for performing word sense disambiguation. The algorithm does not need to train on many senses of each word; it uses instead the probability that certain concepts will occur together. That algorithm is then used to index several corpa of documents. Our indexing algorithm does not generally outperform the traditional stem-based tf.idf model. NOTE:: Undergraduate Honors Thesis. Advisor: Jay Aslam. END:: ncstrl.dartmouthcs//TR99-352