|
Dartmouth College Computer Science Technical Report series |
CS home TR home TR search TR listserv |
| By author: | A B C D E F G H I J K L M N O P Q R S T U V W X Y Z | |
| By number: | 2008, 2007, 2006, 2005, 2004, 2003, 2002, 2001, 2000, 1999, 1998, 1997, 1996, 1995, 1994, 1993, 1992, 1991, 1990, 1989, 1988, 1987, 1986 | |
Abstract:
The need for a more effective similarity measure is growing as a
result of the astonishing amount of information being placed
online. Most existing similarity measures are defined by empirically
derived formulas and cannot easily be extended to new applications. We
present a pairwise document similarity measure based on Information
Theory, and present corpus dependent and independent applications of
this measure. When ranked with existing similarity measures over TREC
FBIS data, our corpus dependent information theoretic similarity
measure ranked first.
Note:
Undergraduate Honors Thesis. Advisor: Jay Aslam.
Bibliographic citation for this report: [plain text] [BIB] [BibTeX] [Refer]
Or copy and paste:
Jeffrey D. Isaacs and
Javed A. Aslam,
"Investigating Measures for Pairwise Document Similarity."
Dartmouth Computer Science Technical Report PCS-TR99-357,
June 1999.
Notify me about new tech reports.

To receive paper copy of a report, by mail, send your address and the TR number to reports AT cs.dartmouth.edu
Copyright notice: The documents contained in this server are included by the contributing authors as a means to ensure timely dissemination of scholarly and technical work on a non-commercial basis. Copyright and all rights therein are maintained by the authors or by other copyright holders, notwithstanding that they have offered their works here electronically. It is understood that all persons copying this information will adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.
Technical reports collection maintained by David Kotz.