Computing Dense Clusters On-line for Information Organization Dartmouth Technical Report PCS-TR97-324 J. Aslam K. Pelekhov Daniela Rus Date: October 1997 URL (compressed postscript): (128KB) URL (PDF): (392KB) Abstract: We present and analyze the off-line star algorithm for clustering static information systems and the on-line star algorithm for clustering dynamic information systems. These algorithms partition a document collection into a number of clusters that is naturally induced by the collection. We show a lower bound on the accuracy of the clusters produced by these algorithms. We use the random graph model to show that both star algorithms produce correct clusters in time Theta(V + E). Finally, we provide data from extensive experiments. Note: Submitted to the 1998 SIGIR Conference.