Dartmouth College Computer Science Technical Report series 
CS home TR home TR search TR listserv 

By author:  A B C D E F G H I J K L M N O P Q R S T U V W X Y Z  
By number:  2017, 2016, 2015, 2014, 2013, 2012, 2011, 2010, 2009, 2008, 2007, 2006, 2005, 2004, 2003, 2002, 2001, 2000, 1999, 1998, 1997, 1996, 1995, 1994, 1993, 1992, 1991, 1990, 1989, 1988, 1987, 1986 
Abstract:
Given a collection of strings S={s_1, ..., s_n} over an alphabet \Sigma, a superstring \alpha of S is a string containing each s_i as a substring; that is, for each i, 1<=i<=n, \alpha contains a block of s_i consecutive characters that match s_i exactly. The shortest superstring problem is the problem of finding a superstring \alpha of minimum length.
The shortest superstring problem has applications in both data compression and computational biology. In data compression, the problem is a part of a general model of string compression proposed by Gallant, Maier and Storer (JCSS '80). Much of the recent interest in the problem is due to its application to DNA sequence assembly.
The problem has been shown to be NPhard; in fact, it was shown by Blum et al.(JACM '94) to be MAX SNPhard. The first O(1)approximation was also due to Blum et al., who gave an algorithm that always returns a superstring no more than 3 times the length of an optimal solution. Several researchers have published results that improve on the approximation ratio; of these, the best previous result is our algorithm ShortString, which achieves a 2 3/4approximation (WADS '95).
We present our new algorithm, GShortString, which achieves a ratio of 2 2/3. It generalizes the ShortString algorithm, but the analysis differs substantially from that of ShortString. Our previous work identified classes of strings that have a nested periodic structure, and which must be present in the worst case for our algorithms. We introduced machinery to descibe these strings and proved strong structural properties about them. In this paper we extend this study to strings that exhibit a more relaxed form of the same structure, and we use this understanding to obtain our improved result.
Bibliographic citation for this report: [plain text] [BIB] [BibTeX] [Refer]
Or copy and paste:
Chris Armen and
Clifford Stein,
"A 22/3 Approximation for the Shortest Superstring Problem."
Dartmouth Computer Science Technical Report PCSTR95262,
June 1995.
Notify me about new tech reports.
To receive paper copy of a report, by mail, send your address and the TR number to reports AT cs.dartmouth.edu
Copyright notice: The documents contained in this server are included by the contributing authors as a means to ensure timely dissemination of scholarly and technical work on a noncommercial basis. Copyright and all rights therein are maintained by the authors or by other copyright holders, notwithstanding that they have offered their works here electronically. It is understood that all persons copying this information will adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.
Technical reports collection maintained by David Kotz.