Chang, W. I., Lampe, J. (1992) Theoretical and Empirical Comparisons of Approximate String Matching Algorithms. Proceedings of the Third Annual Symposium on Combinatorial Pattern Matching, 644. pp. 175-184. ISSN 0302-9743
Abstract
We study in depth a model of non-exact pattern matching based on edit distance, which is the minimum number of substitutions, insertions, and deletions needed to transform one string of symbols to another. More precisely, the k differences approximate string matching problem specifies a text string of length n, a pattern string of length m, the number k of differences (substitutions, insertions, deletions) allowed in a match, and asks for all locations in the text where a match occurs. We have carefully implemented and analyzed various O(kn) algorithms based on dynamic programming (DP), paying particular attention to dependence on b the alphabet size. An empirical observation on the average values of the DP tabulation makes apparent each algorithm's dependence on b. A new algorithm is presented that computes much fewer entries of the DP table. In practice, its speedup over the previous fastest algorithm is 2.5X for binary alphabet; 4X for four-letter alphabet; 10X for twenty-letter alphabet. We give a probabilistic analysis Of the DP table in order to prove that the expected running time of our algorithm (as well as an earlier ''cut-off'' algorithm due to Ukkonen) is O(kn) for random text. Furthermore, we give a heuristic argument that our algorithm is O(kn/(square-root b - 1)) on the average, when alphabet size is taken into consideration.
Item Type: | Paper |
---|---|
Additional Information: | Meeting Abstract |
Uncontrolled Keywords: | COMMON ANCESTORS |
Subjects: | bioinformatics > computational biology > algorithms bioinformatics > computational biology |
CSHL Authors: | |
Communities: | CSHL labs |
Depositing User: | Matt Covey |
Date: | 1992 |
Date Deposited: | 18 Sep 2015 14:27 |
Last Modified: | 18 Sep 2015 14:27 |
URI: | https://repository.cshl.edu/id/eprint/31861 |
Actions (login required)
Administrator's edit/view item |