Using Edit Distance in Point-Pattern Matching

Veli Mäkinen

Department of Computer Science, P.O Box 26 (Teollisuuskatu 23)
FIN-00014 University of Helsinki, Finland.

Abstract. Edit distance is a powerful measure of similarity in string matching, measuring the minimum amount of insertions, deletions, and substitutions to convert a string into another string. This measure is often contrasted with time warping in speech processing, that measures how close two trajectories are by allowing compression and expansion operations on time scale. Time warping can be easily generalized to measure the similarity between 1D point-patterns (ascending lists of real values), as the difference between points (i) and (i-1) in a point-pattern can be considered as the value of a trajectory at the time i. However, we show that edit distance is more natural choice, and derive a measure by calculating the minimum amount of space needed to insert and delete between points to convert a point-pattern into another. We show that this measure defines a metric. We also define a substitution operation such that the distance calculation automatically separates the points into matching and mismatching points. The algorithms are based on dynamic programming. The main motivation for these methods is two and higher dimensional point-pattern matching, and therefore we generalize these methods into the 2D case, and show that this generalization leads to an NP-complete problem. There is also applications for the 1D case; we discuss shortly the matching of tree ring sequences in dendrochronology.