AbstractWe investigate the time complexity of the pattern matching problem for strings which are succinctly described in terms of straight-line programs, in which the constants are symbols and the only operation is the concatenation. Most strings of descriptive size

nare of exponential length with respect ton. A complicated algorithm for theequality-test(testing if two succinctly described strings are the same) in O(n^{4}) time was constructed by Plandowski. Our main aim is to extend this result and we show that a much stronger problem of thepattern-matchingfor succinctly described strings can be solved with similar complexity: in O(n^{4}logn) time. However, Plandowski's algorithm is not used in this paper, and our algorithm gives independent constructive proof. The crucial point in our algorithm is the succinct representation of all periods of a (possibly long) string described in this manner. We also show a (rather straightforward) result that a very simple extension of the pattern-matching problem for shortly described strings is NP-complete.

Categories and Subject Descriptors: E.4 [Coding and Information Theory]; F.2.2 [Analysis of Algorithms and Problem Complexity]: Nonnumerical Algorithms and Problems

Additional Key Words and Phrases: pattern matching, text search, compression, straight-line program

Selected references

- Martin Farach and Mikkel Thorup. String matching in Lempel-Ziv compressed strings. In Proceedings of the Twenty-Seventh Annual ACM Symposium on the Theory of Computing, pages 703-712, Las Vegas, Nevada, 29 May-1 June 1995.