LCP array construction in external memory

[back to the main page]

We present a C++ implementation of the first external memory algorithm called LCPscan for constructing the LCP array given the text and its suffix array as input. The only previous way to compute the LCP array for data that is bigger than the RAM is to use a suffix array construction algorithm with complex modifications to produce the LCP array as a by-product. Compared to the best prior method, our algorithm needs much less disk space (by more than a factor of three) and is significantly faster. Furthermore, our algorithm can be combined with any suffix array construction algorithm including a better one developed in the future.

Details of the algorithm and experimental evaluation can be found in [1] and [2].
See the README provided in the package for more information.

Note: LCPscan uses the STXXL library for external memory sorting. However, the version used by LCPscan is included in the package and does not require installation. Just follow the compilation instructions from the README included in the package.

News

  • 2016-01-01: New version (0.2.0) available [changelog].

Downloads

Implemented by

  • Juha Kärkkäinen
  • Dominik Kempa

See also

References

  1. Juha Kärkkäinen, Dominik Kempa. LCP Array Construction in External Memory.
    ACM Journal of Experimental Algorithmics (JEA), Volume 21(1), 2016, Article 1.7.
    [ACM]
  2. Juha Kärkkäinen, Dominik Kempa. LCP Array Construction in External Memory.
    In Proc. 13th Symposium on Experimental Algorithms (SEA 2014), Springer, 2014, pp. 412-423.
    [Springer]