Linear-time Computation of DAWGs, Symmetric Indexing Structures, and MAWs for Integer Alphabets

07/04/2023
by   Yuta Fujishige, et al.
0

The directed acyclic word graph (DAWG) of a string y of length n is the smallest (partial) DFA which recognizes all suffixes of y with only O(n) nodes and edges. In this paper, we show how to construct the DAWG for the input string y from the suffix tree for y, in O(n) time for integer alphabets of polynomial size in n. In so doing, we first describe a folklore algorithm which, given the suffix tree for y, constructs the DAWG for the reversed string of y in O(n) time. Then, we present our algorithm that builds the DAWG for y in O(n) time for integer alphabets, from the suffix tree for y. We also show that a straightforward modification to our DAWG construction algorithm leads to the first O(n)-time algorithm for constructing the affix tree of a given string y over an integer alphabet. Affix trees are a text indexing structure supporting bidirectional pattern searches. We then discuss how our constructions can lead to linear-time algorithms for building other text indexing structures, such as linear-size suffix tries and symmetric CDAWGs in linear time in the case of integer alphabets. As a further application to our O(n)-time DAWG construction algorithm, we show that the set 𝖬𝖠𝖶(y) of all minimal absent words (MAWs) of y can be computed in optimal, input- and output-sensitive O(n + |𝖬𝖠𝖶(y)|) time and O(n) working space for integer alphabets.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset
Success!
Error Icon An error occurred

Sign in with Google

×

Use your Google Account to sign in to DeepAI

×

Consider DeepAI Pro