Engineering Rank/Select Data Structures for Big-Alphabet Strings

05/23/2023
by   Diego Arroyuelo, et al.
0

Big-alphabet strings are common in several scenarios such as information retrieval and natural-language processing. The efficient storage and processing of such strings usually introduces several challenges that are not witnessed in smaller-alphabets strings. This paper studies the efficient implementation of one of the most effective approaches for dealing with big-alphabet strings, namely the alphabet-partitioning approach. The main contribution is a compressed data structure that supports the fundamental operations rank and select efficiently. We show experimental results that indicate that our implementation outperforms the current realizations of the alphabet-partitioning approach. In particular, the time for operation select can be improved by about 80 alphabet-partitioning schemes. We also show the impact of our data structure on several applications, like the intersection of inverted lists (where improvements of up to 60 representation of run-length compressed strings, and the distributed-computation processing of rank and select operations.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset
Success!
Error Icon An error occurred

Sign in with Google

×

Use your Google Account to sign in to DeepAI

×

Consider DeepAI Pro