In-database connected component analysis

02/26/2018
by   Harald Bögeholz, et al.
0

We describe a Big Data-practical, SQL-implementable algorithm for efficiently determining connected components for graph data stored in a Massively Parallel Processing (MPP) relational database. The algorithm described is a linear-space, randomised algorithm, always terminating with the correct answer but subject to a stochastic running time, such that for any ϵ>0 and any input graph G=〈 V, E 〉 the algorithm terminates after O( |V|) SQL queries with probability of at least 1-ϵ, which we show empirically to translate to a quasi-linear runtime in practice.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset
Success!
Error Icon An error occurred

Sign in with Google

×

Use your Google Account to sign in to DeepAI

×

Consider DeepAI Pro