Sampling connected subgraphs: nearly-optimal mixing time bounds, nearly-optimal ε-uniform sampling, and perfect uniform sampling

07/23/2020
by   Marco Bressan, et al.
0

We study the connected subgraph sampling problem: given an integer k ≥ 3 and a simple graph G=(V,E), sample a connected induced k-node subgraph of G (also called k-graphlet) uniformly at random. This is a fundamental graph mining primitive, with applications in social network analysis and bioinformatics. In this work we give several results that improve substantially upon the current state of the art: 1) We settle the mixing time of the graphlet random walk. We show it is Θ̃(t(G) ·ρ(G)^k-1), where t(G) is the mixing time of G and ρ(G) is the ratio between its maximum and minimum degree. As a consequence, we obtain state-of-the-art running time bounds for random-walk graphlet sampling. 2) We give the first efficient algorithm for perfectly uniform graphlet sampling. The algorithm has a preprocessing phase that uses time O(n + m logΔ) and space O(n), and a sampling phase that uses k^O(k)O(logΔ) time per sample. 3) We give a nearly-optimal algorithm for ϵ-uniform graphlet sampling. The preprocessing runs in time k^O(k)O(1/ϵ n log n ), which is sublinear when G is not too sparse, and space O(n). The sampling takes k^O(k)O((1/ϵ)^10log1/ϵ) expected time per sample. This is nearly optimal, as any ϵ-uniform algorithm has worst-case expected running time Ω(n).

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset
Success!
Error Icon An error occurred

Sign in with Google

×

Use your Google Account to sign in to DeepAI

×

Consider DeepAI Pro