Multi-Agent Low-Dimensional Linear Bandits

07/02/2020
by   Ronshee Chawla, et al.
0

We study a multi-agent stochastic linear bandit with side information, parameterized by an unknown vector θ^* ∈ℝ^d. The side information consists of a finite collection of low-dimensional subspaces, one of which contains θ^*. In our setting, agents can collaborate to reduce regret by sending recommendations across a communication graph connecting them. We present a novel decentralized algorithm, where agents communicate subspace indices with each other, and each agent plays a projected variant of LinUCB on the corresponding (low-dimensional) subspace. Through a combination of collaborative best subspace identification, and per-agent learning of an unknown vector in the corresponding low-dimensional subspace, we show that the per-agent regret is much smaller than the case when agents do not communicate. By collaborating to identify the subspace containing θ^*, we show that each agent effectively solves an easier instance of the linear bandit (compared to the case of no collaboration), thus leading to the reduced per-agent regret. We finally complement these results through simulations.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset
Success!
Error Icon An error occurred

Sign in with Google

×

Use your Google Account to sign in to DeepAI

×

Consider DeepAI Pro