k – Online Policies and Fundamental Limits

10/15/2021
by   Samrat Mukhopadhyay, et al.
5

This paper introduces and studies the k problem – a generalization of the classic Prediction with Expert's Advice (i.e., the ) problem. Unlike the problem, where the learner chooses exactly one expert, in this problem, the learner selects a subset of k experts from a pool of N experts at each round. The reward obtained by the learner at any round depends on the rewards of the selected experts. The k problem arises in many practical settings, including online ad placements, personalized news recommendations, and paging. Our primary goal is to design an online learning policy having a small regret. In this pursuit, we propose (Sampled Hedge) - a framework for designing efficient online learning policies by leveraging statistical sampling techniques. We show that, for many related problems, improves upon the state-of-the-art bounds for regret and computational complexity. Furthermore, going beyond the notion of regret, we characterize the mistake bounds achievable by online learning policies for a class of stable loss functions. We conclude the paper by establishing a tight regret lower bound for a variant of the k problem and carrying out experiments with standard datasets.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset
Success!
Error Icon An error occurred

Sign in with Google

×

Use your Google Account to sign in to DeepAI

×

Consider DeepAI Pro