Bandit Convex Optimisation Revisited: FTRL Achieves Õ(t^1/2) Regret

02/01/2023
by   David Young, et al.
0

We show that a kernel estimator using multiple function evaluations can be easily converted into a sampling-based bandit estimator with expectation equal to the original kernel estimate. Plugging such a bandit estimator into the standard FTRL algorithm yields a bandit convex optimisation algorithm that achieves Õ(t^1/2) regret against adversarial time-varying convex loss functions.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset
Success!
Error Icon An error occurred

Sign in with Google

×

Use your Google Account to sign in to DeepAI

×

Consider DeepAI Pro