A new regret analysis for Adam-type algorithms

03/21/2020
by   Ahmet Alacaoglu, et al.
0

In this paper, we focus on a theory-practice gap for Adam and its variants (AMSgrad, AdamNC, etc.). In practice, these algorithms are used with a constant first-order moment parameter β_1 (typically between 0.9 and 0.99). In theory, regret guarantees for online convex optimization require a rapidly decaying β_1→0 schedule. We show that this is an artifact of the standard analysis and propose a novel framework that allows us to derive optimal, data-dependent regret bounds with a constant β_1, without further assumptions. We also demonstrate the flexibility of our analysis on a wide range of different algorithms and settings.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset
Success!
Error Icon An error occurred

Sign in with Google

×

Use your Google Account to sign in to DeepAI

×

Consider DeepAI Pro