research
          
      
      ∙
      04/20/2023
    Optimal Activation of Halting Multi-Armed Bandit Models
We study new types of dynamic allocation problems the Halting Bandit mod...
          
            research
          
      
      ∙
      09/28/2019
    Accelerating the Computation of UCB and Related Indices for Reinforcement Learning
In this paper we derive an efficient method for computing the indices as...
          
            research
          
      
      ∙
      09/13/2019
    Reinforcement Learning: a Comparison of UCB Versus Alternative Adaptive Policies
In this paper we consider the basic version of Reinforcement Learning (R...
          
            research
          
      
      ∙
      10/07/2015
    Asymptotically Optimal Sequential Experimentation Under Generalized Ranking
We consider the classical problem of a controller activating (or samplin...
          
            research
          
      
      ∙
      05/12/2015
    Asymptotic Behavior of Minimal-Exploration Allocation Policies: Almost Sure, Arbitrarily Slow Growing Regret
The purpose of this paper is to provide further understanding into the s...
          
            research
          
      
      ∙
      05/08/2015
    An Asymptotically Optimal Policy for Uniform Bandits of Unknown Support
Consider the problem of a controller sampling sequentially from a finite...
          
            research
          
      
      ∙
      04/22/2015