research
          
      
      ∙
      12/12/2022
    Nearly Minimax Optimal Reinforcement Learning for Linear Markov Decision Processes
We study reinforcement learning (RL) with linear function approximation....
          
            research
          
      
      ∙
      02/28/2022
    Bandit Learning with General Function Classes: Heteroscedastic Noise and Variance-dependent Regret Bounds
We consider learning a stochastic bandit model, where the reward functio...
          
            research
          
      
      ∙
      10/25/2021