research
          
      
      ∙
      06/27/2022
    Parametrically Retargetable Decision-Makers Tend To Seek Power
If capable AI agents are generally incentivized to seek power in service...
          
            research
          
      
      ∙
      06/23/2022
    On Avoiding Power-Seeking by Artificial Intelligence
We do not know how to align a very intelligent AI agent's behavior with ...
          
            research
          
      
      ∙
      06/23/2022
    Formalizing the Problem of Side Effect Regularization
AI objectives are often hard to specify properly. Some approaches tackle...
          
            research
          
      
      ∙
      06/11/2020
    Avoiding Side Effects in Complex Environments
Reward function specification can be difficult, even in simple environme...
          
            research
          
      
      ∙
      12/03/2019
    Optimal Farsighted Agents Tend to Seek Power
Some researchers have speculated that capable reinforcement learning (RL...
          
            research
          
      
      ∙
      02/26/2019