搜索结果: 1-2 共查到“Logarithmic Regret”相关记录2条 . 查询时间(0.062 秒)
Adaptive Learning of Uncontrolled Restless Bandits with Logarithmic Regret
Uncontrolled Restless Bandits Logarithmic Regret Optimization and Control
2011/9/15
Abstract: In this paper we consider the problem of learning the optimal policy for the uncontrolled restless bandit problem. In this problem only the state of the selected arm can be observed, the sta...
The Non-Bayesian Restless Multi-Armed Bandit: a Case of Near-Logarithmic Regret
The Non-Bayesian Restless Multi-Armed Bandit:Near-Logarithmic Regret
2010/11/24
In the classic Bayesian restless multi-armed bandit (RMAB) problem, there are $N$ arms, with rewards on all arms evolving at each time as Markov chains with known parameters. A player seeks to activa...