方略学科导航

搜索结果: 1-2 共查到“Logarithmic Regret”相关记录2条 . 查询时间(0.062 秒)

Adaptive Learning of Uncontrolled Restless Bandits with Logarithmic Regret Uncontrolled Restless Bandits Logarithmic Regret Optimization and Control 2011/9/15

Abstract: In this paper we consider the problem of learning the optimal policy for the uncontrolled restless bandit problem. In this problem only the state of the selected arm can be observed, the sta...

存档附件原文地址

The Non-Bayesian Restless Multi-Armed Bandit: a Case of Near-Logarithmic Regret The Non-Bayesian Restless Multi-Armed Bandit:Near-Logarithmic Regret 2010/11/24

In the classic Bayesian restless multi-armed bandit (RMAB) problem, there are $N$ arms, with rewards on all arms evolving at each time as Markov chains with known parameters. A player seeks to activa...

存档附件原文地址

中国研究生教育排行榜-条

正在加载...

中国学术期刊排行榜-条

正在加载...

世界大学科研机构排行榜-条

正在加载...

中国大学排行榜-条

正在加载...

人　物-篇

正在加载...

课　件-篇

正在加载...

视听资料-篇

正在加载...

研招资料 -篇

正在加载...

知识要闻-篇

正在加载...

国际动态-篇

正在加载...

会议中心-篇

正在加载...

学术指南-篇

正在加载...

学术站点-篇

正在加载...