方略学科导航

搜索结果: 1-3 共查到“管理学 Regret Bounds”相关记录3条 . 查询时间(0.093 秒)

Regret Bounds for Reinforcement Learning with Policy Advice Regret Bounds Reinforcement LearningPolicy Advice 2013/6/13

In some reinforcement learning problems an agent may be provided with a set of input policies, perhaps learned from prior experience or provided by advisors. We present a reinforcement learning with p...

存档附件原文地址

Further Optimal Regret Bounds for Thompson Sampling Further Optimal Regret Bounds Thompson Sampling 2012/11/23

Thompson Sampling is one of the oldest heuristics for multi-armed bandit problems. It is a randomized algorithm based on Bayesian ideas, and has recently generated significant interest after several s...

存档附件原文地址

Regret Bounds for Restless Markov Bandits Regret Bounds Restless Markov Bandits 2012/11/23

We consider the restless Markov bandit problem, in which the state of each arm evolves according to a Markov process independently of the learner's actions. We suggest an algorithm that after $T$ step...

存档附件原文地址

中国研究生教育排行榜-条

正在加载...

中国学术期刊排行榜-条

正在加载...

世界大学科研机构排行榜-条

正在加载...

中国大学排行榜-条

正在加载...

人　物-篇

正在加载...

课　件-篇

正在加载...

视听资料-篇

正在加载...

研招资料 -篇

正在加载...

知识要闻-篇

正在加载...

国际动态-篇

正在加载...

会议中心-篇

正在加载...

学术指南-篇

正在加载...

学术站点-篇

正在加载...

中国研究生教育排行榜-条

中国学术期刊排行榜-条

世界大学科研机构排行榜-条

中国大学排行榜-条

人 物-篇

课 件-篇

视听资料-篇

知识库-篇

研招资料 -篇

知识要闻-篇

国际动态-篇

会议中心-篇

学术指南-篇

学术站点-篇

人　物-篇

课　件-篇