来源:统计学院

10月21日 | 史成春:Reinforcement Learning in Possibly Nonstationary Environment

来源:统计学院发布时间:2022-10-20浏览次数:58

时   间:2022年10月21日15:00-16:30

地   点:腾讯会议ID : 194-278-539

报告人:史成春 助理教授

主持人:马慧娟 助理教授

摘   要:

We consider reinforcement learning (RL) methods in offline nonstationary environments. Many existing RL algorithms in the literature rely on the stationarity assumption that requires the system transition and the reward function to be constant over time. However, the stationarity assumption is restrictive in practice and is likely to be violated in a number of applications, including traffic signal control, robotics and mobile health. In this project, we develop a consistent procedure to test the nonstationarity of the optimal policy based on pre-collected historical data, without additional online data collection. Based on the proposed test, we further develop a sequential change point detection method that can be naturally coupled with existing state-of-the-art RL methods for policy optimisation in nonstationary environments. The usefulness of our method is illustrated by theoretical results, simulation studies, and real data examples. 

报告人简介:

史成春是伦敦政治经济学院统计系的助理教授。他有近20篇第一作者同行评审的文章被顶级统计期刊AOS,JRSSB,JASA和JMLR接受。他还在顶级机器学习会议ICML 和 NeurIPS上发表了论文。他目前担任 JRSSB 和 Journal of Nonparametric Statistics 的副主编。他的研究主要是在强化学习中开发统计方法。他是 2021年皇家统计学会研究奖的获得者。他还获得三次 IMS Travel Award。