来源：统计学院

10月28日 | 董玉超：Randomized Optimal Stopping Problem In Continuous Time And Reinforcement Learning Algorithm

来源：统计学院发布时间：2025-10-24浏览次数：10

时间：2025年10月28 日（周二）9：30 – 10：30

地点：普陀校区理科大楼A1714室

报告人：董玉超同济大学副研究员

主持人：李丹萍华东师范大学教授

摘要：

In this paper, we study the optimal stopping problem in the so-called exploratory framework, in which the agent takes actions randomly conditioning on current state and a regularization term is added to the reward functional. Such a transformation reduces the optimal stopping problem to a standard optimal control problem. For the American put option model, we derive the related HJB equation and prove its solvability. Furthermore, we give a convergence rate of policy iteration and compare our solution to the classical American put option problem. Our results indicate a trade-off between the convergence rate and bias in the choice of the temperature constant. Based on the theoretical analysis, a reinforcement learning algorithm is designed and numerical results are demonstrated for several models.

报告人简介：

董玉超博士毕业于复旦大学数学科学学院，之后在复旦大学，法国昂热大学，新加坡国立大学从事博士后研究。2021年1月加入同济大学数学科学学院。董玉超博士的研究方向为随机最优控制理论及其在金融数学中的应用。其研究工作发表在包括AMO，SICON,SIAP,MaFi，SIMA等国际知名期刊上。