关于加拿大阿尔伯塔大学李玉喜博士作“Learning Exercise Policies for American Options”专题报告的通知
时间:4月29日上午10:30
地点:体育外围平台APP1002会议室
报告人:加拿大阿尔伯塔大学李玉喜博士
报告题目:Learning Exercise Policies for American Options
主持人:刘南 教授
欢迎各位师生参加。
管理科学与工程系
李玉喜博士 简介
李玉喜博士于2006年在加拿大阿尔伯塔大学计算机系取得博士学位。博士论文主要研究如何在不确定环境中优化网络性能。博士期间获得阿尔伯塔大学最高荣誉奖学金Izaak Walton Killam Memorial Scholarship 等多项荣誉。之后在计算机系人工智能和机器学习研究组做博士后。研究增强学习 (sequential decision making),机器学习,运筹学等及其在金融工程等方向的应用。曾参与多项NSERC(相当于自然科学基金)等科研项目。曾在顶级会议IEEE INFOCOM,AI Statistics等国际会议,期刊发表论文17篇,均为EI/ISTP索引。多次为国际会议,国际期刊做审稿人。
报告题目与摘要
Title: Learning Exercise Policies for American Options
Abstract: Options are important instruments in modern finance. In this paper, we investigate reinforcement learning (RL) methods---in particular, least-squares policy iteration (LSPI)---for the problem of learning exercise policies for American options. We develop finite-time bounds on the performance of the policy obtained with LSPI and compare LSPI and the fitted Q-iteration algorithm (FQI) with the Longstaff-Schwartz method (LSM), the standard least-squares Monte Carlo algorithm from the finance community. Our empirical results show that the exercise policies discovered by LSPI and FQI gain larger payoffs than those discovered by LSM, on both real and synthetic data. Furthermore, we find that for all methods the policies learned from real data generally gain similar payoffs to the policies learned from simulated data. Our work shows that solution methods developed in machine learning can advance the state-of-the-art in an important and challenging application area, while demonstrating that computational finance remains a promising area for future applications of machine learning methods.
Moreover, reinforcement learning also has applications in logistics, for example, the inventory routing problem.