Special Session 118: Recent advances in mathematical finance

Reinforcement learning for optimal constant proportion portfolio management

Jorge P Zubelli
Khalifa University
United Arab Emirates
Co-Author(s):    Giorgio Consigli
Abstract:
A reinforcement learning (RL) approach is presented to address a multi-period optimization problem whereby the portfolio manager requires an optimal constant proportion portfolio strategy by minimizing a tail risk measure consistent with second order stochastic dominance (SSD) principles. As a risk measure, we consider the particular case of the Interval Conditional Value-at-Risk (ICVaR). By including the ICVaR in the reward function of an RL method we show that an optimal fixed-mix policy can be derived as solution of short- to medium-term allocation problems through an accurate specification of the learning parameters under general statistical assumptions. The methodology is tested in- and out-of-sample on market data showing good performance relative to the SP500, adopted as benchmark policy. The talk is based on joint work with Giorgio Consigli and Alvaro Gomez which appeared in the journal {\em Engineering Applications of Artificial Intelligence}.