Abstract: |
This talk discusses mean-variance problems in the context of finite horizon continuous-time Markov decision processes. The state and action spaces are assumed to be general spaces, while reward functions and transition rates are allowed to be unbounded. Using the first-jump analysis, we succeed in converting the variance of the finite-horizon reward to a mean of a finite-horizon reward with new reward functions, and then, we design a new method called successive approximation, via which we prove the existence of a solution to the Hamilton-Jocobi-Bellman equation and of an optimal policy under some growth and compact-continuity conditions. |
|