Abstract: |
Generalising the idea of the classical EM algorithm that is widely used for computing maximum likelihood estimates, we propose an EM-Control (EM-C) algorithm for solving multi-period finite time horizon stochastic control problems. The new algorithm sequentially updates the control policies in each time period using Monte Carlo simulation in a forward-backward manner; in other words, the algorithm goes forward in simulation and backward in optimization in each iteration. Similar to the EM algorithm, the EM-C algorithm has the monotonicity of performance improvement in each iteration, leading to good convergence properties. We demonstrate the effectiveness of the algorithm by solving stochastic control problems in revenue management for airlines and in the study of real business cycle. |
|