Untitled Document

Special Session 37: Recent development of stochastic optimal control, applications and deep learning methods

Exponential Convergence of Relative Value Iteration in Ergodic Control Problems in Diffusions

Sumith Reddy Anugu

TU Ilmenau
Germany

Co-Author(s): Sumith Reddy Anugu (TU Ilmenau), Guodong Pang (Rice University)

Abstract:

The relative value iteration (RVI) algorithm is well-known to numerically approximate the value function (and also the optimal value) associated with ergodic cost control problems and thereby also help us obtain the approximations of optimal controls. However, the rate of convergence of these algorithms is less explored in the case of infinite state space like in diffusions, even under strong conditions of exponential ergodicity. In this talk, we present our recent work where we establish that a `slightly modified version' of the RVI algorithm in the case of diffusions converges exponentially, under the conditions of uniform exponential ergodicity of the diffusion. The proof involves considering a weighted semi-norm which is identical for all functions modulo an additive constant. It turns out that under this semi-norm, the diffusion semi-group becomes a contraction, uniformly in all Markov controls. Another consequence of this consideration is that one effectively `decouples' the problem of convergence to the optimal value and the problem of convergence to the value function. We then analyze both these convergences separately.

Go Back