Special Session 197: Intelligent Control and Game Theory

Numerical Methods for Mean-Field Forward-Backward Stochastic Differential Equations

Yinggu Chen
Ocean University of China
Peoples Rep of China
Co-Author(s):    Lifeng Wei
Abstract:
Mean-field forward-backward stochastic differential equations (FBSDEs) serve as a core mathematical framework in mean-field game theory with broad applications in finance and large population systems. By leveraging the connection between FBSDEs and high-dimensional partial differential equations, deep learning algorithms can effectively circumvent the curse of dimensionality. However, existing numerical methods train neural network parameters separately on each time subinterval, causing parameter proliferation as time partitions become finer and leading to unstable computational accuracy. To address this issue, we note that under Markovian coefficient conditions, there exists a unique function u such that $Y_t = u(t, X_t)$ over the entire time interval, enabling consistent parameter training and significantly improving numerical accuracy. This result also holds when mean-field terms are expressed in distribution form, provided that certain regularity conditions are satisfied. Furthermore, the convergence from theoretical solutions to discretized equations is studied, and the convergence of neural network solutions with respect to the proposed loss function is analyzed. Several numerical examples are provided to validate the effectiveness of the method.

Indefinite Linear-Quadratic Mean-Field Game of Regime-Switching System

Kai DU
Shandong University
Peoples Rep of China
Co-Author(s):    
Abstract:
In this talk, we introduce an indefinite mean-field game with Markov jump parameters. One notable aspect is the relaxation of the assumption regarding the positivity or non-negativity of weight matrices within costs. By virtue of mean-field methods and decomposition techniques, we have derived decentralized strategies presented by Hamiltonian systems and a new type of consistency condition system. These systems consist of fully coupled regime-switching forward-backward stochastic differential equations that do not conform to the Monotonicity condition. The well-posedness of these strategies is established by employing a relaxed compensator method with an easily verifiable Condition (RC) and the decomposition technique. Furthermore, we demonstrate that the resulting decentralized strategies achieve an epsilon-Nash equilibrium in the indefinite case without any assumptions on admissible control sets using novel estimates of the disturbed state and cost function.

Optimal Cybersecurity Investment with Risk Sharing

Zhuo Jin
Macquarie University
Australia
Co-Author(s):    
Abstract:
We study the optimal timing for a firm to invest in cybersecurity technology while managing cyber risk through insurance. Cyber losses are modeled as a jump process to capture fat-tailed behavior, and investment costs evolve stochastically via a compound Poisson process due to technological uncertainty. The firm aims to minimize total expected costs, including losses, investment, and insurance premiums. By formulating an optimal stopping problem and solving the corresponding Hamilton-Jacobi-Bellman equations, we obtain semi-closed-form solutions for the value function and optimal strategies. Numerical examples illustrate how key factors, such as loss intensity, insurance coverage, and cost volatility, affect investment timing and risk-sharing decisions.

Linear-Quadratic Stochastic Stackelberg Games of N Players for Time-Delay Systems and Related FBSDEs

Na Li
Dalian University of Technology
Peoples Rep of China
Co-Author(s):    Shujun Wang
Abstract:
Motivated by the multi-scheme supply chain problem, a linear-quadratic generalized Stackelberg game for time-delay is studied, in which the multi-level hierarchy structure with delay is involved. With the help of the continuity method, we first establish the unique solvability of nonlinear anticipated forward-backward stochastic delayed differential equations with a multi-level self-similar domination-monotonicity structure. Based on it, we derive the Stackelberg equilibrium in this framework. By the theoretical results, a corporate social responsibility problem is studied in the view of a multi-scheme supply chain problem, some simulations are also presented to illustrate the Stackelberg equilibrium in a special case.

The Deep Truncated FBSDE Method: A Robust Solver for High-dimensional PDEs

Yunzhang Li
Fudan University
Peoples Rep of China
Co-Author(s):    
Abstract:
The curse of dimensionality makes it difficult to solve high-dimensional partial differential equations (PDEs) numerically. The nonlinear Feynman-Kac formula gives a probabilistic link between these PDEs and forward-backward stochastic differential equations (FBSDEs), but classical deep-learning solvers often become unstable for strongly coupled problems. We introduce a Deep Truncated FBSDE method for high-dimensional quasilinear PDEs that overcomes these limitations. The method uses the gradient truncation and applies iterative decoupling to separate the forward and backward parts, reducing unstable feedback between them. We also rewrite the PDE as equivalent coupled subsystems that are easier to optimize and enforce pathwise consistency through repeated refinement to control errors. Theoretical results and numerical tests show this method improves stability, convergence, and accuracy, offering a scalable and reliable tool for high-dimensional problems.

Vulnerable European and American Options in a Market Model with Optional Hazard Process

Ruyi Liu
University of New South Wales
Australia
Co-Author(s):    Libo Li, Marek Rutkowski
Abstract:
Vulnerable options of a European and American style with a possible occurrence of an exogenous termination are studied under market incompleteness in a hazard process setup. It is proven that the reduced upper price of a vulnerable European option coincides with the unique price of an American option with a properly defined payoff and holder`s exercise times constrained to the random set given by the right support of the hazard process. For a vulnerable American option, it is shown that the reduced upper price equals the price of a specific American option with unrestricted exercise times, whereas the reduced lower price coincides with the price of a particular game option in which the issuer`s exercise times are constrained to the above-mentioned random set.

Policy Optimization in the Linear Quadratic Gaussian Problem: A Frequency Domain Perspective

Yuan-Hua Ni
Nankai University
Peoples Rep of China
Co-Author(s):    Haoran Li, Xun Li, Yuan-Hua Ni, Xuebo Zhang
Abstract:
The Linear Quadratic Gaussian (LQG) problem is a classic and widely studied model in optimal control, providing a fundamental framework for designing controllers for linear systems subject to process and observation noises. In recent years, researchers have increasingly focused on directly parameterizing dynamic controllers and optimizing the LQG cost over the resulting parameterized set, whereas this parameterization typically gives rise to a highly non-convex optimization landscape for the resulting parameterized LQG problem. To our knowledge, there is currently no general method for certifying the global optimality of candidate controller parameters in this setting, and most existing numerical methods lack rigorous guarantees of global convergence. In this work, we address these gaps with the following contributions. First, we derive a necessary and sufficient condition for the global optimality of stationary points in a parameterized LQG problems. This condition reduces the verification of optimality to a test of the controllability and observability for a novel, specially constructed transfer function, yielding a precise and computationally tractable certificate. Furthermore, our condition provides a rigorous explanation for why traditional parameterizations can lead to suboptimal stationary points. Second, we elevate the controller parameter space from conventional finite-dimensional settings to the infinite-dimensional $\mathcal{RH}_\infty$ space and develop a gradient-based algorithm in this setting, for which we provide a theoretical analysis establishing global convergence. Finally, representative numerical experiments validate the theoretical findings and demonstrate the practical viability of the proposed approach.

Partial stabilizability of forward-backward systems and stabilizability of game-based control systems

Yuanzhuo Song
Shenzhen University
Peoples Rep of China
Co-Author(s):    Kai Du
Abstract:
We introduce three types of stabilizability for controlled forward-backward stochastic differential equations (FBSDEs): weak stabilizability, stabilizability and strong stabilizability. To investigate the weak stabilizability and stabilizability of FBSDEs, we focus on the partial stabilizability of the associated controlled linear systems, which can be characterized by some linear matrix inequality. Similarly, to address the strong stabilizability of FBSDEs, we define the concept of invariant stabilizability for controlled linear systems, which is described by some mild algebraic criteria. In addition, the stabilizability of game-based control systems (GBCSs) is studied, and by virtue of using the maximum principle, we reduce the analysis of these stabilizability types for GBCSs to the corresponding stabilizability of controlled FBSDEs. Finally, for the specific GBCS, we give the corresponding numerical simulation results to verify the accuracy of our theoretical results.

Resilient Event-Triggered Data-Driven LFC under DoS Attacks

Weiwei Sun
Qufu Normal University
Peoples Rep of China
Co-Author(s):    
Abstract:
This talk presents a resilient event-triggered load frequency control (LFC) scheme for multi-area power systems under DoS attacks, using only input-output data. Unlike existing methods that rely on average dwell time, we directly analyze how attack patterns interact with event-triggered mechanisms. This allows us to distinguish valid and invalid time intervals within attack periods that truly affect system dynamics. Based on this, we propose an attack compensation recovery strategy within a model-free adaptive control framework, where both the controller and parameter estimator are reconstructed using the last successfully transmitted signal before an attack. A recursive analysis then gives a sufficient condition for tracking error boundedness, linking stability to the ratio of valid to invalid time during attacks. Simulations on multi-area power systems validate the effectiveness of the approach.

Unknown Input State Observer Based on a Closed-Form Transformation and the Kalman Filter

Huaibin Tang
Shandong University
Peoples Rep of China
Co-Author(s):    Qinghua Zhang
Abstract:
This paper considers robust state estimation for continuous time-varying systems involving arbitrary unknown inputs and random noises. After a time-varying transformation decoupling the system from the unknown inputs, the Kalman filter is applied to the transformed system before recovering the state of the original system. With the decoupling transformation in a simple closed-form, the computational cost of the whole algorithm is close to that of the classical Kalman filter. Under assumptions involving a Gramian matrix similar to the classical observability Gramian, it is shown that the proposed algorithm is an asymptotically unbiased state estimator, i.e., the mathematical expectation of the state estimation error converges exponentially to zero, and the covariance of the state estimation error is bounded. Simulation examples illustrate the effectiveness of the proposed algorithm.

A new PINNs algorithm for solving MFGs

Shupeng Wang
Shandong University
Peoples Rep of China
Co-Author(s):    Zhen Wu, Hui Zhang
Abstract:
Mean field games are frameworks to study Nash equilibria or social optima in games with a continuum of agents. These problems can be used to approximate competitive or cooperative games with a large finite number of agents and have found a broad range of applications, in particular in economics. In this work, we consider a mean field games systems with non-local coupling terms involving M populations. State-of-the-art numerical methods for solving such problems exploit spatial discretization, which leads to the curse of dimensionality and computational scale. We propose a new physics informed neural networks for solving considered systems. It is proved that the standard gradient descent for the proposed method converges to the global optimum of the loss with an appropriate choice of the learning rate. In view of the study on the construction and scarcity of multi-population mean field games, the effectiveness of our method is demonstrated by solving single-population mean-field game systems in 1, 2, and 100 dimensions, and then applied to solving the mean field games systems with 50 and 100 populations. These results open the door to much-anticipated applications of multi-population mean field games that are beyond the reach of existing numerical methods.

Linear Quadratic Optimal Control Problems for Conditional Mean-Field Stochastic Differential Equations Under Partial Information

Hua Xiao
Shandong University
Peoples Rep of China
Co-Author(s):    Siqi Feng, GuangchenWang, Hua Xiao, Zhuangzhuang Xing, Huanjun Zhang
Abstract:
This talk centers on a kind of linear quadratic stochastic optimal control problem driven by conditional mean-field stochastic differential equations under partial information. In this context, the cost functional is permitted to be indefinite. At the outset, we present a broad overview of optimal control with the aid of the adjoint equation. However, the Hamiltonian system poses a challenge as it incorporates two distinct conditional expectations, making decoupling unattainable. To tackle this, we extract and analyze three representative cases derived from practical problems, discussing each case separately. We find that, in any case, the uniform convexity of the cost functional ensures the existence of a unique optimal control with a state feedback form for the problem, which is a weaker assumption compared to the standard one. Finally, we apply the obtained results to address specific issues raised in the initial motivations of this paper. These applications demonstrate the practical relevance and effectiveness of our theoretical findings in addressing real-world challenges in the field of stochastic optimal control.

Innovation Incentive Game for Key Core Technology Innovation Based on Multi-Dimensional Private Information

Detao Zhang
Shandong University
Peoples Rep of China
Co-Author(s):    Juankai Ding, Ziang Duan, Detao Zhang
Abstract:
Innovation is the fundamental driver of long-term economic growth and the enhancement of social welfare. However, innovation does not occur spontaneously; the rational decisions of individual firms often deviate from the socially optimal outcome, making government intervention indispensable. Yet, information asymmetry between the government and firms makes it difficult for the government to accurately assess firms` technological capabilities and true costs, thereby hindering the design of effective incentive measures and giving rise to adverse selection. To address this, this paper constructs an innovation incentive game framework that covers the entire process of technological innovation-technology industrialization-downstream demand cultivation, and systematically explores how the government can implement appropriate subsidies when firms possess multi-dimensional private information, thereby inducing firms to voluntarily disclose their true information and engage in innovation activities.

Simultaneous identifiability of piecewise-constant reaction coefficient and initial condition in a reaction--diffusion equation

Zhi-Xue Zhao
School of Mathematical Sciences, Tianjin Normal University
Peoples Rep of China
Co-Author(s):    
Abstract:
This talk addresses the inverse problem of simultaneously identifying a piecewise-constant reaction coefficient and the initial condition in a one-dimensional reaction--diffusion equation, using only boundary control and observation data. We propose a novel ON/OFF control strategy combined with an estimation/cancellation mechanism. This approach decomposes the problem of simultaneous identifiability into two sequential subproblems: first, uniquely determining the reaction coefficient; second, reconstructing the initial value. By applying Property C, we establish the unique identifiability of the piecewise-constant reaction coefficient. The initial value is then recovered via spectral analysis. A key advantage of our framework is its exclusive reliance on boundary data, eliminating the need for interior measurements and thereby enhancing practical applicability.

Output regulation problem for passive systems with strong stability

Huacheng Zhou
Central South University
Peoples Rep of China
Co-Author(s):    
Abstract:
In this paper, we present output regulator problem for an impedance passive linear plant, using the classical resonant internal model based controller. The reference and disturbance signals are assumed to be linear combinations of sine waves of known frequencies. We prove that (under suitable mild assumptions, but without any stability assumptions for the plant) this controller leads to a strongly stable closed-loop system and it solves the output regulator problem if the concept of ``the error converges to zero`` is defined in a suitable way: low-pass filtering the error with any low-pass filter gives a continuous signal that tends to zero. We give two examples (both models of engineering systems) to illustrate the results.