Special Session 125: Analysis, Algorithms, and Applications of Neural Networks

Neural network, dynamical system and formal language

Yongqiang Cai
Beijing Normal University
Peoples Rep of China
Co-Author(s):    
Abstract:
Deep learning has made significant progress in data science and natural science. Some studies have linked deep neural networks to dynamical systems, but the network structure is restricted to a residual network. It is known that residual networks can be regarded as a numerical discretization of dynamical systems. In this talk, we consider the traditional network structure and prove that vanilla feedforward networks can also be used for the numerical discretization of dynamical systems, where the width of the network is equal to the input and output dimensions. The proof is based on the properties of the leaky-ReLU function and the numerical technique of the splitting method for solving differential equations. The results could provide a new perspective for understanding the approximation properties of feedforward neural networks. In particular, the minimum width of neural networks for universal approximation can be derived and the relationship between mapping conpositions and regular languages can be constructed.

Solving Hughes model for crowd with Fourier neural operator

Salah Eddine Choutri
New York University Abu Dhabi (CITIES research center)
United Arab Emirates
Co-Author(s):    
Abstract:
This presentation explores the application of Fourier Neural Operators (FNOs) to solve the Hughes pedestrian flow model. The Hughes model, a macroscopic crowd dynamics model, incorporates a velocity field dependent on the density distribution and a turning point representing the direction change due to congestion. The approach leverages the FNO framework to learn solutions of the model efficiently. Training data are generated using the wave-front tracking scheme, ensuring adherence to the entropy condition. The presentation outlines the mathematical formulation of the Hughes model, discusses the training methodology and dataset preparation, and demonstrates the FNO architecture`s capacity to predict accurate density profiles and turning points. This study highlights the potential of operator learning in tackling complex PDE-constrained problems in crowd dynamics.

Structure-conforming Operator Learning for Geometric Inverse Problems

Ruchi Guo
Sichuan University
Peoples Rep of China
Co-Author(s):    Long Chen, Shuhao Can, Huayi Wei
Abstract:
The principle of developing structure-conforming numerical algorithms widely exists in scientific computing. In this work, following this principle, we propose an operator learning method for solving a class of geometric inverse problems. The architecture here is inspired by Direct Sampling Methods and is also closely related to convolutional network and Transformer. The latter one is state-of-art architecture for many scientific computing tasks. To obtain the optimal hyperparameters in this method, we propose a FEM and OpL joint-training framework and a Leaning-Automated FEM package. Numerical examples demonstrate that the proposed architecture outperforms many existing operator learning methods in the literature.

Neural Networks and Operators Based on Convolution and Multigrid Structure

Juncai He
The King Abdullah University of Science and Technology
Saudi Arabia
Co-Author(s):    Jinchao Xu and Xinliang Liu
Abstract:
In this talk, we will present recent results on applying multigrid structures to both neural networks and operators for problems in images and numerical PDEs. First, we will illustrate MgNet as a unified framework for convolutional neural networks and multigrid methods with some preliminary theories and applications. Then, we will discuss some basic background on operator learning, including the problem setup, a uniform framework, and a general universal approximation result. Motivated by the general definition of neural operators and the MgNet structure, we propose MgNO, which utilizes multigrid structures to parameterize these linear operators between neurons, offering a new and concise architecture in operator learning. This approach provides both mathematical rigor and practical expressivity, with many interesting numerical properties and observations.

Neural Operator for Multidisciplinary Engineering Design

Daniel Zhengyu Huang
Peking University
Peoples Rep of China
Co-Author(s):    
Abstract:
Deep learning surrogate models have shown significant promise in solving partial differential equations. These efficient models enable many-query computations in science and engineering, with particular focus on engineering design optimization, which is the central topic of this talk. I will begin by introducing the neural operator approach for surrogate modeling, followed by a theoretical analysis of Bayesian nonparametric regression of linear functionals to better understand the sample complexity.

DualFL-CS: an accelerated, inexact, and parallel coordinate descent method for federated learning

Boou Jiang
The King Abdullah University of Science and Technology
Peoples Rep of China
Co-Author(s):    Boou Jiang, Jongho Park, Jinchao Xu
Abstract:
This work presents a novel approach to Federated Learning (FL), a collaborative learning model that leverages data distributed across numerous clients. We establish a duality connection between the widely studied FL problem and the parallel subspace correction problem, leading to the development of our accelerated FL algorithm, DualFL-CS. By employing a novel randomized coordinate descent method, our algorithm effectively incorporates client sampling and allows for the use of inexact local solvers, thereby reducing computational costs in both smooth and non-smooth cases. For smooth FL problems, DualFL-CS achieves optimal linear convergence rates, while for non-smooth problems, it attains accelerated sub-linear convergence rates. Numerical experiments demonstrate the superior performance of our algorithm compared to existing state-of-the-art FL algorithms.

A deformation-based framework for learning solution mappings of PDEs defined on varying domains

Pengzhan Jin
Peking University
Peoples Rep of China
Co-Author(s):    
Abstract:
In this work, we establish a deformation-based framework for learning solution mappings of PDEs defined on varying domains. The union of functions defined on varying domains can be identified as a metric space according to the deformation, then the solution mapping is regarded as a continuous metric-to-metric mapping, and subsequently can be represented by another continuous metric-to-Banach mapping using two different strategies, referred to as the D2D framework and the D2E framework, respectively. We point out that such a metric-to-Banach mapping can be learned by neural operators, hence the solution mapping is accordingly learned. With this framework, a rigorous convergence analysis is built for the problem of learning solution mappings of PDEs on varying domains. As the theoretical framework holds based on several pivotal assumptions which need to be verified for a given specific problem, we study the star domains as a typical example, and other situations could be similarly verified. There are three important features of this framework: (1) The domains under consideration are not required to be diffeomorphic, therefore a wide range of regions can be covered by one model provided they are homeomorphic. (2) The deformation mapping is unnecessary to be continuous, thus it can be flexibly established via combining a main body identity mapping and a local deformation mapping. This feature makes it possible to achieve inverse design for the large system where only local parts of the shape are tuned. (3) If a linearity-preserving neural operator such as MIONet is adopted, this framework still preserves the linearity of the surrogate solution mapping on its source term for linear PDEs, thus it can be applied to the hybrid iterative method. We finally present several numerical experiments to validate our theoretical results.

Entropy-based convergence rates of greedy algorithms

Yuwen Li
Zhejiang University
Peoples Rep of China
Co-Author(s):    Yuwen Li
Abstract:
In this talk, I will present novel convergence estimates of greedy algorithms including the reduced basis method for parametrized PDEs, the empirical interpolation method for approximating parametric functions, and the orthogonal/Chebyshev greedy algorithms for nonlinear dictionary approximation. The proposed convergence rates are all based on the metric entropy of underlying compact sets. This talk is partially based on joint work Jonathan Siegel.

Expressivity and Approximation Properties of Deep Neural Networks with ReLUk Activation

Tong Mao
King Abdullah University of Science and Technology
Saudi Arabia
Co-Author(s):    J. He, J. Xu
Abstract:
Deep ReLU$^k$ networks have the capability to represent higher-degree polynomials precisely. We provide a comprehensive constructive proof for polynomial representation using deep ReLU$^k$ networks. This allows us to establish an upper bound on both the size and count of network parameters. Consequently, we are able to demonstrate a suboptimal approximation rate for functions from Sobolev spaces as well as for analytic functions. Additionally, we reveal that deep ReLU$^k$ networks can approximate functions from a range of variation spaces, extending beyond those generated solely by the ReLU$^k$ activation function. This finding demonstrates the adaptability of deep ReLU$^k$ networks in approximating functions within various variation spaces.

Adaptive Growing Randomized Neural Networks for Solving Partial Differential Equations

Fei Wang
Xi`an Jiaotong University
Peoples Rep of China
Co-Author(s):    Haoning Dang and Song Jiang
Abstract:
Traditional numerical methods face numerous challenges in handling high-dimensional problems, complex regional segmentation, and error accumulation caused by time iteration. Concurrently, neural network methods based on optimization training suffer from insufficient accuracy, slow training speeds, and uncontrollable errors due to the lack of efficient optimization algorithms. To combine the advantages of these two approaches and overcome their shortcomings, randomized neural network methods have been proposed. This method not only leverages the strong approximation capabilities of neural networks to circumvent the limitations of classical numerical methods but also aims to resolve issues related to accuracy and training efficiency in neural networks. By incorporating a posterior error estimation as feedback, in this talk, we propose Adaptive Growing Randomized Neural Networks for solving PDEs. This approach can adaptively generate network structures, significantly improving the approximation capabilities.

Optimistic Sample Size Estimate for Deep Neural Networks

Yaoyu Zhang
Shanghai Jiao Tong University
Peoples Rep of China
Co-Author(s):    
Abstract:
Estimating the sample size required for a deep neural network (DNN) to accurately fit a target function is a crucial issue in deep learning. In this talk, we introduce a novel sample size estimation method based on the phenomenon of condensation, which we term the optimistic estimate. This method quantitatively characterizes the best possible performance achievable by neural networks through condensation. Our findings suggest that increasing the width and depth of a DNN preserves its sample efficiency. However, increasing the number of unnecessary connections significantly deteriorates sample efficiency. This analysis provides theoretical support for the commonly adopted strategy in practice of expanding network width and depth rather than increasing the number of connections.