Special Session 129: Mathematics of Data Science and Applications

Graph Neural Networks: Principles and Applications

JIA CAI
Guangdong University of Finance & Economics
Peoples Rep of China
Co-Author(s):    Jiahao Lai, Mengzhu Chen, Lin Gao, Ranhui Yan
Abstract:
In real-world scenarios, many data types are naturally represented as graphs with complex topological structures, inherently reflecting real-life systems. In recent years, Graph Neural Networks (GNNs) have attracted widespread attention as powerful tools for processing graph-structured data, with applications spanning diverse domains. This report examines several core issues in applying graph neural networks. First, we address stock movement prediction in the fintech domain. This task demands handling the complexity and dynamic nature of financial markets, making it a representative challenge in financial time series analysis. To tackle this issue, we propose a novel model: the Time-Lag and Edge Feature-Incorporated Graph Attention Network (TILE-GAT). Second, traffic flow prediction is a critical task in intelligent transportation systems, where the central challenge lies in capturing complex spati-temporal dependencies within road networks. In particular, modeling spatial dependencies requires accurately and efficiently characterizing traffic flow transmission between nodes, especially the causal relationships among them. To address this limitation, we introduce a Spatio-Temporal Causal Graph Neural Network (STCGNN) framework for traffic flow prediction.

Decomposition of Electrodermal Activity Signals Using Matrix Separation

Xuemei Chen
University of North Carolina Wilmington
USA
Co-Author(s):    
Abstract:
We develop a new method for decomposing EDA signals using matrix decomposition. This matrix decomposition framework is new and we present some theoretical guarantees of recovery from a convex optimization problem. We further develop efficient algorithms for this convex optimization program where we also propose a preconditioning technique. Some numerical experiments with real data is also presented.

Stable Phase Retrieval for Gaussian Shift-invariant Signals

Cheng Cheng
Sun Yat-sen University
Peoples Rep of China
Co-Author(s):    
Abstract:
Gabor phase retrieval for signals has attracted considerable attention in recent years. For the more general short-time linear canonical transform (STLCT), which arises naturally in optical systems and canonical time--frequency analysis, existing work has so far focused mainly on uniqueness and sampling conditions. Explicit reconstruction formulas, quantitative stability estimates, and robust reconstruction algorithms, however, are still missing. In this paper, we study uniqueness, stability, and robust reconstruction for phase retrieval from phaseless STLCT measurements in the complex Gaussian shift-invariant space $V_\beta^\infty(\varphi)$.

Toward Theoretical Insights into Diffusion Trajectory Distillation via Operator Merging

Weiguo Gao
Fudan University
Peoples Rep of China
Co-Author(s):    Ming LI
Abstract:
Diffusion trajectory distillation accelerates sampling by training a student model to approximate the multi-step denoising trajectories of a pretrained teacher model using far fewer steps. Despite strong empirical results, the trade-off between distillation strategy and generative quality remains poorly understood. We provide a theoretical characterization by reinterpreting trajectory distillation as an operator merging problem, differentiating our analysis between two distinct regimes. In the linear Gaussian regime, where approximation error is zero, we isolate optimization error, specifically signal shrinkage driven by finite training time, as the primary bottleneck. This characterization allows us to derive the theoretically optimal merging strategy, which exhibits a variance-driven phase transition and is computable via a Pareto dynamic programming algorithm. In the nonlinear Gaussian mixture regime, we prove that distilling composite steps incurs unavoidable approximation error due to the exponential growth of mixture components, and we quantify how these errors amplify across merges. Together, these results clarify the distinct theoretical mechanisms governing each regime and provide principled guidance for method selection. This is joint work with Ming Li.

Unbounded Density Ratio Estimation and Its Application to Covariate Shift Adaptation

Zhengchu Guo
Zhejiang University
Peoples Rep of China
Co-Author(s):    Ren-Rui Liu, Jun Fan, Lei Shi
Abstract:
This talk addresses unbounded density ratio estimation--an underexplored yet critical challenge in statistical learning--and its application to covariate shift adaptation. We propose a three-step procedure utilizing unlabeled data from both source and target distributions: (1) estimating a relative density ratio; (2) truncating to control unboundedness; and (3) transforming the estimate back to the standard density ratio, which is then used as importance weights for regression. We establish non-asymptotic convergence guarantees for both the density ratio estimator and the resulting regression estimator, showing that under mild conditions, both achieve optimal or near-optimal rates. This work provides new theoretical insights into density ratio estimation and learning under covariate shift, extending classical theory to more practical settings.

Convergence of zeroth-order optimization with stochastic mirror descent

Ting HU
Xi`an Jiaotong University
Peoples Rep of China
Co-Author(s):    
Abstract:
We study a randomized zeroth-order algorithm based only on computation of the difference of function values, which is associated with mirror descent to solve stochastic optimization problems. In the convex setting, we present a new convergence analysis of zeroth-order algorithm with general mirror maps, which relies on the $\ga$-H\"older $(0

Separation of Non-Stationary Multi-Component Signals: Enhanced SST/Chirplet Methods and Their Engineering & Data Science Applications

Qingtang Jiang
Zhejiang Normal University
Peoples Rep of China
Co-Author(s):    
Abstract:
Real-world signals, such as those from mechanical systems or medical data, are typically non-stationary and composed of multiple time-varying components. Traditional methods often struggle with mathematical rigor or lack the flexibility needed for dynamic environments. This talk presents advanced mathematical frameworks for the high-resolution analysis and recovery of these multi-component signals. First, we discuss the Adaptive Synchrosqueezing Transform (SST), which improves upon standard wavelet- and STFT-based SST by adaptively selecting time-varying parameters to resolve blurry time-frequency representations. Second, we address the challenge of signals with crossover instantaneous frequencies-a scenario where traditional well-separated conditions fail. We introduce a Chirplet Transform-based approach and a mathematically rigorous Signal Separation Operator that acts as an overpass in the time-frequency plane to resolve intersecting components. Finally, we present robust performance of these methods across machinery fault diagnosis, social media depression screening, and radar signal classification; these advancements further form a rigorous, adaptive framework for non-stationary signal separation, whose integration with deep learning delivers high performance across these key engineering and data science domains.

Resolution Invariant Operator Learning via Encoder Decoder Representations: Limiting Kernels and Convergence Analysis

Lei Shi
Fudan University
Peoples Rep of China
Co-Author(s):    Jiaqi Yang and Dingxuan Zhou
Abstract:
Operator learning aims to approximate mappings between infinite dimensional function spaces, while practical implementations rely on finite dimensional encoder decoder representations that often induce resolution dependent hypothesis classes and obscure the underlying operator being learned. We overcome this ambiguity by using a kernel based framework that identifies a resolution independent class of limiting operators associated with encoder decoder architectures. Starting from matrix valued kernel learning at a fixed finite resolution, we show that the encoder decoder structure admits an isometric lifting to an operator valued kernel formulation on function spaces. As input and output resolutions increase, encoder induced kernels converge to a resolution independent limiting kernel that characterizes the intrinsic operator class. Building on this limiting kernel perspective, we analyze stochastic gradient descent (SGD) for operator learning in encoder decoder settings. The framework applies to broad classes of kernels and encoder decoder constructions, including radial and dot product kernels; spectral truncation based encoders such as Fourier, polynomial, and wavelet based encoders; empirical principal component encoders; and kernel interpolation based encoding schemes. This is a joint work with Jiaqi Yang and Prof. Dingxuan Zhou.

Kolmogorov Superposition Theorem: Construction, Approximation and Networks

Li-Lian Wang
Division of Mathematical Sciences
Singapore
Co-Author(s):    
Abstract:
The Kolmogorov Superposition Theorem (KST, 1957), offers a mathematically elegant framework for expressing any high-dimensional continuous function as a superposition of one-dimensional continuous functions. This foundational result has recently gained renewed interest, particularly in neural networks. However, a major challenge remains: the one-dimensional functions resulted from all constructions are highly non-smooth. In this talk, we present a novel approximate version of KST involving $C^2$ inner functions and piecewise $C^2$ outer functions and show its applications in neural network approximations.

Convolution smoothed outcome weighted learning

Daohong Xiang
Zhejiang Normal University
Peoples Rep of China
Co-Author(s):    Aoli Yang, Jun Fan
Abstract:
Precision medicine seeks to provide personalized healthcare by tailoring treatments to individual patient characteristics, making the development of individualized treatment rules (ITR) a central task. Outcome weighted learning serves as a powerful framework for estimating optimal , often relying on the hinge loss function due to its favorable statistical properties. However, the non-smooth nature of the hinge loss function can lead to computational difficulties, particularly in high-dimensional or small-sample settings. In this talk, we introduce and investigate a family of outcome weighted learning algorithms generated by convolution-type smoothed hinge loss functions and varying Gaussian kernels. We derive fast convergence rates for the excess value function of these proposed learning algorithms across different model selection strategies, showing that these rates are optimal under mild noise and margin conditions, up to logarithmic factors. Numerical simulations support our theoretical findings and indicate that the convolution smoothed hinge loss functions outperform the standard hinge loss function in terms of both computational efficiency and treatment value.