Untitled Document

Special Session 115: Topology and Dynamics in Data


Temporal Graph Classification with Topological Machine Learning

Baris Coskunuzer

University of Texas at Dallas
USA

Co-Author(s): Md Joshem Uddin, Soham Changani

Abstract:

Temporal graph classification models time evolving networks in neuroscience, cybersecurity, and infrastructure. Existing temporal GNNs often miss global structural evolution and can be sensitive to noise and node permutations. In this talk, we present \textbf{T3former}, a topological machine learning framework that augments temporal graph models with stable global descriptors. T3former extracts spectral summaries and persistent homology features from each snapshot to capture multi scale dynamics such as connectivity changes and the emergence or disappearance of cycles. These signals are fused with temporal neural architectures to produce robust sequence level representations. Across real world datasets including brain, social, and traffic networks, T3former improves classification performance over strong TGNN baselines, especially when labels depend on structural dynamics.

Sheaf-Theoretic Models for Signal Interaction on Complex Networks

Chuan-Shen Hu

National University of Kaohsiung
Taiwan

Co-Author(s):

Abstract:

Combinatorial structures such as graphs, simplicial complexes, and cell complexes provide the foundations for geometric and topological deep learning (GDL and TDL), yet the mechanisms governing how features propagate and interact across local and global structures during training remain not yet fully understood. In this talk, we present an ongoing project that develops a cellular sheaf-theoretic framework to study the consistency and interaction of learned signals on such mathematical objects. By encoding node features and edge weights as local sections of a cellular sheaf, the proposed approach enables a systematic analysis of local agreements and consistencies, providing a new perspective on feature diffusion and aggregation from a topological viewpoint. In addition, a multiscale extension inspired by topological data analysis is introduced to capture hierarchical feature interactions across different resolutions. This framework aims to provide a unified mathematical perspective on GDL/TDL models, with potential applications to tasks such as node classification, substructure detection, and community detection. This is a work in progress, and preliminary results have been presented at the AAAI 2026 Workshop on Foretell of Future AI from Mathematical Foundation.

Binding Closed Ties: Ensuring Topological Fidelity of Delay-Embedded Attractors via Redundant KNN Coverage

Paul Samuel Ignacio

University of the Philippines Baguio
Philippines

Co-Author(s): Jhunas Paul Viernes

Abstract:

Let $(M,\Phi^t)$ be a dynamical system on a compact manifold $M$, and let $x(t)=\Phi^t(x_0)$ denote a trajectory. Suppose that only a scalar observation $s(t)$ is available. For a fixed delay $\tau>0$ and embedding dimension $m$, define the delay-coordinate map $ F(t)=\bigl(s(t),s(t+\tau),\dots,s(t+(m-1)\tau)\bigr)\in\mathbb{R}^m,$ and let $X=\{F(t_i)\}_{i=1}^n \subset \mathbb{R}^m$ be a finite sample. Under standard embedding assumptions, $X$ approximates a compact invariant set of the underlying dynamics up to smooth embedding. For $\varepsilon>0$, let $\mathrm{VR}_\varepsilon(X)$ denote the Vietoris--Rips complex of $X$, and let $\mathrm{Dgm}_k(X)$ denote the persistence diagram in homological dimension $k$ associated with the filtration $\{\mathrm{VR}_\varepsilon(X)\}_{\varepsilon\ge 0}$. These diagrams provide a multiscale description of the topology of the sampled set $X$, capturing features such as connected components and $1$-dimensional cycles. A major obstacle in this setting is computational: the size of $\mathrm{VR}_\varepsilon(X)$ grows combinatorially with $|X|$, making persistent homology infeasible for large datasets. A common approach is to replace $X$ by a smaller subset $L\subset X$, but arbitrary subsampling may result in substantial discrepancies between $\mathrm{Dgm}_k(X)$ and $\mathrm{Dgm}_k(L)$, as measured, for example, by the bottleneck distance $d_B$. In this work, we construct a subset $L\subset X$ by imposing coverage and overlap conditions on local neighborhoods. Fix $K\in\mathbb{N}$, and for each $x\in X$, let $\mathrm{KNN}_K(x)\subset X$ denote the set of its $K$-nearest neighbors. We select $L\subset X$ such that $X \subset \bigcup_{\ell\in L}\mathrm{KNN}_K(\ell)$, and such that there exists $r\in(0,1]$ with the property that, whenever $\ell_i,\ell_j\in L$ satisfy $\mathrm{KNN}_K(\ell_i)\cap \mathrm{KNN}_K(\ell_j)\neq\varnothing$, their overlap obeys $\bigl|\mathrm{KNN}_K(\ell_i)\cap \mathrm{KNN}_K(\ell_j)\bigr| \ge rK$. The first condition ensures that every point of $X$ is contained in at least one landmark neighborhood, while the second imposes a lower bound on the size of pairwise intersections among overlapping neighborhoods. These conditions induce a cover of $X$ with controlled overlap, designed to preserve adjacency relations and cyclic configurations in the sampled data. In particular, they support preservation of homological features in dimensions $H_0$, $H_1$, and $H_2$ under subsampling, and thus reduce the discrepancy between $\mathrm{Dgm}_k(X)$ and $\mathrm{Dgm}_k(L)$ for $k=0,1,2$. We evaluate the method on synthetic and real trajectory data by computing $d_B\bigl(\mathrm{Dgm}_k(X),\mathrm{Dgm}_k(L)\bigr)$. We apply the proposed method in two case studies of time series data: delay-embedded electrocardiogram (ECG) signals and ocean drifter trajectories. In the ECG setting, delay-coordinate reconstruction yields point clouds for which we recover persistent 1-cycle features encoding periodicity information, whereas in the ocean drifter setting, sampled trajectories give rise to point clouds whose topological features reflect coherent transport structures in fluid flow. In both cases, controlling the discrepancy between persistence diagrams provides a quantitative mechanism for preserving the topological signatures used in data-driven analysis of the underlying dynamics while significantly reducing computational cost.

On Mapper - a TDA approach to visualising data

Cerene Rathilal

University of KwaZulu-Natal (South Africa)
So Africa

Co-Author(s): Maria VivIen Visaya

Abstract:

\begin{document} Topological Data Analysis (TDA) is a modern approach for analysing complex datasets using tools from algebraic topology, the branch of mathematics concerned with studying the shape of spaces. Unlike conventional statistics, which often assumes linear or low-dimensional structures, TDA is designed to uncover hidden patterns and global organisation in high-dimensional and noisy data. There are two main approaches to TDA: Persistent Homology (PH) and Mapper. In this talk, we will focus on Mapper, which creates a simplified graph-based representation of high-dimensional data. It works by partitioning the dataset, analysing local structures, and connecting them into a network called a Mapper graph. We end by discussing an application for breast cancer analysis. \end{document}

Topological signatures of admixture in genomic data

Maria Vivien Visaya

University of Johannesburg
So Africa

Co-Author(s): Al Bien Aculan, Rachelle Sambayan, Victoria Mendoza, Ricardo del Rosario

Abstract:

We investigate topological summaries of high-dimensional genomic data to detect and characterise population admixture, without relying on parametric population genetic models. Using persistent homology, we analyse haplotype data from 26 human populations (3202 individuals) from the 1000 Genomes Project. Admixed populations exhibit distinct and reproducible topological signatures, particularly in the distribution and persistence of one-dimensional homology classes. Unlike non-admixed populations, where short-lived cycles emerge only at large filtration scales, admixed populations display cycles distributed across a broad range of scales, reflecting heterogeneous ancestry structure. We formalise this observation through a non-admixture score (NAS) derived from persistence barcode statistics, which robustly separates admixed from non-admixed populations across genome-wide and per-chromosome analyses. Further, by equipping persistence diagrams with the Wasserstein metric, we demonstrate that hierarchical clustering recovers groups of populations with shared admixture signatures, revealing structure not captured by classical measures such as FST. Our results suggest that persistent homology provides an orthogonal, model-free framework for population genetic inference, capturing geometric and topological aspects of genetic variation that complement existing statistical approaches. This positions topological data analysis as a promising tool for studying complex evolutionary processes in large-scale genomic data.