| Abstract: |
| Let $(M,\Phi^t)$ be a dynamical system on a compact manifold $M$, and let $x(t)=\Phi^t(x_0)$ denote a trajectory. Suppose that only a scalar observation $s(t)$ is available. For a fixed delay $\tau>0$ and embedding dimension $m$, define the delay-coordinate map $ F(t)=\bigl(s(t),s(t+\tau),\dots,s(t+(m-1)\tau)\bigr)\in\mathbb{R}^m,$ and let $X=\{F(t_i)\}_{i=1}^n \subset \mathbb{R}^m$ be a finite sample. Under standard embedding assumptions, $X$ approximates a compact invariant set of the underlying dynamics up to smooth embedding. For $\varepsilon>0$, let $\mathrm{VR}_\varepsilon(X)$ denote the Vietoris--Rips complex of $X$, and let $\mathrm{Dgm}_k(X)$ denote the persistence diagram in homological dimension $k$ associated with the filtration $\{\mathrm{VR}_\varepsilon(X)\}_{\varepsilon\ge 0}$. These diagrams provide a multiscale description of the topology of the sampled set $X$, capturing features such as connected components and $1$-dimensional cycles. A major obstacle in this setting is computational: the size of $\mathrm{VR}_\varepsilon(X)$ grows combinatorially with $|X|$, making persistent homology infeasible for large datasets. A common approach is to replace $X$ by a smaller subset $L\subset X$, but arbitrary subsampling may result in substantial discrepancies between $\mathrm{Dgm}_k(X)$ and $\mathrm{Dgm}_k(L)$, as measured, for example, by the bottleneck distance $d_B$.
In this work, we construct a subset $L\subset X$ by imposing coverage and overlap conditions on local neighborhoods. Fix $K\in\mathbb{N}$, and for each $x\in X$, let $\mathrm{KNN}_K(x)\subset X$ denote the set of its $K$-nearest neighbors. We select $L\subset X$ such that $X \subset \bigcup_{\ell\in L}\mathrm{KNN}_K(\ell)$, and such that there exists $r\in(0,1]$ with the property that, whenever $\ell_i,\ell_j\in L$ satisfy $\mathrm{KNN}_K(\ell_i)\cap \mathrm{KNN}_K(\ell_j)\neq\varnothing$, their overlap obeys $\bigl|\mathrm{KNN}_K(\ell_i)\cap \mathrm{KNN}_K(\ell_j)\bigr| \ge rK$. The first condition ensures that every point of $X$ is contained in at least one landmark neighborhood, while the second imposes a lower bound on the size of pairwise intersections among overlapping neighborhoods. These conditions induce a cover of $X$ with controlled overlap, designed to preserve adjacency relations and cyclic configurations in the sampled data. In particular, they support preservation of homological features in dimensions $H_0$, $H_1$, and $H_2$ under subsampling, and thus reduce the discrepancy between $\mathrm{Dgm}_k(X)$ and $\mathrm{Dgm}_k(L)$ for $k=0,1,2$. We evaluate the method on synthetic and real trajectory data by computing $d_B\bigl(\mathrm{Dgm}_k(X),\mathrm{Dgm}_k(L)\bigr)$. We apply the proposed method in two case studies of time series data: delay-embedded electrocardiogram (ECG) signals and ocean drifter trajectories. In the ECG setting, delay-coordinate reconstruction yields point clouds for which we recover persistent 1-cycle features encoding periodicity information, whereas in the ocean drifter setting, sampled trajectories give rise to point clouds whose topological features reflect coherent transport structures in fluid flow. In both cases, controlling the discrepancy between persistence diagrams provides a quantitative mechanism for preserving the topological signatures used in data-driven analysis of the underlying dynamics while significantly reducing computational cost. |
|