Special Session 112: Nonlinear Dynamics: Methods, Models, and Applications

Sparse modeling of high dimensional time series and the role of lag causality
Dimitris Kugiumtzis
Aristotle University of Thessaloniki
Greece
Co-Author(s):    Petros Petridis
Abstract:
Machine learning and deep learning models are widely utilized for modeling and forecasting multivariate time series but they have no closed form and do not give insight onto the true dynamics. This study explores methods that elucidate the structural form of the equations representing the components of the multivariate time series and conducts a comparative analysis of their forecasting errors. Specifically, the study evaluates the performance of regression models bearing analytic form on multivariate time series of stochastic and nonlinear deterministic systems. These models are: 1) polynomial regression models, 2) the Sparse Identification of Nonlinear Dynamics (SINDy) algorithm and 3) symbolic regression models. In addition, we consider sparse versions of the above models by restricting the input lag variables to the most informative to the response, as found by the Partial Mutual Information from Mixed Embedding (PMIME) algorithm. The models are compared with respect to multi-step ahead prediction on multivariate time series of stochastic and deterministic systems of varying dimension. The results of the simulations highlight the importance of dimensionality reduction through lag variable selection in the autoregressive models of any of the three examined types.