Research – Artificial Intelligence – Recurrent Neural Networks

Recurrent Neural Networks

Forecasting Chaotic Dynamics with Recurrent Neural Networks (RNNs)

Natural systems, ranging from climate and ocean circulation to organisms and cells, involve complex dynamics extending over multiple spatio-temporal scales. Centuries old efforts to comprehend and forecast the dynamics of such systems have spurred developments in large scale simulations, dimensionality reduction techniques and a multitude of forecasting methods. The goals of understanding and prediction have been complementing each other but have been hindered by the high dimensionality and chaotic behavior of these systems. We propose to utilise state-of-the-art recurrent neural network architectures for forecasting spatio-temporal chaotic systems build surrogate models and analyse dynamical characteristics (e.g. Lyapunov spectrum) of such systems. These methods can be either applied in the original space entailing the full information of the system’s state, or in the reduced order space, in a fully data-driven scheme or complementing equations from first-principles.

We introduced a data-driven forecasting method for high dimensional, chaotic systems using a Long Short-Term Memory (LSTM) recurrent neural network. The proposed LSTM perform inference of high dimensional dynamical systems in their reduced order space and are shown to be an effective set of non-linear approximators of their attractor. We demonstrate the forecasting performance of the LSTM and compare it with Gaussian processes (GPs) in time series obtained from the Lorenz 96 system, the Kuramoto-Sivashinsky equation and a prototype climate model. Initially, we collect data from each application and perform dimensionality reduction keeping the modes with the largest energy. In this way, we construct a reduced order observable from the dynamical system. We train the network to forecast the dynamics of this observable. The history of the observable is exploited (in the memory cells of the LSTM) to account for the missing information of the reduced order state inspired by the Taken’s theorem that states that the dynamics of the original system can be reconstructed in an embedding given by delayed versions of the reduced order observable. The LSTM networks outperform the GPs in short-term forecasting accuracy in all applications considered owing to the scaling of their training procedure to large data-sets. A hybrid architecture, extending the LSTM with a mean stochastic model (MSM-LSTM), is proposed to ensure convergence to the invariant measure in fully turbulent scenarios. This novel hybrid method is fully data-driven and extends the forecasting capabilities of LSTM networks.

We utilize data from the dynamical system to construct a reduced order observable. The LSTM is trained on this reduced order space to forecast the dynamics. During testing, we iteratively forecast the dynamics feeding back in the input the predicted state, by iteratively integrating the dynamics predicted by the LSTM. The forecasted evolution of the reduced order observable is then expanded to the original space to compare the forecasted state evolution with the ground truth.

Moreover, we analysed the effectiveness of various state-of-the-art RNN cells in forecasting the spatiotemporal dynamics of chaotic systems in both their original space and a reduced order observable. We performed a comparative study of Reservoir Computing (RC) and backpropagation through time (BPTT) algorithms for gated network architectures on a number of benchmark problems. We quantified their relative prediction accuracy on the long-term forecasting of Lorenz-96 and the Kuramoto-Sivashinsky equation and calculation of its Lyapunov spectrum. We discuss their implementation on parallel computers and highlight advantages and limitations of each method. We find that, when the full state dynamics are available for training, RC outperforms BPTT approaches in terms of predictive performance and capturing of the long-term statistics, while at the same time requiring much less time for training. However, in the case of reduced order data, large RC models can be unstable and more likely, than the BPTT algorithms, to diverge in the long term. In contrast, RNNs trained via BPTT capture well the dynamics of these reduced order models. This study confirms that RNNs present a potent computational framework for the forecasting of complex spatio-temporal dynamics.

Forecasting the full 512 dimensional state space of the Kuramoto-Sivashinsky equation with RNNs with Reservoir Computing (RC), Gated Recurrent Units (GRU) and Long Short-Term Memory (LSTM) cells.

Forecasting the evolution of the reduced order observable of the Lorenz-96 system. The observable is constructed by keeping the 35 most energetic SVD modes (from the total 40). GRU and LSTM networks outperform Reservoir Computers and are less prone to divergence effects, emphasising their superiority in capturing temporal dependencies in the dynamics.

Publications

2018

P. R. Vlachas, W. Byeon, Z. Y. Wan, T. P. Sapsis, and P. Koumoutsakos, “Data-driven forecasting of high-dimensional chaotic systems with long short-term memory networks,” P. Roy. Soc. A-Math. Phy., vol. 474, iss. 2213, p. 20170844, 2018.

BibTex

@article{vlachas2018a,
author = {Pantelis R. Vlachas and Wonmin Byeon and Zhong Y. Wan and Themistoklis P. Sapsis and Petros Koumoutsakos},
doi = {10.1098/rspa.2017.0844},
journal = {{P. Roy. Soc. A-Math. Phy.}},
month = {may},
number = {2213},
pages = {20170844},
publisher = {The Royal Society},
title = {Data-driven forecasting of high-dimensional chaotic systems with long short-term memory networks},
url = {https://drive.google.com/file/d/1fFvdBojGd0ozXBh9WoWlF9ZVi19ZLAg5/view?usp=drive_link},
volume = {474},
year = {2018}
}

Abstract

We introduce a data-driven forecasting method for high-dimensional chaotic systems using long shortterm memory (LSTM) recurrent neural networks. The proposed LSTM neural networks perform inference of high-dimensional dynamical systems in their reduced order space and are shown to be an effective set of nonlinear approximators of their attractor. We demonstrate the forecasting performance of the LSTM and compare it with Gaussian processes (GPs) in time series obtained from the Lorenz 96 system, the Kuramoto–Sivashinsky equation and a prototype climate model. The LSTM networks outperform the GPs in short-term forecasting accuracy in all applications considered. A hybrid architecture, extending the LSTM with a mean stochastic model (MSM–LSTM), is proposed to ensure convergence to the invariant measure. This novel hybrid method is fully data-driven and extends the forecasting capabilities of LSTM networks.

Digital Object Identifier

Research paper

Z. Y. Wan, P. R. Vlachas, P. Koumoutsakos, and T. P. Sapsis, “Data-assisted reduced-order modeling of extreme events in complex dynamical systems,” PLoS ONE, vol. 13, iss. 5, pp. 1-22, 2018.

BibTex

@article{wan2018a,
author = {Zhong Y. Wan and Pantelis R. Vlachas and Petros Koumoutsakos and Themistoklis P. Sapsis},
doi = {10.1371/journal.pone.0197704},
journal = {{PL}o{S} {ONE}},
month = {may},
number = {5},
pages = {1-22},
publisher = {Public Library of Science},
title = {Data-assisted reduced-order modeling of extreme events in complex dynamical systems},
url = {https://drive.google.com/file/d/1C3Hc7dd_NiWQ433jyg3ZdNT32oWcVgJj/view?usp=drive_link},
volume = {13},
year = {2018}
}

Abstract

The prediction of extreme events, from avalanches and droughts to tsunamis and epidemics, depends on the formulation and analysis of relevant, complex dynamical systems. Such dynamical systems are characterized by high intrinsic dimensionality with extreme events having the form of rare transitions that are several standard deviations away from the mean. Such systems are not amenable to classical order-reduction methods through projection of the governing equations due to the large intrinsic dimensionality of the underlying attractor as well as the complexity of the transient events. Alternatively, data-driven techniques aim to quantify the dynamics of specific, critical modes by utilizing data-streams and by expanding the dimensionality of the reduced-order model using delayed coordinates. In turn, these methods have major limitations in regions of the phase space with sparse data, which is the case for extreme events. In this work, we develop a novel hybrid framework that complements an imperfect reduced order model, with data-streams that are integrated though a recurrent neural network (RNN) architecture. The reduced order model has the form of projected equations into a low-dimensional subspace that still contains important dynamical information about the system and it is expanded by a long short-term memory (LSTM) regularization. The LSTM-RNN is trained by analyzing the mismatch between the imperfect model and the data-streams, projected to the reduced-order space. The data-driven model assists the imperfect model in regions where data is available, while for locations where data is sparse the imperfect model still provides a baseline for the prediction of the system state. We assess the developed framework on two challenging prototype systems exhibiting extreme events. We show that the blended approach has improved performance compared with methods that use either data streams or the imperfect model alone. Notably the improvement is more significant in regions associated with extreme events, where data is sparse.

Digital Object Identifier

Research paper

Address

Herbert S. Winokur Jr. Professor of Computing in Science and Engineering

323 Pierce Hall, 29 Oxford Street Cambridge, MA 02138

View Map

Recent Publications

D. Waelchli, P. Weber, M. Chatzimanolakis, R. Katzschmann, and P. Koumoutsakos, “Inverse reinforcement learning for objective discovery in collective behavior of artificial swimmers“, Physical Review Fluids, vol. 10, no. 6, pp. 064901, 2025. Inverse reinforcement learning for objective discovery in collective behavior of artificial swimmers DOI Inverse reinforcement learning for objective discovery in collective behavior of artificial swimmers PDF
M. Balcerak, et al., “Individualizing glioma radiotherapy planning by optimization of a data and physics-informed discrete loss“, Nature Communications, vol. 16, pp. 5982, 2025. Individualizing glioma radiotherapy planning by optimization of a data and physics-informed discrete loss DOI Individualizing glioma radiotherapy planning by optimization of a data and physics-informed discrete loss PDF
L. Amoudruz, P. Karnakov, and P. Koumoutsakos, “Contactless precision steering of particles in a fluid inside a cube with rotating walls“, Journal of Fluid Mechanics, vol. 1014, no. A15, 2025. Contactless precision steering of particles in a fluid inside a cube with rotating walls DOI Iner Contactless precision steering of particles in a fluid inside a cube with rotating walls PDF

More ▸

Social media