Transformer-based Atmospheric Density Forecasting

Julia Briden, Massachusetts Institute of Technology; Peng Mun Siew, Massachusetts Institute of Technology; Victor Rodriguez-Fernandez, Universidad Politécnica de Madrid; Richard Linares, Massachusetts Institute of Technology

Keywords: atmospheric density modeling, atmospheric drag, deep learning, orbit prediction, space weather, transformers

Abstract:

Thermospheric mass density serves as the largest source of uncertainty for low Earth orbit (LEO) satellite orbit prediction. This uncertainty is largely due to fluctuations in solar and geomagnetic activity that can occur on the order of hours. Solar and geomagnetic storms take place when charged particles, usually from coronal mass ejections, travel to Earth’s atmosphere and increase atmospheric heating and transient solar wind activity. These storms are often marked by high nonlinearities in the evolution of atmospheric density over time and, therefore, prove to be difficult to predict with even the most advanced forecasting techniques. With the ability of a single geomagnetic storm to significantly alter the orbit of satellites, techniques for atmospheric density forecasting are vital for space situational awareness.
Current forecasting methods use physics-based atmospheric density models, such as the Global Ionosphere-Thermosphere Model (GITM) and the Thermosphere-Ionosphere-Electrodynamics General Circulation Model (TIE-GCM), which solve the full continuity, energy, and momentum equations required for the propagation of atmospheric density. While these models are ideal for short-term forecasting, the computational cost required to run physics-based models exceeds computing capabilities for most real-time applications. Conversely, empirical models, including NRLMSISE-00 and JB2008, are fast to evaluate, since they are derived only from measurements, such as total mass density, temperature, and oxygen number density. Unfortunately, empirical models cannot translate their computational efficiency to atmospheric density prediction, since they lack the underlying dynamics required for forecasting. Recent work in reduced-order modeling (ROM) and dynamic mode decomposition with control (DMDc) has addressed this gap in computationally efficient atmospheric density prediction methods; ROMs use proper orthogonal decomposition (POD) or convolutional autoencoder-based machine learning (ML) to reduce the dimensionality of physics-based models and generate reduced-order snapshots of empirical models to enable atmospheric density propagation. To propagate a reduced-order atmospheric density state forward in time, DMDc models atmospheric density as a linear dynamical system with a control input, where the dynamics and input matrices are estimated from a dataset of state snapshots. While this approach captures the nominal atmospheric density dynamics relatively well, DMDc often fails when mildly nonlinear conditions occur.
The effectiveness of using a machine learning approach for modeling the nonlinear dynamics in atmospheric density has been assessed by Turner et al. By utilizing a deep feedforward neural network (NN) for atmospheric density forecasting, an error reduction of over 99%, when compared to DMDc, was achieved. While effective for short-term forecasts, multi-step prediction performance was hindered by the inability of the feedforward NN to optimize with previous data during training. Moreover, the challenge of atmospheric density forecasting during a significant space weather event requires an algorithm that can capture long-term dependencies in the dataset to prevent compounding propagation errors. The NN must handle large inputs for the atmospheric density state and only focus on the relevant part of the input for accurate forecasting. With the sequential processing component of Recurrent Neural Networks (RNNs), transformers train on sequential data in less time and with longer inputs by using a mechanism known as attention. By passing all hidden states, derived from the encoded sequential atmospheric density reduced-order states and space weather input, to the decoder network of the transformer propagator, the decoder can focus its attention on only the most relevant hidden states. Since the occurrence of a solar or geomagnetic storm represents an anomaly in the atmospheric density dynamical system, the transformer’s ability to focus attention on only the relevant propagation dynamics is imperative for fast and accurate atmospheric density forecasting.
After reducing the dimensionality of the current atmospheric density state using a POD or ML ROM, the transformer-based atmospheric density forecasting algorithm takes the current and past reduced-order states and the current space weather indices as an input to the transformer’s encoder network, and then uses the transformer’s decoder network to generate a new reduced-order atmospheric density state at the next time step. Where the encoder generates a set of embeddings for the control input and the time series reduced-order density states and the decoder generates the predicted reduced-order density states using the encoder’s embeddings. The training process includes seasonal-trend decomposition, which has been shown to boost model performance by 50%-80%. The final transformer propagation model provides a mapping between time-series reduced-order states and control input to the next predicted state, serving as a surrogate dynamical system.
To evaluate the performance of the transformer-based forecasting model, transformer-based atmospheric density predictions are compared with DMDc for satellite orbit prediction during significant space weather events, including the 2003 Halloween solar storms and the geomagnetic storm in February 2022. Empirical NRLMSISE-00, JB2008, JB2008 POD ROM and, JB 2008 ML ROM, as well as physics-based TIEGCM and TIEGCM POD ROM atmospheric density models are compared for forecasting with DMDc and with the transformer-based propagator. The High Precision Orbital Propagator (HPOP) architecture described in Briden et al. is extended to include transformer NN-based propagation and forecast orbits during the following solar and geomagnetic storm test cases: orbit propagation of the Quakesat, XSS-10, and Coriolis satellites during the 2003 Halloween solar storms and orbit propagation of Starlink-3396, Kepler-19, and TechEdSat-13 satellites during the February 2022 geomagnetic storm. Ballistic coefficients (BCs) are estimated by comparing TLE data to the semimajor axis change due to drag for each test case. The final orbit propagation architecture is used to evaluate our nonlinear transformer-based atmospheric density forecasting methods for use in real-time LEO satellite orbit propagation in the presence of significant space weather events. Uncertainty in the space weather indices during orbit propagation is assessed through a sensitivity analysis and error bounds on the satellites’ position for each test case. With almost 5,500 active satellites currently in orbit and the predicted launch of an additional 58,000 by 2030, forecasting the nonlinear dynamics of space weather-induced changes in atmospheric density are essential for Resident Space Object (RSO) deorbit prediction and collision avoidance.

Date of Conference: September 19-22, 2023

Track: Atmospherics/Space Weather

View Paper