Delay/Disruption Tolerant Reinforcement Learning Aurora based Communication System (DREAMS)

Richard Stottler, Stottler Henke Associates, Inc.; Gregory Howe, Stottler Henke Associates, Inc.

Keywords: Cislunar SSA/SDA, Satellite Communication Scheduling, Satellite Network Data Packet Routing, Satellite RF Link Optimization, Machine Learning, Artificial Intelligence, Distributed Scheduling

Abstract:

Space Situational Awareness (SSA) / Space Domain Awareness (SDA) requires sensing objects in a huge volume of space from sensors located in a very wide variety of orbital regimes, including lunar orbits and beyond-lunar earth orbits. Large volumes of sensor data, health and status telemetry, commanding, tipping and cueing data, etc. will need to be communicated throughout this very large volume enterprise, including, of course, to human commanders on the ground. There will be a variety of satellites in a variety of orbital regimes used to relay this data and, because of the distances involved, often data bandwidth will be severely limited so that making optimum use of these communication resources is paramount. Speed of light delays through this very large volume and the dynamic nature of the communications links between satellites in different orbital regimes means that ordinary methods for routing data through terrestrial networks are not applicable. Furthermore, there is a strong desire to optimize the transmission configuration parameters to optimize the bandwidth possible at each link. Interestingly, these same issues arise with NASAs LunaNet, an Internet that spans both the Earth and the moon and eventually beyond and includes lunar and earth-orbiting satellites.

Cognitive communication techniques are needed on spacecraft and ground stations to ensure optimal use of very expensive communications capabilities. The vast distances and numerous satellites necessitate multi-hop communications utilizing temporary stores. Space communication networks are often characterized by intermittent links, high latencies, low bandwidths, ad hoc connections, mobile physical nodes, asymmetric data rates, higher error rates, and heterogenous node types (e.g., earth-orbiting TDRS satellites versus a Lunar Relay Satellite (LRS)). All of these characteristics are variable (may depend on the time, the node, etc.) and may have both a predictable component and an unpredictable component (e.g., line-of-sight (LOS)-based network disruption is normally predictable, but sporadic space weather is not). To this end, NASA has established a Delay/Disruption Tolerant Network (DTN) service onboard the International Space Station (ISS). This service is currently in use, and NASA hopes to integrate DTNs into all three of NASAs networks via the Space Communication and Navigation (SCaN) program.

Our project, DREAMS, the Delay/disruption tolerant REinforcement learning and Aurora based coMmunication System, focuses on two separate but related aspects of optimal communications: routing and link optimization. Link optimization refers to the ability to optimize a given link by tuning the modulation scheme, coding scheme, transmit power, symbol rate, and roll-off factor, where optimality is determined by a weighted sum of Bit Error Rate (BER), throughput, occupied bandwidth, spectral efficiency, transmit power efficiency, and Direct Current (DC) power consumption. Routing refers to the ability, given that links are already optimized and known (to the extent possible), to optimally schedule storage and transmission of data to maximize throughput. In other words, routing is a high-level capability that chains links together (i.e., bundle transmissions) separated by periods of storage (i.e., bundle storage).

To optimally fulfill the router role, we are developing a distributed, optimized transmission and storage scheduler to be tested on a large number of very diverse scenarios. The distributed nature of the scheduler will be accomplished by both a straightforward division of labor between separate computational nodes and the exchange of extremely low volume information (resource status and the current schedule). (A proof of correctness, termination, and other important properties of the distributed version of the scheduler exists.) The scheduler is based on the bottleneck avoidance algorithm, which uses information from the entire schedule and current resource status and congestion to utilize ALL possible links, maximizing successful transmission of the highest number of packetsi.e., the bottleneck avoidance algorithm optimally spreads the required data packet traffic across all possible routes to get the most packets successfully delivered. By using global information across both space and time, the bottleneck avoidance scheduler can greatly outperform conventional routers which inherently only make decisions based on very local, very immediate information (which leads to underutilization of some links and a corresponding failure to deliver all the packets that could have been delivered in time).

While the scheduler could function entirely based on information concerning what packets have already arrived at some node on the network, its performance can be further improved by an ability to predict packets that have not yet arrived. By predicting future packets, their size, priority, and urgency, the scheduler can make more optimal decisions about possibly waiting to transmit some packets that have loose deadlines. It can even schedule transmission of packets that have not yet arrived. Machine Learning (ML) techniques can be used to predict two types of packets, those that are tied to existing daily activity schedules and those that are not. In the latter case, only rough order of magnitude volumes of traffic can be predicted, but this can still be used by the scheduler to avoid probable congestion caused by the arrival of these predicted packets.]

Finally, to optimize throughput includes optimizing the volume of data that can be successfully transmitted through each link. A machine learning system will utilize data from past real and simulated RF data transmission to periodically optimize each RF link to maximize its throughput. We will use a high-fidelity simulator to generate data for this purpose.

The DREAMS system consist of three primary components: 1) a ML-based RF link optimizer which periodically optimizes each RF link and apprises the distributed schedulers of what data bandwidth each link can support; 2) a packet predictor which predicts for the distributed schedulers both how many specific future packets are expected from specific applications (based on the daily activity schedule) and the expected volume of additional packets not tied to the schedule; and 3) the distributed schedulers themselves, which exchange network status and schedule information with one another to continuously update and re-optimize the transmission and storage schedule. By rescheduling frequently, these schedulers react quickly and in an optimal manner in response to unexpected events such as links or nodes going down or the sudden arrival of unexpectedly high volumes of data packets.

Date of Conference: September 27-20, 2022

Track: Machine Learning for SSA Applications

View Paper