A Deep Reinforcement Learning Application to Space-based Sensor Tasking for Space Situational Awareness

Thomas G. Roberts, Massachusetts Institute of Technology; Peng Mun Siew, Massachusetts Institute of Technology; Daniel Jang, Massachusetts Institute of Technology; Richard Linares, Massachusetts Institute of Technology

Keywords: Deep Reinforcement Learning, Reinforcement Learning, Population Based Training, Space Sensor Tasking, Proximal Policy Optimization

Abstract:

To maintain a robust catalog of resident space objects (RSOs), space situational awareness (SSA) mission operators depend on ground- and space-based sensors to repeatedly detect, characterize, and track objects in orbit. Although some space sensors are capable of monitoring large swaths of the sky with wide fields of view (FOV), others—such as maneuverable optical telescopes, narrow-band and imaging radars, or satellite laser ranging (SLR) systems—are restricted to relatively narrow FOVs and must slew at a finite rate from object to object as they observe them. Since there are many objects that a narrow FOV sensor could choose to observe within its field of regard (FOR), it must algorithmically create a schedule that dictates which direction to point and for how long: a combinatorial optimization problem known as the sensor tasking problem (Erwin et al. 2010). As more RSOs are added to the United States Space Command’s (USSPACECOM) RSO catalog with the advent of proliferated satellite constellations and the deployment of more accurate sensors that can detect smaller objects, the problem of tasking narrow FOV sensors becomes more pressing. For example, there are currently fewer than 3,000 active satellites in LEO (Union of Concerned Scientists 2021), and it is estimated that by 2025 over 1,000 satellites could be launched each year (MIT Technology Review 2019). The number of satellites will likely greatly outpace any increased capacity of SSA sensors, making efficient tasking of existing sensors extremely valuable.

Although most of the sensors in the U.S. Space Surveillance Network (SSN) are ground-based, space-based sensors have contributed observations since the late 1990s. Over the past decade, several additional LEO satellites have been launched to augment the SSN, including the Space Based Space Surveillance (SBSS) system, the Operationally Responsive Space 5 (ORS-5 or SensorSat) satellite, and the Canadian military’s Sapphire system. While SBSS and Sapphire are taskable, gimbaled sensors that can be pointed, SensorSat is a body-fixed satellite for GEO surveillance. Much like taskable, limited FOV terrestrial sensors, gimbaled sensors in space will also benefit from optimized tasking, which will help maximize new sensors’ utility to the SSN.

In this paper, we describe a specific application of a trained scheduler that was developed using deep reinforcement learning with the proximal policy optimization (PPO) algorithm (Jang et al. 2020) to a space-based narrow FOV sensor in low Earth orbit. The sensor’s performance—both as a singular sensor acting alone, but also as a complement to a network of taskable, narrow FOV ground-based sensors—is compared to the greedy scheduler across several figures of merit, including the cumulative number of RSOs observed and the mean trace of the covariance matrix of all of the objects in the scenario. The results of several simulations with 100 randomly selected objects from the USSPACECOM RSO catalog—with no additional satellites launched or de-orbited during the study period—are presented and discussed. Additionally, the results from a LEO SSA sensor in different orbits are evaluated and discussed, as well as various combinations of space-based sensors.

Date of Conference: September 14-17, 2021

Track: Dynamic Tasking

View Paper