Strategic Satellite Custody Maintenance with AlphaZero

Tyler Becker, University of Colorado Boulder; Zachary Sunberg, University of Colorado Boulder

Keywords: Game theory, adversarial, custody maintenance, AlphaZero, deep learning, reinforcement learning, Monte Carlo tree search, pursuit-evasion

Abstract:

Custody maintenance of resident space objects (RSOs) can be complicated by adversarial targets that deliberately maneuver to evade tracking. Traditional probabilistic estimation and optimization methods do not provide guarantees under strategic uncertainty, where an adversary may behave far outside of the assumed noise or behavior distribution. We formulate custody maintenance as a two-player zero-sum Markov game between an observer satellite and an adversarial target. To solve this game, we introduce Simultaneous AlphaZero, a variant of AlphaZero adapted for simultaneous-action Markov game environments. Our approach supports simultaneous action selection for both agents and evaluates strategies using exploitability to quantify robustness to adversarial deviations. In simulated orbital scenarios, Simultaneous AlphaZero synthesizes observer strategies that align with intuitive physical constraints (e.g., illumination and occlusion) while maintaining robustness against adversarial targets. The best-response value increases over training, demonstrating convergence toward minimax strategies with worst-case guarantees. These results establish a foundation for game-theoretic learning methods in space domain awareness, providing a scalable framework for custody maintenance for potential adversarial RSOs beyond classical differential games.

Date of Conference: September 16-19, 2025

 

Track: Space Domain Awareness

View Paper