Discussing: Satellite Pattern-of-Life Characterization Dataset and Benchmark Suite

Discussing: Satellite Pattern-of-Life Characterization Dataset and Benchmark Suite


3 min read

My current interest area in Computer Science research is the applications of machine learning in space and on satellites. In order to learn more about this intersection of space exploration and machine learning, I am participating in the MIT ARCLab Prize for AI Innovation in Space. This is a challenge to use machine learning to predict Satellites' Patterns-of-Life from their trajectory and propulsion data in the Satellite Pattern-of-Life Identification Dataset (SPLID). I decided to give the SPLID dataset's paper (AI SSA Challenge Problem: Satellite Pattern-of-Life Characterization Dataset and Benchmark Suite) a read to better understand how to use the dataset and learn more about the goals of the competition.

What is a Satellite Pattern of Life?

This competition and paper introduced me to Satellite Patterns of Life, which are "sequences of behavioral modes - periods of consistent on-orbit behavior, such as those in which satellites adhere to various station-keeping protocols - that they pursue throughout their operational lifetimes". It is fascinating that various kinds of satellites, with different missions, propulsion systems, and hardware, could all be characterized into different patterns of life. The patterns of life that the SPLID dataset mainly focuses on are combinations of nodes (different operational transitions) and station-keeping types. This focus on nodes and station-keeping types allows different satellites to be identified as having a specific pattern of life, which makes it easier to perform machine learning on the data. The reason why nodes and station-keeping types only were chosen to represent the patterns of life doesn't seem to be elaborated upon, and is something I would like to learn more about.

Dataset Generation and Format

The SPLID dataset consists of both synthetic data generated by "an in-house satellite simulation tool developed by the MIT Lincoln Laboratory" as well as a real-world space dataset from Vector Covariance Messages (VCMs) "manually annotated by a human expert to generate a list of time-stamped satellite PoL nodes". There doesn't seem to be any novel methods involved in the dataset's generation, but the paper doesn't seem to go into much more detail regarding the in-house high-fidelity simulator and how it simulates mission objectives and propulsion systems. I am interested in how the raw simulator data was classified into node labels and type labels, or if the simulator output included the node labels and type labels already.

Discussions and Applications

One part that the paper didn't go too in depth about was the applications of the dataset and any machine learning models using that dataset. It is stated that the end goal is to "enhance tracking and orbit prediction capabilities to safeguard space assets from the threat of object-on-object collision". This dataset and related machine learning models probably wouldn't be suited for satellite on-board machine learning since the dataset mainly contains compiled and pre-classified data rather than raw data from onboard satellite sensors or detectors, and the two-hour temporal resolution likely wouldn't be useful for the near real-time responses needed from satellite on-board machine learning. However, I could envision using a Satellite Pattern of Life machine learning model to compile information about where there may be an overcrowding of satellites in space, and knowing to avoid certain orbits if the area is overcrowded. Another application of this could be to identify better orbits to place satellites in to facilitate clear line of sight optical communications from lasers, and lessen the chance of those communications being blocked. The applications of the dataset and related machine learning models would be truly interesting and useful in those cases.

I won't be delving into the details of how the competition was designed or how the baseline solutions were created, since I am actively participating in the competition at the moment. A discussion of the competition itself can be expected once the competition ends.