A Spoken Language Interface for SSA/SDA based on Modern Speech Processing Technology

Richard Stottler, Stottler Henke Associates, Inc.

Keywords: spoken language interface (SLI), speech to text, intents, Machine Learning (ML), User Experience/User Interface (UX/UI), SSA/SDA

Abstract:

Good spoken language interfaces are a natural, intuitive, and often efficient interface mechanism for a variety of different applications. Unfortunately, bad spoken language systems are frustrating and inefficient. Meanwhile, several large online companies have made huge investments in their spoken language interface technologies. People in their normal lives have become used to interacting with their personal assistants via voice and have come to expect very high levels of performance from spoken language interface systems. To the degree that the large investments by online companies can be easily and inexpensively leveraged for Space Situational Awareness (SSA)/Space Domain Awareness (SDA) applications, they represent a significant opportunity for greatly improving SSA/SDA applications and their corresponding user acceptance. We have been working with Amazon, Stanford, and others to create a spoken language interface for SSA/SDA, based on modern spoken language interface technology. We are working to understand the opportunities and limitations and have put together a demonstration that runs on any Alexa device with a screen as well as Internet-connected PCs that runs through different SSA/SDA scenarios and involves queries, commands, and notifications. The main point of the demonstration is to demonstrate the spoken language interface to give conference attendees an understanding as to what is possible so they can start determining how they would best use spoken language interfaces for their SSA/SDA applications. We can also share our lessons learned and the highly efficient nature of the development of the spoken language interface.

There are different trade-offs between the different SLI development tools and these will be discussed. Some require Internet connections to operate and some do not, for example. The computational requirements for both development and real-time operation are typically modest. Updating the application requires about 30 seconds and the real-time, online component can be fielded with the computational power of a standard desktop without specialized processors. The robustness is inversely proportional to the breadth, diversity, and similarity of different intents of the discourse. We believe that SSA/SDA applications are very applicable because their discourse breadth is fairly narrow and predictable, consisting of the different kinds of plausible events that occur in space; applicable tactics, techniques, and procedures; the different spacecraft of interest, their subsystems and components; and indications, warnings, and cautions.

Our presentation will consist primarily of demonstrating the spoken language interface and letting anyone who would like to, to try it for themselves. So far, the resulting spoken language interface appears fairly robust. A Machine Learning (ML) paradigm is used to match utterances to the users intent. The user may say either the exact utterances associated with the intent or something similar. In the first case, intent recognition is near 100%. In the second it is still very high, over approximately 95%. And when an intent is not recognized, the utterance which was incorrectly understood can be quickly added to the correct intent list so that it correctly classifies the utterance in the future. Thus, the application can easily and naturally improve over time.

We believe this is a real educational opportunity for conference attendees as to what is possible and practical, inexpensively, with spoken language interfaces for SSA/SDA applications so that they can start thinking about the SSA/SDA user interface/user experience design and what role spoken language should play in that.

Date of Conference: September 14-17, 2021

Track: Machine Learning for SSA Applications

View Paper