A Machine Learning Method for Object Localization

Mridul Gupta, Purdue University; Mary Comer, Purdue University; Edward Delp, Purdue University; Jonathan Chan, Lockheed Martin; Mitchell Krouss, Lockheed Martin; Paul Martens, Lockheed Martin Space; Michael Jacobs, Lockheed Martin; Corbin Spells, Lockheed Martin; Moses Chan, Lockheed Martin;

Keywords: Machine Learning, Object Localization, Closely-Spaced Objects (CSOs), Remote Sensing

Abstract:

Detection and localization of small objects is significant in many applications, such as surveillance, reconnaissance, microscopy, and astronomy. In satellite images, the spatial resolution of the imaging system may not be enough to localize a small object or two closely-spaced objects (CSOs). Detection and localization become more difficult for objects with a low signal-to-noise ratio (SNR).

In this work, the imaging system is modeled as an array of square detectors, and the center of a detector corresponds with a single pixel in the observed image. We assume that an object is a point source and that the gray-level distribution of an object can be approximated by a point spread function (PSF). We model the noise as an additive noise using a zero-mean Gaussian distribution with identical variance for all the pixels. The noise is added to the gray level of each pixel in the observed image. We assume that we know the approximate 2D spatial neighborhood containing the objects in an observed image. We also consider the case with two spatially registered spectral bands where the bands have different ensquared energy, which is the fraction of energy on a detector when an object lies at its center. For our experiments, we assume the PSF of the imaging system is a 2D symmetric Gaussian function with its center at the object’s location and its width dependent on the ensquared energy of the spectral band.

Sub-pixel location estimator for small objects (SPLEO) is a lightweight convolutional neural network-based method that estimates the location and amplitude of a single object in an observed image of size 7 x 7 in a single spectral band. We propose an improvement to SPLEO called SPLEOv2 for localizing one object or two CSOs using observed 7 x 7 images in one or two spectral bands. SPLEOv2 can be easily modified to localize more than two CSOs as well. The input to the network is an observed image which is processed in parallel by an average pooling layer with filter size 2 x 2 and two 2D convolutional layers having 32 filters with sizes 3 x 3 and 5 x 5, respectively. Outputs from the two convolutional layers are then processed with average pooling layers with filter size 2 x 2. The outputs from the three average pooling layers are row-concatenated to obtain a 1D vector and processed with 3 fully connected layers with 512, 20, and “p” neurons where “p” is the number of object parameters to be determined. For example, a single object in a single spectral band has 3 parameters (2D spatial location and amplitude), and a single object with two spectral bands has 4 parameters (2D spatial location and amplitude in two bands). Rectified linear unit (ReLU) is used as the activation function for the network except for the last fully connected layer, which uses the hyperbolic tangent function (tanh) to scale the output in the range (-1,1). The network is trained by minimizing the negative log-likelihood of the observed image given the locations and amplitudes of the estimated objects. Like SPLEO, SPLEOv2 is also a lightweight model with low computational requirements.

In this paper, we use simulated data to evaluate our approach. The objects are assumed to be contained in a 7 x 7-pixel window (observed image) around the approximate object location. We also assume that we know the number of objects present in the image. The observed image is input to the neural network, which then estimates the sub-pixel locations and amplitudes of the objects. The network is trained on high SNR objects but generalizes well to images with low SNR for both a single object and two CSOs. We also propose using the parameter estimates from SPLEOv2 as an initial estimate for three conventional object localization methods. These methods minimize the negative log-likelihood iteratively, but their performance depends on the quality of the initial estimates. With the outputs from SPLEOv2 as the initial estimates, we achieve better mean localization error, standard deviation of the localization error, and 95 percentile localization error. We also compare the variance in the parameter estimates with the Cramer-Rao lower bound (CRLB) for the variance of an unbiased estimator.

Date of Conference: September 19-22, 2023

Track: Machine Learning for SDA Applications

View Paper