CLASS_STAR classifier¶

Note

The CLASS_STAR classifier has been superseded by the SPREAD_MODEL estimator (see Model-based star-galaxy separation: SPREAD_MODEL), which offers better performance by making explicit use of the full, variable PSF model.

A good discrimination between stars and galaxies is essential for both galactic and extragalactic statistical studies. The common assumption is that galaxy images look more extended or fuzzier than those of stars (or QSOs). SExtractor provides the CLASS_STAR catalog parameter for separating both types of sources. The CLASS_STAR classifier relies on a multilayer feed-forward neural network trained using supervised learning to estimate the a posteriori probability [6][7] of a SExtractor detection to be a point source or an extended object. Below is a shortened description of the estimator, see [8] for more details.

Inputs and outputs¶

The neural network is a multilayer Perceptron with a single fully connected, hidden layers. Of all neural networks it is probably the best-studied, and it has been intensively applied with success for many classification tasks.

The classifier (Fig. 5) has 10 inputs:

8 isophotal areas \(A_0..A_7\), measured at isophotes exponentially spaced between the analysis threshold (which may be modified with the ANALYSIS_THRESH configuration parameter) and the object’s peak pixel value
The object’s peak pixel value above the local background \(I_{\mathrm max}\)
A seeing input, which acts as a tuning button.

The output layer contains only one neuron, as “star” and “galaxy” are two classes mutually exclusive. The output value is a “stellarity index”, which for images that reasonably match those of the training sample is an estimation of the a posteriori probability for the classified object to be a point-source. Hence a CLASS_STAR close to 0 means that the object is very likely a galaxy, and 1 that it is a star. In practice, real data always differ at least slightly from the training sample, and the CLASS_STAR output is often a poor approximation of the expected a posteriori probabilities. Nevertheless, a CLASS_STAR value closer to 0 or 1 normally indicates a higher confidence in the classification, and the balance between sample completeness and purity may still be adjusted by tweaking the decision threshold .

Architecture of SExtractor’s CLASS_STAR classifier

The seeing input must be set by the user with the SEEING_FWHM configuration parameter. If SEEING_FWHM is set to 0, it is automatically measured on the PSF model which must be provided (using the PSF_NAME configuration parameter).

If no PSF model is available, the SEEING_FWHM configuration parameter must be adjusted by the user to match the actual average PSF FWHM on the image. The accuracy with which SEEING_FWHM must be set for optimal results ranges from \(\pm 20\%\) for bright sources to about \(\pm 5\%\) for the faintest (Fig. 6). SEEING_FWHM is expressed in arcseconds. The PIXEL_SCALE configuration parameter must therefore also be set by the user if WCS information is missing from the FITS image header. There are several ways to measure, directly or indirectly, the size of point sources in SExtractor; they may lead to slightly discordant results, depending on the exact shape of the PSF. The measurement FWHM_IMAGE (although not the most reliable as an image quality estimator) sets the reference when it comes to setting SEEING_FWHM.

One may check that the SEEING_FWHM is set correctly by making sure that the typical CLASS_STAR value of unclassifiable sources at the faint end of the catalog hovers around the 0.5 mark.

Architecture of SExtractor’s CLASS_STAR classifier

Training¶

This section gives some insight on how the CLASS_STAR classifier has been trained. The main issue with supervised machine learning is the labeling of the large training sample. Hopefully a big percentage of contemporary astronomical images share a common set of generic features, which can be simulated with sufficient realism to create a large training sample together with the ground truth (labels). The CLASS_STAR classifier was trained on such a sample of artificial images.

Six hundred \(512\times512\) simulation images containing stars and galaxies were generated to train the network using an early prototype of the SkyMaker package [9]. They were done in the blue band, where galaxies present very diversified aspects. The two parameters governing the shape of the PSF (seeing FWHM and Moffat \(\beta\) parameter [10]) were chosen randomly with \(0.025\le\) FWHM \(\le 5.5\) and \(2\le\beta\le4\). Note that the Moffat function used in the simulation is a poor approximation to diffraction-limited images, hence the CLASS_STAR classifier is not optimized for space-based images. The pixel scale was always taken less than \(\approx 0.7\) FWHM to warrant correct sampling of the image. Bright galaxies are simply too rare in the sky to constitute a significant training sample on such a small number of simulations. So, keeping a constant comoving number density, we increased artificially the number of nearby galaxies by making the volume element proportional to \(zdz\). Stars were given a number-magnitude distribution identical to that of galaxies. Therefore any pattern presented to the network had a 50% chance to correspond to a star or a galaxy, irrespective of magnitude [1]. Crowding in the simulated images was higher than what one sees on real images of high galactic latitude fields, allowing for the presence of many “difficult cases” (close double stars, truncated profiles, etc…) that the neural network classifier had to deal with.

SExtractor was run on each image with 8 different extraction thresholds. This led to a catalog with about \(10^6\) entries with the 10 input parameters plus the class label. Back-propagation learning took about 15 min on a SUN SPARC20 workstation. The final set of synaptic weights was saved to the file default.nnw , ready to be used in “feed-forward” mode during source extraction.

[1]	Faint galaxies have less chance being detected than faint stars, but it has little effect because the ones that are lost at a given magnitude are predominantly the most extended and consequently the easiest to classify.