Automated Unsupervised Classification of the Sloan Digital Sky Survey Stellar Spectra using k-means Clustering

Allende-Prieto, C.; Sánchez-Almeida, J.
Bibliographical reference

The Astrophysical Journal, Volume 763, Issue 1, article id. 50, 18 pp. (2013).

Advertised on:
Number of authors
IAC number of authors
Refereed citations
Large spectroscopic surveys require automated methods of analysis. This paper explores the use of k-means clustering as a tool for automated unsupervised classification of massive stellar spectral catalogs. The classification criteria are defined by the data and the algorithm, with no prior physical framework. We work with a representative set of stellar spectra associated with the Sloan Digital Sky Survey (SDSS) SEGUE and SEGUE-2 programs, which consists of 173,390 spectra from 3800 to 9200 Å sampled on 3849 wavelengths. We classify the original spectra as well as the spectra with the continuum removed. The second set only contains spectral lines, and it is less dependent on uncertainties of the flux calibration. The classification of the spectra with continuum renders 16 major classes. Roughly speaking, stars are split according to their colors, with enough finesse to distinguish dwarfs from giants of the same effective temperature, but with difficulties to separate stars with different metallicities. There are classes corresponding to particular MK types, intrinsically blue stars, dust-reddened, stellar systems, and also classes collecting faulty spectra. Overall, there is no one-to-one correspondence between the classes we derive and the MK types. The classification of spectra without continuum renders 13 classes, the color separation is not so sharp, but it distinguishes stars of the same effective temperature and different metallicities. Some classes thus obtained present a fairly small range of physical parameters (200 K in effective temperature, 0.25 dex in surface gravity, and 0.35 dex in metallicity), so that the classification can be used to estimate the main physical parameters of some stars at a minimum computational cost. We also analyze the outliers of the classification. Most of them turn out to be failures of the reduction pipeline, but there are also high redshift QSOs, multiple stellar systems, dust-reddened stars, galaxies, and, finally, odd spectra whose nature we have not deciphered. The template spectra representative of the classes are publicly available in the online journal and at ftp://stars:kmeans [at]
Related projects
Project Image
Starbursts in Galaxies GEFE
Starsbursts play a key role in the cosmic evolution of galaxies, and thus in the star formation (SF) history of the universe, the production of metals, and the feedback coupling galaxies with the cosmic web. Extreme SF conditions prevail early on during the formation of the first stars and galaxies, therefore, the starburst phenomenon constitutes a
Muñoz Tuñón
spectrum of mercury lamp
Chemical Abundances in Stars
Stellar spectroscopy allows us to determine the properties and chemical compositions of stars. From this information for stars of different ages in the Milky Way, it is possible to reconstruct the chemical evolution of the Galaxy, as well as the origin of the elements heavier than boron, created mainly in stellar interiors. It is also possible to
Allende Prieto