Estimation of stellar atmospheric parameters from SDSS/SEGUE spectra

Re Fiorentin, P.; Bailer-Jones, C. A. L.; Lee, Y. S.; Beers, T. C.; Sivarani, T.; Wilhelm, R.; Allende Prieto, C.; Norris, J. E.
Bibliographical reference

Astronomy and Astrophysics, Volume 467, Issue 3, June I 2007, pp.1373-1387

Advertised on:
6
2007
Number of authors
8
IAC number of authors
0
Citations
86
Refereed citations
75
Description
We present techniques for the estimation of stellar atmospheric parameters (T_eff, log~g, [Fe/H]) for stars from the SDSS/SEGUE survey. The atmospheric parameters are derived from the observed medium-resolution (R = 2000) stellar spectra using non-linear regression models trained either on (1) pre-classified observed data or (2) synthetic stellar spectra. In the first case we use our models to automate and generalize parametrization produced by a preliminary version of the SDSS/SEGUE Spectroscopic Parameter Pipeline (SSPP). In the second case we directly model the mapping between synthetic spectra (derived from Kurucz model atmospheres) and the atmospheric parameters, independently of any intermediate estimates. After training, we apply our models to various samples of SDSS spectra to derive atmospheric parameters, and compare our results with those obtained previously by the SSPP for the same samples. We obtain consistency between the two approaches, with RMS deviations on the order of 150 K in T_eff, 0.35 dex in log~g, and 0.22 dex in [Fe/H]. The models are applied to pre-processed spectra, either via Principal Component Analysis (PCA) or a Wavelength Range Selection (WRS) method, which employs a subset of the full 3850-9000Å spectral range. This is both for computational reasons (robustness and speed), and because it delivers higher accuracy (better generalization of what the models have learned). Broadly speaking, the PCA is demonstrated to deliver more accurate atmospheric parameters when the training data are the actual SDSS spectra with previously estimated parameters, whereas WRS appears superior for the estimation of log~g via synthetic templates, especially for lower signal-to-noise spectra. From a subsample of some 19 000 stars with previous determinations of the atmospheric parameters, the accuracies of our predictions (mean absolute errors) for each parameter are T_eff to 170/170 K, log~g to 0.36/0.45 dex, and [Fe/H] to 0.19/0.26 dex, for methods (1) and (2), respectively. We measure the intrinsic errors of our models by training on synthetic spectra and evaluating their performance on an independent set of synthetic spectra. This yields RMS accuracies of 50 K, 0.02 dex, and 0.03 dex on T_eff, log~g, and [Fe/H], respectively. Our approach can be readily deployed in an automated analysis pipeline, and can easily be retrained as improved stellar models and synthetic spectra become available. We nonetheless emphasise that this approach relies on an accurate calibration and pre-processing of the data (to minimize mismatch between the real and synthetic data), as well as sensible choices concerning feature selection. From an analysis of cluster candidates with available SDSS spectroscopy (M 15, M 13, M 2, and NGC 2420), and assuming the age, metallicity, and distances given in the literature are correct, we find evidence for small systematic offsets in T_eff and/or log~g for the parameter estimates from the model trained on real data with the SSPP. Thus, this model turns out to derive more precise, but less accurate, atmospheric parameters than the model trained on synthetic data.