The SDSS spectra can be thought of as "labels" for objects detected in the imaging, each of which has ugriz photometry and some shape and position parameters. Can we train a model with this enormous amount of data to predict the spectra using the photometry? One thing that says "yes" is that photometric redshifts (for galaxies and quasars), photometric distances (for stars), and photometric temperatures and metallicities (for stars) all work well. One thing that says "no" is that there is far more information (in a technical sense) in the spectra than in the photometry. All this said, it is an absolutely great "Data Science" demonstration project, and it might create some new ideas for LSST-era astrophysics projects. In principle, it will also get us predictions about the spectral types and redshifts of many objects that lack spectra!
Asensio Ramos points me to this paper, which is VERY RELEVANT http://adsabs.harvard.edu/abs/2010ApJ...719.1759A
ReplyDeleteProfessor Hogg,
ReplyDeleteI'm only 6 years late to this party, but I wonder if you're still interested in this project or have made any progress on it. This kind of study is exactly something I've been experimenting with for white dwarfs, albeit that is probably a much simpler problem. The white dwarf optical spectrum is essentially entirely determined by the temperature and surface gravity, which are strongly constrained by the photometry.
In our implementation we used simple artificial neural networks to reconstruct the spectrum from labels or photometry. With a dataset of 5000 SDSS white dwarfs, the reconstructions were quite satisfactory using 2-band photometry. My main goal for this project was to generate predicted spectra using photometry, and then compare the predicted spectrum to actual spectroscopic measurements. By characterizing the goodness/badness of fit, interesting outlier stars fall out of the dataset. I have since put this project on hold, but I hope to restart it sometime in the future.
Extending this methodology beyond white dwarfs would be very interesting. I think one big improvement could be deconvolutional neural networks, where the deconvolution layers could intelligently reconstruct spectral features from the limited input parameter space. There's only so many constraints we can get from two-band photometry, but I wonder how much prediction power we'd gain by adding photometric bands - I'd imagine certain bands would work well for certain abundances and metallicities.
These are just some thoughts of my own, I'm curious if you've developed/are interested in developing this idea further.