4.7 Article

A machine learning approach to galaxy properties: joint redshift-stellar mass probability distributions with Random Forest

Journal

MONTHLY NOTICES OF THE ROYAL ASTRONOMICAL SOCIETY
Volume 502, Issue 2, Pages 2770-2786

Publisher

OXFORD UNIV PRESS
DOI: 10.1093/mnras/stab164

Keywords

methods: data analysis; methods: statistical; galaxies: evolution; galaxies: fundamental parameters; software: data analysis; software: public release

Funding

  1. STFC UCL Centre for Doctoral Training in Data Intensive Science [ST/P006736/1]
  2. European Research Council [TESTDE FP7/291329]
  3. STFC Consolidated grants [ST/M001334/1, ST/R000476/1]
  4. ERC Advanced grant [695671]
  5. UK STFC
  6. U.S. Department of Energy
  7. U.S. National Science Foundation
  8. Ministry of Science and Education of Spain
  9. Science and Technology Facilities Council of the United Kingdom
  10. Higher Education Funding Council for England
  11. National Center for Supercomput-ing Applications at the University of Illinois at Urbana-Champaign
  12. Kavli Institute of Cosmological Physics at the University of Chicago
  13. Center for Cosmology and Astro-Particle Physics at the Ohio StateUniversity
  14. Mitchell Institute for Fundamental Physics and Astronomy at Texas AM University
  15. Financiadora de Estudos e Projetos
  16. Fundacao Carlos Chagas Filho de Amparo a Pesquisa do Estado do Rio de Janeiro
  17. Conselho Nacional de Desenvolvimento Cientifico e Tecnologico
  18. Ministerio da Ciencia, Tecnologia e Inovacao
  19. Deutsche Forschungsgemeinschaft
  20. Argonne National Laboratory
  21. University of California at Santa Cruz
  22. University of Cambridge
  23. Centro de Investigaciones Energeticas
  24. Medioambientales y Tecnologicas-Madrid
  25. University of Chicago
  26. University College London
  27. DES-Brazil Consortium
  28. University of Edinburgh
  29. Eidgen ossische Technische Hochschule (ETH) Zurich
  30. Fermi National Accelerator Laboratory
  31. University of Illinois at Urbana Champaign
  32. Institut de Ciencies de l'Espai (IEEC/CSIC)
  33. Institut de Fisica d'Altes Energies
  34. Lawrence Berkeley National Laboratory
  35. Ludwig-Maximilians Universit at Munchen
  36. Excellence Cluster Universe
  37. University of Michigan
  38. NFS's NOIRLab
  39. University of Nottingham
  40. Ohio State University
  41. University of Pennsylvania
  42. University of Portsmouth
  43. SLAC National Accelerator Laboratory
  44. Stanford University
  45. University of Sussex
  46. Texas AM University
  47. OzDES Membership Consortium
  48. National Science Foundation [AST-1138766, AST-1536171]
  49. MICINN [ESP2017-89838, PGC2018094773, PGC2018-102021, SEV-2016-0588, SEV-2016-0597, MDM-2015-0509]
  50. ERDF funds from the European Union
  51. CERCA program of the Generalitat de Catalunya
  52. European Research Council under the European Union's Seventh Framework Program (FP7/2007-2013)
  53. Brazilian Instituto Nacional de Ciencia e Tecnologia (INCT) do e-Universo (CNPq) [465376/2014-2]
  54. U.S. Department of Energy, Office of Science, and Office of High Energy Physics [DE-AC02-07CH11359]
  55. ERC [240672, 291329, 306478]
  56. European Research Council (ERC) [291329] Funding Source: European Research Council (ERC)
  57. STFC [ST/S000550/1, ST/I000879/1, ST/R000476/1] Funding Source: UKRI

Ask authors/readers for more resources

This study demonstrates the use of Random Forest machine learning algorithm to accurately predict joint redshift-stellar mass probability distribution functions, even with limited photometric bands available. The models built using DES and COSMOS2015 data outperform template-fitting methods and show high efficiency in computation speed. The development of GALPRO(1) provides researchers with a fast and intuitive tool to generate multivariate PDFs in cosmology and galaxy evolution studies.
We demonstrate that highly accurate joint redshift-stellar mass probability distribution functions (PDFs) can be obtained using the Random Forest (RF) machine learning (ML) algorithm, even with few photometric bands available. As an example, we use the Dark Energy Survey (DES), combined with the COSMOS2015 catalogue for redshifts and stellar masses. We build two ML models: one containing deep photometry in the griz bands, and the second reflecting the photometric scatter present in the main DES survey, with carefully constructed representative training data in each case. We validate our joint PDFs for 10 699 test galaxies by utilizing the copula probability integral transform and the Kendall distribution function, and their univariate counterparts to validate the marginals. Benchmarked against a basic set-up of the template-fitting code BAGPIPES, our ML-based method outperforms template fitting on all of our predefined performance metrics. In addition to accuracy, the RF is extremely fast, able to compute joint PDFs for a million galaxies in just under 6 min with consumer computer hardware. Such speed enables PDFs to be derived in real time within analysis codes, solving potential storage issues. As part of this work we have developed GALPRO(1), a highly intuitive and efficient PYTHON package to rapidly generate multivariate PDFs on-the-fly. GALPRO is documented and available for researchers to use in their cosmology and galaxy evolution studies.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available