4.7 Article

Galaxy Zoo: Clump Scout - Design and first application of a two-dimensional aggregation tool for citizen science

Journal

MONTHLY NOTICES OF THE ROYAL ASTRONOMICAL SOCIETY
Volume 517, Issue 4, Pages 5882-5911

Publisher

OXFORD UNIV PRESS
DOI: 10.1093/mnras/stac2919

Keywords

methods: data analysis; methods: statistical; software: data analysis; software: public release; galaxies: structure

Funding

  1. European Union [824064]
  2. Science and Technology Facilities Council [ST/P000584/1]
  3. Alan Turing Institute [EP/V030302/1]
  4. National Science Foundation [AST 1716602, IIS 2006894]
  5. National Aeronautics and Space Administration (NASA) [HST-AR-15792.002-A]
  6. Google
  7. Alfred P. Sloan Foundation

Ask authors/readers for more resources

This article introduces a citizen science project called Galaxy Zoo: Clump Scout, which uses a statistically driven software framework to identify giant star forming clumps in galaxies. By aggregating annotations from multiple volunteers, they generate consensus labels and evaluate the reliability of the results. Using a large dataset, they successfully identify and study thousands of potential clumps.
Galaxy Zoo: Clump Scout is a web-based citizen science project designed to identify and spatially locate giant star forming clumps in galaxies that were imaged by the Sloan Digital Sky Survey Legacy Survey. We present a statistically driven software framework that is designed to aggregate two-dimensional annotations of clump locations provided by multiple independent Galaxy Zoo: Clump Scout volunteers and generate a consensus label that identifies the locations of probable clumps within each galaxy. The statistical model our framework is based on allows us to assign false-positive probabilities to each of the clumps we identify, to estimate the skill levels of each of the volunteers who contribute to Galaxy Zoo: Clump Scout and also to quantitatively assess the reliability of the consensus labels that are derived for each subject. We apply our framework to a data set containing 3561 454 two-dimensional points, which constitute 1739 259 annotations of 85 286 distinct subjects provided by 20 999 volunteers. Using this data set, we identify 128 100 potential clumps distributed among 44 126 galaxies. This data set can be used to study the prevalence and demographics of giant star forming clumps in low-redshift galaxies. The code for our aggregation software framework is publicly available at:

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available