4.6 Article

Robust statistical tools for identifying multiple stellar populations in globular clusters in the presence of measurement errors A case study: NGC 2808

Journal

ASTRONOMY & ASTROPHYSICS
Volume 658, Issue -, Pages -

Publisher

EDP SCIENCES S A
DOI: 10.1051/0004-6361/202142454

Keywords

methods; statistical; stars; evolution; stars; abundances; globular clusters; general; globular clusters; individual; NGC 2808

Funding

  1. Czech Science Foundation GACR [21-16583M]

Ask authors/readers for more resources

This study explores the analysis methods for multiple stellar populations in globular clusters, using NGC 2808 as a case study. By employing established statistical clustering methods and addressing the issue of measurement errors, the results obtained differ from those reported in previous literature. The findings suggest that the existence of multiple populations is reliable in high-resolution spectroscopy data, but questionable in low-resolution spectroscopy data. Additionally, it is revealed that the commonly used histogram analysis is prone to generating false-positive findings. Therefore, the use of statistically grounded methods is crucial for conducting more robust and reproducible research.
Context. The finding of multiple stellar populations (MPs), which are defined by patterns in the stellar element abundances, is considered today a distinctive feature of globular clusters. However, while data availability and quality have improved in the past decades, this is not always true for the techniques that are adopted to analyse them, which creates problems of objectivity for the claims and reproducibility. Aims. Using NGC 2808 as test case, we show the use of well-established statistical clustering methods. We focus our analysis on the red giant branch phase, where two data sets are available in the recent literature for low- and high-resolution spectroscopy. Methods. We adopted hierarchical clustering and partition methods. We explicitly addressed the usually neglected problem of measurement errors, for which we relied on techniques that were recently introduced in the statistical literature. The results of the clustering algorithms were subjected to a silhouette width analysis to compare the performance of the split into different numbers of MPs. Results. For both data sets the results of the statistical pipeline are at odds with those reported in the literature. Two MPs are detected for both data sets, while the literature reports five and four MPs from high- and low-resolution spectroscopy, respectively. The silhouette analysis suggests that the population substructure is reliable for high-resolution spectroscopy data, while the actual existence of MP is questionable for the low-resolution spectroscopy data. The discrepancy with literature claims can be explained with the different methods that were adopted to characterise MPs. By means of Monte Carlo simulations and multimodality statistical tests, we show that the often adopted study of the histogram of the differences in some key elements is prone to multiple false-positive findings. Conclusions. The adoption of statistically grounded methods, which adopt all the available information to split the data into subsets and explicitly address the problem of data uncertainty, is of paramount importance to present more robust and reproducible research.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available