☆ 4.6 Article

Three points to consider when choosing a LM or GLM test for count data

METHODS IN ECOLOGY AND EVOLUTION (2016)

期刊

METHODS IN ECOLOGY AND EVOLUTION

卷 7, 期 8, 页码 882-890

出版社

WILEY

DOI: 10.1111/2041-210X.12552

关键词

data transformation; generalized linear models; multivariate analysis; power analysis; type I error

类别

Ecology

资金

Australian Research Council [FT120100501, DP150100823, LP150100972]
US National Science Foundation [DEB-LTREB-1052160]
Australian Research Council [LP150100972] Funding Source: Australian Research Council
Division Of Environmental Biology
Direct For Biological Sciences [1052160] Funding Source: National Science Foundation
Division Of Environmental Biology
Direct For Biological Sciences [1556208] Funding Source: National Science Foundation

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

The two most common approaches for analysing count data are to use a generalized linear model (GLM), or transform data, and use a linear model (LM). The latter has recently been advocated to more reliably maintain control of type I error rates in tests for no association, while seemingly losing little in power. We make three points on this issue.Point 1 - Choice of statistical model should primarily be made on the grounds of data properties. Choice of testing procedure should be considered and addressed as a separate issue, after model choice. If models with the appropriate data properties nonetheless have statistical problems such as type I error control (i.e. type I error rate greatly exceeds the intended significance level), the best solution is to keep the model but fix the problems.Point 2 - When a test has problems with type I error control, it can usually be corrected, but this may require departure from software default approaches. In particular, resampling is a good solution for small samples that can be easy to implement.Point 3 -Tests based on models that better fit the data (e.g. a negative binomial for overdispersed count data) tend to have better power properties and in some instances have considerably higher power. We illustrate these issues for a 2x2 experiment with a count response. This seemingly simple problem becomes hard when the experimental design is unbalanced, and software default procedures using LMs or GLMs can have difficulties, although in both cases the issues can be fixed. We conclude that, when GLMs are thought to fit count data well, and when any necessary steps are taken to correct type I error rates, they should be used rather than LMs. Nonetheless, standard LM tests are often robust and can have good type I error control, so there is an argument for their use for counts when diagnostics are difficult and statistical models are complex, although at some risk of loss of power and interpretability.

Three points to consider when choosing a LM or GLM test for count data

期刊

METHODS IN ECOLOGY AND EVOLUTION

出版社

WILEY

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Three points to consider when choosing a LM or GLM test for count data

期刊

METHODS IN ECOLOGY AND EVOLUTION

出版社

WILEY

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文