4.6 Article

Three points to consider when choosing a LM or GLM test for count data

期刊

METHODS IN ECOLOGY AND EVOLUTION
卷 7, 期 8, 页码 882-890

出版社

WILEY
DOI: 10.1111/2041-210X.12552

关键词

data transformation; generalized linear models; multivariate analysis; power analysis; type I error

类别

资金

  1. Australian Research Council [FT120100501, DP150100823, LP150100972]
  2. US National Science Foundation [DEB-LTREB-1052160]
  3. Australian Research Council [LP150100972] Funding Source: Australian Research Council
  4. Division Of Environmental Biology
  5. Direct For Biological Sciences [1052160] Funding Source: National Science Foundation
  6. Division Of Environmental Biology
  7. Direct For Biological Sciences [1556208] Funding Source: National Science Foundation

向作者/读者索取更多资源

The two most common approaches for analysing count data are to use a generalized linear model (GLM), or transform data, and use a linear model (LM). The latter has recently been advocated to more reliably maintain control of type I error rates in tests for no association, while seemingly losing little in power. We make three points on this issue.Point 1 - Choice of statistical model should primarily be made on the grounds of data properties. Choice of testing procedure should be considered and addressed as a separate issue, after model choice. If models with the appropriate data properties nonetheless have statistical problems such as type I error control (i.e. type I error rate greatly exceeds the intended significance level), the best solution is to keep the model but fix the problems.Point 2 - When a test has problems with type I error control, it can usually be corrected, but this may require departure from software default approaches. In particular, resampling is a good solution for small samples that can be easy to implement.Point 3 -Tests based on models that better fit the data (e.g. a negative binomial for overdispersed count data) tend to have better power properties and in some instances have considerably higher power. We illustrate these issues for a 2x2 experiment with a count response. This seemingly simple problem becomes hard when the experimental design is unbalanced, and software default procedures using LMs or GLMs can have difficulties, although in both cases the issues can be fixed. We conclude that, when GLMs are thought to fit count data well, and when any necessary steps are taken to correct type I error rates, they should be used rather than LMs. Nonetheless, standard LM tests are often robust and can have good type I error control, so there is an argument for their use for counts when diagnostics are difficult and statistical models are complex, although at some risk of loss of power and interpretability.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据