4.4 Article

Random effects structure for confirmatory hypothesis testing: Keep it maximal

期刊

JOURNAL OF MEMORY AND LANGUAGE
卷 68, 期 3, 页码 255-278

出版社

ACADEMIC PRESS INC ELSEVIER SCIENCE
DOI: 10.1016/j.jml.2012.11.001

关键词

Linear mixed-effects models; Generalization; Statistics; Monte Carlo simulation

资金

  1. ESRC [RES-062-23-2009]
  2. NSF [IIS-0953870]
  3. NIH [HD065829]
  4. Direct For Computer & Info Scie & Enginr
  5. Div Of Information & Intelligent Systems [0953870] Funding Source: National Science Foundation
  6. Economic and Social Research Council [ES/G045720/1] Funding Source: researchfish
  7. ESRC [ES/G045720/1] Funding Source: UKRI

向作者/读者索取更多资源

Linear mixed-effects models (LMEMs) have become increasingly prominent in psycholinguistics and related areas. However, many researchers do not seem to appreciate how random effects structures affect the generalizability of an analysis. Here, we argue that researchers using LMEMs for confirmatory hypothesis testing should minimally adhere to the standards that have been in place for many decades. Through theoretical arguments and Monte Carlo simulation, we show that LMEMs generalize best when they include the maximal random effects structure justified by the design. The generalization performance of LMEMs including data-driven random effects structures strongly depends upon modeling criteria and sample size, yielding reasonable results on moderately-sized samples when conservative criteria are used, but with little or no power advantage over maximal models. Finally, random-intercepts-only LMEMs used on within-subjects and/or within-items data from populations where subjects and/or items vary in their sensitivity to experimental manipulations always generalize worse than separate F-1 and F-2 tests, and in many cases, even worse than F-1 alone. Maximal LMEMs should be the 'gold standard' for confirmatory hypothesis testing in psycholinguistics and beyond. (C) 2012 Elsevier Inc. All rights reserved.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.4
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据