4.5 Article

Effects of categorization method, regression type, and variable distribution on the inflation of Type-I error rate when categorizing a confounding variable

期刊

STATISTICS IN MEDICINE
卷 34, 期 6, 页码 936-949

出版社

WILEY-BLACKWELL
DOI: 10.1002/sim.6387

关键词

Type-I error; confounding; categorization; dichotomization; simulation; distribution

资金

  1. CIHR [110789, 120305, 119485]
  2. NSERC [402079-2011]
  3. AAC, FRQ-S
  4. FRQ-S

向作者/读者索取更多资源

The loss of signal associated with categorizing a continuous variable is well known, and previous studies have demonstrated that this can lead to an inflation of Type-I error when the categorized variable is a confounder in a regression analysis estimating the effect of an exposure on an outcome. However, it is not known how the Type-I error may vary under different circumstances, including logistic versus linear regression, different distributions of the confounder, and different categorization methods. Here, we analytically quantified the effect of categorization and then performed a series of 9600 Monte Carlo simulations to estimate the Type-I error inflation associated with categorization of a confounder under different regression scenarios. We show that Type-I error is unacceptably high (>10% in most scenarios and often 100%). The only exception was when the variable categorized was a continuous mixture proxy for a genuinely dichotomous latent variable, where both the continuous proxy and the categorized variable are error-ridden proxies for the dichotomous latent variable. As expected, error inflation was also higher with larger sample size, fewer categories, and stronger associations between the confounder and the exposure or outcome. We provide online tools that can help researchers estimate the potential error inflation and understand how serious a problem this is. Copyright (C) 2014 John Wiley & Sons, Ltd.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.5
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据