4.0 Article

Estimating Multilevel Logistic Regression Models When the Number of Clusters is Low: A Comparison of Different Statistical Software Procedures

Journal

INTERNATIONAL JOURNAL OF BIOSTATISTICS
Volume 6, Issue 1, Pages -

Publisher

WALTER DE GRUYTER GMBH
DOI: 10.2202/1557-4679.1195

Keywords

statistical software; multilevel models; hierarchical models; random effects model; mixed effects model; generalized linear mixed models; Monte Carlo simulations; Bayesian analysis; R; SAS; Stata; BUGS

Funding

  1. Institute for Clinical Evaluative Sciences (ICES)
  2. Ontario Ministry of Health and Long-Term Care (MOHLTC)
  3. Canadian Institutes of Health Research (CIHR) [MOP 86508]
  4. Heart and Stroke Foundation of Ontario

Ask authors/readers for more resources

Multilevel logistic regression models are increasingly being used to analyze clustered data in medical, public health, epidemiological, and educational research. Procedures for estimating the parameters of such models are available in many statistical software packages. There is currently little evidence on the minimum number of clusters necessary to reliably fit multilevel regression models. We conducted a Monte Carlo study to compare the performance of different statistical software procedures for estimating multilevel logistic regression models when the number of clusters was low. We examined procedures available in BUGS, HLM, R, SAS, and Stata. We found that there were qualitative differences in the performance of different software procedures for estimating multilevel logistic models when the number of clusters was low. Among the likelihood-based procedures, estimation methods based on adaptive Gauss-Hermite approximations to the likelihood (glmer in R and xtlogit in Stata) or adaptive Gaussian quadrature (Proc NLMIXED in SAS) tended to have superior performance for estimating variance components when the number of clusters was small, compared to software procedures based on penalized quasi-likelihood. However, only Bayesian estimation with BUGS allowed for accurate estimation of variance components when there were fewer than 10 clusters. For all statistical software procedures, estimation of variance components tended to be poor when there were only five subjects per cluster, regardless of the number of clusters.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.0
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available