Journal
EPIDEMIOLOGY
Volume 18, Issue 2, Pages 246-252Publisher
LIPPINCOTT WILLIAMS & WILKINS
DOI: 10.1097/01.ede.0000254708.40228.8b
Keywords
-
Categories
Funding
- NCI NIH HHS [R01 CA094594] Funding Source: Medline
Ask authors/readers for more resources
Multiple imputation can be a good solution to handling missing data if data are missing at random. However, this assumption is often difficult to verify. We describe an application of multiple imputation that makes this assumption plausible. This procedure requires contacting a random sample of subjects with incomplete data to fill in the missing information, and then adjusting the imputation model to incorporate the new data. Simulations with missing data that were decidedly not missing at random showed, as expected, that the method restored the original beta coefficients, whereas other methods of dealing with missing data failed. Using a dataset with real missing data, we found that different approaches to imputation produced moderately different results. Simulations suggest that filling in 10% of data that was initially missing is sufficient for imputation in many epidemiologic applications, and should produce approximately unbiased results, provided there is a high response on follow-up from the subsample of those with some originally missing data. This response can probably be achieved if this data collection is planned as an initial approach to dealing with the missing data, rather than at later stages, after further attempts that leave only data that is very difficult to complete.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available