4.5 Article

A classification for complex imbalanced data in disease screening and early diagnosis

期刊

STATISTICS IN MEDICINE
卷 41, 期 19, 页码 3679-3695

出版社

WILEY
DOI: 10.1002/sim.9442

关键词

Alzheimer's disease; AUC; brain imaging data; class imbalance; group LASSO

资金

  1. Alzheimer's Disease Neuroimaging Initiative (ADNI) (National Institutes of Health) [U01 AG024904]
  2. DOD ADNI (Department of Defense) [W81XWH-12-2-0012]
  3. National Institute on Aging
  4. National Institute of Biomedical Imaging and Bioengineering
  5. AbbVie
  6. Alzheimer's Association
  7. Alzheimer's Drug Discovery Foundation
  8. Araclon Biotech
  9. BioClinica, Inc.
  10. Biogen
  11. Bristol-Myers Squibb Company
  12. CereSpir, Inc.
  13. Cogstate
  14. Eisai Inc.
  15. Elan Pharmaceuticals, Inc.
  16. Eli Lilly and Company
  17. EuroImmun
  18. F. Hoffmann-La Roche Ltd
  19. Genentech, Inc.
  20. Fujirebio
  21. GE Healthcare
  22. IXICO Ltd.
  23. Janssen Alzheimer Immunotherapy Research & Development, LLC.
  24. Johnson & Johnson Pharmaceutical Research & Development LLC.
  25. Lumosity
  26. Lundbeck
  27. Merck Co., Inc.
  28. Meso Scale Diagnostics, LLC.
  29. NeuroRx Research
  30. Neurotrack Technologies
  31. Novartis Pharmaceuticals Corporation
  32. Pfizer Inc.
  33. Piramal Imaging
  34. Servier
  35. Takeda Pharmaceutical Company
  36. Transition Therapeutics
  37. Canadian Institutes of Health Research
  38. Northern California Institute for Research and Education

向作者/读者索取更多资源

This article presents a nonparametric classification approach for imbalanced data in longitudinal and high-dimensional settings. The proposed method shows improvements in imbalanced classification while maintaining low computational complexity and providing meaningful feature selection.
Imbalanced classification has drawn considerable attention in the statistics and machine learning literature. Typically, traditional classification methods often perform poorly when a severely skewed class distribution is observed, not to mention under a high-dimensional longitudinal data structure. Given the ubiquity of big data in modern health research, it is expected that imbalanced classification in disease diagnosis may encounter an additional level of difficulty that is imposed by such a complex data structure. In this article, we propose a nonparametric classification approach for imbalanced data in longitudinal and high-dimensional settings. Technically, the functional principal component analysis is first applied for feature extraction under the longitudinal structure. The univariate exponential loss function coupled with group LASSO penalty is then adopted into the classification procedure in high-dimensional settings. Along with a good improvement in imbalanced classification, our approach provides a meaningful feature selection for interpretation while enjoying a remarkably lower computational complexity. The proposed method is illustrated on the real data application of Alzheimer's disease early detection and its empirical performance in finite sample size is extensively evaluated by simulations.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.5
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据