4.6 Article

Computer-Aided Detection: The Effect of Training Databases on Detection of Subtle Breast Masses

期刊

ACADEMIC RADIOLOGY
卷 17, 期 11, 页码 1401-1408

出版社

ELSEVIER SCIENCE INC
DOI: 10.1016/j.acra.2010.06.009

关键词

Computer-aided detection; CAD; full-field digital mammography; FFDM; image databases; performance assessment

资金

  1. National Cancer Institute, National Institutes of Health (Bethesda, MD) [CA77850, CA101733]

向作者/读者索取更多资源

Rationale and Objectives: Lesion conspicuity is typically highly correlated with visual difficulty for lesion detection, and computer-aided detection (CAD) has been widely used as a second reader in mammography. Hence, increasing CAD sensitivity in detecting subtle cancers without increasing false-positive rates is important. The aim of this study was to investigate the effect of training database case selection on CAD performance in detecting low-conspicuity breast masses. Materials and Methods: A full-field digital mammographic image database that included 525 cases depicting malignant masses was randomly partitioned into three subsets. A CAD scheme was applied to detect all initially suspected mass regions and compute region conspicuity. Training samples were iteratively selected from two of the subsets. Four types of training data sets-(1) one including all available true-positive mass regions in the two subsets (all), (2) one including 350 randomly selected mass regions (diverse), (3) one including 350 high-conspicuity mass regions (easy), and (4) one including 350 low-conspicuity mass regions (difficult)-were assembled. In each training data set, the same number of randomly selected false-positive regions as the true-positives were also included. Two classifiers, an artificial neural network (ANN) and a k-nearest neighbor (KNN) algorithm, were trained using each of the four training data sets and tested on all suspected regions in the remaining data set. Using a threefold cross-validation method, the performance changes of the CAD schemes trained using one of the four training data sets were computed and compared. Results: CAD initially detected 1025 true-positive mass regions depicted on 507 cases (97% case-based sensitivity) and 9569 false-positive regions (3.5 per image) in the entire database. Using the all training data set, CAD achieved the highest overall performance on the entire testing database. However, CAD detected the highest number of low-conspicuity masses when the difficult training data set was used for training. Results did agree for both ANN-based and KNN-based classifiers in all tests. Compared to the use of the all training data set, the sensitivity of the schemes trained using the difficult data set decreased by 8.6% and 8.4% for the ANN and KNN algorithm on the entire database, respectively, but the detection of low-conspicuity masses increased by 7.1% and 15.1% for the ANN and KNN algorithm at a false-positive rate of 0.3 per image. Conclusions: CAD performance depends on the size, diversity, and difficulty level of the training database. To increase CAD sensitivity in detecting subtle cancer, one should increase the fraction of difficult cases in the training database rather than simply increasing the training data set size.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据