4.6 Article

Mass Classification in Mammograms Using Selected Geometry and Texture Features, and a New SVM-Based Feature Selection Method

Journal

IEEE SYSTEMS JOURNAL
Volume 8, Issue 3, Pages 910-920

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/JSYST.2013.2286539

Keywords

Breast cancer; feature selection; mammogram; mass classification; mutual information (MI); recursive feature elimination (RFE); support vector machine (SVM)

Funding

  1. Open Foundation of Hubei Province Key Laboratory [znss2013A006]
  2. Natural Science Foundation of China [31201121, 61273225]

Ask authors/readers for more resources

Masses are the primary indications of breast cancer in mammograms, and it is important to classify them as benign or malignant. Benign and malignant masses differ in geometry and texture characteristics. However, not every geometry and texture feature that is extracted contributes to the improvement of classification accuracy; thus, to select the best features from a set is important. In this paper, we examine the feature selection methods for mass classification. We integrate a support vector machine (SVM)-based recursive feature elimination (SVM-RFE) procedure with a normalized mutual information feature selection (NMIFS) to avoid their singular disadvantages (the redundancy in the selected features of the SVM-RFE and the unoptimized classifier for the NMIFS) while retaining their advantages, and we propose a new feature selection method, which is called the SVM-RFE with an NMIFS filter (SRN). In addition to feature selection, we also study the initialization of mass segmentation. Different initialization methods are investigated, and we propose a fuzzy c-means (FCM) clustering, with spatial constraints as the initialization step. In the experiments, 826 regions of interest (ROIs) from the Digital Database for Screening Mammography were used. All 826 were used in the classification experiments, and 413 ROIs were used in the feature selection experiments. Different feature selection methods, including F-score, Relief, SVM-RFE, SVM-RFE with a minimum redundancy-maximum relevance (mRMR) filter [ SVM-RFE (mRMR)], and SRN, were used to select features and to compare mass classification results using the selected features. In the classification experiments, the linear discriminant analysis and the SVM classifiers were investigated. The accuracy that is obtained with the SVM classifier using the selected features obtained by the F-score, Relief, SVM-RFE, SVM-RFE (mRMR), and SRN methods are 88%, 88%, 90%, 91%, and 93%, respectively, with a tenfold cross-validation procedure, and 91%, 89%, 92%, 92%, and 94%, respectively, with a leave-one-out (LOO) scheme. We also compared the performance of the different feature selection methods using the receiver operating characteristic analysis and the areas under the curve (AUCs). The AUCs for the F-score, Relief, SVM-RFE, SVM-RFE (mRMR), and SRN methods are 0.9014, 0.8916, 0.9121, 0.9236, and 0.9439, respectively, with a tenfold cross-validation procedure, and are 0.9312, 0.9178, 0.9324, 0.9413, and 0.9615, respectively, with a LOO scheme. Both the accuracy and AUC values show that the proposed SRN feature selection method has the best performance. In addition to the accuracy and the AUC, we also measured the significance between the two best feature selection methods, i.e., the SVM-RFE (mRMR) and the proposed SRN method. Experimental results show that the proposed SRN method is significantly more accurate than the SVM-RFE (mRMR) (p = 0.011).

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available