4.4 Article

SGL-SVM: A novel method for tumor classification via support vector machine with sparse group Lasso

Journal

JOURNAL OF THEORETICAL BIOLOGY
Volume 486, Issue -, Pages -

Publisher

ACADEMIC PRESS LTD- ELSEVIER SCIENCE LTD
DOI: 10.1016/j.jtbi.2019.110098

Keywords

Tumor classification; Gene expression data; Feature selection; Sparse group Lasso; Support vector machine

Funding

  1. National Nature Science Foundation of China [61863010, 11771188]
  2. Key Research and Development Program of Shandong Province of China [2019GGX101001]
  3. Natural Science Foundation of Shandong Province of China [ZR2017MA014, ZR2018MC007]
  4. Project of Shandong Province Higher Educational Science and Technology Program [J17KA159]

Ask authors/readers for more resources

At present, with the in-depth study of gene expression data, the significant role of tumor classification in clinical medicine has become more apparent. In particular, the sparse characteristics of gene expression data within and between groups. Therefore, this paper focuses on the study of tumor classification based on the sparsity characteristics of genes. On this basis, we propose a new method of tumor classification-Sparse Group Lasso (least absolute shrinkage and selection operator) and Support Vector Machine (SGL-SVM). Firstly, the primary selection of feature genes is performed on the normalized tumor datasets using the Kruskal-Wallis rank sum test. Secondly, using a sparse group Lasso for further selection, and finally, the support vector machine serves as a classifier for classification. We validate proposed method on microarray and NGS datasets respectively. Formerly, on three two-class and five multi-class microarray datasets it is tested by 10-fold cross-validation and compared with other three classifiers. SGL-SVM is then applied on BRCA and GBM datasets and tested by 5-fold cross-validation. Satisfactory accuracy is obtained by above experiments and compared with other proposed methods. The experimental results show that the proposed method achieves a higher classification accuracy and selects fewer feature genes, which can be widely applied in classification for high-dimensional and small-sample tumor datasets. The source code and all datasets are available at https://github.com/QUST-AIBBDRC/SGL-SVM/. (C) 2019 Elsevier Ltd. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.4
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available