4.5 Article

Uncertainty measurement for a gene space based on class-consistent technology: an application in gene selection

Journal

APPLIED INTELLIGENCE
Volume 53, Issue 5, Pages 5416-5436

Publisher

SPRINGER
DOI: 10.1007/s10489-022-03657-3

Keywords

Gene space; Gene selection; Class-consistent technology; Information granularity; Information entropy

Ask authors/readers for more resources

This paper studies the uncertainty measurement of gene space based on the class-consistent technology and discusses its application in gene selection from the perspective of GrC. The class-consistent relation between cells in a gene space is established, and the information granules are obtained. Two metrics to measure the uncertainty of gene space are defined, and their effectiveness is verified through numerical experiments and statistical tests. Furthermore, two gene selection algorithms are proposed and shown to outperform state-of-the-art feature selection algorithms in clustering experiments.
With the development of data mining, artificial intelligence, neural network, expert system and machine learning, information system (i-system) becomes more and more important. If the objects, attributes and information values in an i-system are replaced by cells, genes and gene expression values, respectively, then the i-system is said to be a gene space. Because gene expression data is characterized by small samples, high dimension and noise, there is considerable uncertainty in a gene space. Traditional machine learning and statistical methods are often powerless to a gene space. Granular computing (GrC) can effectively deal with various uncertainties. This paper studies the uncertainty measurement of gene space based on the class-consistent technology and discusses its application in gene selection from the perspective of GrC. A class-consistent relation between cells in a gene space is first established by the gene expression values of cells on the basis of class-consistent technology. Then, the information granules (i-granules) are obtained from a gene space by using the class-consistent relation. Next, two metrics (information granularity and information entropy) to measure the uncertainty of gene space are defined and their properties are also investigated. The results of numerical experiments and statistical tests verify their effectiveness. Furthermore, as their application to gene space, two gene selection algorithms are proposed. Finally, the clustering experiments and statistical tests on 16 gene spaces show that the designed gene selection algorithms outperform some state-of-the-art feature selection algorithms in terms of three clustering performance indicators.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available