4.7 Article

Row and Column Structure-Based Biclustering for Gene Expression Data

出版社

IEEE COMPUTER SOC
DOI: 10.1109/TCBB.2020.3022085

关键词

Gene expression; Complexity theory; Clustering algorithms; Greedy algorithms; Clustering methods; Computational biology; Bioinformatics; Biclustering; checkerboard pattern; row and column selection

资金

  1. Hong Kong Research Grants Council [C1007-15G, 11200818]

向作者/读者索取更多资源

This paper proposes a biclustering method called RCSBC, which aims to find checkerboard patterns within gene expression data. By exploiting the relationship between the row/column structure of a gene expression matrix and the structure of biclusters, the method achieves low time and space complexity and outperforms existing algorithms in terms of clustering accuracy and time/space complexity.
Due to the development of high-throughput technologies for gene analysis, the biclustering method has attracted much attention. However, existing methods have problems with high time and space complexity. This paper proposes a biclustering method, called Row and Column Structure-based Biclustering (RCSBC), with low time and space complexity to find checkerboard patterns within microarray data. First, the paper describes the structure of bicluster by using the structure of rows and columns. Second, the paper chooses the representative rows and columns with two algorithms. Finally, the gene expression data are biclustered on the space spanned by representative rows and columns. To the best of our knowledge, this paper is the first to exploit the relationship between the row/column structure of a gene expression matrix and the structure of biclusters. Both the synthetic datasets and the real-life gene expression datasets are used to validate the effectiveness of our method. It can be seen from the experiment results that the RCSBC outperforms the state-of-the-art algorithms both on clustering accuracy and time/space complexity. This study offers new insights into biclustering the large-scale gene expression data without loading the whole data into memory.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据