4.7 Article

Protein fold recognition using geometric kernel data fusion

期刊

BIOINFORMATICS
卷 30, 期 13, 页码 1850-1857

出版社

OXFORD UNIV PRESS
DOI: 10.1093/bioinformatics/btu118

关键词

-

资金

  1. EU [COST Action BM1006]
  2. MIUR [2002014121]
  3. Research Council KU Leuven [OT/11/055, CoE EF/05/006]
  4. Fund for Scientific Research-Flanders (Belgium) [G034212N]
  5. Interuniversity Attraction Poles Programme
  6. [KUL GOA/10/09 MaNet]
  7. [KUL PFV/10/016 SymBioSys]
  8. [KUL IOF 3M120274 Immunosuppressive drugs]

向作者/读者索取更多资源

Motivation: Various approaches based on features extracted from protein sequences and oftenmachine learningmethods have been used in the prediction of protein folds. Finding an efficient technique for integrating these different protein features has received increasing attention. In particular, kernelmethods are an interesting class of techniques for integrating heterogeneous data. Various methods have been proposed to fuse multiple kernels. Most techniques for multiple kernel learning focus on learning a convex linear combination of base kernels. In addition to the limitation of linear combinations, working with such approaches could cause a loss of potentially useful information. Results: We design several techniques to combine kernel matrices by taking more involved, geometry inspired means of these matrices instead of convex linear combinations. We consider various sequencebased protein features including information extracted directly from position-specific scoring matrices and local sequence alignment. We evaluate our methods for classification on the SCOP PDB-40D benchmark dataset for protein fold recognition. The best overall accuracy on the protein fold recognition test set obtained by our methods is similar to 86.7%. This is an improvement over the results of the best existing approach. Moreover, our computational model has been developed by incorporating the functional domain composition of proteins through a hybridization model. It is observed that by using our proposed hybridization model, the protein fold recognition accuracy is further improved to 89.30%. Furthermore, we investigate the performance of our approach on the protein remote homology detection problem by fusing multiple string kernels. Availability and implementation: The MATLAB code used for our proposed geometric kernel fusion frameworks are publicly available at http://people.cs.kuleuven.be/similar to raf.vandebril/homepage/software/ geomean.php?menu=5/

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据