4.4 Article

A method for handling metabonomics data from liquid chromatography/mass spectrometry: combinational use of support vector machine recursive feature elimination, genetic algorithm and random forest for feature selection

期刊

METABOLOMICS
卷 7, 期 4, 页码 549-558

出版社

SPRINGER
DOI: 10.1007/s11306-011-0274-7

关键词

Support vector machine; Genetic algorithm; Random forest; Liver diseases; Metabonomics; Metabolomics

资金

  1. State Key Science & Technology Project for Infectious Diseases [2008ZX10002-019, 2008ZX10002-017]
  2. National Basic Research Program of China [2007CB914701]
  3. State Ministry of Science & Technology of China [2006038079037]
  4. National Natural Science Foundation of China [20835006, 90713032]

向作者/读者索取更多资源

Metabolic markers are the core of metabonomic surveys. Hence selection of differential metabolites is of great importance for either biological or clinical purpose. Here, a feature selection method was developed for complex metabonomic data set. As an effective tool for metabonomics data analysis, support vector machine (SVM) was employed as the basic classifier. To find out meaningful features effectively, support vector machine recursive feature elimination (SVM-RFE) was firstly applied. Then, genetic algorithm (GA) and random forest (RF) which consider the interaction among the metabolites and independent performance of each metabolite in all samples, respectively, were used to obtain more informative metabolic difference and avoid the risk of false positive. A data set from plasma metabonomics study of rat liver diseases developed from hepatitis, cirrhosis to hepatocellular carcinoma was applied for the validation of the method. Besides the good classification results for 3 kinds of liver diseases, 31 important metabolites including lysophosphatidylethanolamine (LPE) C16:0, palmitoylcarnitine, lysophosphatidylethanolamine (LPC) C18:0 were also selected for further studies. A better complementary effect of the three feature selection methods could be seen from the current results. The combinational method also represented more differential metabolites and provided more metabolic information for a global understanding of diseases than any single method. Further more, this method is also suitable for other complex biological data sets.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.4
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据