4.7 Article

Normalization and Statistical Analysis of Quantitative Proteomics Data Generated by Metabolic Labeling

期刊

MOLECULAR & CELLULAR PROTEOMICS
卷 8, 期 10, 页码 2227-2242

出版社

AMER SOC BIOCHEMISTRY MOLECULAR BIOLOGY INC
DOI: 10.1074/mcp.M800462-MCP200

关键词

-

资金

  1. Australian Government Systemic Infrastructure Initiative
  2. Major National Research Facilities Program
  3. UNSW Capital Grants Scheme

向作者/读者索取更多资源

Comparative proteomics is a powerful analytical method for learning about the responses of biological systems to changes in growth parameters. To make confident inferences about biological responses, proteomics approaches must incorporate appropriate statistical measures of quantitative data. In the present work we applied microarray-based normalization and statistical analysis (significance testing) methods to analyze quantitative proteomics data generated from the metabolic labeling of a marine bacterium (Sphingopyxis alaskensis). Quantitative data were generated for 1,172 proteins, representing 1,736 high confidence protein identifications (54% genome coverage). To test approaches for normalization, cells were grown at a single temperature, metabolically labeled with N-14 or N-15, and combined in different ratios to give an artificially skewed data set. Inspection of ratio versus average (MA) plots determined that a fixed value median normalization was most suitable for the data. To determine an appropriate statistical method for assessing differential abundance, a-fold change approach, Student's t test, unmoderated t test, and empirical Bayes moderated t test were applied to proteomics data from cells grown at two temperatures. Inverse metabolic labeling was used with multiple technical and biological replicates, and proteomics was performed on cells that were combined based on equal optical density of cultures (providing skewed data) or on cell extracts that were combined to give equal amounts of protein (no skew). To account for arbitrarily complex experiment-specific parameters, a linear modeling approach was used to analyze the data using the limma package in R/Bioconductor. A high quality list of statistically significant differentially abundant proteins was obtained by using lowess normalization (after inspection of MA plots) and applying the empirical Bayes moderated t test. The approach also effectively controlled for the number of false discoveries and corrected for the multiple testing problem using the Storey-Tibshirani false discovery rate (Storey, J. D., and Tibshirani, R. (2003) Statistical significance for genome-wide studies. Proc. Natl. Acad. Sci. U. S. A. 100, 94409445). The approach we have developed is generally applicable to quantitative proteomics analyses of diverse biological systems. Molecular & Cellular Proteomics 8: 2227-2242, 2009.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据