4.7 Article

Enhanced whole genome sequence and annotation of Clostridium stercorarium DSM8532T using RNA-seq transcriptomics and high-throughput proteomics

Journal

BMC GENOMICS
Volume 15, Issue -, Pages -

Publisher

BMC
DOI: 10.1186/1471-2164-15-567

Keywords

Genome; Proteome; Transcriptome; RNA-seq; Tandem mass spectrometry; Proteogenomics; Glycolysis; Pentose phosphate pathway; Transaldolase

Funding

  1. Genome Canada, of MGCB2: Microbial Genomics for Biofuels and Co-products from Biorefining Processes
  2. EU-FP7 COST Action SeqAhead [EC] [BM1006]
  3. Austrian Centre of Industrial Biotechnology (ACIB) - FFG
  4. BMWFJ
  5. BMVIT
  6. ZIT
  7. SFG
  8. Zukunftsstiftung Tirol within the Austrian COMET program [FFG] [824186]

Ask authors/readers for more resources

Background: Growing interest in cellulolytic clostridia with potential for consolidated biofuels production is mitigated by low conversion of raw substrates to desired end products. Strategies to improve conversion are likely to benefit from emerging techniques to define molecular systems biology of these organisms. Clostridium stercorarium DSM8532(T) is an anaerobic thermophile with demonstrated high ethanol production on cellulose and hemicellulose. Although several lignocellulolytic enzymes in this organism have been well-characterized, details concerning carbohydrate transporters and central metabolism have not been described. Therefore, the goal of this study is to define an improved whole genome sequence (WGS) for this organism using in-depth molecular profiling by RNA-seq transcriptomics and tandem mass spectrometry-based proteomics. Results: A paired-end Roche/454 WGS assembly was closed through application of an in silico algorithm designed to resolve repetitive sequence regions, resulting in a circular replicon with one gap and a region of 2 kilobases with 10 ambiguous bases. RNA-seq transcriptomics resulted in nearly complete coverage of the genome, identifying errors in homopolymer length attributable to 454 sequencing. Peptide sequences resulting from high-throughput tandem mass spectrometry of trypsin-digested protein extracts were mapped to 1,755 annotated proteins (68% of all protein-coding regions). Proteogenomic analysis confirmed the quality of annotation and improvement pipelines, identifying a missing gene and an alternative reading frame. Peptide coverage of genes hypothetically involved in substrate hydrolysis, transport and utilization confirmed multiple pathways for glycolysis, pyruvate conversion and recycling of intermediates. No sequences homologous to transaldolase, a central enzyme in the pentose phosphate pathway, were observed by any method, despite demonstrated growth of this organism on xylose and xylan hemicellulose. Conclusions: Complementary omics techniques confirm the quality of genome sequence assembly, annotation and error-reporting. Nearly complete genome coverage by RNA-seq likely indicates background DNA in RNA extracts, however these preps resulted in WGS enhancement and transcriptome profiling in a single Illumina run. No detection of transaldolase by any method despite xylose utilization by this organism indicates an alternative pathway for sedoheptulose-7-phosphate degradation. This report combines next-generation omics techniques to elucidate previously undefined features of substrate transport and central metabolism for this organism and its potential for consolidated biofuels production from lignocellulose.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available