期刊
GENES
卷 12, 期 10, 页码 -出版社
MDPI
DOI: 10.3390/genes12101645
关键词
functional annotation; containerization; pipeline; reproducibility
资金
- ISCIII [PT 13/0001/0021]
- European Regional Development Fund (FEDER)
- Spanish Ministry of Science and Innovation (AEI/FEDER) [PGC2018-094017-B-I00]
- Spanish Ministry of Science and Innovation
- Centro de Excelencia Severo Ochoa
- CERCA Programme/Generalitat de Catalunya
Functional annotation is essential for enhancing the biological relevance of predicted features in genomic sequences, and improving gene structural annotation. The pipeline FA-nf, implemented in Nextflow, integrates various annotation approaches for producing multiple files and reports efficiently. It can be easily parallelized and deployed in a Linux computational environment, ensuring full reproducibility through software containerization.
Functional annotation allows adding biologically relevant information to predicted features in genomic sequences, and it is, therefore, an important procedure of any de novo genome sequencing project. It is also useful for proofreading and improving gene structural annotation. Here, we introduce FA-nf, a pipeline implemented in Nextflow, a versatile computational workflow management engine. The pipeline integrates different annotation approaches, such as NCBI BLAST+, DIAMOND, InterProScan, and KEGG. It starts from a protein sequence FASTA file and, optionally, a structural annotation file in GFF format, and produces several files, such as GO assignments, output summaries of the abovementioned programs and final annotation reports. The pipeline can be broken easily into smaller processes for the purpose of parallelization and easily deployed in a Linux computational environment, thanks to software containerization, thus helping to ensure full reproducibility.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据