4.8 Article

The P10K database: a data portal for the protist 10 000 genomes project

Related references

Note: Only part of the references are listed.
Article Biochemistry & Molecular Biology

Database Resources of the National Genomics Data Center, China National Center for Bioinformation in 2023

Yongbiao Xue et al.

Summary: The National Genomics Data Center (NGDC) of the China National Center for Bioinformation (CNCB) provides database resources to support global academic and industrial communities. The NGDC constantly expands and updates core database resources by archiving big data, conducting integrative analysis, and providing value-added curation. New database resources have been developed for infectious diseases and microbiology, cancer-trait association, and tropical plants. Additionally, resources for the monkeypox virus and SARS-CoV-2 have been newly constructed and regularly updated. All resources and services are publicly accessible at https://ngdc.cncb.ac.cn.

NUCLEIC ACIDS RESEARCH (2023)

Article Biochemistry & Molecular Biology

Stop or Not: Genome-Wide Profiling of Reassigned Stop Codons in Ciliates

Wenbing Chen et al.

Summary: By sequencing seven representative ciliate genomes, we discovered two previously undescribed genetic codes, highlighting the prevalence of bifunctional stop codons in ciliates. Evolutionary genomic analyses revealed that the gain or loss of reassigned stop codons in ciliates is influenced by their living environment, eukaryotic release factor 1, and suppressor tRNAs. This study provides new insights into the functional diversity and evolutionary history of stop codons in eukaryotic organisms.

MOLECULAR BIOLOGY AND EVOLUTION (2023)

Article Biochemistry & Molecular Biology

iGDP: An integrated genome decontamination pipeline for wild ciliated microeukaryotes

Chuanqi Jiang et al.

Summary: Researchers developed an integrated Genome Decontamination Pipeline (iGDP) to filter contaminated ciliate genome assemblies from wild specimens, resulting in high-quality ciliate genomes. iGDP showed good performance in filtering contaminants and can be applied to other microeukaryotes.

MOLECULAR ECOLOGY RESOURCES (2023)

Article Multidisciplinary Sciences

Nontriplet feature of genetic code in Euplotes ciliates is a result of neutral evolution

Sofya A. Gaydukova et al.

Summary: This study investigates the phenomenon of ribosomal frameshifting in Euplotes ciliates, which involves frequent stop codons at internal mRNA positions. The authors sequenced transcriptomes of Euplotes species and found that frameshift sites are accumulating rapidly through genetic drift. They also discovered that frameshift sites do not significantly impact the fitness and survival of Euplotes. These findings suggest that the violation of the triplet character of the genetic code can be introduced and maintained solely by neutral evolution.

PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA (2023)

Article Biochemical Research Methods

Codetta: predicting the genetic code from nucleotide sequence

Yekaterina Shulgina et al.

Summary: Codetta is a Python program that predicts the genetic code table of an organism from nucleotide sequences. It can analyze any nucleotide sequence without the need for sequence annotation or taxonomic placement. The most likely amino acid decoding for each codon is inferred from alignments of conserved proteins to the input sequence through profile hidden Markov models.

BIOINFORMATICS (2023)

Article Biochemistry & Molecular Biology

VEuPathDB: the eukaryotic pathogen, vector and host bioinformatics resource center

Beatrice Amos et al.

Summary: VEuPathDB project is a bioinformatics resource center funded by the National Institutes of Health, supporting over 500 organisms including invertebrate vectors, eukaryotic pathogens, and hosts. It integrates over 1700 pre-analyzed datasets, providing advanced search capabilities, visualizations, and analysis tools, with standardized workflows for researchers to access Omics data and bioinformatic analyses.

NUCLEIC ACIDS RESEARCH (2022)

Article Biochemical Research Methods

Sensitive protein alignments at tree-of-life scale using DIAMOND

Benjamin Buchfink et al.

Summary: We are at the beginning of a genomic revolution where all known species are planned to be sequenced. The improved version of DIAMOND allows for quick tree-of-life scale protein alignments.

NATURE METHODS (2021)

Letter Biochemistry & Molecular Biology

BUSCO Update: Novel and Streamlined Workflows along with Broader and Deeper Phylogenetic Coverage for Scoring of Eukaryotic, Prokaryotic, and Viral Genomes

Mose Manni et al.

Summary: The BUSCO software provides essential methods for assessing the quality of genomic and metagenomic data, offering new functionalities and improvements to streamline the process. It is capable of evaluating both eukaryotic and prokaryotic species, and can be used across various data types including genome assemblies, metagenomic bins, transcriptomes, and gene sets.

MOLECULAR BIOLOGY AND EVOLUTION (2021)

Article Genetics & Heredity

The Genome Sequence Archive Family: Toward Explosive Data Growth and Diverse Data Types

Tingting Chen et al.

Summary: The Genome Sequence Archive (GSA) and its family of resources aim to provide data storage and sharing services for worldwide scientific communities, including GSA, GSA-Human, and OMIX. GSA has updated data model and online functionalities, GSA-Human stores human genetics-related data, and OMIX provides open archive for miscellaneous data. These resources together form a family, accepting global data submissions and providing free open access for worldwide research activities.

GENOMICS PROTEOMICS & BIOINFORMATICS (2021)

Article Genetics & Heredity

Genome Warehouse: A Public Repository Housing Genome-scale Data

Meili Chen et al.

Summary: The Genome Warehouse (GWH) is a public repository that houses genome assembly data for various species and offers web services for data submission, storage, release, and sharing. With a uniform quality control procedure, GWH accepts full and partial genome sequences and visualizes released data using JBrowse, serving as an important resource for global research activities.

GENOMICS PROTEOMICS & BIOINFORMATICS (2021)

Article Microbiology

Bacteria-Derived Hemolysis-Related Genes Widely Exist in Scuticociliates

Jing Zhang et al.

MICROORGANISMS (2020)

Review Multidisciplinary Sciences

Dog10K: an international sequencing effort to advance studies of canine domestication, phenotypes and health

Elaine A. Ostrander et al.

NATIONAL SCIENCE REVIEW (2019)

Article Biochemistry & Molecular Biology

Hidden genomic evolution in a morphospecies-The landscape of rapidly evolving genes in Tetrahymena

Jie Xiong et al.

PLOS BIOLOGY (2019)

Article Biochemical Research Methods

fastp: an ultra-fast all-in-one FASTQ preprocessor

Shifu Chen et al.

BIOINFORMATICS (2018)

Editorial Material Biology

10KP: A phylodiverse genome sequencing plan

Shifeng Cheng et al.

GIGASCIENCE (2018)

Editorial Material Microbiology

Earth Microbiome Project and Global Systems Biology

Jack A. Gilbert et al.

MSYSTEMS (2018)

Article Biochemistry & Molecular Biology

Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation

Sergey Koren et al.

GENOME RESEARCH (2017)

Letter Biotechnology & Applied Microbiology

MMseqs2 enables sensitive protein sequence searching for the analysis of massive data sets

Martin Steinegger et al.

NATURE BIOTECHNOLOGY (2017)

Article Biochemistry & Molecular Biology

Reference sequence (RefSeq) database at NCBI: current status, taxonomic expansion, and functional annotation

Nuala A. O'Leary et al.

NUCLEIC ACIDS RESEARCH (2016)

Article Biochemistry & Molecular Biology

Genetic Codes with No Dedicated Stop Codon: Context-Dependent Translation Termination

Estienne Carl Swart et al.

Letter Multidisciplinary Sciences

Bird sequencing project takes off

Guojie Zhang

NATURE (2015)

Article Biochemical Research Methods

InterProScan 5: genome-scale protein function classification

Philip Jones et al.

BIOINFORMATICS (2014)

Article Biochemical Research Methods

nhmmer: DNA homology search with profile HMMs

Travis J. Wheeler et al.

BIOINFORMATICS (2013)

Article Biochemistry & Molecular Biology

The SILVA ribosomal RNA gene database project: improved data processing and web-based tools

Christian Quast et al.

NUCLEIC ACIDS RESEARCH (2013)

Article Biochemistry & Molecular Biology

The NCBI Taxonomy database

Scott Federhen

NUCLEIC ACIDS RESEARCH (2012)

Article Mathematical & Computational Biology

Tetrahymena genome database Wiki: a community-maintained model organism database

Nicholas A. Stover et al.

DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION (2012)

Article Biochemical Research Methods

FACIL: Fast and Accurate Genetic Code Inference and Logo

Bas E. Dutilh et al.

BIOINFORMATICS (2011)

Article Biotechnology & Applied Microbiology

Full-length transcriptome assembly from RNA-Seq data without a reference genome

Manfred G. Grabherr et al.

NATURE BIOTECHNOLOGY (2011)

Article Biochemical Research Methods

BLAST plus : architecture and applications

Christiam Camacho et al.

BMC BIOINFORMATICS (2009)

Article Biochemical Research Methods

Using native and syntenically mapped cDNA alignments to improve de novo gene finding

Mario Stanke et al.

BIOINFORMATICS (2008)

Article Computer Science, Information Systems

Engineering a software tool for gene structure prediction in higher organisms

G Gremme et al.

INFORMATION AND SOFTWARE TECHNOLOGY (2005)

Article Biochemical Research Methods

TigrScan and GlimmerHMM:: two open source ab initio eukaryotic gene-finders

WH Majoros et al.

BIOINFORMATICS (2004)

Article Biochemical Research Methods

Gene finding in novel genomes

I Korf

BMC BIOINFORMATICS (2004)

Article Biochemistry & Molecular Biology

Improving the Arabidopsis genome annotation using maximal transcript alignment assemblies

BJ Haas et al.

NUCLEIC ACIDS RESEARCH (2003)