4.5 Review

Scalable approaches for functional analyses of whole-genome sequencing non-coding variants

相关参考文献

注意:仅列出部分参考文献,下载原文获取全部文献信息。
Article Genetics & Heredity

StrVCTVRE: A supervised learning method to predict the pathogenicity of human genome structural variants

Andrew G. Sharo et al.

Summary: This article introduces a method called StrVCTVRE to distinguish between pathogenic and benign structural variants (SVs). By integrating multiple features and using a rare training set for classification, this method reduces about half of the SVs while maintaining a high sensitivity. It provides support for further investigation into unresolved cases and understanding new mechanisms of disease.

AMERICAN JOURNAL OF HUMAN GENETICS (2022)

Article Biochemical Research Methods

eSCAN: scan regulatory regions for aggregate association testing using whole-genome sequencing data

Yingxi Yang et al.

Summary: This study proposes a method called eSCAN for genome-wide assessment of enhancer regions in sequencing studies, which combines the advantages of dynamic window selection in the SCANG method with the advantages of incorporating putative regulatory regions from annotation. eSCAN increases statistical power and aids mechanistic interpretation by searching in putative enhancers.

BRIEFINGS IN BIOINFORMATICS (2022)

Article Biochemistry & Molecular Biology

A framework to score the effects of structural variants in health and disease

Philip Kleinert et al.

Summary: In this study, the authors introduce CADD-SV, a method that uses a wide set of annotations to predict the effects of structural variants (SVs). They overcome previous limitations of supervised learning approaches by using a surrogate training objective and show that CADD-SV can effectively predict pathogenic and rare population variants.

GENOME RESEARCH (2022)

Article Biochemistry & Molecular Biology

RefSeq Functional Elements as experimentally assayed nongenic reference standards and functional interactions in human and mouse

Catherine M. Farrell et al.

Summary: RefSeq Functional Elements (RefSeqFEs) are experimentally validated human and mouse nongenic elements provided by NCBI, offering rich functional details and transparent experimental evidence, with multiple uses for basic functional discovery, bioinformatics studies, and genetic variant interpretation.

GENOME RESEARCH (2022)

Article Neurosciences

Alzheimer's Disease Variant Portal: A Catalog of Genetic Findings for Alzheimer's Disease

Pavel P. Kuksa et al.

Summary: The Alzheimer's Disease Variant Portal (ADVP) is a comprehensive collection of AD genetic associations curated from over 200 GWAS publications, covering more than 900 loci, 1800 variants, 80 cohorts, and 8 populations. It provides investigators with seamless integration of genomic and publicly available functional annotations, facilitating further understanding and analyses of AD genetics findings. ADVP serves as a valuable resource for quick and systematic exploration of high-confidence AD genetic findings and offers insights into population-specific AD genetic architecture.

JOURNAL OF ALZHEIMERS DISEASE (2022)

Article Biochemistry & Molecular Biology

Prioritization of regulatory variants with tissue-specific function in the non-coding regions of human genome

Shengcheng Dong et al.

Summary: A computational tool called TURF is introduced to prioritize regulatory variants with tissue-specific function, showing overall top performance in prediction. TURF can generate prediction scores for non-coding variants based on functional genomics datasets and pick out regulatory variants with tissue-specific function from candidate lists. Additionally, GWAS traits exhibit enrichment of regulatory variants predicted by TURF scores in trait-relevant organs, suggesting their value for future studies.

NUCLEIC ACIDS RESEARCH (2022)

Article Biochemistry & Molecular Biology

Priority index: database of genetic targets in immune-mediated disease

Hai Fang et al.

Summary: This study introduces a comprehensive and unique database 'Priority index' that provides prioritized genes encoding potential therapeutic targets for major immune-mediated diseases. Target genes receive a 5-star rating based on genomic, annotation, and network evidence, highlighting the importance of pathway crosstalk. The database facilitates cross-disease comparisons and aids in early-stage therapeutic target identification and validation leveraging human genetics.

NUCLEIC ACIDS RESEARCH (2022)

Article Biochemistry & Molecular Biology

VannoPortal: multiscale functional annotation of human genetic variants for interrogating molecular mechanism of traits and diseases

Dandan Huang et al.

Summary: VannoPortal is a comprehensive variant annotation database that integrates extensive genomic/epigenomic data and commonly used annotation databases, with rich features and visualization tools to provide comprehensive and context-specific variant annotations for biologists and clinicians.

NUCLEIC ACIDS RESEARCH (2022)

Article Oncology

Genome Nexus: A Comprehensive Resource for the Annotation and Interpretation of Genomic Variants in Cancer

Ino de Bruijn et al.

Summary: Genome Nexus is a user-friendly tool for variant annotation specifically designed for cancer research and clinical practice, addressing the issue of fragmented variant interpretation information across multiple databases, with high-performance annotation capabilities.

JCO CLINICAL CANCER INFORMATICS (2022)

Article Genetics & Heredity

A multi-dimensional integrative scoring framework for predicting functional variants in the human genome

Xihao Li et al.

Summary: MACIE is an unsupervised multivariate mixed-model framework that integrates diverse annotations to assess multi-dimensional functional roles of coding and non-coding variants. It demonstrates powerful performance in discriminating between functional and non-functional variants, and can be applied to fine-mapping and heritability enrichment analysis.

AMERICAN JOURNAL OF HUMAN GENETICS (2022)

Article Genetics & Heredity

New insights into the genetic etiology of Alzheimer's disease and related dementias

Celine Bellenguez et al.

Summary: By characterizing the genetic landscape of Alzheimer's disease and related dementias, new loci have been identified and a new genetic risk score associated with the risk of future Alzheimer's disease and dementia has been generated.

NATURE GENETICS (2022)

Article Neurosciences

Whole-genome sequencing reveals that variants in the Interleukin 18 Receptor Accessory Protein 3 ' UTR protect against ALS

Chen Eitan et al.

Summary: In this study, rare variant association analysis was performed on the untranslated regions of the genomes of amyotrophic lateral sclerosis (ALS) patients and non-ALS controls. The study identified genetic variants in the IL18RAP gene's 3' UTR that were significantly associated with a reduced risk of developing ALS. These variants reduce mRNA stability and dampen neurotoxicity in motor neurons, highlighting the importance of studying noncoding genetic associations.

NATURE NEUROSCIENCE (2022)

Review Genetics & Heredity

Best practices for the interpretation and reporting of clinical whole genome sequencing

Christina A. Austin-Tse et al.

Summary: Whole genome sequencing (WGS) has the potential to become a first-tier diagnostic test for patients with rare genetic disorders. However, there is a lack of standards for defining and implementing the best test. To address this issue, the Medical Genome Initiative formed a consortium of experts to publish best practice recommendations and improve the quality of clinical WGS.

NPJ GENOMIC MEDICINE (2022)

Article Genetics & Heredity

FILER: a framework for harmonizing and querying large-scale functional genomics knowledge

Pavel P. Kuksa et al.

Summary: Querying and summarizing large-scale functional genomic and annotation data collections is a crucial step in genetic analysis. However, the heterogeneity and breadth of data sources and formats make this process difficult. FILER is a framework that provides streamlined access to harmonized genomic datasets, a scalable querying interface, and the ability to analyze user's experimental data. This resource is highly scalable and facilitates reproducible research.

NAR GENOMICS AND BIOINFORMATICS (2022)

Article Biochemistry & Molecular Biology

Successful application of genome sequencing in a diagnostic setting: 1007 index cases from a clinically heterogeneous cohort

Aida M. Bertoli-Avella et al.

Summary: Despite the technical superiority of genome sequencing (GS) over other diagnostic methods, limited studies have been conducted on its clinical application advantages. This study analyzed 1007 consecutive cases where GS was performed clinically, showing a high diagnostic yield and highlighting the importance of GS for ES-negative cases due to its access to noncoding regions and more uniform coverage.

EUROPEAN JOURNAL OF HUMAN GENETICS (2021)

Article Biochemistry & Molecular Biology

Open Targets Genetics: systematic identification of trait-associated genes using large-scale genetics and functional genomics

Maya Ghoussaini et al.

Summary: Open Targets Genetics is an open-access integrative resource that aggregates human GWAS and functional genomics data to make connections between GWAS-associated loci and likely causal genes. Users can search, prioritize, and explore GWAS signals through data visualizations provided by the portal.

NUCLEIC ACIDS RESEARCH (2021)

Article Biochemistry & Molecular Biology

VARAdb: a comprehensive variation annotation database for human

Qi Pan et al.

Summary: The study introduces a comprehensive variation annotation database for human, which offers a wide range of variation information and related annotation details, aiding in selecting potential functional variations and interpreting their effects on human diseases and biological processes.

NUCLEIC ACIDS RESEARCH (2021)

Article Biochemical Research Methods

Combining artificial intelligence: deep learning with Hi-C data to predict the functional effects of non-coding variants

Xiang-He Meng et al.

Summary: This study utilized a novel deep learning model of artificial intelligence combined with Hi-C data to predict the functional impact of non-coding variants on chromatin interaction. The model effectively classified and predicted chromatin interactions, with the predicted causal SNPs more likely to be identified by GWAS and eQTL analyses. Integrating artificial intelligence with experimental evidence of chromatin interaction expedites the prioritization of functional variants in disease-related loci, facilitating the discovery of biological mechanisms underlying genomic study associations.

BIOINFORMATICS (2021)

Article Neurosciences

Large mosaic copy number variations confer autism risk

Maxwell A. Sherman et al.

Summary: The contribution of mosaic copy number variants (mCNVs) to the risk of autism spectrum disorder (ASD) was investigated in a study, revealing that probands with ASD carry a significant burden of mCNVs compared to their unaffected siblings, particularly large mCNVs. Additionally, the severity of ASD symptoms correlated positively with the size of mCNVs, and no mosaic analogues of common short de novo CNVs associated with ASD were observed.

NATURE NEUROSCIENCE (2021)

Article Biochemistry & Molecular Biology

LincSNP 3.0: an updated database for linking functional variants to human long non-coding RNAs, circular RNAs and their regulatory elements

Yue Gao et al.

Summary: LincSNP 3.0 is an updated comprehensive database that focuses on documenting and annotating disease or phenotype-associated variants in human long non-coding RNAs and circular RNAs, with expanded types of variants and regulatory elements, as well as identified associations among them.

NUCLEIC ACIDS RESEARCH (2021)

Article Biochemical Research Methods

A semisupervised model to predict regulatory effects of genetic variants at single nucleotide resolution using massively parallel reporter assays

Zikun Yang et al.

Summary: The study introduces a presence-only model to predict regulatory effects of genetic variants, showing better performance with experimental data and aiding in prioritizing functional variants. Particularly for the costimulatory locus associated with autoimmune diseases, evidence is presented of regulatory and coding variants acting together to increase disease risk.

BIOINFORMATICS (2021)

Article Oncology

ICGC-ARGO precision medicine: familial matters in pancreatic cancer

Michele Milella et al.

LANCET ONCOLOGY (2021)

Article Multidisciplinary Sciences

Regulatory genomic circuitry of human disease loci by integrative epigenomics

Carles A. Boix et al.

Summary: Annotating the molecular basis of human disease using EpiMap, a compendium of 10,000 epigenomic maps, has revealed the importance of dense, rich, high-resolution epigenomic annotations for investigating complex traits. The study used EpiMap to annotate genetic loci associated with traits and predict trait-relevant tissues and candidate target genes, showing extensive pleiotropy in top-scoring loci.

NATURE (2021)

Article Multidisciplinary Sciences

Sequencing of 53,831 diverse genomes from the NHLBI TOPMed Program

Daniel Taliun et al.

Summary: The TOPMed program aims to study the genetic architecture and biology of heart, lung, blood, and sleep disorders to improve diagnosis, treatment, and prevention of these diseases. Resources include a variant browser, genotype imputation server, and genomic and phenotypic data available through dbGaP. The study detected a large number of rare genetic variants, providing insights into mutation processes and recent human evolutionary history.

NATURE (2021)

Article Multidisciplinary Sciences

Prioritizing non-coding regions based on human genomic constraint and sequence context with deep learning

Dimitrios Vitsios et al.

Summary: The study integrates intolerance to variation, functional genomic annotations, and primary genomic sequences to build a comprehensive deep learning model JARVIS and genome-wide residual variation intolerance score (gwRVIS) for prioritizing non-coding regions. Both JARVIS and gwRVIS capture previously inaccessible human-lineage constraint information and enhance understanding of the non-coding genome.

NATURE COMMUNICATIONS (2021)

Article Genetics & Heredity

CADD-Splice-improving genome-wide variant effect prediction using deep learning-derived splice scores

Philipp Rentzsch et al.

Summary: The study compared several machine learning methods that score variant effects on splicing using an experimental dataset, integrating the best methods into general variant effect prediction models and evaluating the impact on the classification of known pathogenic variants. The inclusion of splicing DNN effect scores substantially improved predictions across multiple variant categories in the new CADD-Splice model, without compromising overall performance.

GENOME MEDICINE (2021)

Article Clinical Neurology

Whole-genome sequencing reveals new Alzheimer's disease-associated rare variants in loci related to synaptic function and neuronal development

Dmitry Prokopenko et al.

Summary: Thirteen new candidate loci for AD were identified through rare variant analysis, highlighting synaptic function as a potential novel pathway for the disease.

ALZHEIMERS & DEMENTIA (2021)

Review Neurosciences

Beyond association: successes and challenges in linking non-coding genetic variation to functional consequences that modulate Alzheimer's disease risk

Gloriia Novikova et al.

Summary: Researchers are working on identifying the functional effects of gene variations related to Alzheimer's disease risk in order to gain a deeper understanding of disease pathogenesis and develop effective treatments.

MOLECULAR NEURODEGENERATION (2021)

Article Biochemical Research Methods

Openness weighted association studies: leveraging personal genome information to prioritize non-coding variants

Shuang Song et al.

Summary: OWAS is a computational approach that leverages and aggregates predictions of chromosome accessibility to prioritize GWAS signals. In simulations and data analysis, OWAS identifies genes/segments more accurately than existing methods, with higher replication rates, and shows tissue-specific patterns.

BIOINFORMATICS (2021)

Article Biochemical Research Methods

A semi-supervised deep learning approach for predicting the functional effects of genomic non-coding variations

Hao Jia et al.

Summary: Understanding the functional effects of non-coding variants is crucial in studying gene-expression regulation and disease development. A novel semi-supervised deep learning model with pseudo labeling has been proposed to improve predictive performance, especially in dealing with limited datasets.

BMC BIOINFORMATICS (2021)

Article Genetics & Heredity

X-CNV: genome-wide prediction of the pathogenicity of copy number variations

Li Zhang et al.

Summary: The research team developed the X-CNV computational framework, integrating multiple informative features to predict the pathogenicity of CNVs, outperforming other tools. They also proposed an MVP score to quantitatively measure the pathogenic effect of CNVs, demonstrating high discriminative power.

GENOME MEDICINE (2021)

Review Multidisciplinary Sciences

Genome-wide association studies

Emil Uffelmann et al.

Summary: Genome-wide association studies (GWAS) involve testing hundreds of thousands of genetic variants to identify those associated with specific traits or diseases, with the number of associated variants expected to increase as sample sizes grow. The results of GWAS have various applications, including understanding the underlying biology of phenotypes, estimating heritability, predicting clinical risks, guiding drug development programs, and inferring causal relationships between risk factors and health outcomes.

NATURE REVIEWS METHODS PRIMERS (2021)

Review Genetics & Heredity

Structural variation in the sequencing era

Steve S. Ho et al.

NATURE REVIEWS GENETICS (2020)

Article Biochemistry & Molecular Biology

ClinVar: improvements to accessing data

Melissa J. Landrum et al.

NUCLEIC ACIDS RESEARCH (2020)

Article Multidisciplinary Sciences

Combined burden and functional impact tests for cancer driver discovery using DriverPower

Shimin Shuai et al.

NATURE COMMUNICATIONS (2020)

Article Multidisciplinary Sciences

The mutational constraint spectrum quantified from variation in 141,456 humans

Konrad J. Karczewski et al.

NATURE (2020)

Article Multidisciplinary Sciences

Expanded encyclopaedias of DNA elements in the human and mouse genomes

Jill E. Moore et al.

NATURE (2020)

Review Genetics & Heredity

Long-read human genome sequencing and its applications

Glennis A. Logsdon et al.

NATURE REVIEWS GENETICS (2020)

Article Biochemistry & Molecular Biology

Ultrafast and scalable variant annotation and prioritization with big functional genomics data

Dandan Huang et al.

GENOME RESEARCH (2020)

Article Multidisciplinary Sciences

A method for scoring the cell type-specific impacts of noncoding variants in personal genomes

Wenran Li et al.

PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA (2020)

Article Biotechnology & Applied Microbiology

SVFX: a machine learning framework to quantify the pathogenicity of structural variants

Sushant Kumar et al.

GENOME BIOLOGY (2020)

Article Biochemistry & Molecular Biology

RegulationSpotter: annotation and interpretation of extratranscriptic DNA variants

Jana Marie Schwarz et al.

NUCLEIC ACIDS RESEARCH (2019)

Article Biochemistry & Molecular Biology

CADD: predicting the deleteriousness of variants throughout the human genome

Philipp Rentzsch et al.

NUCLEIC ACIDS RESEARCH (2019)

Article Biochemistry & Molecular Biology

The NHGRI-EBI GWAS Catalog of published genome-wide association studies, targeted arrays and summary statistics 2019

Annalisa Buniello et al.

NUCLEIC ACIDS RESEARCH (2019)

Review Biotechnology & Applied Microbiology

Structural variant calling: the long and the short of it

Medhat Mahmoud et al.

GENOME BIOLOGY (2019)

Article Biochemical Research Methods

Functional annotation of genomic variants in studies of late-onset Alzheimer's disease

Mariusz Butkiewicz et al.

BIOINFORMATICS (2018)

Article Biochemical Research Methods

GIGGLE: a search engine for large-scale integrated genome analysis

Ryan M. Layer et al.

NATURE METHODS (2018)

Review Genetics & Heredity

From genome-wide associations to candidate causal variants by statistical fine-mapping

Daniel J. Schaid et al.

NATURE REVIEWS GENETICS (2018)

Article Biochemistry & Molecular Biology

INFERNO: inferring the molecular mechanisms of noncoding genetic variants

Alexandre Amlie-Wolf et al.

NUCLEIC ACIDS RESEARCH (2018)

Article Multidisciplinary Sciences

The UK Biobank resource with deep phenotyping and genomic data

Clare Bycroft et al.

NATURE (2018)

Article Multidisciplinary Sciences

Genetic effects on gene expression across human tissues

Francois Aguet et al.

NATURE (2017)

Article Multidisciplinary Sciences

Functional mapping and annotation of genetic associations with FUMA

Kyoko Watanabe et al.

NATURE COMMUNICATIONS (2017)

Article Genetics & Heredity

WGSA: an annotation pipeline for human genome sequencing studies

Xiaoming Liu et al.

JOURNAL OF MEDICAL GENETICS (2016)

Article Computer Science, Hardware & Architecture

Apache Spark: A Unified Engine for Big Data Processing

Matei Zaharia et al.

COMMUNICATIONS OF THE ACM (2016)

Article Biotechnology & Applied Microbiology

Vcfanno: fast, flexible annotation of genetic variants

Brent S. Pedersen et al.

GENOME BIOLOGY (2016)

Article Biotechnology & Applied Microbiology

The Ensembl Variant Effect Predictor

William McLaren et al.

GENOME BIOLOGY (2016)

Article Multidisciplinary Sciences

Integrative analysis of 111 reference human epigenomes

Anshul Kundaje et al.

NATURE (2015)

Article Genetics & Heredity

The support of human genetic evidence for approved drug indications

Matthew R. Nelson et al.

NATURE GENETICS (2015)

Article Biochemical Research Methods

Predicting effects of noncoding variants with deep learning-based sequence model

Jian Zhou et al.

NATURE METHODS (2015)

Article Biochemical Research Methods

Genomic variant annotation and prioritization with ANNOVAR and wANNOVAR

Hui Yang et al.

NATURE PROTOCOLS (2015)

Article Biotechnology & Applied Microbiology

The Ensembl Regulatory Build

Daniel R. Zerbino et al.

GENOME BIOLOGY (2015)

Article Multidisciplinary Sciences

An atlas of active enhancers across human cell types and tissues

Robin Andersson et al.

NATURE (2014)

Review Genetics & Heredity

Transcriptional enhancers: from properties to genome-wide predictions

Daria Shlyueva et al.

NATURE REVIEWS GENETICS (2014)

Article Biochemistry & Molecular Biology

ClinVar: public archive of relationships among sequence variation and human phenotype

Melissa J. Landrum et al.

NUCLEIC ACIDS RESEARCH (2014)

Article Biochemistry & Molecular Biology

Annotation of functional variation in personal genomes using RegulomeDB

Alan P. Boyle et al.

GENOME RESEARCH (2012)

Article Multidisciplinary Sciences

An integrated encyclopedia of DNA elements in the human genome

Ian Dunham et al.

NATURE (2012)

Article Multidisciplinary Sciences

Systematic Localization of Common Disease-Associated Variation in Regulatory DNA

Matthew T. Maurano et al.

SCIENCE (2012)

Article Biochemical Research Methods

Tabix: fast retrieval of sequence features from generic TAB-delimited files

Heng Li

BIOINFORMATICS (2011)

Article Biochemical Research Methods

BigWig and BigBed: enabling browsing of large distributed datasets

W. J. Kent et al.

BIOINFORMATICS (2010)

Article Biochemical Research Methods

BEDTools: a flexible suite of utilities for comparing genomic features

Aaron R. Quinlan et al.

BIOINFORMATICS (2010)

Article Multidisciplinary Sciences

Potential etiologic and functional implications of genome-wide association loci for human diseases and traits

Lucia A. Hindorff et al.

PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA (2009)

Article Multidisciplinary Sciences

Comprehensive Mapping of Long-Range Interactions Reveals Folding Principles of the Human Genome

Erez Lieberman-Aiden et al.

SCIENCE (2009)

Article Multidisciplinary Sciences

Genome-wide mapping of in vivo protein-DNA interactions

David S. Johnson et al.

SCIENCE (2007)

Article Biochemistry & Molecular Biology

dbSNP: the NCBI database of genetic variation

ST Sherry et al.

NUCLEIC ACIDS RESEARCH (2001)