4.7 Review

Machine learning bridges omics sciences and plant breeding

Related references

Note: Only part of the references are listed.
Review Behavioral Sciences

Next-generation deep learning based on simulators and synthetic data

Celso M. de Melo et al.

Summary: Deep learning has achieved success in various domains, but the requirement for large amounts of labeled data presents a major bottleneck. Synthetic data is emerging as a potential solution, aided by advances in rendering pipelines, generative adversarial models, and fusion models. Domain adaptation techniques are also closing the statistical gap between synthetic and real data. The use of synthetic data and deep neural networks provides insights into the cognitive and neural functioning of biological systems.

TRENDS IN COGNITIVE SCIENCES (2022)

Letter Multidisciplinary Sciences

MODAS: exploring maize germplasm with multi-omics data association studies

Songyu Liu et al.

SCIENCE BULLETIN (2022)

Review Cell Biology

A guide to machine learning for biologists

Joe G. Greener et al.

Summary: This passage discusses the application of machine learning in the analysis of biological data and provides guidance for experimentalists. The increasing scale and complexity of biological data have led to a growing use of machine learning in biology.

NATURE REVIEWS MOLECULAR CELL BIOLOGY (2022)

Article Mathematical & Computational Biology

easyMF: A Web Platform for Matrix Factorization-Based Gene Discovery from Large-scale Transcriptome Data

Wenlong Ma et al.

Summary: In this study, we developed easyMF, a web platform that utilizes matrix factorization algorithms for functional gene discovery from large-scale transcriptome data. Compared with existing software, easyMF offers greater functionality, flexibility, and ease of use. The platform is equipped with user-friendly graphic user interfaces and supports various analyses, including transcriptome analysis, multiple-scenario matrix factorization analysis, and multiple-way gene discovery. We applied easyMF to maize RNA-Seq datasets and successfully identified numerous seed-specific genes. Additionally, easyMF outperformed other systems in gene prioritization.

INTERDISCIPLINARY SCIENCES-COMPUTATIONAL LIFE SCIENCES (2022)

Article Computer Science, Artificial Intelligence

Predicting the suitable fertilizer for crop based on soil and environmental factors using various feature selection techniques with classifiers

Ganesan Mariammal et al.

Summary: This article introduces a method of using machine learning techniques to predict suitable fertilizers for crops. Experimental results demonstrate that the combination of recursive feature elimination and the proposed Heterogeneous Stacked Ensemble classifier achieves the best prediction rate.

EXPERT SYSTEMS (2022)

Review Multidisciplinary Sciences

Machine learning in plant science and plant breeding

Aalt Dirk Jan van Dijk et al.

Summary: Machine learning has been widely applied in plant science and breeding to extract meaningful patterns from large, complex plant data sets, particularly in connecting genotypes to different levels of phenotypes such as biochemical traits and yield.

ISCIENCE (2021)

Review Plant Sciences

Machine learning approaches for crop improvement: Leveraging phenotypic and genotypic big data

Hao Tong et al.

Summary: Genomic selection utilizes machine learning and genotyping data to predict agronomically relevant phenotypic traits, thus reducing the reliance on phenotypic data and improving breeding efficiency. To further improve genomic selection, it is necessary to collect intermediate phenotype data and develop modeling techniques, while also considering the transferability of models between different environments.

JOURNAL OF PLANT PHYSIOLOGY (2021)

Article Multidisciplinary Sciences

Integrated omics networks reveal the temporal signaling events of brassinosteroid response in Arabidopsis

Natalie M. Clark et al.

Summary: This study analyzed the BR signaling in Arabidopsis by integrating multiple omics datasets and inferring networks, identifying a BR-regulated transcription factor BRONTOSAURUS that affects cell division in roots. The research provides insights into the molecular signaling events during BR response through integrative network analysis applied to multi-omic data.

NATURE COMMUNICATIONS (2021)

Article Plant Sciences

Prediction of Maize Phenotypic Traits With Genomic and Environmental Predictors Using Gradient Boosting Frameworks

Cathy C. Westhues et al.

Summary: Developing crop varieties that perform stably in future environmental conditions is a critical challenge in the context of climate change. Modern predictive modeling approaches, especially machine learning techniques, show promising potential in improving predictive ability. Environmental factors play a key role in genotype-by-environment interactions in genomic prediction models, and gradient boosting methods can enhance the accuracy of forecasting new genotype performance in terms of grain yield.

FRONTIERS IN PLANT SCIENCE (2021)

Article Biochemical Research Methods

High-throughput soybean seeds phenotyping with convolutional neural networks and transfer learning

Si Yang et al.

Summary: This study introduces a novel synthetic image generation and augmentation method based on domain randomization for training instance segmentation network to achieve high-throughput soybean seed segmentation. By using this method, a large labeled image dataset was generated automatically to reduce manual annotation costs and facilitate the preparation of training dataset. The convolutional neural network trained purely on the synthetic image dataset achieved good performance. The robustness and generalization ability of the method were demonstrated by analyzing results from synthetic and real-world datasets.

PLANT METHODS (2021)

Article Genetics & Heredity

A Comparison for Dimensionality Reduction Methods of Single-Cell RNA-seq Data

Ruizhi Xiang et al.

Summary: The study compared the performance of different dimensionality reduction methods in scRNA-seq data analysis. t-SNE showed the best accuracy and computing cost, while UMAP demonstrated high stability and preserved the cohesion and separation of cell populations.

FRONTIERS IN GENETICS (2021)

Article Agriculture, Multidisciplinary

Tomato plant disease detection using transfer learning with C-GAN synthetic images

Amreen Abbas et al.

Summary: The paper introduces a deep learning-based method utilizing transfer learning and synthetic image generation, achieving high accuracy in classifying tomato leaf images. This approach demonstrates superior effectiveness and precision compared to existing methods.

COMPUTERS AND ELECTRONICS IN AGRICULTURE (2021)

Article Biochemistry & Molecular Biology

Prioritization of disease genes from GWAS using ensemble-based positive-unlabeled learning

Nikita Kolosov et al.

Summary: Understanding disease biology from GWAS is challenging due to the inability to directly implicate causal genes, but integrating multiple omics data sources can provide important functional links. Machine learning plays a key role in prioritizing disease genes by utilizing various data. The newly developed GPrior tool fills an important gap in GWAS data post-processing methods, significantly improving the ability to pinpoint disease genes.

EUROPEAN JOURNAL OF HUMAN GENETICS (2021)

Article Multidisciplinary Sciences

Crop yield prediction integrating genotype and weather variables using deep learning

Johnathon Shook et al.

Summary: The study demonstrates the use of LSTM-Recurrent Neural Network model for predicting crop yield and the development of a temporal attention mechanism for interpretability during key time windows in the growing season. The proposed models outperformed other machine learning models and provided valuable insights for plant breeders.

PLOS ONE (2021)

Review Chemistry, Analytical

Review: Application of Artificial Intelligence in Phenomics

Shona Nabwire et al.

Summary: Plant phenomics has significantly advanced in recent years due to innovations in new technologies and the widespread application of artificial intelligence. The integration of AI has improved the efficiency of data collection and analysis in high-throughput phenotyping and non-invasive imaging techniques, fostering the development of software and tools for field phenotyping.

SENSORS (2021)

Article Multidisciplinary Sciences

The regulatory landscape of Arabidopsis thaliana roots at single-cell resolution

Michael W. Dorrity et al.

Summary: The study reports the regulatory landscape of Arabidopsis thaliana roots at single-cell resolution and identifies thousands of differentially accessible sites. It finds that a cell's regulatory landscape and transcriptome independently capture cell type identity, and leveraging this information to integrate data helps characterize developmental progression and cell division. The approach provides an analytical framework to infer the gene regulatory networks that execute plant development.

NATURE COMMUNICATIONS (2021)

Article Genetics & Heredity

Using Network-Based Machine Learning to Predict Transcription Factors Involved in Drought Resistance

Chirag Gupta et al.

Summary: The Gene Regulation and Association Network (GRAiN) of rice is an interactive platform that helps study the functional relationships between transcription factors and genetic modules underlying plant response to abiotic stress. A supervised machine learning framework is proposed to prioritize genes regulating stress signal transduction and gene expression under drought conditions. This approach accurately predicts key regulatory genes and has the potential to be applied to other agricultural traits and genetic engineering of rice varieties.

FRONTIERS IN GENETICS (2021)

Review Genetics & Heredity

Improving Genomic Prediction Using High-Dimensional Secondary Phenotypes

Bader Arouisse et al.

Summary: The advances in high-throughput phenotyping have led to a greater number of secondary traits being observed, posing a challenge to improving genomic prediction for the target trait. Existing methods have limitations when dealing with a large number of secondary traits, emphasizing the need for novel approaches to enhance prediction accuracy.

FRONTIERS IN GENETICS (2021)

Article Biochemistry & Molecular Biology

Development of high-resolution multiple-SNP arrays for genetic analyses and molecular breeding through genotyping by target sequencing and liquid chip

Zifeng Guo et al.

Summary: In this study, a novel multiple single-nucleotide polymorphism (mSNP) approach was developed for maize, which proved to be more powerful for genetic diversity detection, linkage disequilibrium analysis, and genome-wide association studies compared to traditional single-amplicon SNPs. The technologies, protocols, and application scenarios established for maize are anticipated to serve as a model for the development of efficient mSNP arrays and GBTS systems in animals, plants, and microorganisms.

PLANT COMMUNICATIONS (2021)

Article Biochemical Research Methods

Revisiting genome-wide association studies from statistical modelling to machine learning

Shanwen Sun et al.

Summary: Genome-wide association studies (GWAS) have made significant progress in discovering genetic variants underlying complex human diseases and agriculturally important traits over the last decade. However, challenges such as detecting epistasis, SNPs with small effects, and distinguishing causal variants still exist. Advancements in statistical modeling and machine learning are driving improvements in GWAS analyses, while state-of-the-art tools are being developed to enhance signal detection and prioritize SNPs.

BRIEFINGS IN BIOINFORMATICS (2021)

Article Biotechnology & Applied Microbiology

LightGBM: accelerated genomically designed crop breeding through ensemble learning

Jun Yan et al.

Summary: LightGBM is an ensemble model of decision trees used for classification and regression prediction, showing superior performance in genomic selection-assisted breeding. Through benchmark tests, it demonstrates advantages in prediction precision, model stability, and computing efficiency.

GENOME BIOLOGY (2021)

Article Biotechnology & Applied Microbiology

The genetic mechanism of heterosis utilization in maize improvement

Yingjie Xiao et al.

Summary: The study demonstrates that although yield heterosis in maize hybrids is correlated with minor-effect epistatic QTLs, it may be the result of major-effect additive and dominant QTLs during early developmental stages. The transition to flowering is identified as a critical stage for heterosis formation, where epistatic QTLs are activated by paternal alleles counteracting deleterious maternal alleles. The proposed molecular breeding approach targets key genes to accelerate maize breeding by reducing deleterious epistatic interactions.

GENOME BIOLOGY (2021)

Article Biotechnology & Applied Microbiology

Hybrid breeding of rice via genomic selection

Yanru Cui et al.

PLANT BIOTECHNOLOGY JOURNAL (2020)

Review Plant Sciences

5Gs for crop genetic improvement

Rajeev K. Varshney et al.

CURRENT OPINION IN PLANT BIOLOGY (2020)

Review Biochemistry & Molecular Biology

Crop Phenomics and High-Throughput Phenotyping: Past Decades, Current Challenges, and Future Perspectives

Wanneng Yang et al.

MOLECULAR PLANT (2020)

Review Agronomy

Genome optimization for improvement of maize breeding

Shuqin Jiang et al.

THEORETICAL AND APPLIED GENETICS (2020)

Review Biochemistry & Molecular Biology

A practical view of fine-mapping and gene prioritization in the post-genome-wide association era

R. V. Broekema et al.

OPEN BIOLOGY (2020)

Review Plant Sciences

Considerations in the analysis of plant chromatin accessibility data

Kerry L. Bubb et al.

CURRENT OPINION IN PLANT BIOLOGY (2020)

Article Plant Sciences

Exploring Deep Learning for Complex Trait Genomic Prediction in Polyploid Outcrossing Species

Laura M. Zingaretti et al.

FRONTIERS IN PLANT SCIENCE (2020)

Article Biochemistry & Molecular Biology

Meta Gene Regulatory Networks in Maize Highlight Functionally Relevant Regulatory Interactions

Peng Zhou et al.

PLANT CELL (2020)

Article Genetics & Heredity

SR4R: An Integrative SNP Resource for Genomic Breeding and Population Research in Rice

Jun Yan et al.

GENOMICS PROTEOMICS & BIOINFORMATICS (2020)

Article Biochemical Research Methods

A scalable SCENIC workflow for single-cell gene regulatory network analysis

Bram van de Sande et al.

NATURE PROTOCOLS (2020)

Article Genetics & Heredity

The Wheat GENIE3 Network Provides Biologically-Relevant Information in Polyploid Wheat

Sophie A. Harrington et al.

G3-GENES GENOMES GENETICS (2020)

Article Biology

Application of deep learning in genomics

Jianxiao Liu et al.

SCIENCE CHINA-LIFE SCIENCES (2020)

Article Chemistry, Analytical

Crop Disease Classification on Inadequate Low-Resolution Target Images

Juan Wen et al.

SENSORS (2020)

Article Plant Sciences

Integrated multi-omics framework of the plant response to jasmonic acid

Mark Zander et al.

NATURE PLANTS (2020)

Review Plant Sciences

Computational prediction of gene regulatory networks in plant growth and development

Samiul Haque et al.

CURRENT OPINION IN PLANT BIOLOGY (2019)

Review Genetics & Heredity

Deep learning: new computational modelling techniques for genomics

Gokcen Eraslan et al.

NATURE REVIEWS GENETICS (2019)

Article Multidisciplinary Sciences

Solving Current Limitations of Deep Learning Based Approaches for Plant Disease Detection

Marko Arsenovic et al.

SYMMETRY-BASEL (2019)

Article Genetics & Heredity

Benchmarking Parametric and Machine Learning Models for Genomic Prediction of Complex Traits

Christina B. Azodi et al.

G3-GENES GENOMES GENETICS (2019)

Article Environmental Sciences

Disentangling Information in Artificial Images of Plant Seedlings Using Semi-Supervised GAN

Simon Leminen Madsen et al.

REMOTE SENSING (2019)

Article Biochemical Research Methods

GRNBoost2 and Arboreto: efficient and scalable inference of gene regulatory networks

Thomas Moerman et al.

BIOINFORMATICS (2019)

Article Biochemistry & Molecular Biology

A guide to deep learning in healthcare

Andre Esteva et al.

NATURE MEDICINE (2019)

Article Biotechnology & Applied Microbiology

Visualizing structure and transitions in high-dimensional biological data

Kevin R. Moon et al.

NATURE BIOTECHNOLOGY (2019)

Editorial Material Biochemical Research Methods

POINTS OF SIGNIFICANCE Statistics versus machine learning

Danilo Bzdok et al.

NATURE METHODS (2018)

Article Biochemistry & Molecular Biology

TGMI: an efficient algorithm for identifying pathway regulators through evaluation of triple-gene mutual interaction

Chathura Gunasekara et al.

NUCLEIC ACIDS RESEARCH (2018)

Article Biochemical Research Methods

iDREM: Interactive visualization of dynamic regulatory networks

Jun Ding et al.

PLOS COMPUTATIONAL BIOLOGY (2018)

Editorial Material Multidisciplinary Sciences

DEEP LEARNING FOR BIOLOGY

Sarah Webb

NATURE (2018)

Review Genetics & Heredity

On the Road to Breeding 4.0: Unraveling the Good, the Bad, and the Boring of Crop Quantitative Genomics

Jason G. Wallace et al.

ANNUAL REVIEW OF GENETICS, VOL 52 (2018)

Article Mathematical & Computational Biology

BTNET : boosted tree based gene regulatory network inference algorithm using time-course measurement data

Sungjoon Park et al.

BMC SYSTEMS BIOLOGY (2018)

Article Multidisciplinary Sciences

Predicting gene regulatory networks by combining spatial and temporal gene expression data in Arabidopsis root stem cells

Maria Angels de Luis Balaguer et al.

PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA (2017)

Review Plant Sciences

Genomic Selection in Plant Breeding: Methods, Models, and Perspectives

Jose Crossa et al.

TRENDS IN PLANT SCIENCE (2017)

Article Plant Sciences

Inference of Transcription Regulatory Network in Low Phytic Acid Soybean Seeds

Neelam Redekar et al.

FRONTIERS IN PLANT SCIENCE (2017)

Article Biochemistry & Molecular Biology

Computational inference of gene regulatory networks: Approaches, limitations and opportunities

Michael Banf et al.

BIOCHIMICA ET BIOPHYSICA ACTA-GENE REGULATORY MECHANISMS (2017)

Article Biochemical Research Methods

A roadmap to multifactor dimensionality reduction methods

Damian Gola et al.

BRIEFINGS IN BIOINFORMATICS (2016)

Article Biochemical Research Methods

ARACNe-AP: gene network reverse engineering through adaptive partitioning inference of mutual information

Alexander Lachmann et al.

BIOINFORMATICS (2016)

Article Biochemical Research Methods

Dimension reduction techniques for the integrative analysis of multi-omics data

Chen Meng et al.

BRIEFINGS IN BIOINFORMATICS (2016)

Article Multidisciplinary Sciences

Integration of omic networks in a developmental atlas of maize

Justin W. Walley et al.

SCIENCE (2016)

Article Biochemical Research Methods

Combining tree-based and dynamical systems for the inference of gene regulatory networks

Van Anh Huynh-Thu et al.

BIOINFORMATICS (2015)

Article Agronomy

A reaction norm model for genomic selection using high-dimensional genomic and environmental data

Diego Jarquin et al.

THEORETICAL AND APPLIED GENETICS (2014)

Review Plant Sciences

Machine learning for Big Data analytics in plants

Chuang Ma et al.

TRENDS IN PLANT SCIENCE (2014)

Article Cell Biology

Epistemological issues in omics and high-dimensional biology: give the people what they want

Tapan S. Mehta et al.

PHYSIOLOGICAL GENOMICS (2006)

Review Plant Sciences

Genomics-assisted breeding for crop improvement

RK Varshney et al.

TRENDS IN PLANT SCIENCE (2005)