期刊
IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS
卷 20, 期 5, 页码 2736-2747出版社
IEEE COMPUTER SOC
DOI: 10.1109/TCBB.2021.3118358
关键词
RNA classification; non-coding RNA; secondary structure; deep learning; k-mers
Non-coding RNAs (ncRNAs) are transcribed RNA elements that do not code for proteins but play important roles in biological processes and disorders. Their structure is closely associated with their functions and conservation. This study proposes the use of secondary structure fingerprints to improve classification accuracy by considering global and local structural matches. By combining these fingerprints with k-mers, a deep learning architecture achieved high accuracy, precision, and recall.
RNA elements that are transcribed but not translated into proteins are called non-coding RNAs (ncRNAs). They play wide-ranging roles in biological processes and disorders. Just like proteins, their structure is often intimately linked to their function. Many examples have been documented where structure is conserved across taxa despite sequence divergence. Thus, structure is often used to identify function. Specifically, the secondary structure is predicted and ncRNAs with similar structures are assumed to have same or similar functions. However, a strand of RNA can fold into multiple possible structures, and some strands even fold differently in vivo and in vitro. Furthermore, ncRNAs often function as RNA-protein complexes, which can affect structure. Because of these, we hypothesized using one structure per sequence may discard information, possibly resulting in poorer classification accuracy. Therefore, we propose using secondary structure fingerprints, comprising two categories: a higher-level representation derived from RNA-As-Graphs (RAG), and free energy fingerprints based on a curated repertoire of small structural motifs. The fingerprints take into account the difference between global and local structural matches. We also evaluated our deep learning architecture with k-mers. By combining our global-local fingerprints with 6-mer, we achieved an accuracy, precision, and recall of 91.04%, 91.10%, and 91.00%.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据