期刊
MOLECULAR BIOLOGY AND EVOLUTION
卷 38, 期 7, 页码 2958-2966出版社
OXFORD UNIV PRESS
DOI: 10.1093/molbev/msab062
关键词
processed pseudogene; bioinformatics; humans and apes
资金
- National Human Genome Research Institute (NHGRI) [R01 HG010040, U01 HG010961, U41 HG010972]
The study introduced a more sensitive and accurate method to identify processed pseudogenes, pinpointing 40 processed pseudogenes not present in the human reference genome GRCh38 from 22 human individuals. Additionally, an overview of lineage-specific retrocopies in chimpanzee, gorilla, and orangutan genomes was provided.
LINE-1-mediated retrotransposition of protein-coding mRNAs is an active process in modern humans for both germline and somatic genomes. Prior works that surveyed human data mostly relied on detecting discordant mappings of paired end short reads, or exon junctions contained in short reads. Moreover, there have been few genome-wide comparisons between gene retrocopies in great apes and humans. In this study, we introduced a more sensitive and accurate method to identify processed pseudogenes. Our method utilizes long-read assemblies, and more importantly, is able to provide full-length retrocopy sequences as well as flanking regions which are missed by short-read based methods. From 22 human individuals, we pinpointed 40 processed pseudogenes that are not present in the human reference genome GRCh38 and identified 17 pseudogenes that are in GRCh38 but absent from some input individuals. This represents a significantly higher discovery rate than previous reports (39 pseudogenes not in the reference genome out of 939 individuals). We also provided an overview of lineage-specific retrocopies in chimpanzee, gorilla, and orangutan genomes.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据