4.7 Article

Classification of Promoter Sequences from Human Genome

期刊

出版社

MDPI
DOI: 10.3390/ijms241612561

关键词

human genome; promoter; genetic algorithm; multiple alignment

向作者/读者索取更多资源

We have developed a new method for classifying promoter sequences using a genetic algorithm and the MAHDS sequence alignment method. Four classes of human promoters were created, using 17,310 sequences from the EPD database. A search for potential promoter sequences (PPSs) in the human genome was conducted, resulting in 3,065,317 PPSs, with only 1,241,206 located in unannotated regions. Intersections were found between PPSs and true promoters, transposable elements, and interspersed repeats.
We have developed a new method for promoter sequence classification based on a genetic algorithm and the MAHDS sequence alignment method. We have created four classes of human promoters, combining 17,310 sequences out of the 29,598 present in the EPD database. We searched the human genome for potential promoter sequences (PPSs) using dynamic programming and position weight matrices representing each of the promoter sequence classes. A total of 3,065,317 potential promoter sequences were found. Only 1,241,206 of them were located in unannotated parts of the human genome. Every other PPS found intersected with either true promoters, transposable elements, or interspersed repeats. We found a strong intersection between PPSs and Alu elements as well as transcript start sites. The number of false positive PPSs is estimated to be 3 x 10(-8) per nucleotide, which is several orders of magnitude lower than for any other promoter prediction method. The developed method can be used to search for PPSs in various eukaryotic genomes.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据