Journal
HUMAN MUTATION
Volume 30, Issue 3, Pages 485-492Publisher
WILEY
DOI: 10.1002/humu.20917
Keywords
machine-learning; random-forest; Swiss-Prot variants; nsSNPs; text-mining; phenotype similarity
Categories
Funding
- Biotechnology and Biological Sciences Research Council [BBS/B/16585] Funding Source: researchfish
Ask authors/readers for more resources
A method has been developed for the prediction of proteins involved in genetic disorders. This involved combining deleterious SNP prediction with a system based on protein interactions and phenotype distances; this is the first time that deleterious SNP prediction has been used to make predictions across linkage-intervals. At each step we tested and selected the best procedure, revealing that the computationally expensive method of assigning medical meta-terms to create a phenotype distance matrix was outperformed by a simple word counting technique. We carried out in-depth benchmarking with increasingly stringent data sets, reaching precision values of up to 75% (19% recall) for 10-Mb linkage-intervals (averaging 100 genes). For the most stringent (worst-case) data we attained an overall recall of 6%, yet still achieved precision values of up to 90% (4% recall). At all levels of stringency and precision the addition of predicted deleterious SNPs was shown to increase recall.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available