Journal
IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS
Volume 15, Issue 3, Pages 839-849Publisher
IEEE COMPUTER SOC
DOI: 10.1109/TCBB.2017.2689762
Keywords
Gene ontology; semantic similarity; information content; protein-protein interaction
Categories
Funding
- CMATER research laboratory of the Computer Science and Engineering Department, Jadavpur University, India
- PURSE-II
- UPE-II project
- FASTTRACK grant of DST [SR/FTP/ETA-04/2012]
- UGC, Government of India [F.30-31/2016(SA-II)]
Ask authors/readers for more resources
The semantic similarity between two interacting proteins can be estimated by combining the similarity scores of the GO terms associated with the proteins. Greater number of similar GO annotations between two proteins indicates greater interaction affinity. Existing semantic similarity measures make use of the GO graph structure, the information content of GO terms, or a combination of both. In this paper, we present a hybrid approach which utilizes both the topological features of the GO graph and information contents of the GO terms. More specifically, we 1) consider a fuzzy clustering of the GO graph based on the level of association of the GO terms, 2) estimate the GO term memberships to each cluster center based on the respective shortest path lengths, and 3) assign weightage to GO term pairs on the basis of their dissimilarity with respect to the cluster centers. We test the performance of our semantic similarity measure against seven other previously published similarity measures using benchmark protein-protein interaction datasets of Homo sapiens and Saccharomyces cerevisiae based on sequence similarity, Pfam similarity, area under ROC curve, and F-1 measure.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available