4.7 Article

Assessment of Semantic Similarity between Proteins Using Information Content and Topological Properties of the Gene Ontology Graph

Publisher

IEEE COMPUTER SOC
DOI: 10.1109/TCBB.2017.2689762

Keywords

Gene ontology; semantic similarity; information content; protein-protein interaction

Funding

  1. CMATER research laboratory of the Computer Science and Engineering Department, Jadavpur University, India
  2. PURSE-II
  3. UPE-II project
  4. FASTTRACK grant of DST [SR/FTP/ETA-04/2012]
  5. UGC, Government of India [F.30-31/2016(SA-II)]

Ask authors/readers for more resources

The semantic similarity between two interacting proteins can be estimated by combining the similarity scores of the GO terms associated with the proteins. Greater number of similar GO annotations between two proteins indicates greater interaction affinity. Existing semantic similarity measures make use of the GO graph structure, the information content of GO terms, or a combination of both. In this paper, we present a hybrid approach which utilizes both the topological features of the GO graph and information contents of the GO terms. More specifically, we 1) consider a fuzzy clustering of the GO graph based on the level of association of the GO terms, 2) estimate the GO term memberships to each cluster center based on the respective shortest path lengths, and 3) assign weightage to GO term pairs on the basis of their dissimilarity with respect to the cluster centers. We test the performance of our semantic similarity measure against seven other previously published similarity measures using benchmark protein-protein interaction datasets of Homo sapiens and Saccharomyces cerevisiae based on sequence similarity, Pfam similarity, area under ROC curve, and F-1 measure.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available