4.7 Article

A Unified Probabilistic Framework for Name Disambiguation in Digital Library

Journal

Publisher

IEEE COMPUTER SOC
DOI: 10.1109/TKDE.2011.13

Keywords

Digital libraries; information search and retrieval; database applications; heterogeneous databases

Funding

  1. Natural Science Foundation of China [61073073]
  2. Chinese National Key Foundation Research [60933013, 61035004]
  3. Special Fund for FSSP

Ask authors/readers for more resources

Despite years of research, the name ambiguity problem remains largely unresolved. Outstanding issues include how to capture all information for name disambiguation in a unified approach, and how to determine the number of people K in the disambiguation process. In this paper, we formalize the problem in a unified probabilistic framework, which incorporates both attributes and relationships. Specifically, we define a disambiguation objective function for the problem and propose a two-step parameter estimation algorithm. We also investigate a dynamic approach for estimating the number of people K. Experiments show that our proposed framework significantly outperforms four baseline methods of using clustering algorithms and two other previous methods. Experiments also indicate that the number K automatically found by our method is close to the actual number.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available