4.5 Article

A novel similarity measure for spatial entity resolution based on data granularity model: Managing inconsistencies in place descriptions

Journal

APPLIED INTELLIGENCE
Volume 51, Issue 8, Pages 6104-6123

Publisher

SPRINGER
DOI: 10.1007/s10489-020-01959-y

Keywords

Data fusion; Data integration; Data granulation; Entity recognition; Spatial data

Ask authors/readers for more resources

This paper focuses on spatial data granulation to address the diversity in location descriptions from different data sources, improving the quality of data fusion. By introducing granular and data blocking methods, apparent differences in place descriptions can be managed effectively, reducing pairwise comparisons based on geographical features.
Tremendous amounts of data are generated every day by different sources and stored in heterogeneous databases. Providing an integrated view by fusion of data is essential to enhance data utilization. An indispensable type of data is spatial data, with diverse application domains, including GIS, e-commerce, military, and tourism. The concept of location forms a key part of user-generated data with serious challenges, including uncertainty. A particular location may have different names, and conversely, various locations may have the same name. Furthermore, geographical coordinates of locations may not be expressed accurately in datasets. More challenges also exist that have received less attention. Various data sources might describe locations in different levels of detail. This increases data inconsistency and decreases the quality of data fusion. This paper focuses on spatial data granulation to deal with this variety. If these diversities are not taken into consideration, the different descriptions of a location may be interpreted differently and, in turn, not be fused. The contribution of this paper are: (a) Introducing a granular approach to measure the similarity between two place description for managing apparent differences. The proposed method improves the quality of the geocoding and data fusion phases, (b) Introducing a novel data blocking method to decrease pairwise comparisons based on geographical features. For result evaluation, we developed a dataset from two real aviation accident datasets. The evaluation shows that the quality of entity recognition and data fusion improved by using our proposed data granulation technique.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available