4.6 Article

Assessing similarity matching for possible integration of feature classifications of geospatial data from official and informal sources

Journal

Publisher

TAYLOR & FRANCIS LTD
DOI: 10.1080/13658816.2011.636012

Keywords

spatial data integration; semantic similarity; structural similarity; VGI data; classification schema integration

Ask authors/readers for more resources

One difficulty in integrating geospatial data sets from different sources is variation in feature classification and semantic content of the data. One step towards achieving beneficial semantic interoperability is to assess the semantic similarity among objects that are categorised within data sets. This article focuses on measuring semantic and structural similarities between categories of formal data, such as Ordnance Survey (OS) cartographic data, and volunteered geographic information (VGI), such as that sourced from OpenStreetMap (OSM), with the intention of assessing possible integration. The model involves 'tokenisation' to search for common roots of words, and the feature classifications have been modelled as an XML schema labelled rooted tree for hierarchical analysis. The semantic similarity was measured using the WordNet::Similarity package, while the structural similarities between sub-trees of the source and target schemas have also been considered. Along with dictionary and structural matching, the data type of the category itself is a comparison variable. The overall similarity is based on a weighted combination of these three measures. The results reveal that the use of a generic similarity matching system leads to poor agreement between the semantics of OS and OSM data sets. It is concluded that a more rigorous peer-to-peer assessment of VGI data, increasing numbers and transparency of contributors, the initiation of more programs of quality testing and the development of more directed ontologies can improve spatial data integration.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available