4.6 Article

Postal address extraction from the web: a comprehensive survey

期刊

ARTIFICIAL INTELLIGENCE REVIEW
卷 55, 期 2, 页码 1085-1120

出版社

SPRINGER
DOI: 10.1007/s10462-021-09983-1

关键词

Postal Address Extraction; Web Information Extraction; Gazetteers; Machine Learning

向作者/读者索取更多资源

The study highlights the lack of postal addresses for user's POIs in LBS applications, with some missed addresses potentially retrievable from the Web. No prior survey has compared previous Web postal address extraction approaches, and the issue remains unaddressed in many countries.
The Web is a source of information for Location-Based Service (LBS) applications. These applications lack postal addresses for the user's Point of Interests (POIs) such as schools, hospitals, restaurants, etc., as these locations are annotated manually by using the yellow pages or by the location owners (users/companies). Our study in this paper confirms that Google Maps, a common LBS application, only contains about 32.5% of the public schools that are registered officially in the documents provided by the Directorate of Education in Egypt. However, the remaining missed school addresses could be fished from the Web (e.g., social media). To the best of our knowledge, no prior survey has been published to compare the previous Web postal address extraction approaches. Additionally, all proposed approaches for address extraction are local (could be working in specific countries/locations with particular languages) and could not be used or even adapted to work in other countries/locations with other languages. Furthermore, the problem of Web postal address extraction is not addressed in many countries such as Arab countries (e.g. Egypt). This paper discusses the issue of address extraction, highlights and compares the recently used techniques in extracting addresses from Web pages. In addition, it investigates the discrepancy of knowledge among existing systems. Moreover, it provides a comprehensive review of the geographical Gazetteers used in the Web postal address approaches and compares their data quality dimensions.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据