4.6 Article

Unsupervised Numerical Information Extraction via Exploiting Syntactic Structures

Journal

ELECTRONICS
Volume 12, Issue 9, Pages -

Publisher

MDPI
DOI: 10.3390/electronics12091977

Keywords

numerical information; information extraction; syntactic parsing

Ask authors/readers for more resources

Numerical information is important in various fields and existing methods for extraction often have limitations. This study proposes QuantityIE, a new approach that leverages syntactic features to extract structured representations of numerical information. Experimental results show that QuantityIE outperforms existing methods in terms of accuracy.
Numerical information plays an important role in various fields such as scientific, financial, social, statistics, and news. Most prior studies adopt unsupervised methods by designing complex handcrafted pattern-matching rules to extract numerical information, which can be difficult to scale to the open domain. Other supervised methods require extra time, cost, and knowledge to design, understand, and annotate the training data. To address these limitations, we propose QuantityIE, a novel approach to extracting numerical information as structured representations by exploiting syntactic features of both constituency parsing (CP) and dependency parsing (DP). The extraction results may also serve as distant supervision for zero-shot model training. Our approach outperforms existing methods from two perspectives: (1) the rules are simple yet effective, and (2) the results are more self-contained. We further propose a numerical information retrieval approach based on QuantityIE to answer analytical queries. Experimental results on information extraction and retrieval demonstrate the effectiveness of QuantityIE in extracting numerical information with high fidelity.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available