4.7 Article

Rule-based information extraction for mechanical-electrical-plumbing-specific semantic web

Journal

AUTOMATION IN CONSTRUCTION
Volume 135, Issue -, Pages -

Publisher

ELSEVIER
DOI: 10.1016/j.autcon.2021.104108

Keywords

Information extraction; MEP; Rule match; Named entity recognition; Relation extraction; Natural language understanding; Semantic web

Funding

  1. National Natural Science Foundation of China [51778336, 72091512]
  2. Tsinghua University - Glodon Joint Research Center for Building Information Modeling
  3. Social computing and information retrieval research center of Harbin Institute of Technology (HIT-SCIR)
  4. Stanford NLP Group

Ask authors/readers for more resources

This paper proposes a rule-based approach for MEP information extraction and verifies its feasibility and efficiency through experiments.
Information extraction (IE), which aims to retrieve meaningful information from plain text, has been widely studied in general and professional domains to support downstream applications. However, due to the lack of labeled data and the complexity of professional mechanical, electrical and plumbing (MEP) information, it is challenging to apply current common deep learning IE methods to the MEP domain. To solve this problem, this paper proposes a rule-based approach for MEP IE task, including a snowball strategy to collect large-scale MEP corpora, a suffix-based matching algorithm on text segments for named entity recognition (NER), and a dependency-path-based matching algorithm on dependency tree for relationship extraction (RE). 2 ideas called meta linking and path filtering for RE are proposed as well, to discover the out-of-pattern entities/relationships as many as possible. To verify the feasibility of the proposed approach, 65 MB MEP corpora have been collected as input of the proposed approach and an MEP semantic web which consists of 15,978 entities and 65,110 relationship triples established, with an accuracy of 81% to entities and 75% to relationship triples, respectively. A comparison experiment between classical deep learning models and the proposed rule-based approach was carried out, illustrating that the performance of our method is 37% and 49% better than the selected deep learning NER and RE models, respectively, in the aspect of extraction precision.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available