4.5 Article

Enhanced Named Entity Recognition algorithm for financial document verification

期刊

JOURNAL OF SUPERCOMPUTING
卷 -, 期 -, 页码 -

出版社

SPRINGER
DOI: 10.1007/s11227-023-05371-4

关键词

Automatic document verification; Named Entity Recognition; Document summarization; Spell-checker; Natural language processing

向作者/读者索取更多资源

Many enterprise systems require extensive manual verification for document-intensive processes. To address this challenge, a general automatic or semi-automatic document verification system is proposed. In this research, a document verification model based on entities within financial documents is experimented, achieving a high accuracy rate of 88.80% and a short verification time of 2.48 s.
Many enterprise systems are document-intensive and require extensive manual verification. The verification process has challenge in terms of time and remaining bugs. A general automatic or semi-automatic document verification system would be useful. However, as the nature of the natural language, the context is an important factor. In this research, the target context is selected to be the financial documents, which have been highly interested recently. An automatic document verification model based on only entities (mostly faced within financial documents) was experimented. The summary report was verified with original documents, such that entities in the summary were searched for matching in the original documents. Verification process success was evaluated by comparison of the named entity algorithms in the literature. The special Kaggle data set ready for this purpose was used for entity matching from the summary within the original documents. The average document verification accuracy of named entity finding algorithms for only financial type documents was 85.36%, where the proposed entity recognition algorithm reached 88.80%. On the other hand, the average document verification time of the experimented algorithms and the developed algorithm is 2.43 and 2.48 s respectively. As a conclusion, when both the BERT-base-cased classification model and rule-based approaches are applied specific to the context, it enhances the entity verification process with an insignificant time cost. Consequently, even we used limited data and rules, it is seen that there exists opportunity to automatize the document verification process with the support of both the BERT-base-cased classification model and rule-based approaches.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.5
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据