3.8 Proceedings Paper

Intellix - End-User Trained Information Extraction for Document Archiving

出版社

IEEE
DOI: 10.1109/ICDAR.2013.28

关键词

-

资金

  1. German Federal Ministry of Education and Research (BMBF) within the Framework Concept KMU Innovativ [01IS10011]

向作者/读者索取更多资源

Automatic information extraction from scanned business documents is especially valuable in the application domain of document archiving. But current systems for automated document processing still require a lot of configuration work that can only be done by experienced users or administrators. We present an approach for information extraction which purely builds on end-user provided training examples and intentionally omits efficient known extraction techniques like rule-based extraction that require intense training and/or information extraction expertise. Our evaluation on a large corpus of business documents shows competitive results of above 85% F1-measure on 10 commonly used fields like document type, sender, receiver and date. The system is deployed and used inside the commercial document management system DocuWare.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

3.8
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据