☆ 3.8 Proceedings Paper

Intellix - End-User Trained Information Extraction for Document Archiving

2013 12TH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR) (2013)

期刊

2013 12TH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR)

卷 -, 期 -, 页码 101-105

出版社

IEEE

DOI: 10.1109/ICDAR.2013.28

关键词

类别

Computer Science, Artificial Intelligence Engineering, Electrical & Electronic

资金

German Federal Ministry of Education and Research (BMBF) within the Framework Concept KMU Innovativ [01IS10011]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

Automatic information extraction from scanned business documents is especially valuable in the application domain of document archiving. But current systems for automated document processing still require a lot of configuration work that can only be done by experienced users or administrators. We present an approach for information extraction which purely builds on end-user provided training examples and intentionally omits efficient known extraction techniques like rule-based extraction that require intense training and/or information extraction expertise. Our evaluation on a large corpus of business documents shows competitive results of above 85% F1-measure on 10 commonly used fields like document type, sender, receiver and date. The system is deployed and used inside the commercial document management system DocuWare.

Intellix - End-User Trained Information Extraction for Document Archiving

期刊

2013 12TH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR)

出版社

IEEE

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Intellix - End-User Trained Information Extraction for Document Archiving

期刊

2013 12TH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR)

出版社

IEEE

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文