4.7 Article

Split, Embed and Merge: An accurate table structure recognizer

Related references

Note: Only part of the references are listed.
Article Computer Science, Artificial Intelligence

TextMountain: Accurate scene text detection via instance segmentation

Yixing Zhu et al.

Summary: This paper introduces a novel scene text detection method named TextMountain, which utilizes border-center information and can effectively handle multi-oriented and curved text. Experimental results demonstrate better performance in terms of both accuracy and efficiency.

PATTERN RECOGNITION (2021)

Article Computer Science, Artificial Intelligence

Mask TextSpotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary Shapes

Minghui Liao et al.

Summary: The paper introduces an end-to-end trainable neural network named Mask TextSpotter for scene text spotting, which combines text detection and recognition. By utilizing two-dimensional space via semantic segmentation, it simplifies the learning procedure and is able to handle text instances of irregular shapes effectively.

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE (2021)

Article Computer Science, Artificial Intelligence

Accuracy vs. complexity: A trade-off in visual question answering models

Moshiur Farazi et al.

Summary: This paper systematically studies the trade-off between model complexity and performance in VQA models, with a specific focus on the impact of multi-modal fusion. Through thorough experimental evaluation, three proposals are presented, optimized for minimal complexity, balanced complexity-accuracy, and state-of-the-art VQA performance.

PATTERN RECOGNITION (2021)

Article Computer Science, Artificial Intelligence

Linguistically-aware attention for reducing the semantic gap in vision-language tasks

K. V. Gouthaman et al.

Summary: The paper proposes a Linguistically-aware Attention (LAT) mechanism to bridge the semantic gap between visual and textual modalities in Vision-language tasks. LAT leverages object attributes and pre-trained language models to provide linguistic awareness to the attention process, and shows improved performance in various V-L tasks.

PATTERN RECOGNITION (2021)

Article Computer Science, Information Systems

Track, Attend, and Parse (TAP): An End-to-End Framework for Online Handwritten Mathematical Expression Recognition

Jianshu Zhang et al.

IEEE TRANSACTIONS ON MULTIMEDIA (2019)

Article Computer Science, Artificial Intelligence

ASTER: An Attentional Scene Text Recognizer with Flexible Rectification

Baoguang Shi et al.

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE (2019)

Article Computer Science, Artificial Intelligence

Dense semantic embedding network for image captioning

Xinyu Xiao et al.

PATTERN RECOGNITION (2019)

Article Computer Science, Information Systems

DeCNT: Deep Deformable CNN for Table Detection

Shoaib Ahmed Siddiqui et al.

IEEE ACCESS (2018)

Article Computer Science, Artificial Intelligence

Deep Visual-Semantic Alignments for Generating Image Descriptions

Andrej Karpathy et al.

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE (2017)

Article Computer Science, Artificial Intelligence

Watch, attend and parse: An end-to-end neural network based approach to handwritten mathematical expression recognition

Jianshu Zhang et al.

PATTERN RECOGNITION (2017)

Proceedings Paper Computer Science, Artificial Intelligence

Learning to Extract Semantic Structure from Documents Using Multimodal Fully Convolutional Neural Networks

Xiao Yang et al.

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017) (2017)

Article Computer Science, Information Systems

Tree edit distance: Robust and memory-efficient

Mateusz Pawlik et al.

INFORMATION SYSTEMS (2016)