☆ 4.7 Article

Track, Attend, and Parse (TAP): An End-to-End Framework for Online Handwritten Mathematical Expression Recognition

IEEE TRANSACTIONS ON MULTIMEDIA (2019)

Journal

IEEE TRANSACTIONS ON MULTIMEDIA

Volume 21, Issue 1, Pages 221-233

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

DOI: 10.1109/TMM.2018.2844689

Keywords

Online handwritten mathematical expression recognition (OHMER); end-to-end framework; gated recurrent unit (GRU); guided hybrid attention (GHA); ensemble

Funding

National Key R&D Program of China [2017YFB1002202]
National Natural Science Foundation of China [61671422, U1613211]
Key Science and Technology Project of Anhui Province [17030901005]
MOE-Microsoft Key Laboratory of USTC

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

In this paper, we introduce Track, Attend, and Parse (TAP), an end-to-end approach based on neural networks for online handwritten mathematical expression recognition (OHMER). The architecture of TAP consists of a tracker and a parser. The tracker employs a stack of bidirectional recurrent neural networks with gated recurrent units (GRU) to model the input handwritten traces, which can fully utilize the dynamic trajectory information in OHMER. Followed by the tracker, the parser adopts a GRU equipped with guided hybrid attention (GHA) to generate LATEX notations. The proposed GHA is composed of a coverage-based spatial attention, a temporal attention, and an attention guider. Moreover, we demonstrate the strong complementarity between offline information with static-image input and online information with ink-trajectory input by blending a fully convolutional networks-based watcher into TAP. Inherently, unlike traditional methods, this end-to-end framework does not require the explicit symbol segmentation and a predefined expression grammar for parsing. Validated on a benchmark published by the CROHME competition, the proposed approach outperforms the state-of-the-art methods and achieves the best reported results with an expression recognition accuracy of 61.16% on CROHME2014 and 57.02% on CROHME 2016, using only official training dataset.

Track, Attend, and Parse (TAP): An End-to-End Framework for Online Handwritten Mathematical Expression Recognition

Journal

IEEE TRANSACTIONS ON MULTIMEDIA

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Track, Attend, and Parse (TAP): An End-to-End Framework for Online Handwritten Mathematical Expression Recognition

Journal

IEEE TRANSACTIONS ON MULTIMEDIA

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper