Journal
WEB CONFERENCE 2018: PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE (WWW2018)
Volume -, Issue -, Pages 1063-1072Publisher
ASSOC COMPUTING MACHINERY
DOI: 10.1145/3178876.3186005
Keywords
Hierarchical Text Classification; Recursive Regularization; Graph-of-words; Deep Learning; Deep Convolutional Neural Networks
Funding
- NSFC program [61472022, 61772151, 61421003]
- Beijing Advanced Innovation Center for Big Data and Brain Computing
- China 973 Fundamental RD Program [2014CB340304]
- Research Grants Council of the Hong Kong Special Administrative Region, China [26206717]
- Hong Kong CERG projects [16211214, 16209715, 16244616]
Ask authors/readers for more resources
Text classification to a hierarchical taxonomy of topics is a common and practical problem. Traditional approaches simply use bag-of-words and have achieved good results. However, when there are a lot of labels with different topical granularities, bag-of-words representation may not be enough. Deep learning models have been proven to be effective to automatically learn different levels of representations for image data. It is interesting to study what is the best way to represent texts. In this paper, we propose a graph-CNN based deep learning model to first convert texts to graph-of-words, and then use graph convolution operations to convolve the word graph. Graph-of-words representation of texts has the advantage of capturing non-consecutive and long-distance semantics. CNN models have the advantage of learning different level of semantics. To further leverage the hierarchy of labels, we regularize the deep architecture with the dependency among labels. Our results on both RCV1 and NYTimes datasets show that we can significantly improve large-scale hierarchical text classification over traditional hierarchical text classification and existing deep models.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available