☆ 4.7 Article

Exploiting syntactic and neighbourhood attributes to address cold start in tag recommendation

INFORMATION PROCESSING & MANAGEMENT (2019)

期刊

INFORMATION PROCESSING & MANAGEMENT

卷 56, 期 3, 页码 771-790

出版社

ELSEVIER SCI LTD

DOI: 10.1016/j.ipm.2018.12.009

关键词

Tag recommendation; Syntactic patterns; NLP; Nearest neighbors

类别

Computer Science, Information Systems Information Science & Library Science

资金

Google
Brazilian National Institute of Science and Technology for Web Research (MCT/CNPq/INCT Web Grant) [573871/2008-6]
FAPEMIG-PRONEX-MASWeb project - Models, Algorithms and Systems for the Web [APQ-01400-14]
CNPq
CAPES
FAPEMIG

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

Many state-of-the-art tag recommendation methods were designed considering that an initial set of tags is available in the target object. However, the effectiveness of these methods greatly suffer in a cold start scenario in which those initial tags are absent (although other features of the target object, such as title and description, may be present). To tackle this problem, previous work extracts candidate terms directly from the text associated with the target object or from similar/related objects, and use statistical properties of the occurrence of words, such as term frequency (TF) and inverse document frequency (IDF), to rank the candidate tags for recommendation. Yet, these properties, in isolation, may not be enough to effectively rank candidate tags, specially when they are extracted from the typically small and possibly low quality texts associated with Web 2.0 objects. In this work, we analyze various syntactic patterns (e.g., syntactic dependencies between words in a sentence) of the text associated with Web 2.0 objects that can be exploited to identify and recommend tags. We also propose new tag quality attributes based on these patterns, including them as new evidence to be exploited by state-of-the-art Learning-to-Rank (L2R) based tag recommenders. We evaluate our tag recommendation methods using real data from four Web 2.0 applications, finding that, for three out of our four datasets, the inclusion of our new proposed syntactic tag quality attributes brings improvements to two L2R-based tag recommenders with gains of up to 17% in precision. Furthermore, we find that recommendations provided by these methods can be further expanded exploiting the target object's neighbourhood (i.e., similar objects). Our characterization and feature importance analysis results show that our syntactic attributes can indeed help discriminate relevant from non-relevant tags, being complementary to other, more traditional, tag quality attributes, particularly for datasets in which the textual features are short and / or present low quality.

Exploiting syntactic and neighbourhood attributes to address cold start in tag recommendation

期刊

INFORMATION PROCESSING & MANAGEMENT

出版社

ELSEVIER SCI LTD

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Exploiting syntactic and neighbourhood attributes to address cold start in tag recommendation

期刊

INFORMATION PROCESSING & MANAGEMENT

出版社

ELSEVIER SCI LTD

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文