4.6 Article

Role of twitter user profile features in retweet prediction for big data streams

Journal

MULTIMEDIA TOOLS AND APPLICATIONS
Volume 81, Issue 19, Pages 27309-27338

Publisher

SPRINGER
DOI: 10.1007/s11042-022-12815-1

Keywords

Twitter; Social media analysis; Retweet prediction; User behavior; User profiling; Big data analysis

Ask authors/readers for more resources

This research explores the influence of numerical features extracted from user profiles on the process of information sharing on Twitter. The study finds that user profile features have a better predictive accuracy for retweets and user behavior compared to tweet content features, and their combined use performs even better.
To study the various factors influencing the process of information sharing on Twitter is a very active research area. This paper aims to explore the impact of numerical features extracted from user profiles in retweet prediction from the real-time raw feed of tweets. The originality of this work comes from the fact that the proposed model is based on simple numerical features with the least computational complexity, which is a scalable solution for big data analysis. This research work proposes three new features from the tweet author profile to capture the unique behavioral pattern of the user, namely Author total activity, Author total activity per year, and Author tweets per year. The features set is tested on a dataset of 100 million random tweets collected through Twitter API. The binary labels regression gave an accuracy of 0.98 for user-profile features and gave an accuracy of 0.99 when combined with tweet content features. The regression analysis to predict the retweet count gave an R-squared value of 0.98 with combined features. The multi-label classification gave an accuracy of 0.9 for combined features and 0.89 for user-profile features. The user profile features performed better than tweet content features and performed even better when combined. This model is suitable for near real-time analysis of live streaming data coming through Twitter API and provides a baseline pattern of user behavior based on numerical features available from user profiles only.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available