4.5 Article

Model-agnostic unsupervised detection of bots in a Likert-type questionnaire

Journal

BEHAVIOR RESEARCH METHODS
Volume -, Issue -, Pages -

Publisher

SPRINGER
DOI: 10.3758/s13428-023-02246-7

Keywords

Aberrant responding; Bots; Outlier detection; Permutation test; Unsupervised learning

Ask authors/readers for more resources

This article introduces a model-agnostic and unsupervised bot detection algorithm that uses permutation test with leave-one-out calculations of outlier statistics. The algorithm provides a p value for each respondent to determine if they are a bot. The simulation study shows that the proposed algorithm outperforms naive alternatives in terms of sensitivity calibration and classification accuracy.
To detect bots in online survey data, there is a wealth of literature on statistical detection using only responses to Likert-type items. There are two traditions in the literature. One tradition requires labeled data, forgoing strong model assumptions. The other tradition requires a measurement model, forgoing collection of labeled data. In the present article, we consider the problem where neither requirement is available, for an inventory that has the same number of Likert-type categories for all items. We propose a bot detection algorithm that is both model-agnostic and unsupervised. Our proposed algorithm involves a permutation test with leave-one-out calculations of outlier statistics. For each respondent, it outputs a p value for the null hypothesis that the respondent is a bot. Such an algorithm offers nominal sensitivity calibration that is robust to the bot response distribution. In a simulation study, we found our proposed algorithm to improve upon naive alternatives in terms of 95% sensitivity calibration and, in many scenarios, in terms of classification accuracy.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available