☆ 4.5 Article

Do data from mechanical Turk subjects replicate accuracy, response time, and diffusion modeling results?

BEHAVIOR RESEARCH METHODS (2021)

Journal

BEHAVIOR RESEARCH METHODS

Volume 53, Issue 6, Pages 2302-2325

Publisher

SPRINGER

DOI: 10.3758/s13428-021-01573-x

Keywords

Mechanical Turk data; Diffusion decision model; Response time and accuracy; Across-session variability

Funding

National Institute on Aging [R01-AG041176, R01-AG057841]

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

The study utilized online data collection to examine subject performance, finding that subjects performed relatively well in lexical decision and item recognition tasks, while a significant number of subjects in numerosity discrimination tasks exhibited fast guesses and unstable RTs.

Online data collection is being used more and more, especially in the face of the COVID crisis. To examine the quality of such data, we chose to replicate lexical decision and item recognition paradigms from Ratcliff et al. (Cognitive Psychology, 60, 127-157, 2010) and numerosity discrimination paradigms from Ratcliff and McKoon (Psychological Review, 125, 183-217, 2018) with subjects recruited from Amazon Mechanical Turk (AMT). Along with these tasks, we collected data from either an IQ test or a math computation test. Subjects in the lexical decision and item recognition tasks were relatively well-behaved, with only a few giving a significant number of responses with response times (RTs) under 300 ms at chance accuracy, i.e., fast guesses, and a few with unstable RTs across a session. But in the numerosity discrimination tasks, almost half of the subjects gave a significant number of fast guesses and/or unstable RTs across the session. Diffusion model parameters were largely consistent with the earlier studies as were correlations across tasks and correlations with IQ and age. One surprising result was that eliminating fast outliers from subjects with highly variable RTs (those eliminated from the main analyses) produced diffusion model analyses that showed patterns of correlations similar to the subjects with stable performance. Methods for displaying data to examine stability, eliminating subjects, and implementing RT data collection on AMT including checks on timing are also discussed.

Do data from mechanical Turk subjects replicate accuracy, response time, and diffusion modeling results?

Journal

BEHAVIOR RESEARCH METHODS

Publisher

SPRINGER

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Do data from mechanical Turk subjects replicate accuracy, response time, and diffusion modeling results?

Journal

BEHAVIOR RESEARCH METHODS

Publisher

SPRINGER

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper