3.8 Proceedings Paper

Garbage In, Garbage Out? Do Machine Learning Application Papers in Social Computing Report Where Human-Labeled Training Data Comes From?

Publisher

ASSOC COMPUTING MACHINERY
DOI: 10.1145/3351095.3372862

Keywords

machine learning; data labeling; human annotation; content analysis; training data; research integrity; meta-research

Funding

  1. Gordon & Betty Moore Foundation [GBMF3834]
  2. Alfred P. Sloan Foundation [2013-10-27]
  3. UC-Berkeley's Undergraduate Research Apprenticeship Program (URAP)

Ask authors/readers for more resources

Many machine learning projects for new application areas involve teams of humans who label data for a particular purpose, from hiring crowdworkers to the paper's authors labeling the data themselves. Such a task is quite similar to (or a form of) structured content analysis, which is a longstanding methodology in the social sciences and humanities, with many established best practices. In this paper, we investigate to what extent a sample of machine learning application papers in social computing - specifically papers from ArXiv and traditional publications performing an ML classification task on Twitter data - give specific details about whether such best practices were followed. Our team conducted multiple rounds of structured content analysis of each paper, making determinations such as: Does the paper report who the labelers were, what their qualifications were, whether they independently labeled the same items, whether inter-rater reliability metrics were disclosed, what level of training and/or instructions were given to labelers, whether compensation for crowdworkers is disclosed, and if the training data is publicly available. We find a wide divergence in whether such practices were followed and documented. Much of machine learning research and education focuses on what is done once a gold standard of training data is available, but we discuss issues around the equally-important aspect of whether such data is reliable in the first place.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

3.8
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available