4.8 Article

How to make causal inferences using texts

期刊

SCIENCE ADVANCES
卷 8, 期 42, 页码 -

出版社

AMER ASSOC ADVANCEMENT SCIENCE
DOI: 10.1126/sciadv.abg2652

关键词

-

资金

  1. Eunice Kennedy Shriver National Institute of Child Health and Human Development of the National Institutes of Health [P2CHD047879]
  2. National Science Foundation under the Resource Implementations for Data Intensive Research program [1738411, 1738288]
  3. Divn Of Social and Economic Sciences
  4. Direct For Social, Behav & Economic Scie [1738288, 1738411] Funding Source: National Science Foundation

向作者/读者索取更多资源

Text as data techniques have the potential to test social science theories by using large collections of text. However, estimating the latent representation of the text may introduce risks. To address these risks, a split-sample workflow is introduced for rigorous causal inferences.
Text as data techniques offer a great promise: the ability to inductively discover measures that are useful for testing social science theories with large collections of text. Nearly all text-based causal inferences depend on a latent representation of the text, but we show that estimating this latent representation from the data creates underacknowledged risks: we may introduce an identification problem or overfit. To address these risks, we introduce a split-sample workflow for making rigorous causal inferences with discovered measures as treatments or outcomes. We then apply it to estimate causal effects from an experiment on immigration attitudes and a study on bureaucratic responsiveness.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.8
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据