☆ 4.8 Article

How to make causal inferences using texts

SCIENCE ADVANCES (2022)

期刊

SCIENCE ADVANCES

卷 8, 期 42, 页码 -

出版社

AMER ASSOC ADVANCEMENT SCIENCE

DOI: 10.1126/sciadv.abg2652

关键词

类别

Multidisciplinary Sciences

资金

Eunice Kennedy Shriver National Institute of Child Health and Human Development of the National Institutes of Health [P2CHD047879]
National Science Foundation under the Resource Implementations for Data Intensive Research program [1738411, 1738288]
Divn Of Social and Economic Sciences
Direct For Social, Behav & Economic Scie [1738288, 1738411] Funding Source: National Science Foundation

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

Text as data techniques have the potential to test social science theories by using large collections of text. However, estimating the latent representation of the text may introduce risks. To address these risks, a split-sample workflow is introduced for rigorous causal inferences.

Text as data techniques offer a great promise: the ability to inductively discover measures that are useful for testing social science theories with large collections of text. Nearly all text-based causal inferences depend on a latent representation of the text, but we show that estimating this latent representation from the data creates underacknowledged risks: we may introduce an identification problem or overfit. To address these risks, we introduce a split-sample workflow for making rigorous causal inferences with discovered measures as treatments or outcomes. We then apply it to estimate causal effects from an experiment on immigration attitudes and a study on bureaucratic responsiveness.

How to make causal inferences using texts

期刊

SCIENCE ADVANCES

出版社

AMER ASSOC ADVANCEMENT SCIENCE

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

How to make causal inferences using texts

期刊

SCIENCE ADVANCES

出版社

AMER ASSOC ADVANCEMENT SCIENCE

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文