4.5 Article

A Random Digit Search (RDS) Method for Sampling of Blogs and Other User-Generated Content

期刊

SOCIAL SCIENCE COMPUTER REVIEW
卷 29, 期 3, 页码 327-339

出版社

SAGE PUBLICATIONS INC
DOI: 10.1177/0894439310382512

关键词

random walks; random digit search; web sampling; web crawling

资金

  1. Hong Kong SAR Research Grants Council [CityU1456/06H]
  2. City University of Hong Kong Research Office [7002396]

向作者/读者索取更多资源

Blogs are arguably the most popular genre of user-generated content (UGC), which make blogs a gold mine for social science research. However, existing research on blogs has suffered from nonprobability samples collected either manually or by computerized crawling based on random walks method. The current article presents a probability sampling method for blogs, called random digit search (RDS), that is modified from the popular random digit dialing (RDD) method used in telephone surveys. The RDS method was tested in a study of Sina Blog, a popular blog service provider (BSP) in China. The results show that, while random walks sampling tends to oversample popular/active blogs, probability samples generated by RDS yield consistent and precise estimates of population parameters. Although the RDS takes advantage of the numeric identification (ID) system used on Sina Blog, the general principles may be applicable to other BSPs and many other genres of UGC.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.5
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据