4.4 Article

ldagibbs: A command for topic modeling in Stata using latent Dirichlet allocation

Journal

STATA JOURNAL
Volume 18, Issue 1, Pages 101-117

Publisher

SAGE PUBLICATIONS INC
DOI: 10.1177/1536867X1801800107

Keywords

st0515; ldagibbs; machine learning; latent Dirichlet allocation; Gibbs sampling; topic model; text analysis

Ask authors/readers for more resources

In this article, I introduce the ldagibbs command, which implements latent Dirichlet allocation in Stata. Latent Dirichlet allocation is the most popular machine-learning topic model. Topic models automatically cluster text documents into a user-chosen number of topics. Latent Dirichlet allocation represents each document as a probability distribution over topics and represents each topic as a probability distribution over words. Therefore, latent Dirichlet allocation provides a way to analyze the content of large unclassified text data and an alternative to predefined document classifications.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.4
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available