4.7 Review

LDAShiny: An R Package for Exploratory Review of Scientific Literature Based on a Bayesian Probabilistic Model and Machine Learning Tools

Journal

MATHEMATICS
Volume 9, Issue 14, Pages -

Publisher

MDPI
DOI: 10.3390/math9141671

Keywords

text mining; topic modeling; latent dirichlet allocation; automatic literature review

Categories

Funding

  1. FCT (Fundacao para a Ciencia e a Tecnologia) [UID/MAR/04292/2020]
  2. Integrated Programme of SRTD SmartBioR [Centro-01-0145-FEDER-000018]
  3. Centro 2020 program, Portugal2020, European Union, through the European Regional Development Fund

Ask authors/readers for more resources

LDAShiny is an open source application that provides an interactive graphical user interface for reviewing scientific literature using latent Dirichlet allocation algorithm and machine learning tools. Through analysis, it was found that 14 topics were sufficient to describe the reviewed literature, with research topics on Oreochromis niloticus species mainly related to growth performance, body weight, heavy metals, genetics, and water quality.
In this paper we propose an open source application called LDAShiny, which provides a graphical user interface to perform a review of scientific literature using the latent Dirichlet allocation algorithm and machine learning tools in an interactive and easy-to-use way. The procedures implemented are based on familiar approaches to modeling topics such as preprocessing, modeling, and postprocessing. The tool can be used by researchers or analysts who are not familiar with the R environment. We demonstrated the application by reviewing the literature published in the last three decades on the species Oreochromis niloticus. In total we reviewed 6196 abstracts of articles recorded in Scopus. LDAShiny allowed us to create the matrix of terms and documents. In the preprocessing phase it went from 530,143 unique terms to 3268. Thus, with the implemented options the number of unique terms was reduced, as well as the computational needs. The results showed that 14 topics were sufficient to describe the corpus of the example used in the demonstration. We also found that the general research topics on this species were related to growth performance, body weight, heavy metals, genetics and water quality, among others.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available