4.6 Article

What Do Programmers Discuss About Blockchain? A Case Study on the Use of Balanced LDA and the Reference Architecture of a Domain to Capture Online Discussions About Blockchain Platforms Across Stack Exchange Communities

Journal

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING
Volume 47, Issue 7, Pages 1331-1349

Publisher

IEEE COMPUTER SOC
DOI: 10.1109/TSE.2019.2921343

Keywords

Blockchain; Peer-to-peer computing; Computer architecture; Smart contracts; Programming; Bitcoin; Empirical study; reference architecture; blockchain; stack overflow; stack exchange

Funding

  1. National Key Research and Development Program of China [2018Y FB1003904]
  2. NSFC Program [61602403]
  3. Project of Science and Technology Research and Development Program of China Railway Corporation [P2018X002]
  4. Fundamental Research Funds for the Central Universities

Ask authors/readers for more resources

Blockchain-related discussions have become increasingly common on programming Q&A websites, providing insights into practitioner interests and challenges, and aiding research communities in understanding developer needs in this new domain. LDA has been proposed for analyzing Stack Exchange discussions, but a simplistic use may overlook data diversity and domain-specific concepts. A balanced LDA approach combined with domain reference architecture captures and compares discussion topic popularity and impact, revealing interesting observations and trends among different blockchain platforms.
Blockchain-related discussions have become increasingly prevalent in programming Q&A websites, such as Stack Overflow and other Stack Exchange communities. Analyzing and understanding those discussions could provide insights about the topics of interest to practitioners, and help the software development and research communities better understand the needs and challenges facing developers as they work in this new domain. Prior studies propose the use of LDA to study the Stack Exchange discussions. However, a simplistic use of LDA would capture the topics in discussions blindly without keeping in mind the variety of the dataset and domain-specific concepts. Specifically, LDA is biased towards larger sized corpora; and LDA-derived topics are not linked to higher level domain-specific concepts. We propose an approach that combines balanced LDA (which ensures that the topics are balanced across a domain) with the reference architecture of a domain to capture and compare the popularity and impact of discussion topics across the Stack Exchange communities. Popularity measures the distribution of interest in discussions, and impact gauges the trend of popularity over time. We made a number of interesting observations, including: (1) Bitcoin, Ethereum, Hyperledger Fabric and Corda are the four most commonly-discussed blockchain platforms on the Stack Exchange communities. (2) A broad range of topics are discussed across the various platforms of distinct layers in our derived reference architecture. (3) The Application layer topics exhibit the highest popularity (33.2 percent) and fastest growth in topic impact since November 2015. (4) The Application, API, Consensus and Network layer topics are discussed across the studied blockchain platforms, but exhibit different distributions in popularity. (5) The impact of architectural layer topics exhibits an upward trend, but is growing at different speeds across the studied blockchain platforms. The breakdown of the topic impact across the architectural layers is relatively stable over time except for the Hyperledger Fabric platform. Based on our findings, we highlighted future directions and provided recommendations for practitioners and researchers.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available