4.7 Article

Parallel incremental efficient attribute reduction algorithm based on attribute tree

期刊

INFORMATION SCIENCES
卷 610, 期 -, 页码 1102-1121

出版社

ELSEVIER SCIENCE INC
DOI: 10.1016/j.ins.2022.08.044

关键词

Attribute reduction; Attribute tree; Knowledge granularity; Parallel computing; Incremental learning; Spark framework

资金

  1. National Natural Science Foundation of China [61976120, 62006128, 62102199]
  2. Natural Science Foundation of Jiangsu Province [BK20191445]
  3. Natural Science Key Foundation of Jiangsu Education Department [21KJA510004]
  4. General Program of the Natural Science Foundation of Jiangsu Province Higher Education Institutions [20KJB520009]
  5. Basic Science Research Program of Nantong Science and Technology Bureau [JC2020141, JC2021122]
  6. Postgraduate Research & Practice Innovation Program of Jiangsu Province [SJCX21_1446, SJCX22_1615]

向作者/读者索取更多资源

Research on efficient attribute reduction for massive dynamic datasets is important. Traditional incremental methods are inefficient when applied to large datasets. This study proposes an incremental acceleration strategy based on attribute trees, clustering attributes into multiple trees to improve efficiency, and introducing a branch coefficient in the stop criterion to avoid redundant calculations.
Attribute reduction is an important application of rough sets. Efficiently reducing massive dynamic data sets quickly has always been a major goal of researchers. Traditional incre-mental methods focus on reduction by updated approximations. However, these methods must evaluate all attributes and repeatedly calculate their importance. When these algo-rithms are applied to large datasets with high time complexity, reducing large decision sys-tems becomes inefficient. We propose an incremental acceleration strategy based on attribute trees to solve this problem. The key step is to cluster all attributes into multiple trees for incremental attribute evaluation. Specifically, we first select the appropriate attri-bute tree for attribute evaluation according to the attribute tree correlation measure to reduce the time complexity. Next, the branch coefficient is added to the stop criterion, increasing with the branch depth and guiding a jump out of the loop after reaching the maximum threshold. This avoids redundant calculation and improves efficiency. Furthermore, we propose an algorithm for incremental attribute reduction based on attri-bute trees using these improvements. Finally, a Spark parallel mechanism is added to par-allelize data processing to implement the parallel incremental efficient attribute reduction based on the attribute tree. Experimental results on the Shuttle dataset show that the time consumption of our algorithm is more than 40% lower than that of the classical IARC algo-rithm while maintaining its good classification performance. In addition, the time is short-ened by more than 87% from the benchmark after adding the Spark parallelizing mechanism.(c) 2022 Elsevier Inc. All rights reserved.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据