4.7 Article

Interdependence analysis on heterogeneous data via behavior interior dimensions

期刊

KNOWLEDGE-BASED SYSTEMS
卷 279, 期 -, 页码 -

出版社

ELSEVIER
DOI: 10.1016/j.knosys.2023.110893

关键词

Interdependence; Heterogeneity; Behavior; Dimensions; Coupling

向作者/读者索取更多资源

This paper proposes the method of interdependence analysis to capture the functional multifarious relationships among attributes and among objects in heterogeneous data. By considering the coupling context and coupling weights, it forms attribute-based and object-based coupled data representation schemes. Experimental results demonstrate that this method effectively captures global couplings.
Interdependent dimensions including categorical and continuous variables can be seen commonly as heterogeneous behavioral data in the real world. Mixed-type objects are more or less associated in terms of certain coupling relationships. The usual representation of such behavioral data is an information table with explicit behavior exterior dimensions (i.e. the original attributes to describe data heterogeneity), assuming the independence of dimensions and the independence of objects. However, both variables and objects are actually very often interdependent on one another either explicitly or implicitly in functional and semantic manners. Limited research has been done in analyzing such interactions among dimensions and those relationships among objects, leading to the learning results to be more local than global. This paper proposes the interdependence analysis to capture the functional multifarious relationships among attributes and among objects in heterogeneous data by addressing the coupling context and coupling weights in unsupervised learning. Such global couplings consider the interactions within discrete dimensions, within numerical attributes and across them, as well as the relationships within an individual object and between multiple objects, to form the attribute-based and object-based coupled data representation schemes based on feature conversion and neighborhood calculation. In addition, we interpret both the representation models via implicit behavior interior dimensions (i.e. the newly defined attributes to model data interdependence) to explain the intrinsic rationales for the superiority of our proposed methods. This work explicitly models the coupling of multiple attributes and the coupling of multiple objects for heterogeneous data sets, demonstrated by various data mining and machine learning applications, such as cluster structure analysis, data clustering evaluation, and data density comparison. Moreover, the sensitivity study is carried out to tune the neighborhood parameter and weight parameter, and the scalability analysis is explored to test the robustness of both models. Extensive experiments on a series of synthetic data sets and multiple UCI data sets show that our proposed framework can effectively capture the global couplings of both heterogeneous variables and mixed-type objects, and is superior to the traditional way as well as the state-of-the-art approaches, which is also verified by statistical analysis. (c) 2023 The Author(s). Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据