4.7 Article

Unravelling the structure of the CSD cocrystal network using a fast near-optimal bipartisation algorithm for large networks

期刊

CRYSTENGCOMM
卷 26, 期 2, 页码 192-202

出版社

ROYAL SOC CHEMISTRY
DOI: 10.1039/d3ce00978e

关键词

-

向作者/读者索取更多资源

Networks are important for describing relationships between people, roads between cities, reactions between chemicals, and other interactions. Bipartiteness, dividing the network into two groups, can facilitate the study of the network's structure. We have developed an algorithm that can find a near-optimal bipartisation within a reasonable time frame and used it to uncover the hidden structure of the CSD cocrystal network.
Networks, consisting of vertices connected by edges, are an important mathematical concept used to describe relationships between people, roads between cities, reactions between chemicals, and many other interactions. Such a network can be created by extracting cocrystals from the Cambridge Structural Database (CSD). This network describes which compounds can form cocrystals together and can, for example, be used to predict new cocrystals using link-prediction techniques. Bipartiteness is an important property of some networks wherein the vertices can be separated into two groups such that edges only point from one group to the other. Knowing whether a network is bipartite can make studying its structure considerably easier. If a network is nearly bipartite except for a number of outlying edges, one might want to identify and remove those edges, thereby bipartising the network. The CSD cocrystal network was previously found to be close to bipartiteness. Truly bipartising it could improve the accuracy of link-prediction and give insight into the hidden structure of the network. Many algorithms exist for exactly finding the optimal bipartisation for a nearly-bipartite network, but the time it takes to complete such algorithms increases exponentially with the size of the problem. In some cases, an exact solution is unnecessary and a 'good enough' bipartisation is sufficient. We have developed an algorithm that can find a near-optimal bipartisation within reasonable time, even for very large networks, and used it to unravel the structure of the CSD cocrystal network. We obtained a bipartisation that leaves 96% of the network intact, and we were able to identify 'universal' coformers that do not conform to the bipartite nature of the network. By applying a clustering algorithm to the bipartised network, we were also able to identify anticommunities of coformers. Analysing the CSD cocrystal network using a fast near-optimal bipartisation algorithm reveals its hidden structures.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据