期刊
LEARNING THEORY AND KERNEL MACHINES
卷 2777, 期 -, 页码 173-187出版社
SPRINGER-VERLAG BERLIN
DOI: 10.1007/978-3-540-45167-9_14
关键词
clustering; comparing partitions; measures of agreement; information theory; mutual information
This paper proposes an information theoretic criterion for comparing two partitions, or clusterings, of the same data set. The criterion, called variation of information (VI), measures the amount of information lost and gained in changing from clustering C to clustering C'. The criterion makes no assumptions about how the clusterings were generated and applies to both soft and hard clusterings. The basic properties of VI are presented and discussed from the point of view of comparing clusterings. In particular, the VI is positive, symmetric and obeys the triangle inequality. Thus, surprisingly enough, it is a true metric on the space of clusterings.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据