☆ 4.6 Article

Scikit-Dimension: A Python Package for Intrinsic Dimension Estimation

ENTROPY (2021)

期刊

ENTROPY

卷 23, 期 10, 页码 -

出版社

MDPI

DOI: 10.3390/e23101368

关键词

intrinsic dimension; effective dimension; Python package; method benchmarking

类别

Physics, Multidisciplinary

资金

Ministry of Science and Higher Education of the Russian Federation [075-15-2020-927]
French government under Agence Nationale de la Recherche, Investissements d'Avenir program [ANR19-P3IA-0001]
Association Science et Technologie
Institut de Recherches Internationales Servier
doctoral school Frontieres de l'Innovation en Recherche et Education Programme Bettencourt
UKRI Turing AI Acceleration Fellowship [EP/V025295/1]
EPSRC [EP/V025295/1] Funding Source: UKRI

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

This technical note introduces an open-source Python package called scikit-dimension for intrinsic dimension estimation. The package provides a uniform implementation of various known ID estimators based on scikit-learn API, allowing evaluation of global and local intrinsic dimension as well as generating synthetic datasets. It is developed with tools to assess code quality, coverage, unit testing and continuous integration.

Dealing with uncertainty in applications of machine learning to real-life data critically depends on the knowledge of intrinsic dimensionality (ID). A number of methods have been suggested for the purpose of estimating ID, but no standard package to easily apply them one by one or all at once has been implemented in Python. This technical note introduces scikit-dimension, an open-source Python package for intrinsic dimension estimation. The scikit-dimension package provides a uniform implementation of most of the known ID estimators based on the scikit-learn application programming interface to evaluate the global and local intrinsic dimension, as well as generators of synthetic toy and benchmark datasets widespread in the literature. The package is developed with tools assessing the code quality, coverage, unit testing and continuous integration. We briefly describe the package and demonstrate its use in a large-scale (more than 500 datasets) benchmarking of methods for ID estimation for real-life and synthetic data.

Scikit-Dimension: A Python Package for Intrinsic Dimension Estimation

期刊

ENTROPY

出版社

MDPI

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Scikit-Dimension: A Python Package for Intrinsic Dimension Estimation

期刊

ENTROPY

出版社

MDPI

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文