4.6 Article

Feature selection procedures for combined density functional theory-artificial neural network schemes

Journal

PHYSICA SCRIPTA
Volume 96, Issue 6, Pages -

Publisher

IOP PUBLISHING LTD
DOI: 10.1088/1402-4896/abf3f7

Keywords

feature selection; neural networks; density functional theory

Funding

  1. Ministry of Education and Scientific Research [PN 19 060 205]
  2. European Regional Development Fund through project CeCBID-EOSC [POC/397/1/1-124405]

Ask authors/readers for more resources

The proposed workflow includes feature selection as a key step for optimizing research methods. Energy gaps of hybrid graphene-boron nitride nanoflakes were predicted using artificial neural networks, with training data obtained by associating structural information to the target quantity. Proper feature vector selection is crucial for accurate and efficient models.
We propose a workflow which includes the essential step of feature selection in order to optimize combined density functional theory-machine learning schemes (DFT-ML). Here, the energy gaps of hybrid graphene-boron nitride nanoflakes with randomly distributed domains are predicted using artificial neural networks (ANNs). The training data is obtained by associating structural information to the target quantity of interest, i.e. the energy gap, obtained by DFT calculations. The selection of proper feature vectors is important for an accurate and efficient ANN model. However, finding an optimal set of features is generally not trivial. We compare different approaches for selecting the feature vectors, ranging from random selection of the features to guided approaches like removing the features with lowest variance and by using the mutual information regression selection technique. We show that the feature selection procedures provides a significant reduction of the input space dimensionality. In addition, a selection method based on the ranking of the cutting radius is proposed and evaluated. This may not only be important for establishing optimal ANN models, but may offer insights into the minimum information required to map certain targeted properties.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available