4.7 Article

Building characterization through smart meter data analytics: Determination of the most influential temporal and importance-in-prediction based features

期刊

ENERGY AND BUILDINGS
卷 234, 期 -, 页码 -

出版社

ELSEVIER SCIENCE SA
DOI: 10.1016/j.enbuild.2020.110671

关键词

Commercial building characterization; Smart meter data analytics; Machine learning; Feature extraction; Feature selection

向作者/读者索取更多资源

The study focuses on extracting influential features from smart meter data to improve machine learning-based classification of non-residential buildings. Through advanced feature selection methods and a custom approach, the number of features needed for classification is reduced while accuracy is increased. By selecting and utilizing fewer features, the methodology simplifies feature extraction procedures and enhances interpretation of important features' influence.
The present paper aims at determining the most influential features to be extracted from smart meter data to facilitate machine learning-based classification of non-residential buildings. Smart meter-driven remote estimation of the chosen characteristics (the buildings' performance class, use type, and operation group) is significantly helpful in buildings' commissioning, benchmarking, and diagnostics applications. As the first step, state-of-the-art feature selection methods and a proposed customized approach are utilized for determining the most influential parameters in the pool of temporal features, proposed in a previous study. Next, importance-in-prediction based features, generated from an hour-ahead load prediction pipeline, that can improve the classification accuracy are proposed and added as additional input parameters. Finally, interpretations about some of the most influential features for different classification targets are provided. The obtained results demonstrate that, while aiming at estimating the buildings' use type, through performing feature selection and adding importance-in-prediction based features, the number of utilized features is reduced from 290 (initial pool of features proposed in a previous study) to 29, while also increasing the accuracy from 71% to 74%. Similarly, number of employed features for estimating the performance class is decreased from 224 to 17 and the achieved accuracy is improved from 56% to 62%. Finally, using only 6 selected features, compared to 287 features in the initial set, the obtained accuracy for the classification of operation group is increased from 98% to 100%. It is thus demonstrated that the proposed methodology, through selecting and utilizing notably fewer features, results in a notable simplification of the feature extraction procedures, improves the achieved accuracy, and facilitates providing interpretations about the reason behind the influence of some of the most important features. (C) 2020 Elsevier B.V. All rights reserved.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据