☆ 4.7 Article

The generalizability of pre-processing techniques on the accuracy and fairness of data-driven building models: A case study

ENERGY AND BUILDINGS (2022)

期刊

ENERGY AND BUILDINGS

卷 268, 期 -, 页码 -

出版社

ELSEVIER SCIENCE SA

DOI: 10.1016/j.enbuild.2022.112204

关键词

Fairness; Generalizability; Accuracy; Data-driven Model; Building

类别

Construction & Building Technology Energy & Fuels Engineering, Civil

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

In recent years, the development and application of data-driven building models have been a hot research topic due to the massive data collected from buildings. This study proposes a sequentially balanced sampling (SBS) technique to address the issues of data volume variation and fairness. The performance of SBS is compared with four existing pre-processing techniques, showing comparable performance in accuracy and fairness improvement.

In recent years, massive data collected from buildings made development and application of data-driven building models is a hot research topic. Due to the variation of data volume in different conditions, existing data-driven building models (DDBMs) would present distinct accuracy for different users or periods. This may create further fairness problems. To solve these issues, balancing training dataset between different conditions using pre-processing techniques could help. In this study, a sequentially balanced sampling (SBS) technique is proposed. Its generalizability to improve fairness and preserve accuracy of DDBMs is compared with four existing pre-processing techniques-random sampling (RS), sequential sampling (SS), reversed preferential sampling (RPS), and sequential preferential sampling (SPS). Totally, 4960 cases are carried out to apply these pre-processing techniques to process training dataset before developing 4 types of classifiers for one-week ahead lighting status prediction of 155 lights in 16 apartments through a year. Note that the collected data show 5 distribution modes. The newly proposed SBS shows comparable performance to RPS. They significantly improve predictive accuracy for minority classes but decrease the accuracy for majority classes. On the other hand, SS and SPS show a slight accuracy improvement for minority classes with an acceptable price of accuracy decrease on majority classes. In terms of fairness improvement, SBS, RS, and RPS could effectively increase the recall rate. However, RS and RPS show more negative effect on accuracy rate and specificity rate. The results of this study provide guidance for researchers to select proper pre-processing techniques to improve the preferred predictive performance under different data distribution. (c) 2022 Elsevier B.V. All rights reserved.

The generalizability of pre-processing techniques on the accuracy and fairness of data-driven building models: A case study

期刊

ENERGY AND BUILDINGS

出版社

ELSEVIER SCIENCE SA

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

The generalizability of pre-processing techniques on the accuracy and fairness of data-driven building models: A case study

期刊

ENERGY AND BUILDINGS

出版社

ELSEVIER SCIENCE SA

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文