4.7 Article

Trade-off between accuracy and fairness of data-driven building and indoor environment models: A comparative study of pre-processing methods

期刊

ENERGY
卷 239, 期 -, 页码 -

出版社

PERGAMON-ELSEVIER SCIENCE LTD
DOI: 10.1016/j.energy.2021.122273

关键词

Fairness; Accuracy; Machine learning; Data-driven model; Building and indoor environment; Privacy

向作者/读者索取更多资源

Data-driven models in the building domain have attracted much attention, but fairness-aware prediction of these models is a new research problem addressed in this paper. Different fairness definitions and pre-processing methods are introduced to improve fairness Type I and Type II while maintaining predictive accuracy. Sequential sampling is found to be a good option for improving fairness Type II with an acceptable decrease in accuracy.
Data-driven models have drawn extensive attention in the building domain in recent years, and their predictive accuracy depends on features or data distribution. Accuracy variation among users or periods creates a certain unfairness to some users. This paper addresses a new research problem called fairness-aware prediction of data-driven building and indoor environment models. First, three types of fairness definitions are introduced in building engineering. Next, Type I and Type II fairness are investigated. To achieve fairness Type I, we study the effect of suppressing the protected attribute (i.e., attribute whose value cannot be disclosed or be discriminated against) from inputs. To improve fairness Type II while preserving the predictive accuracy of data-driven building and indoor environment models, we propose three pre-processing methods for training datasetdsequential sampling, reversed preferential sampling, and sequential preferential sampling. The proposed methods are compared to two existing pre-processing methods in a case study for lighting status prediction in an apartment building. Overall, 576 study cases were used to study the effect of these pre-processing methods on the accuracy and fairness of 12 series of lighting status prediction based on 2 types of feature combinations and 4 types of classifiers. Predictive results show that suppressing the protected attribute slightly influences overall predictive accuracy, while all pre-processing methods decrease it. However, in general, sequential sampling would be a good option for improving fairness Type II with an acceptable accuracy decrease. Fairness improvement performance of other pre-processing methods varies depending on applied fea-tures and classifiers. (c) 2021 Elsevier Ltd. All rights reserved.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据