4.7 Article

Trade-off between accuracy and fairness of data-driven building and indoor environment models: A comparative study of pre-processing methods

Journal

ENERGY
Volume 239, Issue -, Pages -

Publisher

PERGAMON-ELSEVIER SCIENCE LTD
DOI: 10.1016/j.energy.2021.122273

Keywords

Fairness; Accuracy; Machine learning; Data-driven model; Building and indoor environment; Privacy

Ask authors/readers for more resources

Data-driven models in the building domain have attracted much attention, but fairness-aware prediction of these models is a new research problem addressed in this paper. Different fairness definitions and pre-processing methods are introduced to improve fairness Type I and Type II while maintaining predictive accuracy. Sequential sampling is found to be a good option for improving fairness Type II with an acceptable decrease in accuracy.
Data-driven models have drawn extensive attention in the building domain in recent years, and their predictive accuracy depends on features or data distribution. Accuracy variation among users or periods creates a certain unfairness to some users. This paper addresses a new research problem called fairness-aware prediction of data-driven building and indoor environment models. First, three types of fairness definitions are introduced in building engineering. Next, Type I and Type II fairness are investigated. To achieve fairness Type I, we study the effect of suppressing the protected attribute (i.e., attribute whose value cannot be disclosed or be discriminated against) from inputs. To improve fairness Type II while preserving the predictive accuracy of data-driven building and indoor environment models, we propose three pre-processing methods for training datasetdsequential sampling, reversed preferential sampling, and sequential preferential sampling. The proposed methods are compared to two existing pre-processing methods in a case study for lighting status prediction in an apartment building. Overall, 576 study cases were used to study the effect of these pre-processing methods on the accuracy and fairness of 12 series of lighting status prediction based on 2 types of feature combinations and 4 types of classifiers. Predictive results show that suppressing the protected attribute slightly influences overall predictive accuracy, while all pre-processing methods decrease it. However, in general, sequential sampling would be a good option for improving fairness Type II with an acceptable accuracy decrease. Fairness improvement performance of other pre-processing methods varies depending on applied fea-tures and classifiers. (c) 2021 Elsevier Ltd. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available