4.5 Review

A Review on Data Preprocessing Techniques Toward Efficient and Reliable Knowledge Discovery From Building Operational Data

Journal

FRONTIERS IN ENERGY RESEARCH
Volume 9, Issue -, Pages -

Publisher

FRONTIERS MEDIA SA
DOI: 10.3389/fenrg.2021.652801

Keywords

data preprocessing; building operational data analysis; data science; knowledge discovery; building energy management

Categories

Funding

  1. National Natural Science Foundation of China [51908365, 71772125]
  2. Shenzhen Science and Technology Program [KQTD2018040816385085]
  3. Philosophical and Social Science Program of Guangdong Province [GD18YGL07]

Ask authors/readers for more resources

This article provides a comprehensive review of the importance of data preprocessing in analyzing massive building operational data and the applications of various techniques, as well as proposing the latest data science techniques to address data challenges in the building field.
The rapid development in data science and the increasing availability of building operational data have provided great opportunities for developing data-driven solutions for intelligent building energy management. Data preprocessing serves as the foundation for valid data analyses. It is an indispensable step in building operational data analysis considering the intrinsic complexity of building operations and deficiencies in data quality. Data preprocessing refers to a set of techniques for enhancing the quality of the raw data, such as outlier removal and missing value imputation. This article serves as a comprehensive review of data preprocessing techniques for analysing massive building operational data. A wide variety of data preprocessing techniques are summarised in terms of their applications in missing value imputation, outlier detection, data reduction, data scaling, data transformation, and data partitioning. In addition, three state-of-the-art data science techniques are proposed to tackle practical data challenges in the building field, i.e., data augmentation, transfer learning, and semi-supervised learning. In-depth discussions have been presented to describe the pros and cons of existing preprocessing methods, possible directions for future research and potential applications in smart building energy management. The research outcomes are helpful for the development of data-driven research in the building field.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available