4.7 Article

Water end-use consumption in low-income households: Evaluation of the impact of preprocessing on the construction of a classification model

期刊

EXPERT SYSTEMS WITH APPLICATIONS
卷 185, 期 -, 页码 -

出版社

PERGAMON-ELSEVIER SCIENCE LTD
DOI: 10.1016/j.eswa.2021.115623

关键词

Low-income water end use; Demand management; Random forest model; Adaptive KNN model; ERP measure applied to KNN; Dataset preprocessing

资金

  1. PROSAB
  2. FINEP
  3. Coordination for the Improvement of Higher Education Personnel-CAPES [CAPES/PRINT -41/2017, 88887.467907/2019-00]

向作者/读者索取更多资源

Converting massive water flow data into smart information based on water end uses presents challenges, especially in low-income regions with high data variability due to hydraulic devices. Commercial software is commonly used to classify water end use events, but improper preprocessing can lead to incorrect conclusions.
The challenge of transforming massive water flow data into desegregated smart information according to water end uses is an issue that has motivated many researchers. This challenge is even more difficult in low-income regions owing to the high variability of data because predominant hydraulic devices offer many activation possibilities for users as they are controlled by globe valves. Devices with standardized flow rates such as washing machines or dishwashers are exceptions. A common practice is to apply commercial software that classifies events at the end-use level and then to develop a personalized classification model with enhanced alignment with the database. If the preprocessing step is not performed properly, it can affect perceived device behaviors, which may lead to incorrect conclusions. To evaluate how this variability can interfere with commercial software responses, we developed classification models using a dataset preprocessed by Trace Wizard (R) as training data and then applied the trained models to a test dataset consisting of events that were authenticated by individual flow sensors. Our goal was to identify the degree of difference between the two datasets. The results demonstrate that when Trace Wizard (R) is applied, the features of each device differ from the original water consumption flow, indicating that data variability interferes with the credibility of feedback. Additionally, preprocessing tended to increase the volume, duration, and flow rates, giving the impression that the consumption was higher than the real scenario. The constructed models were not able to overcome the distortions introduced by Trace Wizard (R) classification. For example, fixtures had poor matches for several houses, with statistical measures below 50%.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据