4.7 Article Data Paper

Developing reliable hourly electricity demand data through screening and imputation

期刊

SCIENTIFIC DATA
卷 7, 期 1, 页码 -

出版社

NATURE PUBLISHING GROUP
DOI: 10.1038/s41597-020-0483-x

关键词

-

资金

  1. NASA's Interdisciplinary Research in Earth Science (IDS) program [80NSSC17K0416]
  2. Gates Ventures, Inc.
  3. Fund for Innovative Climate and Energy Research

向作者/读者索取更多资源

Electricity usage (demand) data are used by utilities, governments, and academics to model electric grids for a variety of planning (e.g., capacity expansion and system operation) purposes. The U.S. Energy Information Administration collects hourly demand data from all balancing authorities (BAs) in the contiguous United States. As of September 2019, we find 2.2% of the demand data in their database are missing. Additionally, 0.5% of reported quantities are either negative values or are otherwise identified as outliers. With the goal of attaining non-missing, continuous, and physically plausible demand data to facilitate analysis, we developed a screening process to identify anomalous values. We then applied a Multiple Imputation by Chained Equations (MICE) technique to impute replacements for missing and anomalous values. We conduct cross-validation on the MICE technique by marking subsets of plausible data as missing, and using the remaining data to predict this missing data. The mean absolute percentage error of imputed values is 3.5% across all BAs. The cleaned data are published and available open access: 10.5281/zenodo.3690240.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据