4.5 Article

Data Assimilation with Missing Data in Nonstationary Environments for Probabilistic Machine Learning Models

期刊

JOURNAL OF COMPUTATIONAL SCIENCE
卷 74, 期 -, 页码 -

出版社

ELSEVIER
DOI: 10.1016/j.jocs.2023.102151

关键词

Data assimilation; Missing data; Nonstationary; Uncertainty

向作者/读者索取更多资源

In this study, the data assimilation framework called Probabilistic Optimal Interpolation (POI) is further developed for handling nonstationary environments and missing data. The results show that POI implementation can reduce uncertainty, but its performance is affected by the accuracy limitation of machine learning models in nonstationary environments.
In this study, we further develop the data assimilation framework proposed for probabilistic Machine Learning (ML) models, named Probabilistic Optimal Interpolation (POI), in nonstationary environments with missing data which are common in real-world situations. The dataset is based on a multi-scale Lorenz 96 chaos system. Three types of nonstationary environments (i.e., trend, heteroscedasticity, and random walk) are introduced in the dataset. In addition, the test datasets are masked with different missingness rates to evaluate the POI performance under scenarios with missing values. This study utilizes several filters to identify background noises for observation covariance initialization, and the covariance is updated along the real-time data assimilation specifically for nonstationary environments. The results show that heteroscedastic noises can be well identified while random-walk noises are very difficult to analyze. Overall, the results show that the POI implementation can lead to reduced uncertainty, but POI performance can also be significantly affected due to the limitation of ML models accuracy in the nonstationary environments. The impact from missing values is then examined and compared between stationary and nonstationary environments. Both prediction and POI updates are more accurate with smaller missingness rates as expected, and whether POI is bypassed or not at missing points does not affect the overall performance significantly. Finally, input evolution can perform well with POI under high noise level and missingness rates in stationary environments, but it always yields worse results in nonstationary environments and thus is not recommended.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.5
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据