4.6 Article

A Toolkit for Data-Driven Discovery of Governing Equations in High-Noise Regimes

期刊

IEEE ACCESS
卷 10, 期 -, 页码 31210-31234

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/ACCESS.2022.3159335

关键词

Data-driven discovery; noise mitigation; SINDy

向作者/读者索取更多资源

This study presents a toolkit for discovering governing equations from time-series data in high noise environments. By extending and enhancing the SINDy regression method and incorporating an assessment step, it is possible to extract sparse governing equations and coefficients from high-noise time-series data.
We consider the data-driven discovery of governing equations from time-series data in the limit of high noise. The algorithms developed describe an extensive toolkit of methods for circumventing the deleterious effects of noise in the context of the sparse identification of nonlinear dynamics (SINDy) framework. We offer two primary contributions, both focused on noisy data acquired from a system (x) over dot = f (x). First, we propose, for use in high-noise settings, an extensive toolkit of critically enabling extensions for the SINDy regression method, to progressively cull functionals from an over-complete library and yield a set of sparse equations that regress to the derivate (x) over dot. This toolkit includes: (regression step) weight timepoints based on estimated noise, use ensembles to estimate coefficients, and regress using FFTs; (culling step) leverage linear dependence of functionals, and restore and protect culled functionals based on Figures of Merit (FoMs). In a novel Assessment step, we define FoMs that compare model predictions to the original time-series (i.e., x(t) rather than (x) over dot(t)). These innovations can extract sparse governing equations and coefficients from high-noise time-series data (e.g., 300% added noise). For example, it discovers the correct sparse libraries in the Lorenz system, with median coefficient estimate errors equal to 1%-3% (for 50% noise), 6%-8% (for 100% noise), and 23%-25% (for 300% noise). The enabling modules in the toolkit are combined into a single method, but the individual modules can be tactically applied in other equation discovery methods (SINDy or not) to improve results on high-noise data. Second, we propose a technique, applicable to any model discovery method based on (x) over dot = f (x), to assess the accuracy of a discovered model in the context of non-unique solutions due to noisy data. Currently, this non-uniqueness can obscure a discovered model's accuracy and thus a discovery method's effectiveness. We describe a technique that uses linear dependencies among functionals to transform a discovered model into an equivalent form that is closest to the true model, enabling more accurate assessment of a discovered model's correctness.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据