4.7 Article

A framework based on symbolic regression coupled with eXtended Physics-Informed Neural Networks for gray-box learning of equations of motion from data

Journal

Publisher

ELSEVIER SCIENCE SA
DOI: 10.1016/j.cma.2023.116258

Keywords

machine learning; gray-box learning; Allen-Cahn equation; phase-field modeling; X-PINN; symbolic regression

Ask authors/readers for more resources

This article proposes a framework and algorithm for directly uncovering the unknown parts of nonlinear equations from data. By augmenting the original X-PINN method with flux continuity across domain interfaces, the approach demonstrates excellent accuracy in predicting the unknown part of the equation. The results are further validated through symbolic regression to determine the closed form of the equation's unknown part.
We propose a framework and an algorithm to uncover the unknown parts of nonlinear equations directly from data. The framework is based on eXtended Physics-Informed Neural Networks (X-PINNs), domain decomposition in space-time, but we augment the original X-PINN method by imposing flux continuity across the domain interfaces. The well-known Allen-Cahn equation is used to demonstrate the approach. The Frobenius matrix norm is used to evaluate the accuracy of the X-PINN predictions and the results show excellent performance. In addition, symbolic regression is employed to determine the closed form of the unknown part of the equation from the data, and the results confirm the accuracy of the X-PINNs based approach. To test the framework in a situation resembling real-world data, random noise is added to the datasets to mimic scenarios such as the presence of thermal noise or instrument errors. The results show that the framework is stable against significant amount of noise. As the final part, we determine the minimal amount of data required for training the neural network. For the systems studied here, the framework is able to predict the correct form of the underlying dynamical equation when at least 50% data is used for training. However, relying solely on 50% of the data is inadequate for accurately predicting the unknown coefficients. To ensure accurate predictions for the coefficients, it was necessary to train the network with a minimum of 60% of the available data.& COPY; 2023 Elsevier B.V. All rights reserved.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available