Self-Adaptive PINNs using a Soft Attention Mechanism

Additionnal learning of multiplicative soft attention masks to weight each training point individually

  1. 2021_Mclenny_L_w-aaai-mlps_sapinns.png
    Self-Adaptive Physics-Informed Neural Networks using a Soft Attention Mechanism
    L. McClenny, and U. Braga-Neto
    In AAAI Symposium on Combining Artificial Intelligence and Machine Learning with Physics Sciences, 2021
TLDR: The paper introduces SA-PINNs, a new adaptive training method for Physics-Informed Neural Networks (PINNs) that allows the neural network to autonomously focus on difficult regions of the solution by using trainable adaptation weights applied to each training point individually.


Context: PINNs may not fit the residual/boundary/initial conditions in “rapidly varying regions” depending on the PDE.

Proposed solution: The neural network learns which regions of the solution are difficult and is forced to focus on them. The self-adaptation weights specify a soft multiplicative soft attention mask, like the one used in computer vision. Each data point is associated with its self-adaptation weight. More formally, they are trainable, nonnegative self-adaptation weights for the initial \(\boldsymbol{\lambda}_0\), boundary \(\boldsymbol{\lambda}_b\), and residue points \(\boldsymbol{\lambda}_r\), respectively. The corresponding objective loss reads

$$ \begin{equation} \mathcal{L}\left(\boldsymbol{w}, \boldsymbol{\lambda}_r, \boldsymbol{\lambda}_b, \boldsymbol{\lambda}_0\right)=\mathcal{L}_s(\boldsymbol{w})+\mathcal{L}_r\left(\boldsymbol{w}, \boldsymbol{\lambda}_r\right)+\mathcal{L}_b\left(\boldsymbol{w}, \boldsymbol{\lambda}_b\right)+\mathcal{L}_0\left(\boldsymbol{w}, \boldsymbol{\lambda}_0\right) \end{equation} $$

Both the neural network parameters \(w\) and the self-adaptation weights are learned as follows

$$ \begin{equation} \min _{\boldsymbol{w}} \max _{\boldsymbol{\lambda}_r, \boldsymbol{\lambda}_b, \boldsymbol{\lambda}_0} \mathcal{L}\left(\boldsymbol{w}, \boldsymbol{\lambda}_r, \boldsymbol{\lambda}_b, \boldsymbol{\lambda}_0\right). \end{equation} $$

Proposed solution (SGD): extension to handle varying colocation points. The basic idea is to use a spatial-temporal predictor of the value of self-adaptive weights for the newly sampled points. Resort to a Gaussian process.

Other previously proposed solutions: