.. _learn: Learning procedure ================== We work with numerically generated training trajectories that we denote by .. math:: :label: ttraj \begin{align} \{(x_i,y_i^2,...,y_i^M)\}_{i=1,...,N}. \end{align} To obtain an approximation of the Hamiltonian :math:`H`, we define a parametric model :math:`H_{\Theta}` and look for a :math:`\Theta` so that the trajectories generated by :math:`H_{\Theta}` resemble the given ones. :math:`H_{\Theta}` in principle can be any parametric function depending on the parameters :math:`\Theta`. In our approach, :math:`\Theta` will collect a factor of the mass matrix and the weights of a neural network, as described below. We use some numerical one-step method :math:`\Psi_{X_{H_{\Theta}}}^{\Delta t}` to generate the trajectories .. math:: :label: ltraj \begin{align} \hat{y}_i^j(\Theta) :=\Psi_{X_{H_{\Theta}}}^{\Delta t}(\hat{y}_i^{j-1}(\Theta)),\quad \hat{y}_i^1(\Theta) := x_i, \quad j=2,\dots,M, \; i=1,\dots,N. \end{align} We then optimize a loss function measuring the distance between the given trajectories :math:`y^j_i` and the generated ones :math:`\hat{y}_i^j`, defined as .. math:: :label: loss \begin{align} \mathcal{L}(\Theta):=\frac{1}{2n}\frac{1}{NM}\sum_{i=1}^N\mathcal{L}_i(\Theta) = \frac{1}{2n}\frac{1}{NM}\sum_{i=1}^N\sum_{j=1}^M \|\hat{y}_i^j(\Theta)- y_i^j\|^2, \end{align} where :math:`\|\cdot\|` is the Euclidean metric of :math:`\mathbb{R}^{2n}`. This is implemented with the PyTorch :math:`\texttt{MSELoss}` loss function. Such a training procedure resembles the one of Recurrent Neural Networks (RNNs), as shown for the forward pass of a single training trajectory in the following figure. .. figure:: /RNN_Diagram.png Figure 1. Forward pass of an input training trajectory :math:`(x_i,y_i^2,...,y_i^M)`. The picture highlights the resemblance to an unrolled version of a Recurrent Neural Network. The network outputs :math:`(\hat{y}_i^2,…,\hat{y}_i^M)`. Indeed, the weight sharing principle of RNNs is reproduced by the time steps in the numerical integrator which are all based on the same approximation of the Hamiltonian, and hence on the same weights :math:`\Theta`. Architecture of the network --------------------------- In this example, the role of the neural network is to model the Hamiltonian, i.e. a scalar function defined on the phase space :math:`\mathbb{R}^{2n}`. Thus, the starting and arrival spaces are fixed. We leverage the form of the kinetic energy, where :math:`M(q)` is modelled through a constant symmetric and positive definite matrix with entries :math:`m_{ij}`. Therefore, we aim at learning a constant matrix :math:`A\in\mathbb{R}^{k\times k}` and a vector :math:`b\in\mathbb{R}^k` so that .. math:: :label: mmatr \begin{align} \begin{bmatrix} m_{11} & ... & m_{1k}\\ m_{21} & ... & m_{2k}\\ \vdots & \vdots & \vdots \\ m_{k1} & ... & m_{kk} \end{bmatrix} \approx A^TA + \begin{bmatrix} \tilde{b}_{1} & 0 & ... & 0 \\ 0 & \tilde{b}_2 & \ddots & \vdots \\ \vdots & \ddots & \ddots & 0 \\ 0 & ... & 0 & \tilde{b}_k \end{bmatrix} \end{align} where :math:`\tilde{b}_i := \max{(0,b_i)}` are terms added to promote the positive definiteness of the right-hand side. Notice that, in principle, the imposition of the positive (semi)definiteness of the matrix defining the kinetic energy is not necessary, but it allows to get more interpretable results. Indeed, it is known that the kinetic energy should define a metric on :math:`\mathbb{R}^n` and the assumption we are making guarantees such a property. For the potential energy, a possible modelling strategy is to work with standard feedforward neural networks, and hence to define .. math:: :label: pot \begin{align} V(q) \approx V_{\theta}(q) = f_{\theta_m}\circ ...\circ f_{\theta_1}(q) \end{align} .. math:: :label: pnn \begin{align} \theta_i = (W_i,b_i)\in\mathbb{R}^{n_i\times n_{i-1}}\times \mathbb{R}^{n_i},\;\theta:=[\theta_1,...,\theta_m], \end{align} .. math:: :label: fnn \begin{align} f_{\theta_i}(u) := \Sigma(W_iu + b_i),\;\mathbb{R}^n\ni z\mapsto \Sigma(z) = [\sigma(z_1),...,\sigma(z_n)]\in\mathbb{R}^n, \end{align} for example with :math:`\sigma(x) = \tanh(x)`. Therefore, we have that .. math:: :label: tpar \begin{align} \Theta = [A, \theta], \quad H(q,p) \approx H_{\Theta}(q,p) = K_A(p) + V_{\theta}(q). \end{align} The neural network for the parameterized Hamiltonian :eq:`tpar` is defined in the following PyTorch class. .. autoclass:: Learning_Hamiltonians.main.Hamiltonian :members: __init__, MassMat, Kinetic, Potential, forward