Example

Problem description

There is an increasing interest in modelling and computation of physical systems with neural networks, given the the outstanding results they achieved in learning patterns from data. We consider Hamiltonian mechanical systems, whose dynamics is fully determined by one scalar function, the Hamiltonian, which represent the energy of the system. This explains why multiple approaches have been proposed to approximate this energy function, and we focus here on the task of learning the Hamiltonians of contrained mechanical systems with neural networks, given sample data information of their solutions.

Let us consider Hamiltonian functions of the form

(1)\[\begin{align} H(q,p) = \frac{1}{2}p^TM^{-1}(q)p + V(q), \end{align}\]

where \(M(q)\) is the mass matrix of the system, possibly depending on the configuration \(q\in\mathbb{R}^n\), and \(V(q)\) is the potential energy of the system. The solution trajectories are often constrained to evolve on a submanifold of a linear vector space. In particular, we focus on systems that are holonomically constrained on some configuration manifold \(\mathcal{Q}=\{q\in\mathbb{R}^n:\,g(q)=0\}\) embedded in \(\mathbb{R}^n\) and we model them by means of some projection operator. As a result, the vector field is written in such a way that it directly respects the constraints, without the addition of algebraic equations. We assume that the components \(g_i:\mathbb{R}^n\rightarrow \mathbb{R}\), \(i=1,...,m\), are functionally independent on the zero level set, so that the Hamiltonian is defined on the \((2n-2m)\) dimensional cotangent bundle \(\mathcal{M}=T^*\mathcal{Q}\). Working with elements of the tangent space at \(q\), \(T_q\mathcal{Q}\), as vectors in \(\mathbb{R}^n\), we introduce a linear operator that defines the orthogonal projection of an arbitrary vector \(v\in\mathbb{R}^n\) onto \(T_q \mathcal{Q}\), i.e.

(2)\[\begin{align} \forall q\in \mathcal{Q},\text{ we set }P(q):\mathbb{R}^n\rightarrow T_q\mathcal{Q},\;\;v\mapsto P(q)v. \end{align}\]

\(P(q)^T\) can be seen as a map sending vectors of \(\mathbb{R}^n\) into covectors in \(T_q^*\mathcal{Q}\). If \(g(q)\) is differentiable, assuming \(G(q)\) is the Jacobian matrix of \(g(q)\), we have \(T_q \mathcal{Q} = \mathrm{Ker}\,G(q)\), and so \(P(q) = I_n - G(q)\left(G(q)^TG(q)\right)^{-1}G(q)^T\), where \(I_n\in\mathbb{R}^{n\times n}\) is the identity matrix. This projection map allows us to define Hamilton’s equations as follows

(3)\[\begin{split}\begin{align} \begin{cases} \dot{q} = P(q)\partial_pH(q,p)\\ \dot{p} = -P(q)^T\partial_qH(q,p) + W(q,p)\partial_pH(q,p), \end{cases} \end{align}\end{split}\]

where

(4)\[\begin{split}\begin{align} \begin{split} W(q,p)&=P(q)^T\Lambda(q,p)^T P(q) + \Lambda(q,p)P(q) -P(q)^T\Lambda(q,p)^T,\\ &\text{with}\quad \Lambda(q,p) = \frac{\partial P(q)^Tp}{\partial q}. \end{split} \end{align}\end{split}\]

Systems defined on products of spheres

Let the phase space of the system be \(\mathcal{M}=(T^*S^2)^k\), where \(S^2\) is the unit sphere in \(\mathbb{R}^3\), \(k\in \mathbb{N}^+\). We coordinatize \(\mathcal{M}\) with \((q,p)=(q_1,\dots,q_k,p_1,\dots,p_k)\in \mathbb{R}^{6k}\). In this case, when \(p\in \mathbb{R}^{3k}\) is intended as the vector of linear momenta, the matrix \(M(q)\) in equation (1) is a block matrix, with

(5)\[\begin{split}\begin{align} i,j = 1,...,k,\quad \mathbb{R}^{3\times 3}\ni M(q)_{ij} = \begin{cases} m_{ii}I_3,\quad i=j\\ m_{ij}(I_3-q_iq_i^T),\quad \text{otherwise,} \end{cases} \end{align}\end{split}\]

The matrix having constant entries \(m_{ij}\) is symmetric and positive definite.

For example, in the case of a spherical pendulum we have \(k=1\), hence the Hamiltonian dynamics is defined on its cotangent bundle \(T^*S^2\). With this specific choice of the geometry, the formulation presented in equation (3) simplifies considerably. Indeed \(P(q) = I_3-qq^T\) which implies \(W(q,p) = pq^T-qp^T\). Replacing these expressions in (3) and using the triple product rule we end up with the following set of ODEs

(6)\[\begin{split}\begin{align} \begin{cases} \dot{q} &= (I-qq^T)\partial_pH(q,p)\\ \dot{p} &= -(I-qq^T)\partial_qH(q,p) + \partial_pH(q,p)\times (p\times q). \end{cases} \end{align}\end{split}\]

This system is implemented in the following function to generate the training data.

Learning_Hamiltonians.trajectories.dynamics(t, z)

System of ODEs defining the dynamics

Parameters:
  • t (float) – 1-D independent variable (time)

  • z (numpy.ndarray) – 6N-component vector of configuration variables and conjugate momenta, with N the number of pendulums

Returns:

vec – 3N-component vector, with N the number of pendulums

Return type:

numpy.ndarray

This function in turn makes use of

Learning_Hamiltonians.trajectories.MatrR(q)

Matrix defining the quadratic kinetic energy

Parameters:

q (numpy.ndarray) – 3N-component vector of coordinates, with N the number of pendulums

Returns:

R – 3Nx3N matrix defining the kinetic energy, with N the number of pendulums

Return type:

numpy.ndarray


Learning_Hamiltonians.trajectories.hat(q)

” Isomorphism bewteen R3 and so(3). It returns the skew symmetric matrix hat(q) associated to the vector q, such that hat(a)b=axb for all 3-component vectors a and b, with “x” the cross product

Parameters:

q (numpy.ndarray) – 3-component vector

Returns:

hat(q) – 3x3 skew-symmetric matrix (element of the Lie algebra so(3))

Return type:

numpy.ndarray


Learning_Hamiltonians.trajectories.Massm(nop, m)

Mass matrix collecting the terms mij in the matrix defining the kinetic energy

Parameters:
  • nop (int) – number of pendulums

  • m (float) – masses of the pendulums (supposed to be all the same in this case)

Returns:

M – Mass matrix

Return type:

numpy.ndarray

to define the Kinetic energy as in (1) - (5), and of

Learning_Hamiltonians.trajectories.Hq(z)

Gradient of the Hamiltonian with respect to the configuration variables

Parameters:

z (numpy.ndarray) – 6N-component vector of configuration variables and conjugate momenta, with N the number of pendulums

Returns:

Hp – 3N-component vector, with N the number of pendulums

Return type:

numpy.ndarray


Learning_Hamiltonians.trajectories.Hp(z)

Gradient of the Hamiltonian with respect to the conjugate momenta

Parameters:

z (numpy.ndarray) – 6N-component vector of generalized coordinates and momenta, with N the number of pendulums

Returns:

Hp – 3N-component vector, with N the number of pendulums

Return type:

numpy.ndarray

to define the derivatives of the HAmiltonian function (1).

System (6) is also implemented in

Learning_Hamiltonians.main.predicted(t, z)

Vector field predicted by using the evaluated Hamiltonian, to be used in scipy.integrate.solve_ivp to get trajectory segments with the evaluated Hamiltonian

Parameters:
  • t (time) – standard input to use the predicted function in scipi.integrate.solve_ivp

  • HH (torch.Tensor) – trajectory point

Returns:

vec – vector field (Hamlton equations)

Return type:

torch.Tensor

with the Hamitlonian function replaced by the Neural Network after the training procedure (i.e. the learned Hamiltonian) to evualte the approximation, and in

Learning_Hamiltonians.nn_functions.predictedVF(x, HH)

Vector field predicted by using the output of the neural network as the Hamiltonian

Parameters:
  • x (torch.Tensor) – solution at time t0

  • HH (function handle) – Hamiltonian

Returns:

vec – vector field (Hamlton equations)

Return type:

torch.Tensor

with the Hamitlonian function replaced by the Neural Network during the training procedure (to solve the training equations with classical Runge–Kutta schemes).

We remarks briefly that \(T^*S^2\) is a homogeneous manifold, since it is acted upon transitively by the Lie group SE(3) through the group action

(7)\[\begin{align} \Psi : SE(3)\times T^*S^2\rightarrow T^*S^2,\;\;((R,r),(q,p^T))\mapsto (Rq,(Rp+r\times Rq)^T), \end{align}\]

where the transpose comes from the usual interpretation of covectors as row vectors. This is implemented in

Learning_Hamiltonians.nn_functions.actionSE3NN(g, z)

Group action of SE3 on TS^2

Parameters:
  • g (torch.Tensor) – element of the group SE3

  • z (torch.Tensor) – trajectory point in the phase space

Return type:

out = torch.Tensor

The hat map is implemented in

Learning_Hamiltonians.nn_functions.hatNN(q)

It returns the skew symmetric matrix associated to the vector q, such that hat(a)b=axb for all 3-component vectors a and b, with “x” the cross product

Parameters:

q (torch.Tensor) – coordinates of the training trajectory points, with shape [batch size, s], s=3

Returns:

res – skew symmetric matrix associated to each vector q in the batch, with shape [batch size, s, s], s=3

Return type:

torch.Tensor

As a consequence of (7), we can use also Lie group integrators to solve numerically the system (6). In the code, both Lie group integrators (Lie-Euler and commutator-free of order 4) and classical Runge-Kutta schemes (Euler and Runge-Kutta of order four) are implemented for the time integration of (6) during the training procedure.

Learning_Hamiltonians.nn_functions.LieEulerNN(x0, f, h, cc, H)

Lie Euler integrator

Parameters:
  • x0 (torch.Tensor) – solution at time t0

  • f (function handle) – map f from the manifold M to the Lie algebra of the group acting on M

  • h (float) – time step

  • cc (int) – M-1

  • H (neural network class) – Hamiltonian function

Returns:

sol

Return type:

solution at time t1=t0+h


Learning_Hamiltonians.nn_functions.CF4NN(x0, f, h, cc, H)

Lie group commutator free integrator of order 4

Parameters:
  • x0 (torch.Tensor) – solution at time t0

  • f (function handle) – map f from the manifold M to the Lie algebra of the group acting on M

  • h (float) – time step

  • cc (int) – M-1

  • H (neural network class) – Hamiltonian function

Returns:

sol

Return type:

solution at time t1=t0+h


Learning_Hamiltonians.nn_functions.ExpEuler(x0, h, cc, H)

Explicit Euler method

Parameters:
  • x0 (torch.Tensor) – solution at time t0

  • f (function handle) – map f from the manifold M to the Lie algebra of the group acting on M

  • h (float) – time step

  • cc (int) – M-1

  • H (neural network class) – Hamiltonian function

Returns:

sol

Return type:

solution at time t1=t0+h


Learning_Hamiltonians.nn_functions.RK4(x0, h, cc, H)

Runge-Kutta method of order 4

Parameters:
  • x0 (torch.Tensor) – solution at time t0

  • f (function handle) – map f from the manifold M to the Lie algebra of the group acting on M

  • h (float) – time step

  • cc (int) – M-1

  • H (neural network class) – Hamiltonian function

Returns:

sol

Return type:

solution at time t1=t0+h

Lie group integrators make use of the exponential maps on \(SO(3)\) and \(SE(3)\), implemented respectively in

Learning_Hamiltonians.nn_functions.expso3NN(x)

Exponential map on SO(3)

Parameters:

x (torch.Tensor (float32)) – element of the lie algebra so(3), represented as a vector with 3 components.

Returns:

expA – element of the group SO(3), i.e. 3x3 rotation matrix

Return type:

torch.Tensor


Learning_Hamiltonians.nn_functions.expse3NN(input)

Exponential map on SE(3)

Parameters:

x (torch.Tensor (float32)) – element of the lie algebra se(3) represented as 6-component vector, i.e. as a pair (u,v) with with the 3-component vector u corresponding to a skew symmetric matrix hat(u) and the 3-component vector v corresponding to the translational part.

Returns:

expA – element of the group SE(3), represented as a 3x4 matrix [A, b], with A 3x3 rotation matrix and b 3-component translation vector.

Return type:

torch.Tensor

We represent a generic element of the special Euclidean group \(G=SE(3)\) as an ordered pair \((R,r)\), where \(R\in SO(3)\) is a rotation matrix and \(r\in\mathbb{R}^3\) is a vector. The vector field \(X(q,p)\) can be expressed as \(\psi_*(F[H](q,p))(q,p)\) with

(8)\[\begin{align} \psi_*((\xi,\eta))(q,p) = (\xi\times q,\xi\times p + \eta\times q), \quad (\xi,\eta)\in \mathfrak{g= se}(3) \end{align}\]

and

(9)\[\begin{align} F[H](q,p) = (\xi,\eta)=\left(q\times \frac{\partial H(q,p)}{\partial p},\frac{\partial H(q,p)}{\partial q}\times q + \frac{\partial H(q,p)}{\partial p}\times p \right). \end{align}\]

A similar reasoning can be extended to a chain of \(k\) connected pendula, and hence to a system on \((T^*S^2)^k\). The main idea is to replicate both the equations and the expression \(F[H]\) for all the \(k\) copies of \(T^*S^2\). A more detailed explanation can be found in (Celledoni, Çokaj, Leone, Murari and Owren, 2021).

The map (9) for the double pendulum has been implemented in the following function

Learning_Hamiltonians.nn_functions.fManiAlgebraNN(H, z)

Funtion f from the manifold M (phase space of the system) to the Lie algebra of the group acting on M

Parameters:
  • H (torch.Tensor) – Hamiltonian function

  • z (torch.Tensor) – training trajectory points, with size [batch size, nop*2s]

Returns:

ff – map f : M -> g (Lie algebra)

Return type:

torch.Tensor

and the extension of (7) to \((T^*S^2)^k\) in

Learning_Hamiltonians.nn_functions.actionse3NNn(g, z)

Concatenate se3 actions

Parameters:
  • g (torch.Tensor) – elements of the group SE3

  • z (torch.Tensor) – trajectory points in the phase space

Returns:

out

Return type:

torch.Tensor

Finally, the extension of exponential map on \(SE(3)\) to \((T^*S^2)^k\) is implemented in

Learning_Hamiltonians.nn_functions.expse3NNn(input)

Concatenate exponentials on SE(3) in one tensor

Parameters:

input (torch.Tensor) –

Returns:

out

Return type:

torch.Tensor