dymad.modules¶

class dymad.modules.FlexLinear(in_features, out_features, bias=True, dtype=None, device=None)¶

Bases: Module

A linear layer that can store weights either as a full matrix (MxN) or as low-rank factors (U, V) with efficient matvec operations.

In the low-rank mode, the weight matrix is represented as:

W = U @ V^T

where U is (M x r) and V is (N x r).

forward(x)¶

Define the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

Return type:: Tensor

set_full(W, b)¶: Switch to full mode and copy parameters.

set_lora(U, V, b)¶: Switch to lowrank mode and copy factors. U: out*r, V: in*r.

set_weights(W=None, b=None, U=None, V=None)¶

Return type:: tuple[Tensor, ...]

class dymad.modules.GNN(input_dim, hidden_dim, output_dim, n_layers, *, gcl='sage', gcl_opts=None, activation='prelu', weight_init='xavier_uniform', bias_init='zeros', gain=1.0, end_activation=True, dtype=None, device=None)¶

Bases: Module

Configurable Graph Neural Network using a choice of GCL (e.g., SAGEConv, ChebConv) and activations.

Due to the implementation, the GNN is applied sequentially to batch data.

To interface with other parts of the code, the model assumes the input to be node-wise, (…, n_nodes, n_input), but the output is reshaped to concatenate features across nodes, (…, n_nodes * n_output). See forward method for details.

Parameters:

input_dim (int) – Dimension of input node features.
hidden_dim (int) – Dimension of hidden layers.
output_dim (int) – Dimension of output node features.
n_layers (int) – Number of GCL layers.
gcl (str | nn.Module | type, default='sage') – Graph convolution layer type or instance.
gcl_opts (dict, default={}) – Options passed to the GCL constructor.
activation (str | nn.Module | type, default='prelu') – Activation function.
weight_init (str | callable, default='xavier_uniform') – Weight initializer.
bias_init (str | callable, default='zeros') – Bias initializer.
gain (float, default=1.0) – Extra gain modifier for weight initialization.
end_activation (bool, default=True) – Whether to apply activation after last layer.

diagnostic_info()¶

Return type:: str

forward(x, edge_index, edge_weights, edge_attr, **kwargs)¶

Forward pass through the GNN.

x (…, n_nodes, n_features).
edge_index (…, n_edges, 2).
edge_weights (…, n_edges).
edge_attr (…, n_edges, n_edge_features).
Returns (…, n_nodes*n_new_features).

If …=1, we can process the entire batch in one go. Otherwise, we aggregate the graph on the fly so the shapes are reduced to the first case. The aggregation takes a bit more time.

class dymad.modules.IdenCatGNN(input_dim, hidden_dim, output_dim, n_layers, gcl='sage', gcl_opts=None, activation='prelu', weight_init='xavier_uniform', bias_init='zeros', gain=1.0, end_activation=True, dtype=None, device=None)¶

Bases: GNN

Identity concatenation GNN.

This GNN concatenates the input with the output of the GNN.

Note

The output dimension represents the total output features and must be greater than the input dimension.

See GNN for the arguments.

forward(x, edge_index, edge_weights, edge_attr, **kwargs)¶

Forward pass through the GNN.

x (…, n_nodes, n_features).
edge_index (…, n_edges, 2).
edge_weights (…, n_edges).
edge_attr (…, n_edges, n_edge_features).
Returns (…, n_nodes*n_new_features).

If …=1, we can process the entire batch in one go. Otherwise, we aggregate the graph on the fly so the shapes are reduced to the first case. The aggregation takes a bit more time.

class dymad.modules.IdenCatMLP(input_dim, hidden_dim, output_dim, n_layers=2, activation=<class 'torch.nn.modules.activation.ReLU'>, weight_init=<function xavier_uniform_>, bias_init=<function zeros_>, gain=1.0, end_activation=True, dtype=None, device=None)¶

Bases: MLP

Identity concatenation MLP.

This MLP concatenates the input with the output of the MLP.

Note

The output dimension represents the total output features and must be greater than the input dimension.

See MLP for the arguments.

forward(x)¶

Forward pass through the identity concatenation MLP.

Parameters:: x (torch.Tensor) – Input tensor of shape (…, input_dim).
Returns:: Output tensor of shape (…, output_dim).
Return type:: torch.Tensor

class dymad.modules.KRRBase(kernel, ridge_init=0, jitter=1e-10, device=None)¶

Bases: Module

Base class for Kernel Ridge Regression, in particular:

Multi-output single scalar kernel (the most common case)

Multi-output multiple scalar kernel (i.e., one kernel per output)

True operator-valued kernel (i.e., matrix-valued)

Subclasses must implement:

_ensure_solved(self)

_predict_from_solution(self, Xnew) -> (M, Dy)

fit()¶

Precompute the linear solve, which can be backprop’d.

Return type:: Tensor

forward(Xnew)¶

Define the computation performed at every call.

Should be overridden by all subclasses.

Note

property ridge: Tensor¶

set_train_data(X, Y)¶

Return type:: None

class dymad.modules.KRRMultiOutputIndep(kernel, ridge_init=0, jitter=1e-10, device=None)¶

Bases: KRRBase

Scalar KRR for multiple outputs, and one kernel per output

A ModuleList of Dy scalar kernels (one per output).

Dy independent NxN Choleskys; Dy ridges (vector).

class dymad.modules.KRRMultiOutputShared(kernel, ridge_init=0, jitter=1e-10, device=None)¶

Bases: KRRBase

Scalar KRR for multiple outputs but one single kernel

One NxN Cholesky; solve Dy outputs together. One lambda (scalar) by default.

class dymad.modules.KRROperatorValued(kernel, ridge_init=0, jitter=1e-10, device=None)¶

Bases: KRRBase

Operator-valued kernel K(X,Z) -> (N,M,Dy,Dy).

Solves (Kxx + lambda I) vec(alpha) = vec(Y), using a single (N*Dy)x(N*Dy) Cholesky.

class dymad.modules.KRRTangent(kernel, ridge_init=0, jitter=1e-10, device=None)¶

Bases: KRRBase

KRR for vector fields on a manifold, using a specialized tangent kernel.

The formulation is based on Geometrically constraint Multivariate KRR (GMKRR) from

Huang, He, Harlim, Li, ICLR2025

Solves (Kxx + lambda I) vec(alpha) = vec(Y), but Kxx is given in a factorized form, so effectively we solve a smaller system in intrinsic dimension d << Dy.

(kxx + lambda I) vec(alpha) = vec(T^T * Y)

set_manifold(manifold)¶

Return type:: None

class dymad.modules.KernelAbstract(in_dim, dtype=None)¶

Bases: Module, ABC

Base interface for all kernels (scalar or operator-valued).

abstractmethod forward(X, Z=None)¶

Compute kernel between X (N,d) and Z (M,d).

If Z is None, compute K(X,X).

Returns:

(N, M) - Operator-valued kernels: (N, Dy, M, Dy)

Return type:

Scalar kernels

abstract property is_operator_valued: bool¶: True for operator-valued kernels; False for scalar kernels.

set_reference_data(Xref)¶

Prepare data-dependent structures from Xref (N,d). Must be differentiable if kernel params are learnable.

By default the kernel is data-independent and does nothing.

Return type:: None

class dymad.modules.KernelOpSeparable(kernels, out_dim, Ls=None, dtype=None)¶

Bases: KernelOperatorValuedScalars

Separable operator-valued kernel K(x,z) = sum_i k_i(x,z; ell) * B_i where B_i = L_i L_i^T is PSD and learnable. Output shape: (…, N, Dy, M, Dy)

forward(X, Z=None)¶

Compute kernel between X (N,d) and Z (M,d).

If Z is None, compute K(X,X).

Returns:

(N, M) - Operator-valued kernels: (N, Dy, M, Dy)

Return type:

Scalar kernels

class dymad.modules.KernelOpTangent(kernel, out_dim, dtype=None)¶

Bases: KernelOperatorValued

Operator-valued kernel for vector fields on a manifold

For manifold of intrinsic dimension d and ambient dimension Dy:

K(x,z) = k(x,z; ell) * T(x’) O(x’,z’) T(z’)^T

where O(x’,z’) = T(x’)^T T(z’) and T, of (Dy,d), are tangent basis vectors at x’ and z’, and the ‘ denotes the state part of the input (the first out_dim dimensions). k is a scalar kernel that includes both states and inputs.

Returns a factored representation of the kernel to stay in intrinsic dimension

k(x,z; ell) O(x,z), T(x), T(z)

of shapes: (…, d, M, d), (…, d, Dy), (M, d, Dy)

forward(X, Z=None)¶

Compute kernel between X (N,d) and Z (M,d).

If Z is None, compute K(X,X).

Returns:

(N, M) - Operator-valued kernels: (N, Dy, M, Dy)

Return type:

Scalar kernels

set_manifold(manifold)¶

Return type:: None

set_reference_data(Xref)¶

Prepare data-dependent structures from Xref (N,d). Must be differentiable if kernel params are learnable.

By default the kernel is data-independent and does nothing.

Return type:: None

class dymad.modules.KernelOperatorValued(in_dim, out_dim, dtype=None)¶

Bases: KernelAbstract, ABC

property is_operator_valued: bool¶: True for operator-valued kernels; False for scalar kernels.

class dymad.modules.KernelOperatorValuedScalars(kernels, out_dim, dtype=None)¶

Bases: KernelOperatorValued

Operator-valued kernel induced by scalar kernels Output shape: (…, N, Dy, M, Dy)

set_reference_data(Xref)¶

Prepare data-dependent structures from Xref (N,d). Must be differentiable if kernel params are learnable.

By default the kernel is data-independent and does nothing.

Return type:: None

class dymad.modules.KernelScDM(in_dim, eps_init=None, t_init=1.0, dtype=None)¶

Bases: KernelScalarValued

Symmetric-normalized diffusion kernel via diffusion maps.

Everything keeps autograd for eps and t.

property eps¶

forward(X, Z=None)¶

Compute kernel between X (N,d) and Z (M,d).

If Z is None, compute K(X,X).

Returns:

(N, M) - Operator-valued kernels: (N, Dy, M, Dy)

Return type:

Scalar kernels

set_reference_data(Xref)¶

Prepare data-dependent structures from Xref (N,d). Must be differentiable if kernel params are learnable.

By default the kernel is data-independent and does nothing.

Return type:: None

property t¶

class dymad.modules.KernelScExp(in_dim, lengthscale_init=None, dtype=None)¶

Bases: KernelScalarValued

Scalar Exponential: k(x,z) = exp(-||x - z|| / ell) Learnable positive lengthscale.

property ell¶

forward(X, Z=None)¶

Compute kernel between X (N,d) and Z (M,d).

If Z is None, compute K(X,X).

Returns:

(N, M) - Operator-valued kernels: (N, Dy, M, Dy)

Return type:

Scalar kernels

class dymad.modules.KernelScRBF(in_dim, lengthscale_init=None, dtype=None)¶

Bases: KernelScalarValued

Scalar RBF: k(x,z) = exp(-0.5 * ||x - z||^2 / ell^2) Learnable positive lengthscale.

property ell¶

forward(X, Z=None)¶

Compute kernel between X (N,d) and Z (M,d).

If Z is None, compute K(X,X).

Returns:

(N, M) - Operator-valued kernels: (N, Dy, M, Dy)

Return type:

Scalar kernels

set_reference_data(Xref)¶

Prepare data-dependent structures from Xref (N,d). Must be differentiable if kernel params are learnable.

By default the kernel is data-independent and does nothing.

Return type:: None

class dymad.modules.KernelScalarValued(in_dim, dtype=None)¶

Bases: KernelAbstract, ABC

property is_operator_valued: bool¶: True for operator-valued kernels; False for scalar kernels.

class dymad.modules.MLP(input_dim, hidden_dim, output_dim, *, n_layers=2, activation=<class 'torch.nn.modules.activation.ReLU'>, weight_init=<function xavier_uniform_>, bias_init=<function zeros_>, gain=1.0, end_activation=True, dtype=None, device=None)¶

Bases: Module

Fully-connected feed-forward network

Assuming the following architecture:

in_dim -> (Linear -> Act) x n_hidden -> Linear -> out_dim

Parameters:

input_dim (int) – Dimension of the input features.
hidden_dim (int) – Width of every hidden layer.
output_dim (int) – Dimension of the network output.
n_layers (int, default = 2) –
Number of total layers.
- If 0, same as Identity, or TakeFirst.
- If 1, same as Linear.
- If 2, same as Linear -> activation -> Linear.
- Otherwise, hidden layers are inserted.
activation (nn.Module or Callable[[], nn.Module], default = nn.ReLU) – Non-linearity to insert after every hidden Linear. Pass either a class (e.g. nn.Tanh) or an already-constructed module.
weight_init (Callable[[torch.Tensor, float], None], default = nn.init.kaiming_uniform_) – Function used to initialise each Linear layer’s weight tensor. Must accept (tensor, gain) signature like the functions in torch.nn.init.
bias_init (Callable[[torch.Tensor], None], default = nn.init.zeros_) – Function used to initialise each Linear layer’s bias tensor.
gain (Optional[float], default = 1.0) – In the linear layers, the weights are initialised with the standard nn.init.calculate_gain(<nonlinearity>) Gain is multiplied to the calculated gain. By default gain=1, so no change.
end_activation (bool, default = True) –
- If True, the last layer is followed by an activation function.
- Otherwise, the last layer is a plain Linear layer.

diagnostic_info()¶

Return type:: str

forward(x)¶

Define the computation performed at every call.

Should be overridden by all subclasses.

Note

Return type:: Tensor

class dymad.modules.ResBlockGNN(input_dim, hidden_dim, output_dim, n_layers, gcl='sage', gcl_opts=None, activation='prelu', weight_init='xavier_uniform', bias_init='zeros', gain=1.0, end_activation=True, dtype=None, device=None)¶

Bases: GNN

Residual block with GNN as the nonlinearity.

See GNN for the arguments.

forward(x, edge_index, edge_weights, edge_attr, **kwargs)¶

Forward pass through the GNN.

x (…, n_nodes, n_features).
edge_index (…, n_edges, 2).
edge_weights (…, n_edges).
edge_attr (…, n_edges, n_edge_features).
Returns (…, n_nodes*n_new_features).

If …=1, we can process the entire batch in one go. Otherwise, we aggregate the graph on the fly so the shapes are reduced to the first case. The aggregation takes a bit more time.

class dymad.modules.ResBlockMLP(input_dim, hidden_dim, output_dim, n_layers=2, activation=<class 'torch.nn.modules.activation.ReLU'>, weight_init=<function xavier_uniform_>, bias_init=<function zeros_>, gain=1.0, end_activation=True, dtype=None, device=None)¶

Bases: MLP

Residual block with MLP as the nonlinearity.

See MLP for the arguments.

forward(x)¶

Forward pass through the residual block.

Parameters:: x (torch.Tensor) – Input tensor of shape (…, input_dim).
Returns:: Output tensor of shape (…, output_dim).
Return type:: torch.Tensor

class dymad.modules.SequentialBase(seq_len, *, last_only=True, net=None, input_dim=-1, hidden_dim=-1, output_dim=-1, n_layers=2, activation=<class 'torch.nn.modules.activation.ReLU'>, weight_init=<function xavier_uniform_>, bias_init=<function zeros_>, gain=1.0, dtype=None, device=None, **kwargs)¶

Bases: Module

Interface module that handles time-delayed input sequences.

The module assumes that the input is given in shape (…, seq_len * input_dim), where seq_len is the length of the time-delay/sequence, and input_dim is the dimension of each step’s input features. The module reshapes the input to (…, seq_len, input_dim) and passes it to an internal sequential model (e.g., RNN), and return the output either at the last step (…, output_dim) or the full sequence flattened (…, seq_len * output_dim).

It considers two types of architectures

Internally construct RNN-like models, that applies to the input in the standard way. This is usually good for defining dynamics.
Externally provided models, that process step by step and not necessarily recurrently. This is usually good for defining encoders/decoders. Examples are MLP and GNN.
In both cases, the models expect input of shape (-1, seq_len, input_dim) and return output of shape (-1, seq_len, output_dim).
In either case, a subclass must implement _run_seq() method that defines how to run the model.

The module can also operate in two modes

last_only=True: returns only the output at the last step (…, output_dim). Usually for dynamics.
last_only=False: returns the outputs at all steps, flattened (…, seq_len * output_dim). Usually for encoders/decoders.

Parameters:

seq_len (int) – Length of the input sequences.
last_only (Optional[bool]) – Whether to return only the last step output, default is True.
net (Optional[nn.Module]) – Optional externally provided model. If None, an internal RNN-like model is constructed.
input_dim (Optional[int]) – Dimension of the input features of all steps (for RNN-like).
hidden_dim (Optional[int]) – Width of the hidden layers (for RNN-like).
output_dim (Optional[int]) – Dimension of the output features of all steps (for RNN-like).
n_layers (Optional[int]) – Number of layers (for RNN-like).
activation (Union[str, nn.Module, Callable[[], nn.Module]]) – Activation function (for RNN-like).
weight_init (Union[str, Callable[[torch.Tensor, float], None]]) – Weight initialization method (for RNN-like).
bias_init (Callable[[torch.Tensor], None]) – Bias initialization method (for RNN-like).
gain (Optional[float]) – Gain factor for weight initialization (for RNN-like).
dtype – Data type for the module (for RNN-like).
device – Device for the module (for RNN-like).
**kwargs – Additional keyword arguments passed to the internal model constructor.

forward(x, u=None)¶

The network does concatenation internally, because x and u are both time-delayed and concatenated, and applying the sequential model requires to stack x and u of the same steps and then concatenate.

Parameters:

x (Tensor) – Stacked input tensor of shape (…, seq_len * x_dim)
u (Tensor | None) – (Optional) Stacked control tensor of shape (…, seq_len * u_dim); needed when serving as encoder with inputs.

Returns:

Stacked output tensor of shape (…, output_dim),: where the last slot is the output of the sequential model at the last step if last_only is True, otherwise all outputs are concatenated.

Return type:

output

class dymad.modules.SimpleRNN(seq_len, *, last_only=True, net=None, input_dim=-1, hidden_dim=-1, output_dim=-1, n_layers=2, activation=<class 'torch.nn.modules.activation.ReLU'>, weight_init=<function xavier_uniform_>, bias_init=<function zeros_>, gain=1.0, dtype=None, device=None, **kwargs)¶

Bases: SequentialBase

A simple recurrent neural network module

One layer, unidirectional, but supports arbitrary activations with a linear readout.

class dymad.modules.StepwiseModel(seq_len, *, last_only=True, net=None, input_dim=-1, hidden_dim=-1, output_dim=-1, n_layers=2, activation=<class 'torch.nn.modules.activation.ReLU'>, weight_init=<function xavier_uniform_>, bias_init=<function zeros_>, gain=1.0, dtype=None, device=None, **kwargs)¶

Bases: SequentialBase

Naive application of a network to each step of a sequence.

class dymad.modules.VanillaRNN(seq_len, *, last_only=True, net=None, input_dim=-1, hidden_dim=-1, output_dim=-1, n_layers=2, activation=<class 'torch.nn.modules.activation.ReLU'>, weight_init=<function xavier_uniform_>, bias_init=<function zeros_>, gain=1.0, dtype=None, device=None, **kwargs)¶

Bases: SequentialBase

Vanilla RNN from pytorch.

dymad.modules.make_autoencoder(ae_type, input_dim, hidden_dim, latent_dim, enc_depth, dec_depth, output_dim=None, seq_len=None, **kwargs)¶

Factory function to create preset autoencoder models. Including:

[mlp_smp] Simple version: MLP-in MLP-out
[mlp_res] Simple version but with ResBlockMLP
[mlp_cat] Concatenation as encoder [x MLP(x)], then TakeFirst as decoder
[mlp_seq_rnn] RNN-in MLP-out, using a 1-layer unidirectional RNN
[mlp_seq_std] RNN-in MLP-out, using standard RNN from pytorch
[mlp_seq_smp] MLP-in MLP-out, applied stepwise to sequences
The graph version of the above, e.g., gnn_smp, gnn_seq_rnn, etc.

Parameters:

ae_type (str) – Type of autoencoder to create.
input_dim (int) – Dimension of the input features.
hidden_dim (int) – Width of the hidden layers.
latent_dim (int) – Dimension of the latent/encoded space.
enc_depth (int) – Number of layers in the encoder.
dec_depth (int) – Number of layers in the decoder.
output_dim (int, optional) – Dimension of the output features, defaults to input_dim.
seq_len (int, optional) – Length of the input sequences (for sequence-based autoencoders).
**kwargs – Additional keyword arguments passed to the specific constructors.

Return type:

tuple[Module, Module]

dymad.modules.make_kernel(k_type, input_dim, output_dim=None, kopts=None, dtype=None, **kwargs)¶

Factory function to create preset kernels. Including:

[sc_rbf] Scalar: Radial basis function kernel
[sc_dm] Scalar: Diffusion Map kernel
[op_sep] Operator-valued: Separable kernel with multiple scalar kernels

Parameters:

k_type (str) – Type of kernel to create. One of {‘sc_rbf’, ‘sc_dm’}.
input_dim (int) – Dimension of the input features.
output_dim (int, optional) – Dimension of the output features.
kopts (List, optional) – List of scalar kernel options (for operator-valued kernels).
dtype – Data type of the kernel parameters.
**kwargs – Additional keyword arguments passed to the kernel constructors.

Return type:

Module

dymad.modules.make_krr(type, kernel, ridge_init=0, jitter=1e-10, dtype=None, device=None)¶

Factory function to create preset Kernel Ridge Regression (KRR) models. Including:

[share] Multi-output KRR with shared scalar kernel
[indep] Multi-output KRR with independent scalar kernels
[opval] Multi-output KRR with operator-valued kernel
[tangent] KRR for vector fields on manifolds

Parameters:

type (str) – Type of KRR model to create. One of {‘krr_shared’, ‘krr_indep’, ‘krr_opval’, ‘krr_tangent’}.
kernel (Union[Dict, List[Dict]]) – Kernel configuration(s).
ridge_init (float, optional) – Initial value for the ridge regularization parameter.
jitter (float, optional) – Jitter added to the diagonal for numerical stability.

Return type:

Module

dymad.modules.make_network(nn_type, input_dim, hidden_dim, output_dim, n_layers, seq_len=None, **kwargs)¶

Factory function to create preset neural network models based on NN_MAP.

Parameters:

nn_type (str) – Type of network to create. One of the keys in NN_MAP: {‘mlp_smp’, ‘mlp_res’, ‘mlp_cat’, ‘mlp_1st’, ‘gnn_smp’, ‘gnn_res’, ‘gnn_cat’, ‘gnn_1st’, ‘seq_std’, ‘seq_rnn’}, or ‘seq’ prefixed versions of MLP and GNN types for sequence models.
input_dim (int) – Dimension of the input features.
hidden_dim (int) – Width of the hidden layers.
output_dim (int) – Dimension of the output space.
n_layers (int) – Number of layers in the network.
seq_len (int, optional) – Length of the input sequences (for sequence-based networks).
**kwargs – Additional keyword arguments passed to the specific constructors.

Returns:

The constructed neural network module.

Return type:

nn.Module

dymad.modules.scaled_cdist(X, Z, scale, p)¶

Pairwise distance ||X/scale - Z/scale||^p with broadcasting-friendly scaling.

Parameters:

X (torch.Tensor) – (N,d)
Z (torch.Tensor) – (M,d)
scale (float or torch.Tensor) – (d,) or scalar, positive
p (float) – order of the norm

Return type:

Tensor

Modules

`collections`
`gnn`
`helpers`
`kernel`
`krr`
`linear`
`misc`
`mlp`
`sequential`