dymad.modules¶
- class dymad.modules.FlexLinear(in_features, out_features, bias=True, dtype=None, device=None)¶
Bases:
ModuleA linear layer that can store weights either as a full matrix (MxN) or as low-rank factors (U, V) with efficient matvec operations.
In the low-rank mode, the weight matrix is represented as:
W = U @ V^T
where U is (M x r) and V is (N x r).
- forward(x)¶
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.- Return type:
Tensor
- set_full(W, b)¶
Switch to full mode and copy parameters.
- set_lora(U, V, b)¶
Switch to lowrank mode and copy factors. U: out*r, V: in*r.
- set_weights(W=None, b=None, U=None, V=None)¶
- Return type:
tuple[Tensor,...]
- class dymad.modules.GNN(input_dim, hidden_dim, output_dim, n_layers, *, gcl='sage', gcl_opts=None, activation='prelu', weight_init='xavier_uniform', bias_init='zeros', gain=1.0, end_activation=True, dtype=None, device=None)¶
Bases:
ModuleConfigurable Graph Neural Network using a choice of GCL (e.g., SAGEConv, ChebConv) and activations.
Due to the implementation, the GNN is applied sequentially to batch data.
To interface with other parts of the code, the model assumes the input to be node-wise, (…, n_nodes, n_input), but the output is reshaped to concatenate features across nodes, (…, n_nodes * n_output). See forward method for details.
- Parameters:
input_dim (int) – Dimension of input node features.
hidden_dim (int) – Dimension of hidden layers.
output_dim (int) – Dimension of output node features.
n_layers (int) – Number of GCL layers.
gcl (str | nn.Module | type, default='sage') – Graph convolution layer type or instance.
gcl_opts (dict, default={}) – Options passed to the GCL constructor.
activation (str | nn.Module | type, default='prelu') – Activation function.
weight_init (str | callable, default='xavier_uniform') – Weight initializer.
bias_init (str | callable, default='zeros') – Bias initializer.
gain (float, default=1.0) – Extra gain modifier for weight initialization.
end_activation (bool, default=True) – Whether to apply activation after last layer.
- diagnostic_info()¶
- Return type:
str
- forward(x, edge_index, edge_weights, edge_attr, **kwargs)¶
Forward pass through the GNN.
x (…, n_nodes, n_features).
edge_index (…, n_edges, 2).
edge_weights (…, n_edges).
edge_attr (…, n_edges, n_edge_features).
Returns (…, n_nodes*n_new_features).
If …=1, we can process the entire batch in one go. Otherwise, we aggregate the graph on the fly so the shapes are reduced to the first case. The aggregation takes a bit more time.
- class dymad.modules.IdenCatGNN(input_dim, hidden_dim, output_dim, n_layers, gcl='sage', gcl_opts=None, activation='prelu', weight_init='xavier_uniform', bias_init='zeros', gain=1.0, end_activation=True, dtype=None, device=None)¶
Bases:
GNNIdentity concatenation GNN.
This GNN concatenates the input with the output of the GNN.
Note
The output dimension represents the total output features and must be greater than the input dimension.
See GNN for the arguments.
- forward(x, edge_index, edge_weights, edge_attr, **kwargs)¶
Forward pass through the GNN.
x (…, n_nodes, n_features).
edge_index (…, n_edges, 2).
edge_weights (…, n_edges).
edge_attr (…, n_edges, n_edge_features).
Returns (…, n_nodes*n_new_features).
If …=1, we can process the entire batch in one go. Otherwise, we aggregate the graph on the fly so the shapes are reduced to the first case. The aggregation takes a bit more time.
- class dymad.modules.IdenCatMLP(input_dim, hidden_dim, output_dim, n_layers=2, activation=<class 'torch.nn.modules.activation.ReLU'>, weight_init=<function xavier_uniform_>, bias_init=<function zeros_>, gain=1.0, end_activation=True, dtype=None, device=None)¶
Bases:
MLPIdentity concatenation MLP.
This MLP concatenates the input with the output of the MLP.
Note
The output dimension represents the total output features and must be greater than the input dimension.
See MLP for the arguments.
- forward(x)¶
Forward pass through the identity concatenation MLP.
- Parameters:
x (torch.Tensor) – Input tensor of shape (…, input_dim).
- Returns:
Output tensor of shape (…, output_dim).
- Return type:
torch.Tensor
- class dymad.modules.KRRBase(kernel, ridge_init=0, jitter=1e-10, device=None)¶
Bases:
ModuleBase class for Kernel Ridge Regression, in particular:
Multi-output single scalar kernel (the most common case)
Multi-output multiple scalar kernel (i.e., one kernel per output)
True operator-valued kernel (i.e., matrix-valued)
Subclasses must implement:
_ensure_solved(self)
_predict_from_solution(self, Xnew) -> (M, Dy)
- fit()¶
Precompute the linear solve, which can be backprop’d.
- Return type:
Tensor
- forward(Xnew)¶
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- property ridge: Tensor¶
- set_train_data(X, Y)¶
- Return type:
None
- class dymad.modules.KRRMultiOutputIndep(kernel, ridge_init=0, jitter=1e-10, device=None)¶
Bases:
KRRBaseScalar KRR for multiple outputs, and one kernel per output
A ModuleList of Dy scalar kernels (one per output).
Dy independent NxN Choleskys; Dy ridges (vector).
Bases:
KRRBaseScalar KRR for multiple outputs but one single kernel
One NxN Cholesky; solve Dy outputs together. One lambda (scalar) by default.
- class dymad.modules.KRROperatorValued(kernel, ridge_init=0, jitter=1e-10, device=None)¶
Bases:
KRRBaseOperator-valued kernel K(X,Z) -> (N,M,Dy,Dy).
Solves (Kxx + lambda I) vec(alpha) = vec(Y), using a single (N*Dy)x(N*Dy) Cholesky.
- class dymad.modules.KRRTangent(kernel, ridge_init=0, jitter=1e-10, device=None)¶
Bases:
KRRBaseKRR for vector fields on a manifold, using a specialized tangent kernel.
The formulation is based on Geometrically constraint Multivariate KRR (GMKRR) from
Huang, He, Harlim, Li, ICLR2025
Solves (Kxx + lambda I) vec(alpha) = vec(Y), but Kxx is given in a factorized form, so effectively we solve a smaller system in intrinsic dimension d << Dy.
(kxx + lambda I) vec(alpha) = vec(T^T * Y)
- set_manifold(manifold)¶
- Return type:
None
- class dymad.modules.KernelAbstract(in_dim, dtype=None)¶
Bases:
Module,ABCBase interface for all kernels (scalar or operator-valued).
- abstractmethod forward(X, Z=None)¶
Compute kernel between X (N,d) and Z (M,d).
If Z is None, compute K(X,X).
- Returns:
(N, M) - Operator-valued kernels: (N, Dy, M, Dy)
- Return type:
Scalar kernels
- abstract property is_operator_valued: bool¶
True for operator-valued kernels; False for scalar kernels.
- set_reference_data(Xref)¶
Prepare data-dependent structures from Xref (N,d). Must be differentiable if kernel params are learnable.
By default the kernel is data-independent and does nothing.
- Return type:
None
- class dymad.modules.KernelOpSeparable(kernels, out_dim, Ls=None, dtype=None)¶
Bases:
KernelOperatorValuedScalarsSeparable operator-valued kernel K(x,z) = sum_i k_i(x,z; ell) * B_i where B_i = L_i L_i^T is PSD and learnable. Output shape: (…, N, Dy, M, Dy)
- forward(X, Z=None)¶
Compute kernel between X (N,d) and Z (M,d).
If Z is None, compute K(X,X).
- Returns:
(N, M) - Operator-valued kernels: (N, Dy, M, Dy)
- Return type:
Scalar kernels
- class dymad.modules.KernelOpTangent(kernel, out_dim, dtype=None)¶
Bases:
KernelOperatorValuedOperator-valued kernel for vector fields on a manifold
For manifold of intrinsic dimension d and ambient dimension Dy:
K(x,z) = k(x,z; ell) * T(x’) O(x’,z’) T(z’)^T
where O(x’,z’) = T(x’)^T T(z’) and T, of (Dy,d), are tangent basis vectors at x’ and z’, and the ‘ denotes the state part of the input (the first out_dim dimensions). k is a scalar kernel that includes both states and inputs.
Returns a factored representation of the kernel to stay in intrinsic dimension
k(x,z; ell) O(x,z), T(x), T(z)
of shapes: (…, d, M, d), (…, d, Dy), (M, d, Dy)
- forward(X, Z=None)¶
Compute kernel between X (N,d) and Z (M,d).
If Z is None, compute K(X,X).
- Returns:
(N, M) - Operator-valued kernels: (N, Dy, M, Dy)
- Return type:
Scalar kernels
- set_manifold(manifold)¶
- Return type:
None
- set_reference_data(Xref)¶
Prepare data-dependent structures from Xref (N,d). Must be differentiable if kernel params are learnable.
By default the kernel is data-independent and does nothing.
- Return type:
None
- class dymad.modules.KernelOperatorValued(in_dim, out_dim, dtype=None)¶
Bases:
KernelAbstract,ABC- property is_operator_valued: bool¶
True for operator-valued kernels; False for scalar kernels.
- class dymad.modules.KernelOperatorValuedScalars(kernels, out_dim, dtype=None)¶
Bases:
KernelOperatorValuedOperator-valued kernel induced by scalar kernels Output shape: (…, N, Dy, M, Dy)
- set_reference_data(Xref)¶
Prepare data-dependent structures from Xref (N,d). Must be differentiable if kernel params are learnable.
By default the kernel is data-independent and does nothing.
- Return type:
None
- class dymad.modules.KernelScDM(in_dim, eps_init=None, t_init=1.0, dtype=None)¶
Bases:
KernelScalarValuedSymmetric-normalized diffusion kernel via diffusion maps.
Everything keeps autograd for eps and t.
- property eps¶
- forward(X, Z=None)¶
Compute kernel between X (N,d) and Z (M,d).
If Z is None, compute K(X,X).
- Returns:
(N, M) - Operator-valued kernels: (N, Dy, M, Dy)
- Return type:
Scalar kernels
- set_reference_data(Xref)¶
Prepare data-dependent structures from Xref (N,d). Must be differentiable if kernel params are learnable.
By default the kernel is data-independent and does nothing.
- Return type:
None
- property t¶
- class dymad.modules.KernelScExp(in_dim, lengthscale_init=None, dtype=None)¶
Bases:
KernelScalarValuedScalar Exponential: k(x,z) = exp(-||x - z|| / ell) Learnable positive lengthscale.
- property ell¶
- forward(X, Z=None)¶
Compute kernel between X (N,d) and Z (M,d).
If Z is None, compute K(X,X).
- Returns:
(N, M) - Operator-valued kernels: (N, Dy, M, Dy)
- Return type:
Scalar kernels
- class dymad.modules.KernelScRBF(in_dim, lengthscale_init=None, dtype=None)¶
Bases:
KernelScalarValuedScalar RBF: k(x,z) = exp(-0.5 * ||x - z||^2 / ell^2) Learnable positive lengthscale.
- property ell¶
- forward(X, Z=None)¶
Compute kernel between X (N,d) and Z (M,d).
If Z is None, compute K(X,X).
- Returns:
(N, M) - Operator-valued kernels: (N, Dy, M, Dy)
- Return type:
Scalar kernels
- set_reference_data(Xref)¶
Prepare data-dependent structures from Xref (N,d). Must be differentiable if kernel params are learnable.
By default the kernel is data-independent and does nothing.
- Return type:
None
- class dymad.modules.KernelScalarValued(in_dim, dtype=None)¶
Bases:
KernelAbstract,ABC- property is_operator_valued: bool¶
True for operator-valued kernels; False for scalar kernels.
- class dymad.modules.MLP(input_dim, hidden_dim, output_dim, *, n_layers=2, activation=<class 'torch.nn.modules.activation.ReLU'>, weight_init=<function xavier_uniform_>, bias_init=<function zeros_>, gain=1.0, end_activation=True, dtype=None, device=None)¶
Bases:
ModuleFully-connected feed-forward network
Assuming the following architecture:
in_dim -> (Linear -> Act) x n_hidden -> Linear -> out_dim
- Parameters:
input_dim (int) – Dimension of the input features.
hidden_dim (int) – Width of every hidden layer.
output_dim (int) – Dimension of the network output.
n_layers (int, default = 2) –
Number of total layers.
If 0, same as Identity, or TakeFirst.
If 1, same as Linear.
If 2, same as Linear -> activation -> Linear.
Otherwise, hidden layers are inserted.
activation (nn.Module or Callable[[], nn.Module], default = nn.ReLU) – Non-linearity to insert after every hidden Linear. Pass either a class (e.g. nn.Tanh) or an already-constructed module.
weight_init (Callable[[torch.Tensor, float], None], default = nn.init.kaiming_uniform_) – Function used to initialise each Linear layer’s weight tensor. Must accept (tensor, gain) signature like the functions in torch.nn.init.
bias_init (Callable[[torch.Tensor], None], default = nn.init.zeros_) – Function used to initialise each Linear layer’s bias tensor.
gain (Optional[float], default = 1.0) – In the linear layers, the weights are initialised with the standard nn.init.calculate_gain(<nonlinearity>) Gain is multiplied to the calculated gain. By default gain=1, so no change.
end_activation (bool, default = True) –
If
True, the last layer is followed by an activation function.Otherwise, the last layer is a plain Linear layer.
- diagnostic_info()¶
- Return type:
str
- forward(x)¶
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.- Return type:
Tensor
- class dymad.modules.ResBlockGNN(input_dim, hidden_dim, output_dim, n_layers, gcl='sage', gcl_opts=None, activation='prelu', weight_init='xavier_uniform', bias_init='zeros', gain=1.0, end_activation=True, dtype=None, device=None)¶
Bases:
GNNResidual block with GNN as the nonlinearity.
See GNN for the arguments.
- forward(x, edge_index, edge_weights, edge_attr, **kwargs)¶
Forward pass through the GNN.
x (…, n_nodes, n_features).
edge_index (…, n_edges, 2).
edge_weights (…, n_edges).
edge_attr (…, n_edges, n_edge_features).
Returns (…, n_nodes*n_new_features).
If …=1, we can process the entire batch in one go. Otherwise, we aggregate the graph on the fly so the shapes are reduced to the first case. The aggregation takes a bit more time.
- class dymad.modules.ResBlockMLP(input_dim, hidden_dim, output_dim, n_layers=2, activation=<class 'torch.nn.modules.activation.ReLU'>, weight_init=<function xavier_uniform_>, bias_init=<function zeros_>, gain=1.0, end_activation=True, dtype=None, device=None)¶
Bases:
MLPResidual block with MLP as the nonlinearity.
See MLP for the arguments.
- forward(x)¶
Forward pass through the residual block.
- Parameters:
x (torch.Tensor) – Input tensor of shape (…, input_dim).
- Returns:
Output tensor of shape (…, output_dim).
- Return type:
torch.Tensor
- class dymad.modules.SequentialBase(seq_len, *, last_only=True, net=None, input_dim=-1, hidden_dim=-1, output_dim=-1, n_layers=2, activation=<class 'torch.nn.modules.activation.ReLU'>, weight_init=<function xavier_uniform_>, bias_init=<function zeros_>, gain=1.0, dtype=None, device=None, **kwargs)¶
Bases:
ModuleInterface module that handles time-delayed input sequences.
The module assumes that the input is given in shape (…, seq_len * input_dim), where seq_len is the length of the time-delay/sequence, and input_dim is the dimension of each step’s input features. The module reshapes the input to (…, seq_len, input_dim) and passes it to an internal sequential model (e.g., RNN), and return the output either at the last step (…, output_dim) or the full sequence flattened (…, seq_len * output_dim).
It considers two types of architectures
Internally construct RNN-like models, that applies to the input in the standard way. This is usually good for defining dynamics.
Externally provided models, that process step by step and not necessarily recurrently. This is usually good for defining encoders/decoders. Examples are MLP and GNN.
In both cases, the models expect input of shape (-1, seq_len, input_dim) and return output of shape (-1, seq_len, output_dim).
In either case, a subclass must implement _run_seq() method that defines how to run the model.
The module can also operate in two modes
last_only=True: returns only the output at the last step (…, output_dim). Usually for dynamics.
last_only=False: returns the outputs at all steps, flattened (…, seq_len * output_dim). Usually for encoders/decoders.
- Parameters:
seq_len (int) – Length of the input sequences.
last_only (Optional[bool]) – Whether to return only the last step output, default is True.
net (Optional[nn.Module]) – Optional externally provided model. If None, an internal RNN-like model is constructed.
input_dim (Optional[int]) – Dimension of the input features of all steps (for RNN-like).
hidden_dim (Optional[int]) – Width of the hidden layers (for RNN-like).
output_dim (Optional[int]) – Dimension of the output features of all steps (for RNN-like).
n_layers (Optional[int]) – Number of layers (for RNN-like).
activation (Union[str, nn.Module, Callable[[], nn.Module]]) – Activation function (for RNN-like).
weight_init (Union[str, Callable[[torch.Tensor, float], None]]) – Weight initialization method (for RNN-like).
bias_init (Callable[[torch.Tensor], None]) – Bias initialization method (for RNN-like).
gain (Optional[float]) – Gain factor for weight initialization (for RNN-like).
dtype – Data type for the module (for RNN-like).
device – Device for the module (for RNN-like).
**kwargs – Additional keyword arguments passed to the internal model constructor.
- forward(x, u=None)¶
The network does concatenation internally, because x and u are both time-delayed and concatenated, and applying the sequential model requires to stack x and u of the same steps and then concatenate.
- Parameters:
x (
Tensor) – Stacked input tensor of shape (…, seq_len * x_dim)u (
Tensor|None) – (Optional) Stacked control tensor of shape (…, seq_len * u_dim); needed when serving as encoder with inputs.
- Returns:
- Stacked output tensor of shape (…, output_dim),
where the last slot is the output of the sequential model at the last step if last_only is True, otherwise all outputs are concatenated.
- Return type:
output
- class dymad.modules.SimpleRNN(seq_len, *, last_only=True, net=None, input_dim=-1, hidden_dim=-1, output_dim=-1, n_layers=2, activation=<class 'torch.nn.modules.activation.ReLU'>, weight_init=<function xavier_uniform_>, bias_init=<function zeros_>, gain=1.0, dtype=None, device=None, **kwargs)¶
Bases:
SequentialBaseA simple recurrent neural network module
One layer, unidirectional, but supports arbitrary activations with a linear readout.
- class dymad.modules.StepwiseModel(seq_len, *, last_only=True, net=None, input_dim=-1, hidden_dim=-1, output_dim=-1, n_layers=2, activation=<class 'torch.nn.modules.activation.ReLU'>, weight_init=<function xavier_uniform_>, bias_init=<function zeros_>, gain=1.0, dtype=None, device=None, **kwargs)¶
Bases:
SequentialBaseNaive application of a network to each step of a sequence.
- class dymad.modules.VanillaRNN(seq_len, *, last_only=True, net=None, input_dim=-1, hidden_dim=-1, output_dim=-1, n_layers=2, activation=<class 'torch.nn.modules.activation.ReLU'>, weight_init=<function xavier_uniform_>, bias_init=<function zeros_>, gain=1.0, dtype=None, device=None, **kwargs)¶
Bases:
SequentialBaseVanilla RNN from pytorch.
- dymad.modules.make_autoencoder(ae_type, input_dim, hidden_dim, latent_dim, enc_depth, dec_depth, output_dim=None, seq_len=None, **kwargs)¶
Factory function to create preset autoencoder models. Including:
[mlp_smp] Simple version: MLP-in MLP-out
[mlp_res] Simple version but with ResBlockMLP
[mlp_cat] Concatenation as encoder [x MLP(x)], then TakeFirst as decoder
[mlp_seq_rnn] RNN-in MLP-out, using a 1-layer unidirectional RNN
[mlp_seq_std] RNN-in MLP-out, using standard RNN from pytorch
[mlp_seq_smp] MLP-in MLP-out, applied stepwise to sequences
The graph version of the above, e.g., gnn_smp, gnn_seq_rnn, etc.
- Parameters:
ae_type (str) – Type of autoencoder to create.
input_dim (int) – Dimension of the input features.
hidden_dim (int) – Width of the hidden layers.
latent_dim (int) – Dimension of the latent/encoded space.
enc_depth (int) – Number of layers in the encoder.
dec_depth (int) – Number of layers in the decoder.
output_dim (int, optional) – Dimension of the output features, defaults to input_dim.
seq_len (int, optional) – Length of the input sequences (for sequence-based autoencoders).
**kwargs – Additional keyword arguments passed to the specific constructors.
- Return type:
tuple[Module,Module]
- dymad.modules.make_kernel(k_type, input_dim, output_dim=None, kopts=None, dtype=None, **kwargs)¶
Factory function to create preset kernels. Including:
[sc_rbf] Scalar: Radial basis function kernel
[sc_dm] Scalar: Diffusion Map kernel
[op_sep] Operator-valued: Separable kernel with multiple scalar kernels
- Parameters:
k_type (str) – Type of kernel to create. One of {‘sc_rbf’, ‘sc_dm’}.
input_dim (int) – Dimension of the input features.
output_dim (int, optional) – Dimension of the output features.
kopts (List, optional) – List of scalar kernel options (for operator-valued kernels).
dtype – Data type of the kernel parameters.
**kwargs – Additional keyword arguments passed to the kernel constructors.
- Return type:
Module
- dymad.modules.make_krr(type, kernel, ridge_init=0, jitter=1e-10, dtype=None, device=None)¶
Factory function to create preset Kernel Ridge Regression (KRR) models. Including:
[share] Multi-output KRR with shared scalar kernel
[indep] Multi-output KRR with independent scalar kernels
[opval] Multi-output KRR with operator-valued kernel
[tangent] KRR for vector fields on manifolds
- Parameters:
type (str) – Type of KRR model to create. One of {‘krr_shared’, ‘krr_indep’, ‘krr_opval’, ‘krr_tangent’}.
kernel (Union[Dict, List[Dict]]) – Kernel configuration(s).
ridge_init (float, optional) – Initial value for the ridge regularization parameter.
jitter (float, optional) – Jitter added to the diagonal for numerical stability.
- Return type:
Module
- dymad.modules.make_network(nn_type, input_dim, hidden_dim, output_dim, n_layers, seq_len=None, **kwargs)¶
Factory function to create preset neural network models based on
NN_MAP.- Parameters:
nn_type (str) – Type of network to create. One of the keys in
NN_MAP: {‘mlp_smp’, ‘mlp_res’, ‘mlp_cat’, ‘mlp_1st’, ‘gnn_smp’, ‘gnn_res’, ‘gnn_cat’, ‘gnn_1st’, ‘seq_std’, ‘seq_rnn’}, or ‘seq’ prefixed versions of MLP and GNN types for sequence models.input_dim (int) – Dimension of the input features.
hidden_dim (int) – Width of the hidden layers.
output_dim (int) – Dimension of the output space.
n_layers (int) – Number of layers in the network.
seq_len (int, optional) – Length of the input sequences (for sequence-based networks).
**kwargs – Additional keyword arguments passed to the specific constructors.
- Returns:
The constructed neural network module.
- Return type:
nn.Module
- dymad.modules.scaled_cdist(X, Z, scale, p)¶
Pairwise distance ||X/scale - Z/scale||^p with broadcasting-friendly scaling.
- Parameters:
X (torch.Tensor) – (N,d)
Z (torch.Tensor) – (M,d)
scale (float or torch.Tensor) – (d,) or scalar, positive
p (float) – order of the norm
- Return type:
Tensor
Modules