dymad.io

class dymad.io.BoundaryLoadTrace(plan, model_ref)

Bases: object

model_ref: str
plan: PredictionWorkflowPlan
class dymad.io.DataInterface(model_class=None, checkpoint_path=None, config_path=None, config_mod=None, device=None)

Bases: object

Interface for data transforms, possibly with learned autoencoders.

It loads the model (if available) and data, sets up the necessary transformations, and provides methods to encode, decode, and apply observables.

Cases:

  • [Priority] checkpoint_path is given: Load the data transforms and model from the checkpoint. May contain autoencoders.

  • [Secondary] config_path and/or config_mod is given: Instantiate the data transforms from the config. No model (i.e., autoencoders) in this case.

apply_obs(fobs)

Apply a generic observable to the raw data.

Parameters:

fobs (Callable) – Observable function. It should accept a 2D array input with each row as one step. The output should be a 1D array, whose ith entry corresponds to the ith step.

Return type:

ndarray

decode(X, rng=None)

Decode trajectory data from the observer space.

Return type:

ndarray

encode(X, rng=None)

Encode new trajectory data to the observer space.

Return type:

ndarray

get_backward_modes(ref=None, rng=None, **kwargs)
Return type:

ndarray

get_forward_modes(ref=None, rng=None, **kwargs)
Return type:

ndarray

class dymad.io.TrajectoryManager(metadata, data_key=None, device=device(type='cpu'))

Bases: object

A class to manage trajectory data loading, preprocessing, and dataloader creation.

The workflow includes:

  • Loading raw data from a binary file.

  • Preprocessing (trimming trajectories, subsetting, etc.).

  • Creating a dataset.

  • Normalizing and transforming the data using specified transformations.

  • Creating a dataloader.

The class is configured via a YAML configuration file.

Parameters:
  • metadata (dict) – Configuration dictionary.

  • mode (str) – Dataset to read, one of ‘train’, ‘valid’, ‘test’.

  • device (torch.device) – Torch device to use.

apply_data_transformations()

Apply data transformations to the loaded trajectories and control inputs. This creates the dataset.

This method applies transformations defined in the configuration for x, y, u, p

Return type:

None

create_dataloaders(*, typed=False)

Create dataloaders for the data set.

Return type:

None

create_regular_series_dataset(indices=None)

Expose the first typed data seam for regular trajectory preprocessing.

Return type:

list[RegularSeries]

data_truncation()

Truncate the loaded data according to the configuration.

Return type:

None

This includes:
  • Subsetting the number of trajectories and horizon (n_steps).

  • Populating basic metadata (dt, tf, shapes, etc.).

load_data()

Load raw data from a binary file.

Return type:

dict

The file is assumed to store (in order):

x: array-like or list of array-like, shape (n_samples, n_state_features) data. If data contains multiple trajectories, x should be a list containing data for each trajectory. Individual trajectories may contain different numbers of samples.

t: float, numpy array of shape (n_samples,), or list of numpy arrays If t is a float, it specifies the timestep between each sample. If array-like, it specifies the time (seconds in physical time) at which each sample was collected. In this case the values in t must be strictly increasing. In the case of multi-trajectory data, t may also be a list of arrays containing the collection times for each individual trajectory.

u: array-like or list of array-like, shape (n_samples, n_control_features), optional (default None) Control variables/inputs. If data contains multiple trajectories (i.e. if x is a list of array-like), then u should be a list containing control variable data for each trajectory. Individual trajectories may contain different numbers of samples.

prepare_data()

Handy function to load and truncate data in one call.

Return type:

None

process_all(*, typed=False)
Returns:

dataloader, dataset, metadata

Return type:

A tuple containing

process_data(*, typed=False)

Latter half of process_all

Return type:

tuple[Union[DataLoader[RegularTrainerBatch], DataLoader[GraphTrainerBatch]], list[RegularSeries] | list[GraphSeries], dict]

set_data_index(index=None)

Set the data index for this TrajectoryManager.

Return type:

None

set_transforms(metadata=None, trajmgr=None)
Return type:

None

update_config(config)

Update the configuration metadata. After this step, data transformations need to be refitted.

Return type:

None

class dymad.io.TrajectoryManagerGraph(metadata, data_key='train', device=device(type='cpu'), adj=None)

Bases: TrajectoryManager

A class to manage trajectory data loading, preprocessing, and dataloader creation - graph version.

The graph data is assumed to be homogeneous, that each node has the same number of features. Hence the normalization, if done, is applied globally to all nodes.

However, the number of edges can vary over time, and hence other quantities defined on edges.

In the raw data, the nodal state features are expected to be concatenated sequentially. For example, for N nodes with M features each, the raw data for states at a time step is

\[x = [x_1, x_2, ..., x_N], \text{where } x_i \in R^M,\]

Same applies to other data members, if present.

Parameters:
  • metadata (dict) – Configuration dictionary.

  • device (torch.device) – Torch device to use.

  • adj (torch.Tensor or np.ndarray, optional) – Adjacency matrix for GNN models. If not provided, will try to get from config.

apply_data_transformations()

Apply data transformations to the loaded trajectories and control inputs. This creates the dataset.

The raw data is expected to be [T, n_nodes * n_features], but the transformation assumes [T * n_nodes, n_features]. So extra reshaping is needed.

Return type:

None

create_dataloaders(*, typed=False)

For graph data, we aggregate the trajectories into batches of graphs.

Return type:

None

create_graph_series_dataset(indices=None)

Expose the typed graph-series seam for graph trajectory preprocessing.

Return type:

list[GraphSeries]

data_truncation()

Truncate the loaded data according to the configuration.

Return type:

None

This includes:
  • Subsetting the number of trajectories and horizon (n_steps).

  • Populating basic metadata (dt, tf, shapes, etc.).

load_data()

Load raw data from a binary file.

Return type:

dict

The file is assumed to store (in order):

x: array-like or list of array-like, shape (n_samples, n_state_features) data. If data contains multiple trajectories, x should be a list containing data for each trajectory. Individual trajectories may contain different numbers of samples.

t: float, numpy array of shape (n_samples,), or list of numpy arrays If t is a float, it specifies the timestep between each sample. If array-like, it specifies the time (seconds in physical time) at which each sample was collected. In this case the values in t must be strictly increasing. In the case of multi-trajectory data, t may also be a list of arrays containing the collection times for each individual trajectory.

u: array-like or list of array-like, shape (n_samples, n_control_features), optional (default None) Control variables/inputs. If data contains multiple trajectories (i.e. if x is a list of array-like), then u should be a list containing control variable data for each trajectory. Individual trajectories may contain different numbers of samples.

set_transforms(metadata=None, trajmgr=None)
Return type:

None

dymad.io.load_model(model_class, checkpoint_path, *, context=None, horizon=1, has_control=False, has_graph=False, return_trace=False)

Load a model from a checkpoint and optionally record the boundary plan.

dymad.io.visualize_model(mdl_class=None, checkpoint_path=None, model=None, prd_func=None, ref_data=None, depth=1, device='cpu', ifsave=False, show_all_paths=False)

Modules

checkpoint

series_adapter

Array-to-series adapters for typed regular and graph payloads.

trajectory_manager