dymad.io.trajectory_manager

Classes

TrajectoryManager(metadata[, data_key, device])

A class to manage trajectory data loading, preprocessing, and dataloader creation.

TrajectoryManagerGraph(metadata[, data_key, ...])

A class to manage trajectory data loading, preprocessing, and dataloader creation - graph version.

class dymad.io.trajectory_manager.TrajectoryManager(metadata, data_key=None, device=device(type='cpu'))

Bases: object

A class to manage trajectory data loading, preprocessing, and dataloader creation.

The workflow includes:

  • Loading raw data from a binary file.

  • Preprocessing (trimming trajectories, subsetting, etc.).

  • Creating a dataset.

  • Normalizing and transforming the data using specified transformations.

  • Creating a dataloader.

The class is configured via a YAML configuration file.

Parameters:
  • metadata (dict) – Configuration dictionary.

  • mode (str) – Dataset to read, one of ‘train’, ‘valid’, ‘test’.

  • device (torch.device) – Torch device to use.

apply_data_transformations()

Apply data transformations to the loaded trajectories and control inputs. This creates the dataset.

This method applies transformations defined in the configuration for x, y, u, p

Return type:

None

create_dataloaders(*, typed=False)

Create dataloaders for the data set.

Return type:

None

create_regular_series_dataset(indices=None)

Expose the first typed data seam for regular trajectory preprocessing.

Return type:

list[RegularSeries]

data_truncation()

Truncate the loaded data according to the configuration.

Return type:

None

This includes:
  • Subsetting the number of trajectories and horizon (n_steps).

  • Populating basic metadata (dt, tf, shapes, etc.).

load_data()

Load raw data from a binary file.

Return type:

dict

The file is assumed to store (in order):

x: array-like or list of array-like, shape (n_samples, n_state_features) data. If data contains multiple trajectories, x should be a list containing data for each trajectory. Individual trajectories may contain different numbers of samples.

t: float, numpy array of shape (n_samples,), or list of numpy arrays If t is a float, it specifies the timestep between each sample. If array-like, it specifies the time (seconds in physical time) at which each sample was collected. In this case the values in t must be strictly increasing. In the case of multi-trajectory data, t may also be a list of arrays containing the collection times for each individual trajectory.

u: array-like or list of array-like, shape (n_samples, n_control_features), optional (default None) Control variables/inputs. If data contains multiple trajectories (i.e. if x is a list of array-like), then u should be a list containing control variable data for each trajectory. Individual trajectories may contain different numbers of samples.

prepare_data()

Handy function to load and truncate data in one call.

Return type:

None

process_all(*, typed=False)
Returns:

dataloader, dataset, metadata

Return type:

A tuple containing

process_data(*, typed=False)

Latter half of process_all

Return type:

tuple[Union[DataLoader[RegularTrainerBatch], DataLoader[GraphTrainerBatch]], list[RegularSeries] | list[GraphSeries], dict]

set_data_index(index=None)

Set the data index for this TrajectoryManager.

Return type:

None

set_transforms(metadata=None, trajmgr=None)
Return type:

None

update_config(config)

Update the configuration metadata. After this step, data transformations need to be refitted.

Return type:

None

class dymad.io.trajectory_manager.TrajectoryManagerGraph(metadata, data_key='train', device=device(type='cpu'), adj=None)

Bases: TrajectoryManager

A class to manage trajectory data loading, preprocessing, and dataloader creation - graph version.

The graph data is assumed to be homogeneous, that each node has the same number of features. Hence the normalization, if done, is applied globally to all nodes.

However, the number of edges can vary over time, and hence other quantities defined on edges.

In the raw data, the nodal state features are expected to be concatenated sequentially. For example, for N nodes with M features each, the raw data for states at a time step is

\[x = [x_1, x_2, ..., x_N], \text{where } x_i \in R^M,\]

Same applies to other data members, if present.

Parameters:
  • metadata (dict) – Configuration dictionary.

  • device (torch.device) – Torch device to use.

  • adj (torch.Tensor or np.ndarray, optional) – Adjacency matrix for GNN models. If not provided, will try to get from config.

apply_data_transformations()

Apply data transformations to the loaded trajectories and control inputs. This creates the dataset.

The raw data is expected to be [T, n_nodes * n_features], but the transformation assumes [T * n_nodes, n_features]. So extra reshaping is needed.

Return type:

None

create_dataloaders(*, typed=False)

For graph data, we aggregate the trajectories into batches of graphs.

Return type:

None

create_graph_series_dataset(indices=None)

Expose the typed graph-series seam for graph trajectory preprocessing.

Return type:

list[GraphSeries]

data_truncation()

Truncate the loaded data according to the configuration.

Return type:

None

This includes:
  • Subsetting the number of trajectories and horizon (n_steps).

  • Populating basic metadata (dt, tf, shapes, etc.).

load_data()

Load raw data from a binary file.

Return type:

dict

The file is assumed to store (in order):

x: array-like or list of array-like, shape (n_samples, n_state_features) data. If data contains multiple trajectories, x should be a list containing data for each trajectory. Individual trajectories may contain different numbers of samples.

t: float, numpy array of shape (n_samples,), or list of numpy arrays If t is a float, it specifies the timestep between each sample. If array-like, it specifies the time (seconds in physical time) at which each sample was collected. In this case the values in t must be strictly increasing. In the case of multi-trajectory data, t may also be a list of arrays containing the collection times for each individual trajectory.

u: array-like or list of array-like, shape (n_samples, n_control_features), optional (default None) Control variables/inputs. If data contains multiple trajectories (i.e. if x is a list of array-like), then u should be a list containing control variable data for each trajectory. Individual trajectories may contain different numbers of samples.

set_transforms(metadata=None, trajmgr=None)
Return type:

None