dymad.training

class dymad.training.AnalysisPhaseSpec(name, *, split='valid', evaluate_all=False, config=None)

Bases: BasePhaseSpec

evaluate_all: bool = False
split: str = 'valid'
class dymad.training.ArtifactRegistry(_artifacts=<factory>)

Bases: object

Typed intermediate artifacts shared across phases.

checkpoint_payload()
Return type:

dict[str, Any]

classmethod from_checkpoint_payload(payload)
Return type:

ArtifactRegistry

get(key, default=None)
Return type:

Any

keys()
Return type:

Iterable[str]

put(key, artifact)
Return type:

Any

require(key, expected_type=None)
Return type:

Any

class dymad.training.CVResult(params, fold_metrics, mean_metric=0.0, std_metric=0.0, checkpoint_paths=<factory>)

Bases: object

checkpoint_paths: list[str]
fold_metrics: list[float]
mean_metric: float = 0.0
params: dict[str, Any]
std_metric: float = 0.0
class dymad.training.DataPhaseSpec(name, *, operation='context', config=None)

Bases: BasePhaseSpec

operation: str = 'context'
class dymad.training.DriverBase(config_path, model_class, config_mod=None, device=None, max_workers=1)

Bases: object

Base driver: loops over (parameter combos x folds) and calls the optimizer.

CV_SEARCH_HANDLERS: dict[str, str] = {'grid': '_execute_cv_search_grid', 'nelder_mead_like': '_execute_cv_search_nelder_mead_like'}
iter_folds()

Yield (fold_id, fold_config) pairs.

fold_config is a full config dict (deep copy of base_config with fold-specific overrides, e.g. split_seed).

Return type:

Iterable[tuple[int, dict[str, Any]]]

train(continue_training=False)

Core loop over hyperparameter and folds combinations.

Return type:

tuple[int, CVResult, list[CVResult]]

Returns:

best_result, all_results

class dymad.training.EvaluationArtifact(metrics=<factory>, split='valid', criterion_name='total')

Bases: object

criterion_name: str = 'total'
metrics: dict[str, float]
split: str = 'valid'
class dymad.training.ExecutionServices(device, checkpoint_prefix, results_prefix, log_level='info', log_stdout=False)

Bases: object

Non-checkpointable runtime policy for one training execution path.

apply_to_config(config)
Return type:

dict[str, Any]

checkpoint_file(file_name)
Return type:

str

checkpoint_prefix: str
configure_logger(name, prefix=None)
Return type:

Logger

device: device
ensure_artifact_dirs()
Return type:

None

classmethod from_config(config, default_device=None)
Return type:

ExecutionServices

classmethod from_driver_config(base_config, config_path, default_device=None)
Return type:

ExecutionServices

log_level: str = 'info'
log_stdout: bool = False
logger_prefix(default_name)
Return type:

str

results_prefix: str
with_device(device)
Return type:

ExecutionServices

with_paths(*, checkpoint_prefix=None, results_prefix=None)
Return type:

ExecutionServices

class dymad.training.ExportArtifact(outputs=<factory>)

Bases: object

outputs: dict[str, str]
class dymad.training.ExportPhaseSpec(name, *, export_kind, config=None)

Bases: BasePhaseSpec

export_kind: str = 'best_model'
class dymad.training.LinearSolvePhaseSpec(name, *, method, params=None, kwargs=None, reset_optimizer=True, config=None)

Bases: BasePhaseSpec

kwargs: dict[str, Any]
method: str = 'full'
params: Any = None
reset_optimizer: bool = True
class dymad.training.LinearSolveReportArtifact(records=<factory>)

Bases: object

records: list[LinearSolveRecord]
class dymad.training.LinearTrainer(config_path, model_class, config_mod=None, device=None, max_workers=1)

Bases: SingleSplitDriver

Simple interface for single-split single-stage training by Linear regression.

class dymad.training.ModelArtifact(model, config, train_md, valid_md, dtype)

Bases: object

config: dict[str, Any]
dtype: dtype
model: Module
train_md: dict[str, Any]
valid_md: dict[str, Any]
class dymad.training.NODETrainer(config_path, model_class, config_mod=None, device=None, max_workers=1)

Bases: SingleSplitDriver

Simple interface for single-split single-stage training by NODE.

class dymad.training.OneStepTrainer(config_path, model_class, config_mod=None, device=None, max_workers=1)

Bases: SingleSplitDriver

Simple interface for single-split single-stage training by nonlinear one-step optimization.

class dymad.training.OptimizerPhaseSpec(name, trainer, config, *, reset_optimizer=False)

Bases: BasePhaseSpec

reset_optimizer: bool = False
trainer: str = 'NODE'
class dymad.training.OptimizerStateArtifact(optimizer, schedulers=<factory>, criteria=<factory>, criteria_weights=<factory>, criteria_names=<factory>, owner_phase='', _weak_C=None, _weak_D=None, _weak_N=None, _weak_dN=None, _linear_updater=None, _one_step_dt=None, _one_step_kwargs=<factory>)

Bases: object

criteria: list[Module]
criteria_names: list[str]
criteria_weights: list[float]
optimizer: Optimizer
owner_phase: str = ''
schedulers: list[Any]
class dymad.training.PhaseContext(train_set=None, valid_set=None, train_loader=None, valid_loader=None, train_md=None, valid_md=None)

Bases: object

Live phase context for one run.

train_loader: Optional[DataLoader[TypeAliasType]] = None
train_md: dict[str, Any] | None = None
train_set: list[TypeAliasType] | None = None
valid_loader: Optional[DataLoader[TypeAliasType]] = None
valid_md: dict[str, Any] | None = None
valid_set: list[TypeAliasType] | None = None
class dymad.training.PhasePipeline(config, model_class, device, dtype, execution_services=None)

Bases: object

Runs typed training phases in sequence.

build_phase(spec)
run(*, initial_context, initial_state, artifacts=None, run_name, checkpoint_callback=None)
Return type:

list[PhaseResult]

class dymad.training.PhaseRecord(name, kind, started_epoch, completed_epoch, metrics=<factory>, artifact_keys=<factory>)

Bases: object

artifact_keys: list[str]
completed_epoch: int
kind: str
metrics: dict[str, float]
name: str
started_epoch: int
class dymad.training.PhaseResult(name, kind, trainer_state, phase_context, artifacts, metrics=<factory>, record=None)

Bases: object

Typed phase outcome.

artifacts: ArtifactRegistry
get_metric(metric_name)
Return type:

float

kind: str
metrics: dict[str, float]
name: str
phase_context: PhaseContext
record: PhaseRecord | None = None
trainer_state: TrainerState
exception dymad.training.PhaseSpecValidationError

Bases: ValueError

Raised when a phase spec or normalized legacy config is invalid.

class dymad.training.SingleSplitDriver(config_path, model_class, config_mod=None, device=None, max_workers=1)

Bases: DriverBase

Single fixed split; can still scan param_grid.

Extreme case

  • schedule has only one phase,

  • param_grid empty or singleton,

Just “one trainer of one phase.”

iter_folds()

Yield (fold_id, fold_config) pairs.

fold_config is a full config dict (deep copy of base_config with fold-specific overrides, e.g. split_seed).

class dymad.training.StackedTrainer(config_path, model_class, config_mod=None, device=None, max_workers=1)

Bases: SingleSplitDriver

Simple interface for single-split phased training.

class dymad.training.TrainerRun(config, model_class, device, dtype, run_name, checkpoint_prefix, results_prefix, execution_services=None)

Bases: object

Owns one concrete training run identity, artifacts, and typed phase pipeline.

load_run_checkpoint(path=None)
Return type:

tuple[TrainerState, ArtifactRegistry]

run(*, initial_context, initial_state=None, artifacts=None)
Return type:

list[PhaseResult]

property run_checkpoint_path: str
save_run_checkpoint(trainer_state, artifacts)
Return type:

str

class dymad.training.TrainerState(config, execution_services=None, device=None, epoch=0, best_loss=<factory>, converged=False, convergence_epoch=None, phase_cursor=0, phase_records=<factory>)

Bases: object

Checkpointable training state.

best_loss: dict[str, float]
checkpoint_payload()
Return type:

dict[str, Any]

config: dict[str, Any] | None
converged: bool = False
convergence_epoch: int | None = None
device: device | None = None
epoch: int = 0
execution_services: ExecutionServices | None = None
classmethod from_checkpoint_payload(payload, *, execution_services=None)
Return type:

TrainerState

phase_cursor: int = 0
phase_records: list[PhaseRecord]
exception dymad.training.TrainingCheckpointError

Bases: ValueError

Raised when a typed training checkpoint cannot be loaded.

class dymad.training.TrainingHistoryArtifact(hist=<factory>, crit=<factory>, epoch_times=<factory>, best_loss=<factory>, best_model_state_dict=None, convergence_epoch=None)

Bases: object

best_loss: dict[str, float]
best_model_state_dict: dict[str, Any] | None = None
convergence_epoch: int | None = None
crit: list[Any]
epoch_times: list[float]
hist: list[Any]
class dymad.training.WeakFormTrainer(config_path, model_class, config_mod=None, device=None, max_workers=1)

Bases: SingleSplitDriver

Simple interface for single-split single-stage training by Weak Form.

dymad.training.aggregate_cv_results(results)

The results are potentially from concurrent runs, and each is in the format of

{‘combo_idx’, ‘fold_idx’, ‘combo’, ‘metric_value’, ‘model_prefix’}

This function aggregates them into CVResult objects by collecting fold results for each combo_idx.

dymad.training.iter_param_grid(param_grid)

param_grid: dict mapping dotted keys to iterables. Yields dicts mapping dotted keys -> single value.

dymad.training.normalize_phase_specs(config)
Return type:

list[OptimizerPhaseSpec | LinearSolvePhaseSpec | DataPhaseSpec | AnalysisPhaseSpec | ExportPhaseSpec]

dymad.training.select_best_cv_result(cv_results, *, goal='minimize', tie_breakers=('std_metric', 'combo_index'), combo_indices=None)

Return the selected best CV-result index using explicit selection rules.

Return type:

int

Supported goals:
  • minimize: lower mean metric is better

  • maximize: higher mean metric is better

Supported tie breakers (applied in order):
  • std_metric: lower std is better

  • param_l1: lower numeric L1 score of tuned params is better

  • combo_index: lower candidate index is better. Uses combo_indices when provided; otherwise uses the cv_results list position.

dymad.training.set_by_dotted_key(d, dotted_key, value)

Set nested dict/list paths for dotted keys such as ‘a.b.c’ or ‘phases.0.n_epochs’. Creates intermediate containers as needed.

Modules

batch_adapter

Adapters between trainer batches and native typed runtimes.

driver

execution_services

helper

ls_update

phase_pipeline

phase_runtime

phases

trainer

trainer_run