Algorithm Integration Guide¶

Purpose¶

Use this guide when adding a reusable algorithm, model family, solver, analysis workflow, training phase, or agent-facing tool. It complements:

architecture.md, which defines package ownership and boundary layers
feature-placement.md, which gives the short placement table
scratch-scripting.md, which defines temporary external scratch work
example-script-pattern.md, which defines runnable script patterns

The goal is to help AI agents decide where an algorithm belongs, what interface it must expose, and what supporting schema, tests, examples, and benchmarks are expected.

Placement Decision Ladder¶

Start with the narrowest implementation package that can own the behavior. Add agent/* exposure only when the algorithm becomes a stable user-facing workflow or capability.

If the new functionality…	And it does not…	Put the primary implementation in…	Then expose it through…
Is a low-level solver, decomposition, derivative, quadrature, denoising primitive, or linear algebra utility	Own datasets, training phases, model construction, handles, or MCP schemas	`src/dymad/numerics/*`	Direct package exports and `test_assert_` or `test_contract_`
Is a reusable trajectory, graph, transform, runtime, batching, or model-context data structure	Train models, register capabilities, or parse user requests	`src/dymad/core/*`	`src/dymad/core/__init__.py` only when it is a public core surface
Is a model family, model variant, component recipe, rollout behavior, or prediction path	Persist handles or validate user-mode requests	`src/dymad/models/*`	`src/dymad/agent/registry/models.py` only if user-mode training should select it
Is an optimizer, training phase, training artifact, trainer state change, or phase pipeline change	Add a stable user-facing capability by itself	`src/dymad/training/*`	`src/dymad/agent/registry/training_schema.py` and `compiler/*` only for user-mode overrides
Is checkpoint loading, dataset registration, trajectory I/O, or legacy public runtime behavior	Merely orchestrate an existing workflow	`src/dymad/io/*`	`agent/exec/`, `agent/store/`, and `agent/facade/*` when handles must reflect it
Is Koopman/spectral analysis library behavior	Just call a one-off script	`src/dymad/sako/*`	`agent/registry/analyses.py`, `compiler/analysis.py`, and `exec/workflow.py` for user-mode analysis
Coordinates existing package behavior into a workflow	Implement new math or model internals	`src/dymad/agent/exec/*`	MCP tools after registry/compiler/store/facade contracts exist
Introduces a stable capability key, model key, profile key, workflow key, schema, examples, or natural-language translation guidance	Execute runtime logic	`src/dymad/agent/registry/*`	`compiler/*` and MCP user tools
Validates or normalizes user-mode request fields	Execute workflow logic or persist artifacts	`src/dymad/agent/compiler/*`	Store compiled requests through facade/store if calls must be durable
Persists a new request, artifact, run, snapshot, or handle	Implement the algorithm itself	`src/dymad/agent/store/` and `src/dymad/agent/facade/`	`agent/exec/*` and MCP tools
Adds an MCP tool for an existing capability or workflow	Define business logic in the server	`src/dymad/agent/mcp/user_tools.py` or `demo_tools.py`	`src/dymad/agent/mcp/server.py` registration only
Is a temporary one-off exploration or benchmark	Need to become maintained repo behavior	External scratch folder	Follow scratch-scripting.md
Is a committed runnable demo, benchmark, or experiment	Need to be imported as library behavior	`scripts/` or `examples/`	Follow example-script-pattern.md

If two rows seem to apply, split the change: implementation first, agent-facing exposure second.

Interface Contracts¶

Numerical Algorithms¶

Numerical code should be small, deterministic, and independent of agent/MCP concerns.

Expected interface:

Pure functions or focused classes in src/dymad/numerics/*
Tensor/array inputs with explicit shape expectations in the docstring or type name
No object-store handles, MCP schemas, training-run records, or user-mode keys
Stable return values that can be checked exactly or within a documented tolerance

Return structures:

Prefer tensors, arrays, scalars, or small dataclasses when multiple named outputs are required.
Do not return loosely shaped dicts unless an existing numerics module already uses that pattern for the same family.

Core Data Structures¶

Core structures represent runtime data rather than algorithms tied to a single workflow.

Expected interface:

Use typed dataclasses that follow RegularSeries, GraphSeries, RegularTrainerBatch, and GraphTrainerBatch conventions.
Preserve device/dtype movement with .to(...) when the object owns tensors.
Keep graph and regular semantics explicit instead of encoding them in ad hoc dicts.
Add public exports in src/dymad/core/__init__.py only when the type is part of the package surface.

Primary consumed/returned data:

Regular trajectories: RegularSeries, RegularSeriesBatch, RegularTrainerBatch
Graph trajectories: GraphSeries, GraphSeriesBatch, GraphTrainerBatch
Runtime views: UniformRegularRuntime, RaggedRegularRuntime, UniformGraphRuntime

Models and Prediction¶

New model behavior belongs in src/dymad/models/* before any agent-facing registry work.

Expected interface:

Add or reuse a ModelSpec from src/dymad/models/model_spec.py.
Expose predefined variants through PredefinedModel in src/dymad/models/collections.py when the model should be selectable by configuration or registry keys.
Make construction work through build_model(model_spec, model_config, data_meta, dtype, device).
Keep rollout behavior in src/dymad/models/rollout_engine.py or prediction.py, not in MCP adapters.

Configuration schema:

Runtime construction still consumes the existing model config dict shape.
User-mode selection is through model_key, reference_profile, and compiler-validated overrides.
If users should select the model family, add a stable key in agent/registry/models.py.
If users need different defaults, add or update a profile in agent/registry/workflows.py or the profile registry that owns the relevant config.

Training Algorithms and Phases¶

Training behavior belongs in src/dymad/training/*.

Expected interface:

Phase specs are dataclasses such as OptimizerPhaseSpec, LinearSolvePhaseSpec, DataPhaseSpec, AnalysisPhaseSpec, and ExportPhaseSpec.
Executable phases subclass BasePhase and implement execute(...) -> PhaseResult.
Shared state flows through TrainerState, PhaseContext, and ArtifactRegistry.
New phase outputs should be explicit artifacts, preferably dataclasses in training/phase_runtime.py, rather than unstructured side effects.

Configuration schema:

Add runtime phase semantics in training/phases.py.
Add user-mode phase-entry schema in agent/registry/training_schema.py only when users can request it through overrides.phases.
Add compiler validation/normalization in agent/compiler/training.py when new user request fields or override paths are accepted.
Keep runtime-owned fields such as dataset paths and model implementation names protected from user overrides.

Analysis Workflows¶

Reusable analysis code belongs in the implementation package first, commonly src/dymad/sako/* for spectral analysis or a focused module near the runtime it analyzes.

Expected interface:

Library analysis should expose functions/classes that consume package data structures, checkpoint paths, or model classes directly.
Agent-facing analysis workflows compile through AnalysisRequest and return AnalysisRunResult.
Persisted analysis artifacts should be written under the provided artifact root and returned as artifact paths plus a compact numeric/string summary.

Configuration schema:

Add a stable workflow key and parameter schema in agent/registry/analyses.py.
Add request requirements in agent/compiler/analysis.py.
Add execution in agent/exec/workflow.py or a focused helper under agent/exec/*.
Add store/facade records only if the analysis has durable state beyond the compiled request and returned artifacts.

MCP Tools¶

MCP tools are boundary adapters. They should not own algorithm behavior.

Expected interface:

User-mode tools call registry/compiler/executor/facade surfaces.
Developer/demo tools may expose raw handles or compatibility behavior, but should still avoid embedding new algorithms.
server.py registers tools and mode splits only.

Configuration schema:

User-facing capability keys and schemas live in agent/registry/*.
Request validation lives in agent/compiler/*.
Persisted request and artifact handles live in agent/store/* and agent/facade/*.

Agent-Facing Schema Checklist¶

If the algorithm is exposed to user-mode agents, document or implement all of these:

Stable key: lower-case snake case, no raw import paths as user-facing identifiers.
Capability summary: concise description of what the algorithm does and the dataset kind it supports.
Dataset compatibility: regular, graph, or both.
Parameter schema: explicit types, defaults, allowed enum values, and required fields.
Override schema: allowed top-level keys and protected runtime-owned paths.
Natural-language translation guidance: only in agent/registry/*, not in MCP adapters.
Constraint notes: shape, dimension, time-domain, graph-mode, and runtime limitations that a client must know before compiling a request.
Examples: at least one minimal request/override example for each new user-facing capability.
Result contract: handles, artifact paths, metrics, summaries, and failure modes.

Data Structures To Consume And Return¶

Use the typed structures that match the layer:

Layer	Consume	Return
Numerics	`torch.Tensor`, `numpy.ndarray`, scalars, typed dataclasses	Tensors, arrays, scalars, typed dataclasses
Core transforms/runtime	`RegularSeries`, `GraphSeries`, trainer batches, runtime views	Updated typed series/batches/runtime views
Models	`ModelSpec`, model config dict, data metadata, runtime payloads	`torch.nn.Module`, rollout tensors, prediction outputs
Training phases	`TrainerState`, `PhaseContext`, `ArtifactRegistry`, dataloaders	`PhaseResult` with explicit artifacts and metrics
Agent compiler	request dataclasses and registry capabilities	compiled request dataclasses plus diagnostics
Agent executor	compiled request records and facade handles	result dataclasses such as `StartModelTrainingResult` or `AnalysisRunResult`
MCP tools	JSON-compatible request fields and handles	JSON-compatible dicts built from typed records

Avoid converting typed runtime objects into generic dicts at lower layers just because MCP eventually returns JSON. The JSON conversion belongs at the boundary.

Mandatory Tests¶

Every algorithm change needs targeted tests in the category that matches the risk:

Change type	Required test category
Deterministic numerical primitive, solver, transform math, or exact reference value	`tests/test_assert_*`
Runtime interface, typed batch, model spec, transform builder, checkpoint boundary, or public API contract	`tests/test_contract_*`
Workflow path that should run without checking detailed numerical accuracy	`tests/test_workflow_*`
Agent-facing registry, compiler, executor, MCP tool, handle, or skill-staging behavior	`tests/test_agent_*`
Slow deterministic CLI, end-to-end regression, or baseline metric coverage	`tests/test_slow_` plus `_baselines.json` when needed

Minimum expectations:

New public model families need model-spec or runtime contract coverage.
New user-mode capability keys need registry tests and compiler tests.
New analysis workflows need compiler/executor tests and at least one artifact/summary assertion.
New persisted handle types need store/facade persistence tests.
New CLI examples that become regression surfaces need slow tests with deterministic seeds and bounded runtime.

New pytest files must use exactly one of the allowed prefixes enforced by tests/conftest.py: test_assert_, test_workflow_, test_slow_, test_contract_, or test_agent_.

Examples¶

Add examples only when they teach a maintained workflow that tests and docs do not already make clear. For one-off exploration, use an external scratch folder first and follow scratch-scripting.md.

Expected placement:

Reusable package API examples: notebooks or docs under examples/* when they are part of the public documentation.
Runnable training, analysis, or benchmark demos: scripts/*.
User-facing runnable scripts: CLI style with parse_args(), main(), and if __name__ == "__main__": raise SystemExit(main()).
Developer-facing scripts that intentionally expose execution details may keep the existing if-block style.

Before adding or reshaping a script, read example-script-pattern.md.

Benchmarks¶

Benchmarks are expected when the algorithm changes hot-path runtime, memory behavior, solver complexity, batch conversion, prediction rollout, or graph/regular data movement.

Expected placement and shape:

Put focused benchmark scripts under scripts/benchmarks/*.
Use CLI arguments for case, device, iteration count, and warmup count.
Make CPU execution the default unless the benchmark is explicitly GPU-only.
Report stable summary statistics such as mean, median, min, and max.
Keep benchmark fixtures small enough for local iteration; use slow CI only for deterministic regression checks, not exploratory performance measurement.

Do not add benchmark results as correctness tests unless the threshold is deliberately stable across CI machines.

Naming Conventions¶

User-facing capability keys: lower-case snake case, stable, and independent of implementation import paths.
Model family keys: short lower-case names that match existing families such as kbf, km, ldm, lti, and sdm.
Model variant names: established uppercase variant style in models/collections.py, with lower-case variant keys derived by the registry.
Analysis workflow keys: lower-case snake case, such as spectral_koopman.
Handles: keep the existing prefix style (ds_*, chk_*, run_*, trainreq_*, analysisreq_*, eval_*, pred_*, specsnap_*).
Phase names: descriptive snake case for generated/default phases; preserve user labels when they improve readability.
Test files: one of the five required prefixes.
Scripts: use descriptive directory names and *_cli.py for user-facing CLI entrypoints.

What Not To Touch¶

Avoid these changes unless the task explicitly requires them:

Do not put algorithm, validation, or workflow logic in src/dymad/agent/mcp/server.py.
Do not add user-facing string parsing to MCP adapters; use compiler modules.
Do not add registry metadata inside executor methods.
Do not persist new handle state directly in executor methods; use store and facade modules.
Do not route numerical/model/runtime behavior through agent/* just because an agent will call it eventually.
Do not expose raw import strings as the primary user-mode interface.
Do not weaken protected runtime-owned config paths to make an override pass.
Do not change checkpoint materialization through dymad.io.checkpoint without updating architecture docs and boundary tests.
Do not rename existing capability keys, handle prefixes, or public model variants without a compatibility plan.
Do not edit generated docs under docs/_build.

Integration Checklist¶

Before opening a change:

Classify the algorithm with the placement decision ladder.
Implement reusable behavior in the implementation package first.
Add typed interfaces and return structures at the lowest layer that owns the behavior.
Add registry keys and schemas only if agents/users need a stable selection surface.
Add compiler validation only for new user-mode request fields or overrides.
Add executor orchestration only for multi-step workflows or persisted runs.
Add store/facade records only for durable objects that must be referenced across calls.
Add MCP tool exposure last.
Add mandatory targeted tests using the required test-file prefix.
Keep one-off explorations in scratch space; add examples and benchmarks only when the algorithm introduces a maintained user workflow or hot path.
Update architecture.md and feature-placement.md if ownership, boundaries, handles, workflows, or placement guidance changed.
For Python edits, run make check before reporting completion.