Calibration Guide#

Audience: hydrogeologists, modelers, and HydroModPy contributors who need to calibrate the parameters of a catchment-scale groundwater model. Suggested reading order: §1 (principles), §2 (workflow), §3 (TOML), §7 (reading the results). Sections §4 to §9 are reference material to consult once the loop is running.

This guide is the single source of truth for the calibration sub-system. Code: hydromodpy/calibration/.

For complementary reading, see Calibration Architecture (calibration architecture root), Design Patterns (item 7 covers the calibration adapters), The three workspace databases, and Calibration Workflow.

Overview#

A HydroModPy calibration is an ask/tell loop that iteratively adjusts a few model parameters (hydraulic conductivity K, specific yield Sy, drain conductance) so that the simulated output matches a reference series (discharge at a station, head at a piezometer). The loop is driven by an optimizer that proposes values and reacts to a scalar objective (NSE, KGE, RMSE) computed at every trial.

Compared to a plain hmp run, a calibration:

produces a trace of N evaluations (one row per trial in the DuckDB catalog), not a single simulation;
reuses the expensive setup steps (geographic, mesh, data loading) across trials through the prepare-once-evaluate-many primitive;
only writes the best runs to disk as full Zarr / Parquet simulations. Other trials remain as lightweight rows in DuckDB;
can be resumed across sessions through the content-addressed params_hash cache.

Use a calibration to fit a model to observations. Use a plain hmp run when the parameters are already known.

Specific terms used in this guide:

TrialContext: prepared runtime, reused by every trial of a session (§2, §4).
earliest_affected_step: integer that decides which pipeline steps are skipped per trial (§4).
params_hash: SHA-256 fingerprint used by the cross-session cache (§8).
promote_trial: action that turns a lightweight trial row into a complete Zarr / Parquet simulation (§2, §5).

End-to-end workflow#

Table 2 Calibration data flow#
Stage	What flows into it and what it produces
`TOML`	User config with `[workflow].mode = "calibration"`. Read by `hmp run`.
`hmp run`	Dispatches the TOML to `prepare_trials`.
`prepare_trials`	Runs pipeline steps `[0..earliest)` once. Builds the prepared `WorkflowContext` consumed by the ask/tell loop.
Ask/tell loop (`run_trial_light`)	Each trial reads the prepared context, writes one row to DuckDB (`calibration_iterations`). After the loop ends, control passes to `promote_trial`.
DuckDB (`calibration_sessions` + `calibration_iterations`)	Stores trial fingerprints and session metadata. Read by `hmp report`.
`promote_trial` (top-N)	Replays the chosen trials through the full pipeline. Writes Zarr + Parquet + `simulations` rows and back-fills `calibration_iterations.sim_id`.
Catalog (simulations + Zarr + Parquet)	Read by `hmp report` alongside the DuckDB iteration history.
`hmp report`	Reads sessions, iterations, simulation outputs, and renders the calibration HTML report with figures.

Node by node:

TOML: a regular HydroModPy project TOML extended with a [calibration] section and a top-level [workflow].mode = "calibration" marker. The rest of the file ([simulation], [flow], [data], [solver]) is exactly what you would write for a single run.
``hmp run``: the unified CLI entry point. It reads [workflow].mode = "calibration" and dispatches to hydromodpy.calibration.cli_runner.run_calibration_cli. Calibration has no separate top-level subcommand.
``prepare_trials``: runs pipeline steps [0..earliest) exactly once. earliest is computed from the dotted paths declared by [calibration.parameters.*] (see §4). The prepared WorkflowContext (geographic, mesh, loaded forcings) is held in RAM and forked by every trial.
Ask/tell loop: the optimizer proposes a parameter point, the loop forks the prepared context, injects the values, runs steps [earliest..8] in lightweight mode (no disk writes beyond the solver’s own scratch files), extracts the scalar objective from RAM, and tells the optimizer. Repeat up to max_iter.
DuckDB: every trial adds one row to calibration_iterations (sim_id stays NULL). The session metadata lives in one row of calibration_sessions finalized at the end.
``promote_trial`` (top-N): if save_runs != "none", the chosen trials are replayed through the full pipeline (steps 00..11) by hydromodpy.Project(cfg_path).simulate(**values). Each promotion creates a Zarr store, a Parquet directory, and a simulations row, and back-fills the corresponding calibration_iterations.sim_id.
``hmp report <session_id>``: post-processing CLI that reads the session and iterations from DuckDB, renders the six calibration figures, and emits a standalone HTML report at <workspace>/reports/<session_id>/report.html.

The user TOML#

The examples below come from the canonical case shipped with the repository: examples/projects/02_nancon_watershed/run_calibration_k.toml.

Overlay and workflow marker#

base_config = "project.toml"

[workflow]
mode = "calibration"

base_config: relative path to the base project TOML (project.toml here). All sections of the base are inherited and overridden by the current file. This is how a single description of the catchment is shared between simulation, calibration, and sweep overlays.
[workflow].mode = "calibration": the single switch that tells hmp run to dispatch to the ask/tell loop instead of the default single-simulation path. Dispatch logic lives in hydromodpy/cli/workflows.py:DISPATCH.

Standard simulation block#

[simulation]
name = "nancon_calibration_k"
description = "Optuna calibration of K on NSE(discharge), Sy/Ss frozen."

[simulation.time]
start_datetime = "2000-01-01"
end_datetime = "2002-12-31"
step_value = "1 month"
coverage_policy = "warn"

[[simulation.process]]
id = "flow_main"
type = "flow"
solvers = ["modflow_nwt"]

The [simulation] tree is exactly what you write for a single hmp run: a name, a time window, and one or more processes. Changing the solver does not change anything in the ``[calibration]`` section: the calibration code is solver-agnostic (no modflow, modflow6, boussinesq, or solver.backend string appears anywhere in hydromodpy/calibration/). Swap "modflow_nwt" for "modflow6" and the same TOML calibrates the other solver (subject to the metric extractor coverage noted in §6).

Frozen parameters#

[domain.depth_model]
thickness = 30.0

[flow.param.Sy.field]
value = 0.05

[flow.param.Ss.field]
value = 1e-5

[flow.sinks_sources.recharge]
first_clim = "first"

Any leaf not listed under [calibration.parameters.*] is frozen at the value written in the TOML. Here Sy and Ss are frozen at 0.05 and 1e-5; only K is calibrated.

The `[calibration]` section#

[calibration]
method = "grid"
max_iter = 40
batch_size = 1
save_runs = "best_n"
save_best_n = 5
seed = 42
objective = "nse"
variable = "discharge"
use_cache = true

Exhaustive option reference:

Option	Values	Default	Effect
`method`	`grid` / `random_search` / `optuna` / `scipy_de` / `scipy_nelder_mead`	`grid`	Sampler backing the ask/tell loop. `optuna` is installed by default; `cma_es` and the Optuna `cmaes` sampler require the calibration extra.
`max_iter`	integer >= 1	`100`	Maximum number of trial evaluations.
`save_runs`	`none` / `best_n` / `all`	`none`	How many trials to promote to full simulations after the loop.
`save_best_n`	integer >= 0	`10`	Number of top trials promoted when `save_runs = "best_n"`. Ignored when `save_runs != "best_n"`.
`objective`	`nse` / `kge` / `rmse` / `"module.path:fn"`	`nse`	Scalar metric; the string form is the Python escape hatch (§10).
`variable`	`discharge` / `head`	`head`	Observed variable to compare against. `"discharge"` reads from `[data.hydrometry]`; `"head"` from `[data.piezometry]`.
`seed`	integer or `null`	`null`	Random seed for reproducibility. `null` = non-reproducible.
`use_cache`	bool	`true`	Enable the `params_hash` cross-session cache (§8).
`batch_size`	integer >= 1	`1`	Suggestions drawn per `ask`. Reserved for future parallel trials. Leave at `1` today.
`optimizer_kwargs`	dict	`{}`	Extra kwargs forwarded to the sampler (for example `{sampler = "cmaes"}` for Optuna).

Per-parameter declarations#

[calibration.parameters.K]
bounds = [1e-6, 1e-3]
transform = "log"
prior = "log_uniform"
path = "flow.param.K.field.value"
units = "m/s"

Exhaustive option reference for each [calibration.parameters.<name>] block:

Option	Values	Required	Effect
`bounds`	`[low, high]`	yes (except some grid/fixed-space setups)	Physical bounds. For `optuna` / `scipy_de` / `grid` these are the sampling bounds.
`transform`	`identity` / `log` / `logit`	no (defaults to `identity`)	Internal sampling space. Use `log` for strictly-positive quantities spanning several orders of magnitude (K, conductances, storage).
`prior`	`uniform` / `log_uniform` / `normal`	no (defaults to `uniform`)	Prior used by Bayesian samplers (`gp_mapping`, `da_mh_gp`). Non-Bayesian methods ignore this.
`path`	dotted path	yes	Where to inject the value in `HydroModPyConfig` each trial (see §4).
`units`	free string	no	Informational label used in figures and reports.

Recipes for common parameter shapes:

# Homogeneous K
[calibration.parameters.K]
bounds = [1e-6, 1e-3]
transform = "log"
path = "flow.param.K.field.value"

# Zoned K: one block per zone, each with its own `path`
[calibration.parameters.K_granite]
bounds = [1e-6, 1e-3]
transform = "log"
path = "flow.param.K.field.values.zone_granite"

[calibration.parameters.K_schiste]
bounds = [1e-6, 1e-3]
transform = "log"
path = "flow.param.K.field.values.zone_schiste"

# Specific yield (no log: already O(1))
[calibration.parameters.Sy]
bounds = [0.02, 0.30]
transform = "identity"
path = "flow.param.Sy.field.value"

# Drain conductance
[calibration.parameters.drain_cond]
bounds = [1e-4, 1e-1]
transform = "log"
path = "flow.sinks_sources.stream_drain.conductance"

# Aquifer thickness
[calibration.parameters.thickness]
bounds = [10.0, 80.0]
transform = "identity"
path = "domain.depth_model.thickness"

Recipe cheat sheet#

Use case	Block
Quick exploratory sweep (1-2 params)	`method = "grid"`, `max_iter ~= 25`, `save_runs = "none"`
Default production calibration	`method = "grid"`, `max_iter = 100`, `save_runs = "best_n"`, `save_best_n = 5`, `seed = 42`
Multi-dim continuous (3+ params)	`method = "optuna"`, `optimizer_kwargs = {sampler = "cmaes"}`, `max_iter = 300`
Local refinement near a known optimum	`method = "scipy_nelder_mead"`, `max_iter = 80`
Full Bayesian posterior (Phase 4)	`method = "da_mh_gp"`, `max_iter = 2000`, `save_runs = "best_n"`, `save_best_n = 10`

Automatic step invalidation#

Every pipeline step declares which TOML subtrees it reads via a config_sections class variable. When a calibration mutates flow.param.K.field.value, only the steps that consume flow.* (and their downstream siblings) need to re-run. The earlier steps produce the same result on every trial and are executed exactly once inside prepare_trials.

Table 3 Pipeline steps and their role during calibration#
Step	Name	Phase	TOML subtrees consumed
00	Validate	Shared (run once)	`workspace`, `simulation`
01	Resolve	Shared (run once)	`workspace`, `simulation`
02	LoadData	Shared (run once)	`data`
03	BuildGeographic	Shared (run once)	`geographic`, `data.dem`
04	BuildMesh	Shared (run once)	`domain.supports`
05	SetupProcess	Shared (run once)	`domain.depth_model`, `flow.ic`
06	PrepareSolver	Looped per trial	`flow`, `transport`, `solver`
07	RunSolver	Looped per trial	`flow`, `transport`, `solver`
08	Extract	Looped per trial	none (in-memory extraction)
09	Derive	Promoted only	`postprocess`
10	Export	Promoted only	none (write Zarr/Parquet)
11	Display	Promoted only	`display`

Green (shared): steps 00..05. Executed once by prepare_trials. Geographic, mesh, and loaded forcings live in TrialContext.ctx and are shared by reference across every fork.
Orange (looped): steps 06..08. Re-run per trial in run_trial_light. The trial fork deep-copies the config, injects the new parameter values, and re-executes this slice only.
Blue (promoted): steps 09..11. Never run during the calibration loop. Executed only for the trials picked up by promote_trial after the loop converges.

If I calibrate X, what is replayed?#

Calibrated path	`earliest`	Steps shared (run once)	Steps re-run per trial	Rough speedup
`flow.param.K.field.value`	6	00-05	06-08	~3x
`flow.param.Sy.field.value`	6	00-05	06-08	~3x
`flow.param.K.field.values.zone_granite`	6	00-05	06-08	~3x
`flow.sinks_sources.stream_drain.conductance`	6	00-05	06-08	~3x
`domain.depth_model.thickness`	5	00-04	05-08	~2x
`domain.supports.cell_size`	4	00-03	04-08	~1.7x
`geographic.buff_area`	3	00-02	03-08	~1.3x

Matching rule: dotted longest-prefix. Given an override path like flow.param.K.field.value, the selector walks each step’s config_sections and accepts a match when the section is a dotted-prefix of the path. flow matches, flow.param.K matches, flow.param.K.field.value matches, but flow_runtime does not. The lowest index among matching steps wins; everything downstream of it is forced to re-run even if its own sections did not match (step 08 Extract, for instance, has empty config_sections, yet it re-runs because it consumes the state produced by step 07).

Implementation: hydromodpy/workflow/internals/dependencies.py:earliest_affected_step. Per-step annotations: config_sections: ClassVar[tuple[str, ...]] on each of the 12 step_<nn>_*.py modules.

Storage#

Calibration storage flow, by container:

Container	Component	Role during the calibration loop
RAM (loop only)	Aligned vector + scalar metrics	Built per trial by `run_trial_light` and discarded at end of trial. Never reaches disk.
`catalog.duckdb`	`calibration_sessions`	One row per session (start, finalize, status).
`catalog.duckdb`	`calibration_iterations` (`sim_id` NULL)	One row per trial: parameters, objective, scalar metrics, `params_hash`.
`catalog.duckdb`	`simulations`, `parameters`, `metrics`, `tags`	Promoted top-N trials only (`promote_trial`). Updates the matching `calibration_iterations.sim_id`.
Filesystem	`simulations/<basename>.zarr/`	Spatial fields written only for promoted trials.
Filesystem	`simulations/<basename>.parquet/`	Detailed timeseries, budgets, mass balance for promoted trials only.
Filesystem	`<project>/figures/<session_id>/`	PNG figures rendered post-loop via the display registry.
Filesystem	`<project>/reports/<session_id>/report.html`	HTML report rendered by `hmp report <session_id>`.

Rule of thumb: RAM inside the loop, DuckDB for the trace, Zarr / Parquet only for promoted runs.

Artefact	Lives in	Written when
Simulated vector aligned on observations	RAM only	Each `run_trial_light`: discarded at end of trial
Per-station scalar metrics (NSE, KGE, RMSE)	`calibration_iterations.metrics` (JSON column)	After each trial
Session metadata	`calibration_sessions` (1 row / session)	Start + finalize
Parameters + scalar objective	`calibration_iterations` (sim_id NULL by default)	After each trial
`params_hash`	Column in `calibration_iterations`	After each trial
Spatial fields `head(x, y, t)`	`simulations/<basename>.zarr/` or `.zarr.zip`	Only via `promote_trial`
Detailed timeseries (head, Q)	`simulations/<basename>.parquet/timeseries.parquet`	Only via `promote_trial`
Figures (PNG)	`<workspace>/projects/<name>/figures/<session_id>/`	Post-loop via Display registry
HTML report	`<workspace>/reports/<session_id>/report.html`	`hmp report <session_id>`

Disk-space cheat sheet for a representative 100-trial session on a moderately fine mesh:

`save_runs`	DuckDB	Zarr dirs	Parquet dirs	Typical total
`none`	100 iter + 1 session	0	0	~500 KB
`best_n = 5`	100 iter + 5 sim + 5 param + 5 metric + 1 session	5	5	~2.5 GB
`all`	100 iter + 100 sim + …	100	100	~50 GB

Recommendation. Use save_runs = "best_n" as your default. It keeps the full trace queryable in DuckDB while only materializing the trials you might actually inspect later. Use "all" only for short diagnostic calibrations (<= 20 trials) when you want every single run on disk.

Optimization methods#

Method	Type	Typical budget	Deterministic	Strength	When to pick it
`grid`	Enumeration	Product of `n_points`	Yes	Simple, exhaustive	1-2 params, unlimited budget
`optuna` (default TPE)	Light Bayesian	50-200	Seed-dependent	Adaptive, easy	Sensible default
`optuna` (`sampler="cmaes"`)	Evolutionary	100-500	Seed-dependent	Continuous multi-dim	3+ continuous params
`scipy_de`	Evolutionary	100-500	Seed-dependent	Robust	Alternative to Optuna CMA-ES
`scipy_nelder_mead`	Local simplex	50-100	Partially	Fast local	Refine around a known optimum
`gp_mapping` (Phase 4)	GP surrogate + EI	30-100	Seed-dependent	Few expensive evaluations	Slow solvers
`da_mh_gp` (Phase 4)	MCMC + GP surrogate	1000+	No	Full posterior	Uncertainty quantification

optimizer_kwargs hints:

grid: { "n_points": {K = 10, Sy = 5} } to set per-parameter granularity. Defaults to a uniform count across all parameters.
optuna: { "sampler": "tpe" | "cmaes" | "random", "pruner": "median" }. Default sampler is TPE and is available in the base install; cmaes requires the calibration extra and is strongly recommended when you have 3+ continuous parameters.
scipy_de: { "popsize": 15, "mutation": [0.5, 1.0], "recombination": 0.7 }, the standard scipy.optimize.differential_evolution knobs.
scipy_nelder_mead: { "xatol": 1e-4, "fatol": 1e-4, "adaptive": true }. adaptive = true is friendlier in higher dimensions.
gp_mapping: { "n_initial": 10, "acq": "ei" }: expected improvement over an RBF Gaussian Process surrogate.
da_mh_gp: { "burn_in": 200, "thin": 5, "proposal_scale": 0.3 }, Metropolis-Hastings tuned by the surrogate.

Reading the results#

Python API and DuckDB#

import hydromodpy as hmp

catalog = hmp.open("~/workspace")
sessions = catalog.calibration_sessions()
iters = catalog.calibration_iterations(session_id=sessions.iloc[0]["session_id"])
best = catalog.best(project="calibration", metric="nse")
best.plot("watertable_map", save="~/figures/")
print(iters.head())

catalog.calibration_sessions() returns a DataFrame with one row per session (method, objective, best_objective, duration_s).
catalog.calibration_iterations(session_id=...) returns the full per-trial trace (parameters, objective_value, status, sim_id, params_hash).
catalog.best(project, metric) returns a Run object for the best promoted trial across sessions matching project.

Figures via the display registry#

Six named figures ship with HydroModPy. Each implements the Figure Protocol and is registered in hydromodpy/display/figures/__init__.py.

calibration_convergence: best-so-far objective vs iteration.
calibration_trace: parallel plots of every parameter and the objective across iterations.
calibration_landscape: 2D scatter of any pair of parameters coloured by objective value.
calibration_posterior: marginal histograms per parameter.
calibration_objective_surface: interpolated NSE surface over any 2-parameter slice.
calibration_pairplot: pairwise grid of scatter plus histograms.

Render one figure at a time from the CLI (the session id is printed on stderr during hmp run):

hmp display run_calibration_k.toml --session <session_id> --figure calibration_convergence
hmp display run_calibration_k.toml --session <session_id> --figure calibration_trace
hmp display run_calibration_k.toml --session <session_id> --figure calibration_landscape
hmp display run_calibration_k.toml --session <session_id> --figure calibration_posterior
hmp display run_calibration_k.toml --session <session_id> --figure calibration_objective_surface
hmp display run_calibration_k.toml --session <session_id> --figure calibration_pairplot

Pre-rendered examples for a Dupuit / MODFLOW 6 twin benchmark live under docs/source/_static/capability_gallery/calibration/: reuse them as visual references.

The loop can also render figures automatically at the end of the session by listing them under [display] figures = [...] in the TOML.

HTML report#

For users who do not want to drop into Python:

hmp report <session_id>
xdg-open ~/workspace/reports/<session_id>/report.html

The report embeds the six figures together with a summary table (method, objective, best parameters, best_sim_id, duration). It is fully standalone: the HTML file plus its sibling PNGs can be shipped as a single folder.

Cross-session cache and reproducibility#

Every trial is fingerprinted by a params_hash: the SHA-256 of its canonical parameter JSON representation (keys sorted, floats normalized). The hash is written on every calibration_iterations row.

use_cache = true (default): before running a trial, the engine looks up params_hash in the ParamsHashCache. The cache is preloaded at session start from every completed and promoted iteration of every previous session on the same workspace. On a hit, the sim_id is reused, the solver is skipped entirely, and the cached objective value is returned.
seed = 42: seeds the sampler. Combined with a deterministic solver, this makes the whole sequence reproducible. Leave null for stochastic exploration.

Disable the cache (use_cache = false) when:

the input data changed (new forcings, new DEM, new observations);
the solver version changed;
you want to benchmark true per-trial wall time.

Common pitfalls and how to avoid them#

``path`` points to a missing field. The engine raises a CalibrationSetupError with the list of valid paths. Double-check with hmp doctor --toml run_calibration_k.toml.
Bounds too tight. The optimizer converges on a plateau and reports a false “best”. Widen bounds and re-run.
``transform = “identity”`` for K from 1e-6 to 1e-3. Uniform sampling in the physical space burns 99% of the budget on the top decade. Always use transform = "log" for multi-order quantities (K, conductances, storage coefficients).
``save_runs = “all”`` on 100 trials. You end up with ~50 GB of Zarr. Use best_n unless you really want every trial on disk.
``max_iter`` too low for a Bayesian method. Optuna TPE needs at least ~50 trials before exploiting; da_mh_gp needs a few hundred before the posterior stabilizes.
Variable ``”discharge”`` but no hydrometry in ``[data]``. The metric extractor returns NaN on every trial; the session completes in seconds with no usable output. Check that [data.hydrometry] (or [data.piezometry] for head) is populated.
MODFLOW 6 + discharge calibration. The extractor in hydromodpy.calibration.metrics currently covers MODFLOW-NWT only; MODFLOW 6 returns NaN. Scheduled as future work: use MODFLOW-NWT in the meantime, or plug in a custom extractor through §10.
Forgetting ``[workflow].mode = “calibration”``. hmp run then treats the TOML as a simulation and runs K exactly once with its default value. The [calibration] section is silently ignored.

Python API reference#

Programmatic entry point (no CLI needed):

from hydromodpy.calibration.cli_runner import run_calibration_cli

summary = run_calibration_cli(
    "run_calibration_k.toml",
    objective="mypkg.metrics:custom_nse",
    workspace="~/workspace",
    project="nancon_K",
)
print(summary["best_sim_id"], summary["best_objective"])

The positional argument is the TOML path; resolved relative to the current working directory.
objective="module.path:fn" is the Python escape hatch. The callable must match the TrialMetricFn signature (ctx, *, objective, variable) -> (primary_metric, {component: value}). Use it when the default NSE / KGE / RMSE extractor does not fit your variable (multi-station weighted, filtered seasons, etc.).
workspace overrides the workspace root resolved from the TOML (same rule as the binary resolver in WorkspaceConfig).
project is a free string written to calibration_sessions.project.

Return dict keys:

session_id, method, n_iterations, best_objective,
best_sim_id, duration_s, save_runs, promoted

Low-level primitives#

The CLI is a thin wrapper around three primitives you can call directly for custom orchestration:

from hydromodpy.simulation.execution.trial import (
    prepare_trials, run_trial_light, promote_trial,
)

trial_ctx = prepare_trials(
    "run_calibration_k.toml",
    override_paths={"K": "flow.param.K.field.value"},
)
result = run_trial_light(
    trial_ctx, {"K": 1e-4},
    objective="nse", variable="discharge",
)
if result.status == "completed":
    sim_id = promote_trial(
        "run_calibration_k.toml",
        {"K": 1e-4},
        paths={"K": "flow.param.K.field.value"},
        name="manual_k_1e-4",
    )

Useful diagnostics on a persisted trace:

from hydromodpy.calibration.diagnostics import (
    convergence_rate, parameter_correlation,
)

rate = convergence_rate(iters)
print(rate["slope"], rate["r_squared"])
corr = parameter_correlation(iters, parameters=["K", "Sy"])
print(corr)

Render a specific figure programmatically (equivalent to the hmp display call above):

from pathlib import Path
import hydromodpy as hmp

catalog = hmp.open("~/workspace")
fig = hmp.display.get("calibration_convergence").plot(
    session_id="abcd1234ef...",
    catalog=catalog,
    save_path=Path("~/figures/convergence.png").expanduser(),
)

Analytical test cases#

Two pure-Python demos ship for local experimentation (no MODFLOW install needed):

hydromodpy.calibration.cases.recession_brutsaert: hydrograph recession fit (Q(t) = Q0 * exp(-t/tau)) used as the golden for grid_search, random_search, nelder_mead, simplex, cma_es, gp_mapping, da_mh_gp.
hydromodpy.calibration.cases.groundwater_1d: 1D synthetic aquifer with analytical head profile.

They are the quickest way to check that a new optimizer adapter behaves against the known-good METHOD_ABS_TOL tolerances.