Data Managers And External Dependencies#
Scope#
The data-manager root layer is HydroModPy’s orchestration boundary for data loading. It answers three questions before solver work begins:
which data families are active for this run,
which manager loads each family,
which external providers or local inputs those managers depend on.
This architecture is intentionally centralized under
hydromodpy.data so the project facade and the simulation runner do
not need to duplicate activation rules or provider-specific loading
logic.
Code map#
hydromodpy/data/data_managers_config.py: typed validation of[data]sections.hydromodpy/data/planner.pyandplan.py: activation inference and immutableDataLoadPlancreation.hydromodpy/data/runtime_loader.py: runtime dispatch from activated data types to concrete managers.hydromodpy/data/data_managers.py: lightweight loaded-data container consumed by the project facade and the structure binders.hydromodpy/data/variables/*: provider-specific packages that own typed config, manager logic, and IO.
Recommended reading path#
hydromodpy/data/README.mdhydromodpy/data/data_managers_config.pyhydromodpy/data/planner.pyhydromodpy/data/runtime_loader.pyone family package under
hydromodpy/data/variables/
Root-Layer Responsibilities#
The root files under hydromodpy/data/ split responsibilities as follows:
data_managers_config.pyvalidates[data]and normalizes typed sections,planner.pymerges explicit types with inference rules,plan.pystores the immutableDataLoadPlan,runtime_loader.pydispatches each activated type to its concrete manager,data_managers.pyexposes the lightweight runtime container consumed by orchestration layers.
This means the project facade can stay focused on execution order while the data layer owns activation, validation, and loading dispatch.
Activation And Inference#
The activation contract is deterministic:
explicit
data.typesalways wins,additional types can be inferred from domain or flow configuration,
the final decision is stored in
DataLoadPlanwith reasons per type.
Current inference rules include:
domain.supportsusing providergeology-> activategeology,domain.zone_idscontaininggeology-> activategeology,flow.active_bccontainingstream-> activatehydrography,flow.active_bccontainingocean-> activateoceanic.
data.inference_mode = "warn" accepts these inferences and records them.
data.inference_mode = "strict" requires explicit typed sections for the
inferred families, except for the geology defaulting path already handled by
the data layer.
Runtime Dispatch#
DataManagersRuntimeLoader is the concrete dispatcher used after planning.
Its current dispatch table covers:
terrain and support context:
dem,geology,hydrography,oceanic,observed stations:
hydrometry,piezometry,intermittency,water_quality,climatic or forcing-like fields:
recharge,runoff,precipitation,etp,temperature,wind,humidity,radiation,soil_moisture.
Each family still owns its own typed config and manager package. The root layer only decides activation and calls the right loader.
Provider Families#
The current provider inventory documented in hydromodpy/data/structure.md
is summarized below.
Provider or source family |
Main HydroModPy data families |
Geographic scope |
|---|---|---|
Hub’Eau Hydrometrie |
|
France metropolitaine |
Hub’Eau Piezometrie |
|
France metropolitaine |
Hub’Eau ONDE |
|
France metropolitaine |
Hub’Eau Qualite |
|
France metropolitaine |
SIM2 EDR |
|
France metropolitaine |
SHOM |
|
French coasts |
IGN GeoPlateforme BD ALTI |
|
France metropolitaine |
BRGM 1:1M / 1:50K |
|
France metropolitaine |
Sandre WFS / BD Topage |
|
France metropolitaine |
EU-Hydro |
|
Europe |
OpenStreetMap / Overpass |
|
global |
External Runtime Constraints#
The important architectural point is not only “which API is called”, but also “what must be available for that call to succeed”.
Typical constraints today are:
SIM2-backed variables require a bounding box and a project time window,
SHOM loading requires a geographic context and a resolved date range,
geology and DEM loading often rely on geographic masks or raster support,
hydrography and watershed preprocessing depend on the Whitebox backend for some derived products,
local custom sources remain first-class inputs and must not be silently overwritten by cache subsumption logic.
The data catalog and cache layer then add another operational constraint:
empty remote results may be cached as sentinels to avoid repeated failed API calls,
stored paths stay relative for workspace portability,
force_refreshbypasses cache reuse when a provider call must be repeated.
What To Read When Touching This Layer#
Start with:
hydromodpy/data/README.mdfor the root orchestration contract,hydromodpy/data/planner.pyfor inference rules,hydromodpy/data/runtime_loader.pyfor the active dispatch surface,hydromodpy/data/structure.mdfor the broader provider and cache model.
Then inspect one typed family such as:
hydromodpy/data/variables/oceanic/,hydromodpy/data/variables/geology/,hydromodpy/data/variables/hydrometry/,hydromodpy/data/variables/precipitation/.
Those packages own provider-specific config, IO, and manager behavior.
Current Boundary With Future Work#
The architecture roadmap still mentions future consolidation work, especially:
deeper planner simplification,
clearer convergence between some observed-station families,
integrating PyHELP as a standard data-manager family rather than a more isolated coupling path.
Until that happens, this page should be read as the current root-layer contract, not as the final long-term provider map.