Catalog patterns#
A project handle is convenient for the run loop but unnecessary when all the caller wants is to read or inspect previously persisted runs. HydroModPy ships a single catalog entry point for that purpose.
hmp.open#
hydromodpy.open() returns a
hydromodpy.results.catalog.SimulationCatalog rooted at the
given workspace. It is the read-side complement of
Project.simulate() and mirrors the xarray.open_dataset intent:
one call, a ready-to-query object.
By default create=False: the call raises FileNotFoundError when
no catalog.duckdb exists at the workspace. Pass create=True to
initialise an empty catalog.
cat exposes the workspace query surface without a project-name
filter: it sees every simulation persisted in the workspace.
Query surface#
The catalog is the single door. cat.find() is the one filtered
entry point and returns a SimulationGroup; an unknown filter key
raises ValueError listing the valid filters. cat.frame
returns the full DataFrame. Federation across workspaces lives on
hmp.index(). Inputs are reached via
hydromodpy.catalog.InputsNamespace or the hmp data CLI.
Schema discovery and selectors live on the same object:
cat.describe()/cat.tables()/cat.columns()/
cat.variables()/cat.metrics()/cat.stations(), plus
cat.latest()/cat.best()/cat.worst()/cat.rank(),
cat[ref], cat.resolve(), cat.sql(), and
cat.read() for the by-id read path.
Notebook pattern#
Catalog and reader compose naturally inside a notebook session:
hmp.read() auto-dispatches the variable name through the field
registry (Zarr), the timeseries table (DuckDB), and the geographic
features table (GeoParquet), so a single call handles the three storage
kinds.
See Also#
hydromodpy.open()– workspace-level catalog.hydromodpy.read()– read a variable from a persisted run.hydromodpy.index()– machine-wide federation of workspaces.