hydromodpy.results.lazy_loaders#

Lazy loaders for the per-simulation Parquet and Zarr stores.

Returns#

polars.LazyFrame for tabular Parquet views, pa.dataset.Dataset for per-field Zarr stores. Both are constructed without pulling row data, so a cohort scan can be planned (filters, projections) before any byte is read.

These accessors fill the ML-friendly gap pointed out in reports_db/04_parquet_tabular.md §R08. The previous catalog API only exposed eager pandas.DataFrame results, which is OOM-prone on large workspaces.

Functions

iter_parquet_paths(workspace)

Yield every Parquet payload under workspace/simulations/.

list_field_paths(catalog)

Return every per-simulation Zarr store path.

list_parquet_paths(catalog, view)

Return existing <view>.parquet paths across the workspace.

scan_field(catalog)

Return a pyarrow.dataset.Dataset over every Zarr store.

scan_timeseries(catalog, *[, filters, columns])

Return a polars LazyFrame over every timeseries.parquet.

scan_view(catalog, view, *[, filters, columns])

Return a polars LazyFrame over any per-sim view.