API Documentation

Connectors

Base

Base classes for PastaStore Connectors.

class pastastore.base.BaseConnector[source]

Base Connector class.

Class holds base logic for dealing with time series and Pastas Models. Create your own Connector to a data source by writing a a class that inherits from this BaseConnector. Your class has to override each abstractmethod and property.

abstractmethod _add_item(libname: Literal['oseries', 'stresses', 'models', 'oseries_models', 'stresses_models'], item: DataFrame | Series | dict, name: str, metadata: dict | None = None) None[source]

Add item for both time series and pastas.Models (internal method).

Must be overridden by subclass.

Parameters:
  • libname (str) – name of library to add item to

  • item (DataFrameOrSeries | dict) – item to add

  • name (str) – name of the item

  • metadata (dict, optional) – dictionary containing metadata, by default None

Note

Metadata storage can vary by connector: - ArcticDB: Native metadata support via write() - DictConnector: Stored as tuple (metadata, item) - PasConnector: Separate {name}_meta.pas JSON file

Add model name to stored list of models per oseries.

Parameters:
  • oseries_name (str) – name of oseries

  • model_names (str | list[str]) – model name or list of model names for an oseries with name oseries_name.

  • _clear_cache (bool, optional) – whether to clear the cache after adding, by default True. set to False during bulk operations to improve performance.

_add_series(libname: Literal['oseries', 'stresses'], series: DataFrame | Series, name: str, metadata: dict | None = None, validate: bool | None = None, overwrite: bool = False) None[source]

Add series to database (internal method).

Parameters:
  • libname (str) – name of the library to add the series to

  • series (pandas.Series or pandas.DataFrame) – data to add

  • name (str) – name of the time series

  • metadata (dict, optional) – dictionary containing metadata, by default None

  • validate (bool, optional) – use pastas to validate series, default is None, which will use the USE_PASTAS_VALIDATE_SERIES value (default is True).

  • overwrite (bool, optional) – overwrite existing dataset with the same name, by default False

Raises:

ItemInLibraryException – if overwrite is False and name is already in the database

Add model name to stored list of models per stress.

Parameters:
  • stress_names (list[str]) – names of stresses

  • model_names (str | list[str]) – model name or list of model names for a stress with name

  • _clear_cache (bool, optional) – whether to clear the cache after adding, by default True. set to False during bulk operations to improve performance.

_added_models = []
static _clear_cache(libname: Literal['oseries', 'stresses', 'models', 'oseries_models', 'stresses_models']) None[source]

Clear cached property.

_conn_type: str | None = None
_default_library_names = ['oseries', 'stresses', 'models', 'oseries_models', 'stresses_models']
abstractmethod _del_item(libname: Literal['oseries', 'stresses', 'models', 'oseries_models', 'stresses_models'], name: str, force: bool = False) None[source]

Delete items (series or models) (internal method).

Must be overridden by subclass.

Parameters:
  • libname (str) – name of library to delete item from

  • name (str) – name of item to delete

Delete model name from stored list of models per oseries.

Parameters:
  • onam (str) – name of oseries

  • mlnam (str) – name of model

Delete model name from stored list of models per stress.

Parameters:
  • stress_names (list[str]) – list of stress names for which to remove the model link.

  • model_name (str) – Name of the model to remove from the stress links.

abstractmethod _get_item(libname: Literal['oseries', 'stresses', 'models', 'oseries_models', 'stresses_models'], name: str) DataFrame | Series | dict[source]

Get item (series or pastas.Models) (internal method).

Must be overridden by subclass.

Parameters:
  • libname (str) – name of library

  • name (str) – name of item

Returns:

item – item (time series or pastas.Model)

Return type:

DataFrameOrSeries | dict

abstractmethod _get_library(libname: Literal['oseries', 'stresses', 'models', 'oseries_models', 'stresses_models'])[source]

Get library handle.

Must be overridden by subclass.

Parameters:

libname (str) – name of the library

Returns:

lib – handle to the library

Return type:

Any

abstractmethod _get_metadata(libname: Literal['oseries', 'stresses'], name: str) dict[source]

Get metadata (internal method).

Must be overridden by subclass.

Parameters:
  • libname (str) – name of the library

  • name (str) – name of the item

Returns:

metadata – dictionary containing metadata

Return type:

dict

_get_model_stress_names(ml: Model | dict) list[str][source]

Get list of stress names used in model.

Parameters:

ml (pastas.Model or dict) – model to get stress names from

Returns:

list of stress names used in model

Return type:

list[str]

_get_series(libname: str, names: list | str, progressbar: bool = True, squeeze: bool = True) DataFrame | Series[source]

Get time series (internal method).

Parameters:
  • libname (str) – name of the library

  • names (str | list[str]) – names of the time series to load

  • progressbar (bool, optional) – show progressbar, by default True

  • squeeze (bool, optional) – if True return DataFrame or Series instead of dictionary for single entry

Returns:

either returns time series as pandas.DataFrame or dictionary containing the time series.

Return type:

pandas.DataFrame or dict of pandas.DataFrames

Get model names per oseries and stresses time series in a dictionary.

Returns:

links – dictionary with ‘oseries’ and ‘stresses’ as keys containing dictionaries with time series names as keys and lists of model names as values.

Return type:

dict

abstractmethod _item_exists(libname: Literal['oseries', 'stresses', 'models', 'oseries_models', 'stresses_models'], name: str) bool[source]

Return True if item present in library, else False.

_iter_series(libname: Literal['oseries', 'stresses'], names: list[str] | None = None)[source]

Iterate over time series in library (internal method).

Parameters:
  • libname (str) – name of library (e.g. ‘oseries’ or ‘stresses’)

  • names (list[str] | None, optional) – list of names, by default None, which defaults to all stored series

Yields:

pandas.Series or pandas.DataFrame – time series contained in library

abstractmethod _list_symbols(libname: Literal['oseries', 'stresses', 'models', 'oseries_models', 'stresses_models']) list[str][source]

Return list of symbol names in library.

property _modelnames_cache: list[str]

List of model names.

abstractmethod _parallel(func: Callable, names: list[str], kwargs: dict | None = None, progressbar: bool | None = True, max_workers: int | None = None, chunksize: int | None = None, desc: str = '') None[source]

Parallel processing of function.

Must be overridden by subclass.

Parameters:
  • func (function) – function to apply in parallel

  • names (list) – list of names to apply function to

  • kwargs (dict) – additional keyword arguments to pass to function

  • progressbar (bool, optional) – show progressbar, by default True

  • max_workers (int, optional) – maximum number of workers, by default None

  • chunksize (int, optional) – chunksize for parallel processing, by default None

  • desc (str, optional) – description for progressbar, by default “”

_update_series(libname: Literal['oseries', 'stresses'], series: DataFrame | Series, name: str, metadata: dict | None = None, validate: bool | None = None, force: bool = False) None[source]

Update time series (internal method).

Parameters:
  • libname (str) – name of library

  • series (DataFrameOrSeries) – time series containing update values

  • name (str) – name of the time series to update

  • metadata (dict | None, optional) – optionally provide metadata dictionary which will also update the current stored metadata dictionary, by default None

  • validate (bool, optional) – use pastas to validate series, default is None, which will use the USE_PASTAS_VALIDATE_SERIES value (default is True).

  • force (bool, optional) – force update even if time series is used in a model, by default False

Add all model names to reverse lookup time series dictionaries.

Used for old PastaStore versions, where relationship between time series and models was not stored. If there are any models in the database and if the oseries_models or stresses_models libraries are empty, loop through all models to determine which time series are used in each model.

Parameters:
  • libraries (list[str], optional) – list of time series libraries to update model links for, by default None which will update both ‘oseries’ and ‘stresses’

  • modelnames (list[str] | None, optional) – list of model names to update links for, by default None

  • recompute (bool, optional) – Indicate operation is an update/recompute of existing links, by default False

  • progressbar (bool, optional) – show progressbar, by default True

_upsert_series(libname: Literal['oseries', 'stresses'], series: DataFrame | Series, name: str, metadata: dict | None = None, validate: bool | None = None, force: bool = False) None[source]

Update or insert series depending on whether it exists in store.

Parameters:
  • libname (str) – name of library

  • series (DataFrameOrSeries) – time series to update/insert

  • name (str) – name of the time series

  • metadata (dict | None, optional) – metadata dictionary, by default None

  • validate (bool, optional) – use pastas to validate series, default is None, which will use the USE_PASTAS_VALIDATE_SERIES value (default is True).

  • force (bool, optional) – force update even if time series is used in a model, by default False

_validator: Validator | None = None
add_model(ml: Model | dict, overwrite: bool = False, validate_metadata: bool = False) None[source]

Add model to the database.

Parameters:
  • ml (pastas.Model or dict) – pastas Model or dictionary to add to the database

  • overwrite (bool, optional) – if True, overwrite existing model, by default False

  • validate_metadata – remove unsupported characters from metadata dictionary keys

  • optional (bool) – remove unsupported characters from metadata dictionary keys

Raises:
  • TypeError – if model is not pastas.Model or dict

  • ItemInLibraryException – if overwrite is False and model is already in the database

add_oseries(series: DataFrame | Series, name: str, metadata: dict | None = None, validate: bool | None = None, overwrite: bool = False) None[source]

Add oseries to the database.

Parameters:
  • series (pandas.Series or pandas.DataFrame) – data to add

  • name (str) – name of the time series

  • metadata (dict, optional) – dictionary containing metadata, by default None.

  • validate (bool, optional) – use pastas to validate series, default is None, which will use the USE_PASTAS_VALIDATE_SERIES value (default is True).

  • overwrite (bool, optional) – overwrite existing dataset with the same name, by default False

add_stress(series: DataFrame | Series, name: str, kind: str, metadata: dict | None = None, validate: bool | None = None, overwrite: bool = False) None[source]

Add stress to the database.

Parameters:
  • series (pandas.Series or pandas.DataFrame) – data to add, if pastas.Timeseries is passed, series_orignal and metadata is stored in database

  • name (str) – name of the time series

  • kind (str) – category to identify type of stress, this label is added to the metadata dictionary.

  • metadata (dict, optional) – dictionary containing metadata, by default None.

  • validate (bool, optional) – use pastas to validate series, default is True

  • overwrite (bool, optional) – overwrite existing dataset with the same name, by default False

property conn_type: str

Get the connector type.

del_model(names: list | str, verbose: bool = True) None[source]

Delete model(s) from the database.

Alias for del_models().

Parameters:
  • names (str | list[str]) – name(s) of the model to delete

  • verbose (bool, optional) – print information about deleted models, by default True

del_models(names: list | str, verbose: bool = True) None[source]

Delete model(s) from the database.

Parameters:
  • names (str | list[str]) – name(s) of the model to delete

  • verbose (bool, optional) – print information about deleted models, by default True

del_oseries(names: list | str, remove_models: bool = False, force: bool = False, verbose: bool = True)[source]

Delete oseries from the database.

Parameters:
  • names (str | list[str]) – name(s) of the oseries to delete

  • remove_models (bool, optional) – also delete models for deleted oseries, default is False

  • force (bool, optional) – force deletion of oseries that are used in models, by default False

  • verbose (bool, optional) – print information about deleted oseries, by default True

del_stress(names: list | str, remove_models: bool = False, force: bool = False, verbose: bool = True)[source]

Delete stress from the database.

Parameters:
  • names (str | list[str]) – name(s) of the stress to delete

  • remove_models (bool, optional) – also delete models for deleted stresses, default is False

  • force (bool, optional) – force deletion of stresses that are used in models, by default False

  • verbose (bool, optional) – print information about deleted stresses, by default True

property empty: bool

Check if the database is empty.

empty_library(libname: Literal['oseries', 'stresses', 'models', 'oseries_models', 'stresses_models'], prompt: bool = True, progressbar: bool = True)[source]

Empty library of all its contents.

Parameters:
  • libname (str) – name of the library

  • prompt (bool, optional) – prompt user for input before deleting contents, by default True. Default answer is “n”, user must enter ‘y’ to delete contents

  • progressbar (bool, optional) – show progressbar, by default True

get_metadata(libname: str, names: list | str, progressbar: bool = False, as_frame: bool = True, squeeze: bool = True) dict[str, Any] | DataFrame[source]

Read metadata from database.

Parameters:
  • libname (str) – name of the library containing the dataset

  • names (str | list[str]) – names of the datasets for which to read the metadata

  • squeeze (bool, optional) – if True return dict instead of list of dict for single entry

Returns:

returns metadata dictionary or DataFrame of metadata

Return type:

dict | pandas.DataFrame

get_model(names: list | str, return_dict: bool = False, progressbar: bool = False, squeeze: bool = True, update_ts_settings: bool = False) Model | list[source]

Load models from database.

Alias for get_models().

Parameters:
  • names (str | list[str]) – names of the models to load

  • return_dict (bool, optional) – return model dictionary instead of pastas.Model (much faster for obtaining parameters, for example)

  • progressbar (bool, optional) – show progressbar, by default False

  • squeeze (bool, optional) – if True return Model instead of list of Models for single entry

  • update_ts_settings (bool, optional) – update time series settings based on time series in store. overwrites stored tmin/tmax in model.

Returns:

return pastas model, or list of models if multiple names were passed

Return type:

pastas.Model or list of pastas.Model

get_model_time_series_names(modelnames: list[str] | str | None = None, dropna: bool = True, progressbar: bool = True) DataFrame | Series[source]

Get time series names contained in model.

Parameters:
  • modelnames (list[str] | str | None, optional) – list or name of models to get time series names for, by default None which will use all modelnames

  • dropna (bool, optional) – drop stresses from table if stress is not included in any model, by default True

  • progressbar (bool, optional) – show progressbar, by default True

Returns:

structure – returns DataFrame with oseries name per model, and a flag indicating whether a stress is contained within a time series model.

Return type:

pandas.DataFrame

get_models(names: list | str, return_dict: bool = False, progressbar: bool = False, squeeze: bool = True, update_ts_settings: bool = False) Model | list[source]

Load models from database.

Parameters:
  • names (str | list[str]) – names of the models to load

  • return_dict (bool, optional) – return model dictionary instead of pastas.Model (much faster for obtaining parameters, for example)

  • progressbar (bool, optional) – show progressbar, by default False

  • squeeze (bool, optional) – if True return Model instead of list of Models for single entry

  • update_ts_settings (bool, optional) – update time series settings based on time series in store. overwrites stored tmin/tmax in model.

Returns:

return pastas model, or list of models if multiple names were passed

Return type:

pastas.Model or list of pastas.Model

get_oseries(names: list | str, return_metadata: bool = False, progressbar: bool = False, squeeze: bool = True) DataFrame | Series | dict | list | None[source]

Get oseries from database.

Parameters:
  • names (str | list[str]) – names of the oseries to load

  • return_metadata (bool, optional) – return metadata as dictionary or list of dictionaries, default is False

  • progressbar (bool, optional) – show progressbar, by default False

  • squeeze (bool, optional) – if True return DataFrame or Series instead of dictionary for single entry

Returns:

  • oseries (pandas.DataFrame or dict of DataFrames) – returns time series as DataFrame or dictionary of DataFrames if multiple names were passed

  • metadata (dict | list[dict]) – metadata for each oseries, only returned if return_metadata=True

get_stress(names: list | str, return_metadata: bool = False, progressbar: bool = False, squeeze: bool = True) DataFrame | Series | dict | list | None[source]

Get stresses from database.

Alias for get_stresses()

Parameters:
  • names (str | list[str]) – names of the stresses to load

  • return_metadata (bool, optional) – return metadata as dictionary or list of dictionaries, default is False

  • progressbar (bool, optional) – show progressbar, by default False

  • squeeze (bool, optional) – if True return DataFrame or Series instead of dictionary for single entry

Returns:

  • stresses (pandas.DataFrame or dict of DataFrames) – returns time series as DataFrame or dictionary of DataFrames if multiple names were passed

  • metadata (dict | list[dict]) – metadata for each stress, only returned if return_metadata=True

get_stresses(names: list[str] | str, return_metadata: bool = False, progressbar: bool = False, squeeze: bool = True) DataFrame | Series | dict | list | None[source]

Get stresses from database.

Parameters:
  • names (str | list[str]) – names of the stresses to load

  • return_metadata (bool, optional) – return metadata as dictionary or list of dictionaries, default is False

  • progressbar (bool, optional) – show progressbar, by default False

  • squeeze (bool, optional) – if True return DataFrame or Series instead of dictionary for single entry

Returns:

  • stresses (pandas.DataFrame or dict of DataFrames) – returns time series as DataFrame or dictionary of DataFrames if multiple names were passed

  • metadata (dict | list[dict]) – metadata for each stress, only returned if return_metadata=True

iter_models(modelnames: list[str] | None = None, return_dict: bool = False)[source]

Iterate over models in library.

Parameters:
  • modelnames (list[str] | None, optional) – list of models to iterate over, by default None which uses all models

  • return_dict (bool, optional) – if True, return model as dictionary, by default False, which returns a pastas.Model.

Yields:

pastas.Model or dict – time series model

iter_oseries(names: list[str] | None = None)[source]

Iterate over oseries in library.

Parameters:

names (list[str] | None, optional) – list of oseries names, by default None, which defaults to all stored series

Yields:

pandas.Series or pandas.DataFrame – oseries contained in library

iter_stresses(names: list[str] | None = None)[source]

Iterate over stresses in library.

Parameters:

names (list[str] | None, optional) – list of stresses names, by default None, which defaults to all stored series

Yields:

pandas.Series or pandas.DataFrame – stresses contained in library

property model_names

List of model names.

Property must be overridden by subclass.

property n_models

Returns the number of models in the store.

Returns:

The number of models in the store.

Return type:

int

property n_oseries

Returns the number of oseries.

Returns:

The number of oseries names.

Return type:

int

property n_stresses

Returns the number of stresses.

Returns:

The number of stresses.

Return type:

int

name: str | None = None
property oseries: DataFrame

Dataframe with overview of oseries.

property oseries_models: dict[str, list[str]]

List of model names per oseries.

Returns:

d – dictionary with oseries names as keys and list of model names as values

Return type:

dict

property oseries_names

List of oseries names.

Property must be overridden by subclass.

property oseries_with_models

List of oseries used in models.

Property must be overridden by subclass.

parse_names(names: list[str] | str | None = None, libname: Literal['oseries', 'stresses', 'models', 'oseries_models', 'stresses_models'] = 'oseries') list[str][source]

Parse names argument and return list of names.

Public method that exposes name parsing functionality.

Parameters:
  • names (list | str, optional) – str or list of str or None or ‘all’ (last two options retrieves all names)

  • libname (str, optional) – name of library, default is ‘oseries’

Returns:

list of names

Return type:

list

property stresses: DataFrame

Dataframe with overview of stresses.

property stresses_models: dict[str, list[str]]

List of model names per stress.

Returns:

d – dictionary with stress names as keys and list of model names as values

Return type:

dict

property stresses_names

List of stresses names.

Property must be overridden by subclass.

property stresses_with_models

List of stresses used in models.

Property must be overridden by subclass.

update_metadata(libname: Literal['oseries', 'stresses'], name: str, metadata: dict) None[source]

Update metadata.

Note: also retrieves and stores time series as updating only metadata is not supported for some Connectors.

Parameters:
  • libname (str) – name of library

  • name (str) – name of the item for which to update metadata

  • metadata (dict) – metadata dictionary that will be used to update the stored metadata

update_oseries(series: DataFrame | Series, name: str, metadata: dict | None = None, force: bool = False) None[source]

Update oseries values.

Parameters:
  • series (DataFrameOrSeries) – time series to update stored oseries with

  • name (str) – name of the oseries to update

  • metadata (dict | None, optional) – optionally provide metadata, which will update the stored metadata dictionary, by default None

  • force (bool, optional) – force update even if time series is used in a model, by default False

update_stress(series: DataFrame | Series, name: str, metadata: dict | None = None, force: bool = False) None[source]

Update stresses values.

Note: the ‘kind’ attribute of a stress cannot be updated! To update the ‘kind’ delete and add the stress again.

Parameters:
  • series (DataFrameOrSeries) – time series to update stored stress with

  • name (str) – name of the stress to update

  • metadata (dict | None, optional) – optionally provide metadata, which will update the stored metadata dictionary, by default None

  • force (bool, optional) – force update even if time series is used in a model, by default False

upsert_oseries(series: DataFrame | Series, name: str, metadata: dict | None = None, force: bool = False) None[source]

Update or insert oseries values depending on whether it exists.

Parameters:
  • series (DataFrameOrSeries) – time series to update/insert

  • name (str) – name of the oseries

  • metadata (dict | None, optional) – optionally provide metadata, which will update the stored metadata dictionary if it exists, by default None

  • force (bool, optional) – force update even if time series is used in a model, by default False

upsert_stress(series: DataFrame | Series, name: str, kind: str, metadata: dict | None = None, force: bool = False) None[source]

Update or insert stress values depending on whether it exists.

Parameters:
  • series (DataFrameOrSeries) – time series to update/insert

  • name (str) – name of the stress

  • metadata (dict | None, optional) – optionally provide metadata, which will update the stored metadata dictionary if it exists, by default None

  • kind (str) – category to identify type of stress, this label is added to the metadata dictionary.

  • force (bool, optional) – force update even if time series is used in a model, by default False

property validation_settings: dict

Return current connector settings as dictionary.

property validator: Validator

Get the Validator instance for this connector.

class pastastore.base.ConnectorUtil[source]

Mix-in class for utility methods used by BaseConnector subclasses.

This class contains internal methods for parsing names, handling metadata, and parsing model dictionaries. It is designed to be mixed into BaseConnector subclasses and assumes the presence of certain attributes and methods from BaseConnector (e.g., oseries_names, stresses_names, get_oseries, get_stresses).

Note

This class should not be instantiated directly. It is intended to be used as a mixin with BaseConnector subclasses only.

static _meta_list_to_frame(metalist: list[dict], names: list[str]) DataFrame[source]

Convert list of metadata dictionaries to DataFrame.

Parameters:
  • metalist (list) – list of metadata dictionaries

  • names (list) – list of names corresponding to data in metalist

Returns:

DataFrame containing overview of metadata

Return type:

pandas.DataFrame

_parse_model_dict(mdict: dict, update_ts_settings: bool = False) Model[source]

Parse dictionary describing pastas models (internal method).

Parameters:
  • mdict (dict) – dictionary describing pastas.Model

  • update_ts_settings (bool, optional) – update stored tmin and tmax in time series settings based on time series loaded from store.

Returns:

ml – time series analysis model

Return type:

pastas.Model

_parse_names(names: list[str] | str | None = None, libname: Literal['oseries', 'stresses', 'models', 'oseries_models', 'stresses_models'] = 'oseries') list[source]

Parse names kwarg, returns iterable with name(s) (internal method).

Parameters:
  • names (list | str, optional) – str or list of str or None or ‘all’ (last two options retrieves all names)

  • libname (str, optional) – name of library, default is ‘oseries’

Returns:

list of names

Return type:

list

class pastastore.base.ModelAccessor(conn)[source]

Object for managing access to stored models.

The ModelAccessor object allows dictionary-like assignment and access to models. In addition it provides some useful utilities for working with stored models in the database.

Examples

Get a model by name:

>>> model = pstore.models["my_model"]

Store a model in the database:

>>> pstore.models["my_model_v2"] = model

Get model metadata dataframe:

>>> pstore.models.metadata

Number of models:

>>> len(pstore.models)

Random model:

>>> model = pstore.models.random()

Iterate over stored models:

>>> for ml in pstore.models:
>>>     ml.solve()
property metadata

Dataframe with overview of models metadata.

random()[source]

Return a random model.

Returns:

A random model object from the connection.

Return type:

pastas.Model

DictConnector

class pastastore.DictConnector(name: str = 'pastas_db')[source]

Bases: BaseConnector, ParallelUtil

DictConnector object that stores timeseries and models in dictionaries.

_add_item(libname: str, item: DataFrame | Series | dict, name: str, metadata: dict | None = None, **_) None[source]

Add item (time series or models) (internal method).

Parameters:
  • libname (str) – name of library

  • item (DataFrameOrSeries) – pandas.Series or pandas.DataFrame containing data

  • name (str) – name of the item

  • metadata (dict, optional) – dictionary containing metadata, by default None

_del_item(libname: Literal['oseries', 'stresses', 'models', 'oseries_models', 'stresses_models'], name: str, force: bool = False) None[source]

Delete items (series or models) (internal method).

Parameters:
  • libname (str) – name of library to delete item from

  • name (str) – name of item to delete

  • force (bool, optional) – if True, force delete item and do not perform check if series is used in a model, by default False

_get_item(libname: Literal['oseries', 'stresses', 'models', 'oseries_models', 'stresses_models'], name: str) DataFrame | Series | dict[source]

Retrieve item from database (internal method).

Parameters:
  • libname (str) – name of the library

  • name (str) – name of the item

Returns:

item – time series or model dictionary, modifying the returned object will not affect the stored data, like in a real database

Return type:

DataFrameOrSeries | dict

_get_library(libname: Literal['oseries', 'stresses', 'models', 'oseries_models', 'stresses_models'])[source]

Get reference to dictionary holding data.

Parameters:

libname (str) – name of the library

Returns:

lib – library handle

Return type:

dict

_get_metadata(libname: Literal['oseries', 'stresses'], name: str) dict[source]

Read metadata (internal method).

Parameters:
  • libname (str) – name of the library the series are in (“oseries” or “stresses”)

  • name (str) – name of item to load metadata for

Returns:

imeta – dictionary containing metadata

Return type:

dict

_item_exists(libname: str, name: str) bool[source]

Check if item exists without scanning directory.

_list_symbols(libname: Literal['oseries', 'stresses', 'models', 'oseries_models', 'stresses_models']) list[str][source]

List symbols in a library (internal method).

Parameters:

libname (str) – name of the library

Returns:

list of symbols in the library

Return type:

list

_parallel(*args, **kwargs) None[source]

Parallel implementation method.

Raises:

NotImplementedError – DictConnector uses in-memory storage that cannot be shared across processes. Use PasConnector or ArcticDBConnector for parallel operations.

PasConnector

class pastastore.PasConnector(name: str, path: str, verbose: bool = True)[source]

Bases: BaseConnector, ParallelUtil

PasConnector object that stores time series and models as JSON files on disk.

_add_item(libname: str, item: DataFrame | Series | dict, name: str, metadata: dict | None = None, **_) None[source]

Add item (time series or models) (internal method).

Parameters:
  • libname (str) – name of library

  • item (DataFrameOrSeries) – pandas.Series or pandas.DataFrame containing data

  • name (str) – name of the item

  • metadata (dict, optional) – dictionary containing metadata, by default None

_del_item(libname: Literal['oseries', 'stresses', 'models', 'oseries_models', 'stresses_models'], name: str, force: bool = False) None[source]

Delete items (series or models) (internal method).

Parameters:
  • libname (str) – name of library to delete item from

  • name (str) – name of item to delete

  • force (bool, optional) – if True, force delete item and do not perform check if series is used in a model, by default False

_get_item(libname: Literal['oseries', 'stresses', 'models', 'oseries_models', 'stresses_models'], name: str) DataFrame | Series | dict[source]

Retrieve item (internal method).

Parameters:
  • libname (str) – name of the library

  • name (str) – name of the item

Returns:

item – time series or model dictionary

Return type:

DataFrameOrSeries | dict

_get_library(libname: Literal['oseries', 'stresses', 'models', 'oseries_models', 'stresses_models']) Path[source]

Get path to directory holding data.

Parameters:

libname (str) – name of the library

Returns:

lib – path to library

Return type:

str

_get_metadata(libname: Literal['oseries', 'stresses'], name: str) dict[source]

Read metadata (internal method).

Parameters:
  • libname (str) – name of the library the series are in (“oseries” or “stresses”)

  • name (str) – name of item to load metadata for

Returns:

imeta – dictionary containing metadata

Return type:

dict

_initialize(verbose: bool = True) None[source]

Initialize the libraries (internal method).

_item_exists(libname: str, name: str) bool[source]

Check if item exists without scanning directory.

_list_symbols(libname: Literal['oseries', 'stresses', 'models', 'oseries_models', 'stresses_models']) list[str][source]

List symbols in a library (internal method).

Parameters:

libname (str) – name of the library

Returns:

list of symbols in the library

Return type:

list

_parallel(func: Callable, names: list[str], kwargs: dict | None = None, progressbar: bool | None = True, max_workers: int | None = None, chunksize: int | None = None, desc: str = '', initializer: Callable = None, initargs: tuple | None = None)[source]

Parallel processing of function.

Does not return results, so function must store results in database.

Warning

When progressbar=True, tasks are dispatched with submit() + as_completed(), so results are returned in completion order, not submission order. When progressbar=False, executor.map() is used and order is preserved. If your caller needs results aligned to names, sort the returned list by name after the call.

Parameters:
  • func (function) – function to apply in parallel

  • names (list) – list of names to apply function to

  • progressbar (bool, optional) – show progressbar, by default True

  • max_workers (int, optional) – maximum number of workers, by default None

  • chunksize (int, optional) – chunksize for parallel processing, by default None

  • desc (str, optional) – description for progressbar, by default “”

  • initializer (Callable, optional) – function to initialize each worker process, by default None

  • initargs (tuple, optional) – arguments to pass to initializer function, by default None

_write_pstore_config_file()[source]

Write pstore configuration file to store database info.

ArcticDBConnector

class pastastore.ArcticDBConnector(name: str, uri: str, verbose: bool = True, worker_process: bool = False)[source]

Bases: BaseConnector, ParallelUtil

ArcticDBConnector object using ArcticDB to store data.

_abc_impl = <_abc._abc_data object>
_add_item(libname: Literal['oseries', 'stresses', 'models', 'oseries_models', 'stresses_models'], item: DataFrame | Series | dict, name: str, metadata: dict | None = None, **_) None[source]

Add item to library (time series or model) (internal method).

Parameters:
  • libname (str) – name of the library

  • item (DataFrameOrSeries | dict) – item to add, either time series or pastas.Model as dictionary

  • name (str) – name of the item

  • metadata (dict | None, optional) – dictionary containing metadata, by default None

_conn_type: str | None = 'arcticdb'
_del_item(libname: Literal['oseries', 'stresses', 'models', 'oseries_models', 'stresses_models'], name: str, force: bool = False) None[source]

Delete items (series or models) (internal method).

Parameters:
  • libname (str) – name of library to delete item from

  • name (str) – name of item to delete

  • force (bool, optional) – force deletion even if series is used in models, by default False

_get_item(libname: Literal['oseries', 'stresses', 'models', 'oseries_models', 'stresses_models'], name: str) DataFrame | Series | dict[source]

Retrieve item from library (internal method).

Parameters:
  • libname (str) – name of the library

  • name (str) – name of the item

Returns:

item – time series or model dictionary

Return type:

DataFrameOrSeries | dict

_get_library(libname: Literal['oseries', 'stresses', 'models', 'oseries_models', 'stresses_models'])[source]

Get ArcticDB library handle.

Parameters:

libname (str) – name of the library

Returns:

lib – handle to the library

Return type:

arcticdb.Library handle

_get_metadata(libname: Literal['oseries', 'stresses'], name: str) dict[source]

Retrieve metadata for an item (internal method).

Parameters:
  • libname (str) – name of the library

  • name (str) – name of the item

Returns:

dictionary containing metadata

Return type:

dict

_initialize(verbose: bool = True) None[source]

Initialize the libraries (internal method).

_item_exists(libname: Literal['oseries', 'stresses', 'models', 'oseries_models', 'stresses_models'], name: str) bool[source]

Check if item exists without scanning directory.

_library_name(libname: Literal['oseries', 'stresses', 'models', 'oseries_models', 'stresses_models']) str[source]

Get full library name according to ArcticDB (internal method).

_list_symbols(libname: Literal['oseries', 'stresses', 'models', 'oseries_models', 'stresses_models']) list[str][source]

List symbols in a library (internal method).

Parameters:

libname (str) – name of the library

Returns:

list of symbols in the library

Return type:

list

_parallel(func: Callable, names: list[str], kwargs: dict | None = None, progressbar: bool | None = True, max_workers: int | None = None, chunksize: int | None = None, desc: str = '', initializer: Callable | None = None, initargs: tuple | None = None)[source]

Parallel processing of function.

Warning

When progressbar=True, tasks are dispatched with submit() + as_completed(), so results are returned in completion order, not submission order. When progressbar=False, executor.map() is used and order is preserved. If your caller needs results aligned to names, sort the returned list by name after the call.

Note

ArcticDB connection objects cannot be pickled, which is required for multiprocessing. This implementation uses an initializer function that creates a new ArcticDBConnector instance in each worker process and stores it in the global conn variable. User-provided functions can access this connector via the global conn variable.

This is the standard Python multiprocessing pattern for unpicklable objects. See: https://docs.python.org/3/library/concurrent.futures.html#processpoolexecutor

For a connector that supports direct method passing (no global variable required), use PasConnector instead.

Parameters:
  • func (function) – function to apply in parallel

  • names (list) – list of names to apply function to

  • kwargs (dict, optional) – keyword arguments to pass to function

  • progressbar (bool, optional) – show progressbar, by default True

  • max_workers (int, optional) – maximum number of workers, by default None

  • chunksize (int, optional) – chunksize for parallel processing, by default None

  • desc (str, optional) – description for progressbar, by default “”

  • initializer (Callable, optional) – function to initialize each worker process, by default None

  • initargs (tuple, optional) – arguments to pass to initializer function, by default None

write_pstore_config_file(path: str = None) None[source]

Write pstore configuration file to store database info.

PastaStore

Module containing the PastaStore object for managing time series and models.

class pastastore.store.PastaStore(connector: BaseConnector | None = None, name: str | None = None)[source]

PastaStore object for managing pastas time series and models.

Requires a Connector object to provide the interface to the database. Different Connectors are available, e.g.:

  • PasConnector for storing all data as .pas (JSON) files on disk (recommended)

  • ArcticDBConnector for saving data on disk using arcticdb package

  • DictConnector for storing all data in dictionaries (in-memory)

Parameters:
  • connector (Connector object) – object that provides the interface to the database, e.g. ArcticConnector (see pastastore.connectors)

  • name (str, optional) – name of the PastaStore, by default takes the name of the Connector object

add_recharge(ml: Model, rfunc=None, recharge=None, recharge_name: str = 'recharge') None[source]

Add recharge to a pastas model.

Uses closest precipitation and evaporation time series in database. These are assumed to be labeled with kind = ‘prec’ or ‘evap’.

Parameters:
  • ml (pastas.Model) – pastas.Model object

  • rfunc (pastas.rfunc, optional) – response function to use for recharge in model, by default None which uses ps.Exponential() (for different response functions, see pastas documentation)

  • recharge (ps.RechargeModel) – recharge model to use, default is ps.rch.Linear()

  • recharge_name (str) – name of the RechargeModel

add_stressmodel(ml: ~pastas.model.Model | str, stresses: str | list[str] | dict[str, str], stressmodel=<class 'pastas.stressmodels.StressModel'>, stressmodel_name: str | None = None, rfunc=<class 'pastas.rfunc.Exponential'>, rfunc_kwargs: dict | None = None, kind: list[str] | str | None = None, **kwargs)[source]

Add a pastas StressModel from stresses time series in Pastastore.

Supports “nearest” selection. Any stress name can be replaced by “nearest [<n>] <kind>” where <n> is optional and represents the number of nearest stresses and <kind> and represents the kind of stress to consider. <kind> can also be specified directly with the kind kwarg.

Note: the ‘nearest’ option requires the oseries name to be provided. Additionally, ‘x’ and ‘y’ metadata must be stored for oseries and stresses.

Parameters:
  • ml (pastas.Model or str) – pastas.Model object to add StressModel to, if passed as string, model is loaded from store, the stressmodel is added and then written back to the store.

  • stresses (str | list[str] | dict) –

    name(s) of the time series to use for the stressmodel, or dictionary with key(s) and value(s) as time series name(s). Options include:

    • name of stress: “prec_stn”

    • list of stress names: [“prec_stn”, “evap_stn”]

    • dict for RechargeModel: {“prec”: “prec_stn”, “evap”: “evap_stn”}

    • dict for StressModel: {“stress”: “well1”}

    • nearest, specifying kind: “nearest well”

    • nearest specifying number and kind: “nearest 2 well”

  • stressmodel (str or class) – stressmodel class to use, by default ps.StressModel

  • stressmodel_name (str, optional) – name of the stressmodel, by default None, which uses the stress name, if there is 1 stress otherwise the name of the stressmodel type. For RechargeModels, the name defaults to ‘recharge’.

  • rfunc (str or class) – response function class to use, by default ps.Exponential

  • rfunc_kwargs (dict, optional) – keyword arguments to pass to the response function, by default None

  • kind (str | list[str], optional) – specify kind of stress(es) to use, by default None, useful in combination with ‘nearest’ option for defining stresses

  • **kwargs – additional keyword arguments to pass to the stressmodel

apply(libname: Literal['oseries', 'stresses', 'models'], func: Callable, names: list[str] | str | None = None, kwargs: dict | None = None, progressbar: bool = True, parallel: bool = False, max_workers: int | None = None, fancy_output: bool = True, initializer: Callable | None = None, initargs: tuple | None = None) dict | Series | DataFrame | Any[source]

Apply function to items in library.

Supported libraries are oseries, stresses, and models.

Parameters:
  • libname (str) – library name, supports “oseries”, “stresses” and “models”

  • func (callable) – function that accepts a string corresponding to the name of an item in the library as its first argument. Additional keyword arguments can be specified. The function can return any result, or update an item in the database without returning anything.

  • names (str | list[str], optional) – apply function to these names, by default None which loops over all stored items in library

  • kwargs (dict, optional) – keyword arguments to pass to func, by default None

  • progressbar (bool, optional) – show progressbar, by default True

  • parallel (bool, optional) – run apply in parallel, default is False.

  • max_workers (int, optional) – max no. of workers, only used if parallel is True

  • fancy_output (bool, optional) – if True, try returning result as pandas Series or DataFrame, by default False

  • initializer (Callable, optional) – function to initialize each worker process, only used if parallel is True

  • initargs (tuple, optional) – arguments to pass to initializer, only used if parallel is True

Returns:

dict of results of func, with names as keys and results as values

Return type:

dict

Notes

Users should be aware that parallel solving is platform dependent and may not always work. The current implementation works well for Linux users. For Windows users, parallel solving does not work when called directly from Jupyter Notebooks or IPython. To use parallel solving on Windows, the following code should be used in a Python file:

from multiprocessing import freeze_support

if __name__ == "__main__":
    freeze_support()
    pstore.apply("models", some_func, parallel=True)
check_models(checklist=None, modelnames=None, style_output: bool = False)[source]

Check models against checklist.

Parameters:
  • checklist (dict, optional) –

    dictionary containing model check methods, by default None which uses the ps.checks.checks_brakenhoff_2022 checklist. This includes:

    • fit metric R² >= 0.6

    • runs test for autocorrelation

    • t95 response < half length calibration period

    • |model parameters| < 1.96 * σ (std deviation)

    • model parameters are not on bounds

  • modelnames (list[str], optional) – list of modelnames to perform checks on, by default None

  • style_output (bool, optional) – if True, return styled dataframe with pass/fail colors, by default False

Returns:

DataFrame containing pass True/False for each check for each model

Return type:

pd.DataFrame

create_model(name: str, modelname: str | None = None, add_recharge: bool = True, add_ar_noisemodel: bool = False, recharge_name: str = 'recharge') Model[source]

Create a pastas Model.

Parameters:
  • name (str) – name of the oseries to create a model for

  • modelname (str, optional) – name of the model, default is None, which uses oseries name

  • add_recharge (bool, optional) – add recharge to the model by looking for the closest precipitation and evaporation time series in the stresses library, by default True

  • add_ar_noisemodel (bool, optional) – add AR(1) noise model to the model, by default False

  • recharge_name (str) – name of the RechargeModel

Returns:

model for the oseries

Return type:

pastas.Model

Raises:
  • KeyError – if data is stored as dataframe and no column is provided

  • ValueError – if time series is empty

create_models_bulk(oseries: list[str] | str | None = None, add_recharge: bool = True, solve: bool = False, store_models: bool = True, ignore_errors: bool = False, suffix: str | None = None, progressbar: bool = True, **kwargs) tuple[dict, dict] | dict[source]

Bulk creation of pastas models.

Parameters:
  • oseries (list[str], optional) – names of oseries to create models for, by default None, which creates models for all oseries

  • add_recharge (bool, optional) – add recharge to the models based on closest precipitation and evaporation time series, by default True

  • solve (bool, optional) – solve the model, by default False

  • store_models (bool, optional) – if False, return a list of models, by default True, which will store the models in the database.

  • ignore_errors (bool, optional) – ignore errors while creating models, by default False

  • suffix (str, optional) – add suffix to oseries name to create model name, by default None

  • progressbar (bool, optional) – show progressbar, by default True

Returns:

  • models (dict, if return_models is True) – dictionary of models

  • errors (list, always returned) – list of model names that could not be created

property empty: bool

Check if the PastaStore is empty.

export_model_series_to_csv(names: list[str] | str | None = None, exportdir: Path | str = '.', exportmeta: bool = True)[source]

Export model time series to csv files.

Parameters:
  • names (list[str] | str | None, optional) – names of models to export, by default None, which uses retrieves all models from database

  • exportdir (str, optional) – directory to export csv files to, default is current directory

  • exportmeta (bool, optional) – export metadata for all time series as csv file, default is True

classmethod from_pastastore_config_file(fname, update_path: bool = True)[source]

Create a PastaStore from a pastastore config file.

Parameters:
  • fname (str) – path to the pastastore config file

  • update_path (bool, optional) – when True, use path derived from location of the config file instead of the stored path in the config file. If a PastaStore is moved, the path in the config file will probably still refer to the old location. Set to False to read the file from the path listed in the config file. In that case config files do not need to be stored within the correct directory.

Returns:

PastaStore

Return type:

PastaStore

classmethod from_zip(fname: str, conn: BaseConnector | None = None, storename: str | None = None, progressbar: bool = True, series_ext_json: bool = False)[source]

Load PastaStore from zipfile.

Parameters:
  • fname (str) – pathname of zipfile

  • conn (Connector object, optional) – connector for storing loaded data, default is None which creates a DictConnector. This Connector does not store data on disk.

  • storename (str, optional) – name of the PastaStore, by default None, which defaults to the name of the Connector.

  • progressbar (bool, optional) – show progressbar, by default True

  • series_ext_json (bool, optional) – if True, series are expected to have a .json extension, by default False, which assumes a .pas extension. set this option to true for reading zipfiles created with older versions of pastastore <1.8.0.

Returns:

return PastaStore containing data from zipfile

Return type:

pastastore.PastaStore

get_distances(oseries: list[str] | str | None = None, stresses: list[str] | str | None = None, kind: list[str] | str | None = None) DataFrame | Series[source]

Get the distances in meters between the oseries and stresses.

Parameters:
  • oseries (str | list[str]) – name(s) of the oseries

  • stresses (str | list[str]) – name(s) of the stresses

  • kind (str | list[str]) – string or list of strings representing which kind(s) of stresses to consider

Returns:

distances – Pandas DataFrame with the distances between the oseries (index) and the stresses (columns).

Return type:

pandas.DataFrame

get_extent(libname, names=None, buffer=0.0)[source]

Get extent [xmin, xmax, ymin, ymax] from library.

Parameters:
  • libname (str) – name of the library containing the time series (‘oseries’, ‘stresses’, ‘models’)

  • names (str | list[str], optional) – list of names to include for computing the extent

  • buffer (float, optional) – add this distance to the extent, by default 0.0

Returns:

extent – extent [xmin, xmax, ymin, ymax]

Return type:

list

get_nearest_oseries(names: list[str] | str | None = None, n: int = 1, maxdist: float | None = None) DataFrame | Series[source]

Get the nearest (n) oseries.

Parameters:
  • names (str | list[str]) – string or list of strings with the name(s) of the oseries

  • n (int) – number of oseries to obtain

  • maxdist (float, optional) – maximum distance to consider

Returns:

list with the names of the oseries.

Return type:

oseries

get_nearest_stresses(oseries: list[str] | str | None = None, stresses: list[str] | str | None = None, kind: list[str] | str | None = None, n: int = 1, maxdist: float | None = None) DataFrame | Series[source]

Get the nearest (n) stresses of a specific kind.

Parameters:
  • oseries (str) – string with the name of the oseries

  • stresses (str | list[str]) – string with the name of the stresses

  • kind (str | list[str], optional) – string or list of str with the name of the kind(s) of stresses to consider

  • n (int) – number of stresses to obtain

  • maxdist (float, optional) – maximum distance to consider

Returns:

list with the names of the stresses.

Return type:

stresses

get_oseries_distances(names: list[str] | str | None = None) DataFrame | Series[source]

Get the distances in meters between the oseries.

Parameters:

names (str | list[str]) – names of the oseries to calculate distances between

Returns:

distances – Pandas DataFrame with the distances between the oseries

Return type:

pandas.DataFrame

get_parameters(parameters: list[str] | None = None, modelnames: list[str] | None = None, param_value: str | None = 'optimal', progressbar: bool | None = False, ignore_errors: bool | None = True) DataFrame | Series[source]

Get model parameters.

NaN-values are returned when the parameters are not present in the model or the model is not optimized.

Parameters:
  • parameters (list[str], optional) – names of the parameters, by default None which uses all parameters from each model

  • modelnames (str | list[str], optional) – name(s) of model(s), by default None in which case all models are used

  • param_value (str, optional) – which column to use from the model parameters dataframe, by default “optimal” which retrieves the optimized parameters.

  • progressbar (bool, optional) – show progressbar, default is False

  • ignore_errors (bool, optional) – ignore errors when True, i.e. when non-existent model is encountered in modelnames, by default True

Returns:

p – DataFrame containing the parameters (columns) per model (rows)

Return type:

pandas.DataFrame

get_signatures(names: list[str] | str | None = None, signatures: list[str] | None = None, libname: Literal['oseries', 'stresses'] = 'oseries', progressbar: bool = False, ignore_errors: bool = False) DataFrame | Series[source]

Get groundwater signatures.

NaN-values are returned when the signature cannot be computed.

Parameters:
  • names (str | list[str], optional) – names of the time series, by default None which uses all the time series in the library

  • signatures (list[str], optional) – list of groundwater signatures to compute, if None all groundwater signatures in ps.stats.signatures.__all__ are used, by default None

  • libname (str) – name of the library containing the time series (‘oseries’ or ‘stresses’), by default “oseries”

  • progressbar (bool, optional) – show progressbar, by default False

  • ignore_errors (bool, optional) – ignore errors when True, i.e. when non-existent timeseries is encountered in names, by default False

Returns:

signatures_df – Containing the time series (columns) and the signatures (index).

Return type:

pandas.DataFrame or pandas.Series

Note

Names is set as the first argument to allow parallelization.

get_statistics(statistics: str | list[str], modelnames: list[str] | None = None, parallel: bool = False, progressbar: bool = False, ignore_errors: bool = False, fancy_output: bool = True, **kwargs) DataFrame | Series[source]

Get model statistics.

Parameters:
  • statistics (str | list[str]) – statistic or list of statistics to calculate, e.g. [“evp”, “rsq”, “rmse”], for a full list see pastas.modelstats.Statistics.ops.

  • modelnames (list[str], optional) – modelnames to calculates statistics for, by default None, which uses all models in the store

  • progressbar (bool, optional) – show progressbar, by default False

  • ignore_errors (bool, optional) – ignore errors when True, i.e. when trying to calculate statistics for non-existent model in modelnames, default is False

  • parallel (bool, optional) – use parallel processing, by default False

  • fancy_output (bool, optional) – only read if parallel=True, if True, return as DataFrame with statistics, otherwise return list of results

  • **kwargs – any arguments that can be passed to the methods for calculating statistics

Returns:

s

Return type:

pandas.DataFrame

get_stressmodel(stresses: str | list[str] | dict[str, str], stressmodel=<class 'pastas.stressmodels.StressModel'>, stressmodel_name: str | None = None, rfunc=<class 'pastas.rfunc.Exponential'>, rfunc_kwargs: dict | None = None, kind: list[str] | str | None = None, oseries: str | None = None, **kwargs)[source]

Get a Pastas stressmodel from stresses time series in Pastastore.

Supports “nearest” selection. Any stress name can be replaced by “nearest [<n>] <kind>” where <n> is optional and represents the number of nearest stresses and <kind> and represents the kind of stress to consider. <kind> can also be specified directly with the kind kwarg.

Note: the ‘nearest’ option requires the oseries name to be provided. Additionally, ‘x’ and ‘y’ metadata must be stored for oseries and stresses.

Parameters:
  • stresses (str | list[str] | dict) –

    name(s) of the time series to use for the stressmodel, or dictionary with key(s) and value(s) as time series name(s). Options include:

    • name of stress: “prec_stn”

    • list of stress names: [“prec_stn”, “evap_stn”]

    • dict for RechargeModel: {“prec”: “prec_stn”, “evap”: “evap_stn”}

    • dict for StressModel: {“stress”: “well1”}

    • nearest, specifying kind: “nearest well”

    • nearest specifying number and kind: “nearest 2 well”

  • stressmodel (str or class) – stressmodel class to use, by default ps.StressModel

  • stressmodel_name (str, optional) – name of the stressmodel, by default None, which uses the stress name, if there is 1 stress otherwise the name of the stressmodel type. For RechargeModels, the name defaults to ‘recharge’.

  • rfunc (str or class) – response function class to use, by default ps.Exponential

  • rfunc_kwargs (dict, optional) – keyword arguments to pass to the response function, by default None

  • kind (str | list[str], optional) – specify kind of stress(es) to use, by default None, useful in combination with ‘nearest’ option for defining stresses

  • oseries (str, optional) – name of the oseries to use for the stressmodel, by default None, used when ‘nearest’ option is used for defining stresses.

  • **kwargs – additional keyword arguments to pass to the stressmodel

Returns:

stressmodel – pastas StressModel that can be added to pastas Model.

Return type:

pastas.StressModel

get_tmin_tmax(libname: Literal['oseries', 'stresses', 'models'] | None = None, names: str | list[str] | None = None, progressbar: bool = False) DataFrame[source]

Get tmin and tmax for time series and/or models.

Parameters:
  • libname (str, optional) – name of the library containing the time series (‘oseries’, ‘stresses’, ‘models’, or None), by default None which returns tmin/tmax for all libraries

  • names (str | list[str], optional) – names of the time series, by default None which uses all the time series in the library

  • progressbar (bool, optional) – show progressbar, by default False

Returns:

tmintmax – Dataframe containing tmin and tmax per time series and/or model

Return type:

pd.dataframe

property model_names

Return list of model names.

Returns:

list of model names

Return type:

list

property models

Return the ModelAccessor object.

The ModelAccessor object allows dictionary-like assignment and access to models. In addition it provides some useful utilities for working with stored models in the database.

Examples

Get a model by name:

>>> model = pstore.models["my_model"]

Store a model in the database:

>>> pstore.models["my_model_v2"] = model

Get model metadata dataframe:

>>> pstore.models.metadata

Number of models:

>>> len(pstore.models)

Random model:

>>> model = pstore.models.random()

Iterate over stored models:

>>> for ml in pstore.models:
>>>     ml.solve()
Returns:

ModelAccessor object

Return type:

ModelAccessor

property n_models

Return number of models.

Returns:

number of models

Return type:

int

property n_oseries

Return number of oseries.

Returns:

number of oseries

Return type:

int

property n_stresses

Return number of stresses.

Returns:

number of stresses

Return type:

int

property oseries

Returns the oseries metadata as dataframe.

Returns:

oseries metadata as dataframe

Return type:

oseries

property oseries_models

Return dictionary of models per oseries.

Returns:

dictionary containing list of models (values) for each oseries (keys).

Return type:

dict

property oseries_names

Return list of oseries names.

Returns:

list of oseries names

Return type:

list

property oseries_with_models

Return list of oseries for which models are contained in the database.

Returns:

list of oseries names for which models are contained in the database.

Return type:

list

search(s: list | str | None = None, libname: Literal['oseries', 'stresses', 'models'] | None = None, case_sensitive: bool = True, sort=True)[source]

Search for names of time series or models containing string s.

Parameters:
  • libname (str) – name of the library to search in

  • s (str, lst) – find names with part of this string or strings in list

  • case_sensitive (bool, optional) – whether search should be case sensitive, by default True

  • sort (bool, optional) – sort list of names

Returns:

matches – list of names that match search result

Return type:

list

solve_models(modelnames: list[str] | str | None = None, report: bool = False, ignore_solve_errors: bool = False, progressbar: bool = True, parallel: bool = False, max_workers: int | None = None, **kwargs) None[source]

Solves the models in the store.

Parameters:
  • modelnames (list[str], optional) – list of model names, if None all models in the pastastore are solved.

  • report (boolean, optional) – determines if a report is printed when the model is solved, default is False

  • ignore_solve_errors (boolean, optional) – if True, errors emerging from the solve method are ignored, default is False which will raise an exception when a model cannot be optimized

  • progressbar (bool, optional) – show progressbar, default is True.

  • parallel (bool, optional) – if True, solve models in parallel using ProcessPoolExecutor

  • max_workers (int, optional) – maximum number of workers to use in parallel solving, default is None which will use the number of cores available on the machine

  • **kwargs (dictionary) – arguments are passed to the solve method.

Notes

Users should be aware that parallel solving is platform dependent and may not always work. The current implementation works well for Linux users. For Windows users, parallel solving does not work when called directly from Jupyter Notebooks or IPython. To use parallel solving on Windows, the following code should be used in a Python file:

from multiprocessing import freeze_support

if __name__ == "__main__":
    freeze_support()
    pstore.solve_models(parallel=True)
property stresses

Returns the stresses metadata as dataframe.

Returns:

stresses metadata as dataframe

Return type:

stresses

property stresses_models

Return dictionary of models per stress.

Returns:

dictionary containing list of models (values) for each stress (keys).

Return type:

dict

property stresses_names

Return list of streses names.

Returns:

list of stresses names

Return type:

list

property stresses_with_models

Return list of stresses for which models are contained in the database.

Returns:

list of stress names for which models are contained in the database.

Return type:

list

to_zip(fname: str | Path, overwrite=False, progressbar: bool = True)[source]

Write data to zipfile.

Parameters:
  • fname (str | Path) – name of zipfile

  • overwrite (bool, optional) – if True, overwrite existing file

  • progressbar (bool, optional) – show progressbar, by default True

within(extent: list, names: list[str] | None = None, libname: Literal['oseries', 'stresses', 'models'] = 'oseries')[source]

Get names of items within extent.

Parameters:
  • extent (list) – list with [xmin, xmax, ymin, ymax]

  • names (str | list[str], optional) – list of names to include, by default None

  • libname (str, optional) – name of library, must be one of (‘oseries’, ‘stresses’, ‘models’), by default “oseries”

Returns:

list of items within extent

Return type:

list

Plots

class pastastore.plotting.plots.Plots(pstore)[source]

Plot class for Pastastore.

Allows plotting of time series and data availability.

static _data_availability(series, names=None, intervals=None, ignore=('second', 'minute', '14 days'), ax=None, cax=None, normtype='log', cmap='viridis_r', set_yticks=False, figsize=(10, 8), dropna=True, **kwargs)[source]

Plot the data-availability for a list of time series.

Parameters:
  • libname (list of pandas.Series) – list of series to plot data availability for

  • names (list, optional) – specify names of series, default is None in which case names will be taken from series themselves.

  • kind (str, optional) – if library is stresses, kind can be specified to obtain only stresses of a specific kind

  • intervals (dict, optional) – A dict with frequencies as keys and number of seconds as values

  • ignore (list, optional) – A list with frequencies in intervals to ignore

  • ax (matplotlib Axes, optional) – pass axes object to plot data availability on existing figure. by default None, in which case a new figure is created

  • cax (matplotlib Axes, optional) – pass object axes to plot the colorbar on. by default None, which gives default Maptlotlib behavior

  • normtype (str, optional) – Determines the type of color normalisations, default is ‘log’

  • cmap (str, optional) – A reference to a matplotlib colormap

  • set_yticks (bool, optional) – Set the names of the series as yticks

  • figsize (tuple, optional) – The size of the new figure in inches (h,v)

  • progressbar (bool) – Show progressbar

  • dropna (bool) – Do not show NaNs as available data

  • kwargs (dict, optional) – Extra arguments are passed to matplotlib.pyplot.subplots()

Returns:

ax – The axes in which the data-availability is plotted

Return type:

matplotlib Axes

_timeseries(libname, names=None, ax=None, split=False, figsize=(10, 5), progressbar=True, show_legend=True, labelfunc=None, legend_kwargs=None, **kwargs)[source]

Plot time series from pastastore (internal method).

Parameters:
  • libname (str) – name of the library to obtain time series from (oseries or stresses)

  • names (list[str], optional) – list of time series names to plot, by default None

  • ax (matplotlib.Axes, optional) – pass axes object to plot on existing axes, by default None, which creates a new figure

  • split (bool, optional) – create a separate subplot for each time series, by default False. A maximum of 20 time series is supported when split=True.

  • figsize (tuple, optional) – figure size, by default (10, 5)

  • progressbar (bool, optional) – show progressbar when loading time series from store, by default True

  • show_legend (bool, optional) – show legend, default is True.

  • labelfunc (callable, optional) – function to create custom labels, function should take name of time series as input

  • legend_kwargs (dict, optional) – additional arguments to pass to legend

Returns:

ax – axes handle

Return type:

matplotlib.Axes

Raises:

ValueError – split=True is only supported if there are less than 20 time series to plot.

compare_models(modelnames, ax=None, **kwargs)[source]

Compare multiple models and plot the results.

Parameters:
  • modelnames (list) – A list of model names to compare.

  • ax (matplotlib.axes.Axes, optional) – The axes on which to plot the comparison. If not provided, a new figure and axes will be created.

  • **kwargs (dict) – Additional keyword arguments to pass to the plot function.

Returns:

cm – The CompareModels object containing the comparison results.

Return type:

pastastore.CompareModels

cumulative_hist(statistic='rsq', modelnames=None, extend=False, ax=None, figsize=(6, 6), label=None, legend=True, progressbar=True)[source]

Plot a cumulative step histogram for a model statistic.

Parameters:
  • statistic (str) – name of the statistic, e.g. “evp” or “rmse”, by default “rsq”

  • modelnames (list[str], optional) – modelnames to plot statistic for, by default None, which uses all models in the store

  • extend (bool, optional) – force extend the stats Series with a dummy value to move the horizontal line outside figure bounds. If True the results are skewed a bit, especially if number of models is low.

  • ax (matplotlib.Axes, optional) – axes to plot histogram, by default None which creates an Axes

  • figsize (tuple, optional) – figure size, by default (6,6)

  • label (str, optional) – label for the legend, by default None, which shows the number of models

  • legend (bool, optional) – show legend, by default True

  • progressbar (bool, optional) – show progressbar, default is True.

Returns:

ax – The axes in which the cumulative histogram is plotted

Return type:

matplotlib Axes

data_availability(libname, names=None, kind=None, intervals=None, ignore=('second', 'minute', '14 days'), ax=None, cax=None, normtype='log', cmap='viridis_r', set_yticks=False, figsize=(10, 8), progressbar=True, dropna=True, **kwargs)[source]

Plot the data-availability for multiple time series in pastastore.

Parameters:
  • libname (str) – name of library to get time series from (oseries or stresses)

  • names (list, optional) – specify names in a list to plot data availability for certain time series

  • kind (str, optional) – if library is stresses, kind can be specified to obtain only stresses of a specific kind

  • intervals (dict, optional) – A dict with frequencies as keys and number of seconds as values

  • ignore (list, optional) – A list with frequencies in intervals to ignore

  • ax (matplotlib Axes, optional) – pass axes object to plot data availability on existing figure. by default None, in which case a new figure is created

  • cax (matplotlib Axes, optional) – pass object axes to plot the colorbar on. by default None, which gives default Maptlotlib behavior

  • normtype (str, optional) – Determines the type of color normalisations, default is ‘log’

  • cmap (str, optional) – A reference to a matplotlib colormap

  • set_yticks (bool, optional) – Set the names of the series as yticks

  • figsize (tuple, optional) – The size of the new figure in inches (h,v)

  • progressbar (bool) – Show progressbar

  • dropna (bool) – Do not show NaNs as available data

  • kwargs (dict, optional) – Extra arguments are passed to matplotlib.pyplot.subplots()

Returns:

ax – The axes in which the data-availability is plotted

Return type:

matplotlib Axes

oseries(names=None, ax=None, split=False, figsize=(10, 5), show_legend=True, labelfunc=None, legend_kwargs=None, **kwargs)[source]

Plot oseries.

Parameters:
  • names (list[str], optional) – list of oseries names to plot, by default None, which loads all oseries from store

  • ax (matplotlib.Axes, optional) – pass axes object to plot oseries on existing figure, by default None, in which case a new figure is created

  • split (bool, optional) – create a separate subplot for each time series, by default False. A maximum of 20 time series is supported when split=True.

  • figsize (tuple, optional) – figure size, by default (10, 5)

  • show_legend (bool, optional) – show legend, default is True.

  • labelfunc (callable, optional) – function to create custom labels, function should take name of time series as input

  • legend_kwargs (dict, optional) – additional arguments to pass to legend

Returns:

ax – axes handle

Return type:

matplotlib.Axes

stresses(names=None, kind=None, ax=None, split=False, figsize=(10, 5), show_legend=True, labelfunc=None, legend_kwargs=None, **kwargs)[source]

Plot stresses.

Parameters:
  • names (list[str], optional) – list of oseries names to plot, by default None, which loads all oseries from store

  • kind (str, optional) – only plot stresses of a certain kind, by default None, which includes all stresses

  • ax (matplotlib.Axes, optional) – pass axes object to plot oseries on existing figure, by default None, in which case a new figure is created

  • split (bool, optional) – create a separate subplot for each time series, by default False. A maximum of 20 time series is supported when split=True.

  • figsize (tuple, optional) – figure size, by default (10, 5)

  • show_legend (bool, optional) – show legend, default is True.

  • labelfunc (callable, optional) – function to create custom labels, function should take name of time series as input

  • legend_kwargs (dict, optional) – additional arguments to pass to legend

Returns:

ax – axes handle

Return type:

matplotlib.Axes

Maps

class pastastore.plotting.maps.Maps(pstore)[source]

Map Class for PastaStore.

Allows plotting locations and model statistics on maps.

Usage

Example usage of the maps methods: :

>> > ax = pstore.maps.oseries() # plot oseries locations >> > pstore.maps.add_background_map(ax) # add background map

static _list_contextily_providers()[source]

List contextily providers.

Taken from contextily notebooks.

Returns:

providers – dictionary containing all providers. See keys for names that can be passed as map_provider arguments.

Return type:

dict

_plotmap_dataframe(*args, **kwargs)[source]

Deprecated, use dataframe method.

static add_background_map(ax, proj='epsg:28992', map_provider='OpenStreetMap.Mapnik', **kwargs)[source]

Add background map to axes using contextily.

Parameters:
  • ax (matplotlib.Axes) – axes to add background map to

  • map_provider (str, optional) – name of map provider, see contextily.providers for options. Default is ‘OpenStreetMap.Mapnik’

  • proj (pyproj.Proj or str, optional) – projection for background map, default is ‘epsg:28992’ (RD Amersfoort, a projection for the Netherlands)

  • **kwargs – additional keyword arguments passed to contextily.add_basemap

static add_labels(df, ax, adjust=False, objects=None, adjust_text_kwargs=None, **kwargs)[source]

Add labels to points on plot.

Uses dataframe index to label points.

Parameters:
  • df (pd.DataFrame) – DataFrame containing x, y - data. Index is used as label

  • ax (matplotlib.Axes) – axes object to label points on

  • adjust (bool) – automated smart label placement using adjustText

  • objects (list of matplotlib objects) – use to avoid labels overlapping markers

  • adjust_text_kwargs – keyword arguments to adjust_text function, only used if adjust=True

  • **kwargs – keyword arguments to ax.annotate or ax.text

dataframe(df, column, label=None, labels=True, adjust=False, cmap='viridis', colorbar=True, legend=False, norm=None, vmin=None, vmax=None, ax=None, figsize=(10, 8), backgroundmap=False, **kwargs)[source]

Plot dataframe on a map.

Parameters:
  • df (pd.DataFrame) – dataframe containing plotting information

  • column (str) – column with values to plot

  • label (bool, optional) – label points, by default True, Deprecated since Pastastore 1.13.0, use labels instead.

  • labels (bool, optional) – label the points, by default True

  • adjust (bool, optional) – automated smart label placement using adjustText, by default False

  • cmap (str or colormap, optional) – (name of) the colormap, by default “viridis”

  • colorbar (bool, optional) – show colorbar, only if column is provided, by default True.

  • legend (bool, optional) – show legend, only possible if the column data type is int/int64, by default False.

  • norm (norm, optional) – normalization for colorbar, by default None

  • vmin (float, optional) – vmin for colorbar, by default None

  • vmax (float, optional) – vmax for colorbar, by default None

  • ax (matplotlib.Axes, optional) – axes handle, if not provided a new figure is created.

  • figsize (tuple, optional) – figuresize, by default(10, 8)

  • backgroundmap (bool, optional) – if True, add background map (default CRS is EPSG:28992) with default tiles by OpenStreetMap.Mapnik. Default option is False.

  • progressbar (bool, optional) – show progressbar, default is True.

Returns:

ax – axes object

Return type:

matplotlib.Axes

See also

self.add_background_map, self.dataframe_scatter

Notes

The DataFrame df should contain columns “x” and “y” for the coordinates, and a column specified by column for the values to plot. The index of the DataFrame is used for labeling if label is True.

static dataframe_scatter(df, x='x', y='y', label=True, column=None, colorbar=True, legend=False, ax=None, figsize=(10, 8), **kwargs)[source]

Plot dataframe.

Parameters:
  • df (pandas.DataFrame) – DataFrame containing coordinates and data to plot, with index providing names for each location.

  • x (str, optional) – name of the column with x - coordinate data, by default “x”.

  • y (str, optional) – name of the column with y - coordinate data, by default “y”.

  • column (str, optional) –

    name of the column containing data used for determining the color of each point, by default None (all one color).

    label: bool, optional

    label points, by default True

  • adjust (bool, optional) – automated smart label placement using adjustText, by default False

  • colorbar (bool, optional) – show colorbar, only if column is provided, by default True.

  • legend (bool, optional) – show legend, only possible if the column data type is int/int64, by default False.

  • progressbar (bool, optional) – show progressbar, default is True.

  • ax (matplotlib Axes) – axes handle to plot dataframe, optional, default is None which creates a new figure.

  • figsize (tuple, optional) – figure size, by default(10, 8)

  • **kwargs – dictionary containing keyword arguments for ax.scatter, by default None.

Returns:

  • ax (matplotlib.Axes) – axes object, returned if ax is None

  • sc (scatter handle) – scatter plot handle, returned if ax is not None

model(ml, label=True, metadata_source='model', offset=0.0, ax=None, figsize=(10, 10), backgroundmap=False)[source]

Plot oseries and stresses from one model on a map.

Parameters:
  • ml (str or pastas.Model) – pastas model or name of pastas model to plot on map

  • label (bool, optional, default is True) – add labels to points on map

  • metadata_source (str, optional) – one of “model” or “store”, pick whether to obtain metadata from model Timeseries or from metadata in pastastore, default is “model”

  • offset (float, optional) – add offset to current extent of model time series, useful for zooming out around models

  • ax (matplotlib.Axes, optional) – axes handle, if not provided a new figure is created.

  • figsize (tuple, optional) – figsize, default is (10, 10)

  • backgroundmap (bool, optional) – if True, add background map (default CRS is EPSG:28992) with default tiles by OpenStreetMap.Mapnik. Default option is False.

Returns:

ax – axis handle of the resulting figure

Return type:

axes object

See also

self.add_background_map

modelparam(parameter, param_value='optimal', modelnames=None, label=True, adjust=False, cmap='viridis', norm=None, vmin=None, vmax=None, figsize=(10, 8), backgroundmap=False, progressbar=True, **kwargs)[source]

Plot model parameter value on map.

Parameters:
  • parameter (str) – name of the parameter, e.g. “rech_A” or “river_a”

  • param_value (str, optional) – which parameter value to plot, by default “optimal”, other options are “initial”, “pmin”, “pmax”

  • modelnames (list of str, optional) – list of modelnames to include

  • label (bool, optional) – label points, by default True

  • adjust (bool, optional) – automated smart label placement using adjustText, by default False

  • cmap (str or colormap, optional) – (name of) the colormap, by default “viridis”

  • norm (norm, optional) – normalization for colorbar, by default None

  • vmin (float, optional) – vmin for colorbar, by default None

  • vmax (float, optional) – vmax for colorbar, by default None

  • ax (matplotlib.Axes, optional) – axes handle, if not provided a new figure is created.

  • figsize (tuple, optional) – figuresize, by default(10, 8)

  • backgroundmap (bool, optional) – if True, add background map (default CRS is EPSG:28992) with default tiles by OpenStreetMap.Mapnik. Default option is False.

  • progressbar (bool, optional) – show progressbar, default is True

Returns:

ax – axes object

Return type:

matplotlib.Axes

See also

self.add_background_map

models(labels=True, adjust=False, figsize=(10, 8), backgroundmap=False, **kwargs)[source]

Plot model locations on map.

Parameters:
  • labels (bool, optional) – label models, by default True

  • adjust (bool, optional) – automated smart label placement using adjustText, by default False

  • ax (matplotlib.Axes, optional) – axes handle, if not provided a new figure is created.

  • figsize (tuple, optional) – figure size, by default(10, 8)

  • backgroundmap (bool, optional) – if True, add background map (default CRS is EPSG:28992) with default tiles by OpenStreetMap.Mapnik. Default option is False.

Returns:

ax – axes object

Return type:

matplotlib.Axes

See also

self.add_background_map

modelstat(statistic, modelnames=None, label=True, adjust=False, cmap='viridis', norm=None, vmin=None, vmax=None, figsize=(10, 8), backgroundmap=False, progressbar=True, **kwargs)[source]

Plot model statistic on map.

Parameters:
  • statistic (str) – name of the statistic, e.g. “evp” or “aic”

  • modelnames (list of str, optional) – list of modelnames to include

  • label (bool, optional) – label points, by default True

  • adjust (bool, optional) – automated smart label placement using adjustText, by default False

  • cmap (str or colormap, optional) – (name of) the colormap, by default “viridis”

  • norm (norm, optional) – normalization for colorbar, by default None

  • vmin (float, optional) – vmin for colorbar, by default None

  • vmax (float, optional) – vmax for colorbar, by default None

  • ax (matplotlib.Axes, optional) – axes handle, if not provided a new figure is created.

  • figsize (tuple, optional) – figuresize, by default(10, 8)

  • backgroundmap (bool, optional) – if True, add background map (default CRS is EPSG:28992) with default tiles by OpenStreetMap.Mapnik. Default option is False.

  • progressbar (bool, optional) – show progressbar, default is True.

Returns:

ax – axes object

Return type:

matplotlib.Axes

See also

self.add_background_map

oseries(names=None, extent=None, labels=True, adjust=False, figsize=(10, 8), backgroundmap=False, label_kwargs=None, **kwargs)[source]

Plot oseries locations on map.

Parameters:
  • names (list, optional) – oseries names, by default None which plots all oseries locations

  • extent (list of float, optional) – plot only oseries within extent [xmin, xmax, ymin, ymax]

  • labels (bool or str, optional) – label models, by default True, if passed as “grouped”, only the first label for each x,y-location is shown.

  • adjust (bool, optional) – automated smart label placement using adjustText, by default False

  • figsize (tuple, optional) – figure size, by default(10, 8)

  • backgroundmap (bool, optional) – if True, add background map (default CRS is EPSG:28992) with default tiles by OpenStreetMap.Mapnik. Default option is False.

  • label_kwargs (dict, optional) – dictionary with keyword arguments to pass to add_labels method

Returns:

ax – axes object

Return type:

matplotlib.Axes

See also

self.add_background_map

series(series, name=None, labels=True, adjust=False, cmap='viridis', colorbar=True, legend=False, norm=None, vmin=None, vmax=None, ax=None, figsize=(10, 8), backgroundmap=False, **kwargs)[source]

Plot the values of a series on a map.

Parameters:
  • series (str) – Pandas.Series with index that (partly) matches the pstore.oseries_names and values to plot on the map. The locations of the oseries are used to plot the values on the map.

  • name (str, optional) – name of the series to use for labeling, by default None, which uses the name of the series itself or “value” if the series has no name.

  • labels (bool, optional) – label models, by default True

  • adjust (bool, optional) – automated smart label placement using adjustText, by default False

  • cmap (str or colormap, optional) – (name of) the colormap, by default “viridis”

  • colorbar (bool, optional) – show colorbar, by default True.

  • legend (bool, optional) – show legend, only possible if the Series data type is int/int64, by default False.

  • norm (norm, optional) – normalization for colorbar, by default None

  • vmin (float, optional) – vmin for colorbar, by default None

  • vmax (float, optional) – vmax for colorbar, by default None

  • ax (matplotlib.Axes, optional) – axes handle, if not provided a new figure is created.

  • figsize (tuple, optional) – figure size, by default(10, 8)

  • backgroundmap (bool, optional) – if True, add background map (default CRS is EPSG:28992) with default tiles by OpenStreetMap.Mapnik. Default option is False.

  • **kwargs (dict, optional) – additional keyword arguments to pass to dataframe_scatter method.

Returns:

ax – axes object

Return type:

matplotlib.Axes

See also

self.add_background_map, self.dataframe_scatter

Notes

The index of the series should match the names of the oseries in the store. Only the oseries with names matching the index of the series will be plotted.

Example

If we have a series with some values for some of the oseries in the store, we can plot these values on the map as follows:

import pandas as pd
series = pd.Series(data=[1, 2, 3], index=["obs1", "obs2", "obs3"])
pstore.maps.series(series)
signature(signature, names=None, label=True, adjust=False, cmap='viridis', norm=None, vmin=None, vmax=None, figsize=(10, 8), backgroundmap=False, progressbar=True, **kwargs)[source]

Plot signature value on map.

Parameters:
  • signature (str) – name of the signature, e.g. “mean_annual_maximum” or “duration_curve_slope”

  • names (list of str, optional) – list of observation well names to include

  • label (bool, optional) – label points, by default True

  • adjust (bool, optional) – automated smart label placement using adjustText, by default False

  • cmap (str or colormap, optional) – (name of) the colormap, by default “viridis”

  • norm (norm, optional) – normalization for colorbar, by default None

  • vmin (float, optional) – vmin for colorbar, by default None

  • vmax (float, optional) – vmax for colorbar, by default None

  • ax (matplotlib.Axes, optional) – axes handle, if not provided a new figure is created.

  • figsize (tuple, optional) – figuresize, by default(10, 8)

  • backgroundmap (bool, optional) – if True, add background map (default CRS is EPSG:28992) with default tiles by OpenStreetMap.Mapnik. Default option is False.

  • progressbar (bool, optional) – show progressbar, default is True

Returns:

ax – axes object

Return type:

matplotlib.Axes

See also

self.add_background_map

stresses(names=None, kind=None, extent=None, labels=True, adjust=False, figsize=(10, 8), backgroundmap=False, label_kwargs=None, show_legend: bool = True, **kwargs)[source]

Plot stresses locations on map.

Parameters:
  • names (list of str, optional) – list of names to plot

  • kind (str, optional) – if passed, only plot stresses of a specific kind, default is None which plots all stresses.

  • extent (list of float, optional) – plot only stresses within extent [xmin, xmax, ymin, ymax]

  • labels (bool, optional) – label models, by default True

  • adjust (bool, optional) – automated smart label placement using adjustText, by default False

  • ax (matplotlib.Axes, optional) – axes handle, if not provided a new figure is created.

  • figsize (tuple, optional) – figure size, by default(10, 8)

  • backgroundmap (bool, optional) – if True, add background map (default CRS is EPSG:28992) with default tiles by OpenStreetMap.Mapnik. Default option is False.

  • label_kwargs (dict, optional) – dictionary with keyword arguments to pass to add_labels method

  • show_legend (bool, optional) – add legend with each kind of stress and associated color, only possible if colors are not explicitly passed. Default is True.

Returns:

ax – axes object

Return type:

matplotlib.Axes

See also

self.add_background_map

Create a map linking models with their stresses.

Parameters:
  • kinds (list, optional) – kinds of stresses to plot, defaults to None, which selects all kinds.

  • model_names (list, optional) – list of model names to plot, substrings of model names are also accepted, defaults to None, which selects all models.

  • color_lines (bool, optional) – if True, connecting lines have the same colors as the stresses, defaults to False, which uses a black line.

  • alpha (float, optional) – alpha value for the connecting lines, defaults to 0.4.

  • ax (matplotlib.Axes, optional) – axes handle, if not provided a new figure is created.

  • figsize (tuple, optional) – figure size, by default (10, 8)

  • legend (bool, optional) – create a legend for all unique kinds, defaults to True.

  • labels (bool, optional) – add labels for stresses and oseries, defaults to False.

  • adjust (bool, optional) – automated smart label placement using adjustText, by default False

  • backgroundmap (bool, optional) – if True, add background map (default CRS is EPSG:28992) with default tiles by OpenStreetMap.Mapnik. Default option is False.

Returns:

ax – axis handle of the resulting figure

Return type:

axes object

See also

self.add_background_map

Yaml

Module containing YAML interface for Pastas models using PastaStore.

class pastastore.yaml_interface.PastastoreYAML(pstore)[source]

Class for reading/writing Pastas models in YAML format.

This class provides a more human-readable form of Pastas models in comparison to Pastas default .pas (JSON) files. The goal is to provide users with a simple mini-language to quickly build/test different model structures. A PastaStore is required as input, which contains existing models or time series required to build new models. This class also introduces some shortcuts to simplify building models. Shortcuts include the option to pass ‘nearest’ as the name of a stress, which will automatically select the closest stress of a particular type. Other shortcuts include certain default options when certain information is not listed in the YAML file, that will work well in many cases.

Usage

Instantiate the PastastoreYAML class:

pyaml = PastastoreYAML(pstore)

Export a Pastas model to a YAML file:

pyaml.export_model_to_yaml(ml)

Load a Pastas model from a YAML file:

models = pyaml.load_yaml("my_first_model.yaml")

Example YAML file using ‘nearest’:

my_first_model:  # this is the name of the model
  oseries: "oseries1"  # name of oseries stored in PastaStore
  stressmodels:
      recharge:  # recognized as RechargeModel by name
        prec: "nearest"  # use nearest stress with kind="prec"
        evap: "EV24_DEELEN"  # specific station
      river:
        stress: "nearest riv"  # nearest stress with kind="riv"
      wells:
        stress: "nearest 3"  # nearest 3 stresses with kind="well"
        stressmodel: WellModel  # provide StressModel type
construct_mldict(mlyml: dict, mlnam: str) dict[source]

Create Pastas.Model dictionary from YAML dictionary.

Parameters:
  • mlyml (dict) – YAML dictionary

  • mlnam (str) – model name

Returns:

dictionary of pastas.Model that can be read by Pastas

Return type:

dict

static export_model(ml: Model | dict, outdir: Path | str = '.', minimal_yaml: bool | None = False, use_nearest: bool | None = False)[source]

Write single pastas model to YAML file.

Parameters:
  • ml (ps.Model or dict) – pastas model instance or dictionary representing a pastas model

  • outdir (str, optional) – path to output directory, by default “.” (current directory)

  • minimal_yaml (bool, optional) – reduce yaml file to include the minimum amount of information that will still construct a model. Users are warned, using this option does not guarantee the same model will be constructed as the one that was exported! Default is False.

  • use_nearest (bool, optional) – if True, replaces time series with “nearest <kind>”, filling in kind where possible. Warning! This does not check whether the time series are actually the nearest ones! Only used when minimal_yaml=True. Default is False.

export_models(models: list[Model] | list[dict] | None = None, modelnames: list[str] | str | None = None, outdir: str | Path = '.', minimal_yaml: bool | None = False, use_nearest: bool | None = False, split: bool | None = True, filename: str = 'pastas_models.yaml')[source]

Export (stored) models to yaml file(s).

Parameters:
  • models (list of ps.Model or dict, optional) – pastas Models to write to yaml file(s), if not provided, uses modelnames to collect stored models to export.

  • modelnames (list[str], optional) – list of model names to export, by default None, which uses all stored models.

  • outdir (str, optional) – path to output directory, by default “.” (current directory)

  • minimal_yaml (bool, optional) – reduce yaml file to include the minimum amount of information that will still construct a model. Users are warned, using this option does not guarantee the same model will be constructed as the one that was exported! Default is False.

  • use_nearest (bool, optional) – if True, replaces time series with “nearest <kind>”, filling in kind where possible. Warning! This does not check whether the time series are actually the nearest ones! Only used when minimal_yaml=True. Default is False.

  • split (bool, optional) – if True, split into separate yaml files, otherwise store all in the same file. The model names are used as file names.

  • filename (str, optional) – filename for YAML file, only used if split=False

export_stored_models_per_oseries(oseries: list[str] | str | None = None, outdir: Path | str = '.', minimal_yaml: bool | None = False, use_nearest: bool | None = False)[source]

Export store models grouped per oseries (location) to YAML file(s).

Note: The oseries names are used as file names.

Parameters:
  • oseries (list[str], optional) – list of oseries (location) names, by default None, which uses all stored oseries for which there are models.

  • outdir (str, optional) – path to output directory, by default “.” (current directory)

  • minimal_yaml (bool, optional) – reduce yaml file to include the minimum amount of information that will still construct a model. Users are warned, using this option does not guarantee the same model will be constructed as the one that was exported! Default is False.

  • use_nearest (bool, optional) – if True, replaces time series with “nearest <kind>”, filling in kind where possible. Warning! This does not check whether the time series are actually the nearest ones! Only used when minimal_yaml=True. Default is False.

load(fyaml: str) list[Model][source]

Load Pastas YAML file.

Note: currently supports RechargeModel, StressModel and WellModel.

Parameters:

fyaml (str) – YAML as str or path to file

Returns:

models – list containing pastas model(s)

Return type:

list

Raises:
  • ValueError – if insufficient information is provided to construct pastas model

  • NotImplementedError – if unsupported stressmodel is encountered

pastastore.yaml_interface.reduce_to_minimal_dict(d: dict, keys: list[str] | None = None) None[source]

Reduce pastas model dictionary to a minimal form.

This minimal form strives to keep the minimal information that still allows a model to be constructed. Users are warned, reducing a model dictionary with this function can lead to a different model than the original!

Parameters:
  • d (dict) – pastas model in dictionary form

  • keys (list, optional) – list of keys to keep, by default None, which defaults to: [“name”, “oseries”, “settings”, “tmin”, “tmax”, “noise”, “stressmodels”, “rfunc”, “stress”, “prec”, “evap”, “stressmodel”]

pastastore.yaml_interface.replace_ts_with_name(d, nearest=False)[source]

Replace time series dict with its name in pastas model dict.

Parameters:
  • d (dict) – pastas model dictionary

  • nearest (bool, optional) – replace time series with “nearest” option. Warning, this does not check whether the time series are actually the nearest ones!

pastastore.yaml_interface.temporary_yaml_from_str(yaml: str)[source]

Temporary yaml file that is deleted after usage.

Util

Useful utilities for pastastore.

class pastastore.util.ColoredFormatter(*args, colors: dict[str, str] | None = None, **kwargs)[source]

Colored log formatter.

Taken from https://gist.github.com/joshbode/58fac7ababc700f51e2a9ecdebe563ad

format(record) str[source]

Format the specified record as text.

exception pastastore.util.ItemInLibraryException[source]

Exception when item is already in library.

exception pastastore.util.SeriesUsedByModel[source]

Exception raised when a series is used by a model.

class pastastore.util.ZipUtils(pstore)[source]

Utility class for zip file operations.

models_to_archive(archive, names=None, progressbar=True)[source]

Write pastas.Model to zipfile (internal method).

Parameters:
  • archive (zipfile.ZipFile) – reference to an archive to write data to

  • names (str | list[str], optional) – names of the models to write to archive, by default None, which writes all models to archive

  • progressbar (bool, optional) – show progressbar, by default True

series_to_archive(archive, libname: Literal['oseries', 'stresses'], names: list[str] | str | None = None, progressbar: bool = True)[source]

Write DataFrame or Series to zipfile (internal method).

Parameters:
  • archive (zipfile.ZipFile) – reference to an archive to write data to

  • libname (str) – name of the library to write to zipfile

  • names (str | list[str], optional) – names of the time series to write to archive, by default None, which writes all time series to archive

  • progressbar (bool, optional) – show progressbar, by default True

pastastore.util.compare_models(ml1: Model, ml2: Model, stats: list[str] | None = None, detailed_comparison: bool = False, style_output: bool = False) DataFrame | Styler[source]

Compare two Pastas models.

Parameters:
  • ml1 (pastas.Model) – first model to compare

  • ml2 (pastas.Model) – second model to compare

  • stats (list[str], optional) – if provided compare these model statistics

  • detailed_comparison (bool, optional) – if True return DataFrame containing comparison details, by default False which returns True if models are equivalent or False if they are not

  • style_output (bool, optional) – if True and detailed_comparison is True, return styled DataFrame with colored output, by default False

Returns:

returns True if models are equivalent when detailed_comparison=True else returns DataFrame containing comparison details.

Return type:

bool or pd.DataFrame or pd.Styler

pastastore.util.copy_database(conn1, conn2, libraries: list[str] | None = None, overwrite: bool = False, progressbar: bool = True) None[source]

Copy libraries from one database to another.

Parameters:
  • conn1 (pastastore.*Connector) – source Connector containing link to current database containing data

  • conn2 (pastastore.*Connector) – destination Connector with link to database to which you want to copy

  • libraries (list[str] | None, optional) – list of str containing names of libraries to copy, by default None, which copies all libraries: [‘oseries’, ‘stresses’, ‘models’]

  • overwrite (bool, optional) – overwrite data in destination database, by default False

  • progressbar (bool, optional) – show progressbars, by default False

Raises:

ValueError – if library name is not understood

pastastore.util.delete_arcticdb_connector(conn=None, uri: str | None = None, name: str | None = None, libraries: list[str] | None = None) None[source]

Delete libraries from arcticDB database.

Parameters:
  • conn (pastastore.ArcticDBConnector) – ArcticDBConnector object

  • uri (str, optional) – uri connection string to the database

  • name (str, optional) – name of the database

  • libraries (list[str] | None, optional) – list of library names to delete, by default None which deletes all libraries

pastastore.util.delete_dict_connector(conn, libraries: list[str] | None = None) None[source]

Delete DictConnector object.

pastastore.util.delete_pas_connector(conn, libraries: list[str] | None = None) None[source]

Delete PasConnector object.

pastastore.util.delete_pastastore(pstore, libraries: list[str] | None = None) None[source]

Delete libraries from PastaStore.

Note

This deletes the original PastaStore object. To access data that has not been deleted, it is recommended to create a new PastaStore object with the same Connector settings. This also creates new empty libraries if they were deleted.

Parameters:
  • pstore (pastastore.PastaStore) – PastaStore object to delete (from)

  • libraries (list[str] | None, optional) – list of library names to delete, by default None which deletes all libraries

Raises:

TypeError – when Connector type is not recognized

pastastore.util.frontiers_aic_select(pstore, modelnames: list[str] | None = None, oseries: list[str] | None = None, full_output: bool = False) DataFrame[source]

Select the best model structure based on the minimum AIC.

As proposed by Brakenhoff et al. 2022 [bra_2022].

Parameters:
  • pstore (pastastore.PastaStore) – reference to a PastaStore

  • modelnames (list[str]) – list of model names (that pass reliability criteria)

  • oseries (list of oseries) – list of locations for which to select models, note that this uses all models associated with a specific location.

  • full_output (bool, optional) – if set to True, returns a DataFrame including all models per location and their AIC values

Returns:

DataFrame with selected best model per location based on the AIC, or a DataFrame containing statistics for each of the models per location

Return type:

pandas.DataFrame

References

[bra_2022]

Brakenhoff, D.A., Vonk M.A., Collenteur, R.A., van Baar, M.,

Bakker, M.: Application of Time Series Analysis to Estimate Drawdown From Multiple Well Fields. Front. Earth Sci., 14 June 2022 doi:10.3389/feart.2022.907609

pastastore.util.frontiers_checks(pstore, modelnames: list[str] | None = None, oseries: list[str] | None = None, check1_rsq: bool = True, check1_threshold: float = 0.7, check2_autocor: bool = True, check2_test: str = 'runs', check2_pvalue: float = 0.05, check3_tmem: bool = True, check3_cutoff: float = 0.95, check4_gain: bool = True, check5_parambounds: bool = False, csv_dir: str | None = None, progressbar: bool = False) DataFrame[source]

Check models in a PastaStore to see if they pass reliability criteria.

The reliability criteria are taken from Brakenhoff et al. 2022 [bra_2022]. These criteria were applied in a region with recharge, river levels and pumping wells as stresses. This is by no means an exhaustive list of reliability criteria but might serve as a reasonable starting point for model diagnostic checking.

Parameters:
  • pstore (pastastore.PastaStore) – reference to a PastaStore

  • modelnames (list[str], optional) – list of model names to consider, if None checks ‘oseries’, if both are None, all stored models will be checked

  • oseries (list[str], optional) – list of oseries to consider, corresponding models will be picked up from pastastore. If None, uses all stored models are checked.

  • check1 (bool, optional) – check if model fit is above a threshold of the coefficient of determination $R^2$ , by default True

  • check1_threshold (float, optional) – threshold of the $R^2$ fit statistic, by default 0.7

  • check2 (bool, optional) – check if the noise of the model has autocorrelation with statistical test, by default True

  • check2_test (str, optional) – statistical test for autocorrelation. Available options are Runs test “runs”, Stoffer-Toloi “stoffer” or “both”, by default “runs”

  • check2_pvalue (float, optional) – p-value for the statistical test to define the confindence interval, by default 0.05

  • check3 (bool, optional) – check if the length of the response time is within the calibration period, by default True

  • check3_cutoff (float, optional) – the cutoff of the response time, by default 0.95

  • check4 (bool, optional) – check if the uncertainty of the gain, by default True

  • check5 (bool, optional) – check if parameters hit parameter bounds, by default False

  • csv_dir (string, optional) – directory to store CSV file with overview of checks for every model, by default None which will not store results

  • progressbar (bool, optional) – show progressbar, by default False

Returns:

df – dataFrame with all models and whether or not they pass the reliability checks

Return type:

pandas.DataFrame

References

[bra_2022]

Brakenhoff, D.A., Vonk M.A., Collenteur, R.A., van Baar, M., Bakker, M.: Application of Time Series Analysis to Estimate Drawdown From Multiple Well Fields. Front. Earth Sci., 14 June 2022 doi:10.3389/feart.2022.907609

pastastore.util.get_color_logger(level='INFO', logger_name=None)[source]

Get a logger with colored output.

Parameters:

level (str, optional) – The logging level to set for the logger. Default is “INFO”.

Returns:

logger – The configured logger object.

Return type:

logging.Logger

pastastore.util.metadata_from_json(fjson: str)[source]

Load metadata dictionary from JSON.

Parameters:

fjson (str) – path to file

Returns:

meta – dictionary containing metadata

Return type:

dict

pastastore.util.series_from_json(fjson: str, squeeze: bool = True)[source]

Load time series from JSON.

Parameters:
  • fjson (str) – path to file

  • squeeze (bool, optional) – squeeze time series object to obtain pandas Series

Returns:

s – DataFrame containing time series

Return type:

pd.DataFrame

pastastore.util.validate_names(s: str | None = None, d: dict | None = None, replace_space: str | None = '_', deletechars: str | None = None, **kwargs) str | dict[source]

Remove invalid characters from string or dictionary keys.

Parameters:
  • s (str, optional) – remove invalid characters from string

  • d (dict, optional) – remove invalid characters from keys from dictionary

  • replace_space (str, optional) – replace spaces by this character, by default “_”

  • deletechars (str, optional) – a string combining invalid characters, by default None

Returns:

string or dict with invalid characters removed

Return type:

str, dict