Documentation

TENAX class

class pyTENAX.tenax.TENAX(return_period: list[int | float], durations: list[int], time_resolution: int, beta: float | int = 4, temp_time_hour: int = 24, alpha=0.05, n_monte_carlo=20000, tolerance=0.1, min_event_duration=30, storm_separation_time=24, left_censoring: list = [0, 1], niter_smev=100, niter_tenax=100, temp_res_monte_carlo=0.001, temp_delta=10, init_param_guess=[0.7, 0, 2, 0], min_rain: float | int = 0)[source]

Bases: object

TNX_tenax_bootstrap_uncertainty(P, T, blocks_id, Ts, temp_method='norm', method_root_scalar='brentq', minimize_method='Nelder-Mead', parallel=False, n_jobs=-1)[source]

Bootstrap uncertainty estimation for the TENAX model.

Parameters:

P (np.ndarray) – Precipitation ordinary events data.
T (np.ndarray) – Temperature ordinary events data.
blocks_id (np.ndarray) – Block identifiers (e.g., years) for each event.
Ts (np.ndarray) – Array of temperature values for the Monte Carlo integration.
temp_method (str, optional) – Distribution used for the temperature model. Defaults to "norm".
method_root_scalar (str, optional) – Root-finding method for model inversion. Defaults to "brentq".
minimize_method (str, optional) – Optimisation method passed to magnitude_model. Defaults to "Nelder-Mead".
parallel (bool, optional) – Run bootstrap iterations in parallel using ProcessPoolExecutor. Warm start is disabled when parallel=True because iterations run in separate processes and cannot share state. Defaults to False.
n_jobs (int, optional) – Number of worker processes. -1 uses all available cores. Ignored when parallel=False. Defaults to -1.

Notes

Warm start (sequential mode only): each iteration reuses the fitted parameters from the previous iteration as the initial guess for magnitude_model. Since consecutive bootstrap samples are drawn from the same dataset, their optimal parameters are typically close to each other, so the optimiser converges in fewer steps. self.init_param_guess is restored to its original value after the loop so that subsequent calls are not affected.

Returns:

F_phat_unc (np.ndarray) – Magnitude model parameters from each bootstrap sample, shape (niter, 4).
g_phat_unc (np.ndarray) – Temperature model parameters from each bootstrap sample, shape (niter, 2) for "norm" or (niter, 3) for "skewnorm".
RL_unc (np.ndarray) – Return levels from each bootstrap sample, shape (niter, len(return_period)).
n_unc (np.ndarray) – Mean number of events per block from each bootstrap sample, shape (niter,).
n_err (int) – Number of iterations where model fitting failed.

associate_vars(dict_ordinary, data_temperature, dates_temperature, method='vectorized')[source]

Associate temperature with each ordinary event.

The associated temperature is the mean over the past temp_time_hour hours before the event. Events for which no temperature can be found are dropped.

Parameters:

dict_ordinary (dict) – Dictionary of ordinary events as returned by get_ordinary_events_values.
data_temperature (np.ndarray) – Full temperature time series.
dates_temperature (np.ndarray) – Timestamps of the full temperature dataset.
method (str, optional) –
Backend used for the temperature association. Defaults to "iterrows". One of:
- "iterrows" — pandas iterrows loop (original behaviour).
- "vectorized" — fully vectorised numpy using cumulative-sum trick; O(1) per event after one O(n) precomputation pass.

Returns:

dict_ordinary (dict) – Ordinary events with an added T column, keyed by duration. Example: ``{“10”: pd.DataFrame(

columns=[‘year’, ‘oe_time’, ‘ordinary’, ‘T’])}``.
dict_dropped_oe (dict) – Ordinary events dropped because no temperature was found, keyed by duration.
n_ordinary_per_year_new (pd.DataFrame) – DataFrame with the count of ordinary events per year after dropping events without temperature.

get_ordinary_events(data: DataFrame | ndarray, dates: ndarray, name_col: str = 'value', check_gaps=True) → list[source]

Extract ordinary precipitation events from a time series.

Groups timesteps at or above self.min_rain into independent storm events separated by at least self.storm_separation_time hours. Optionally removes events too close to dataset boundaries or data gaps.

Parameters:

data (Union[pd.DataFrame, np.ndarray]) – Precipitation values.
dates (np.ndarray) – Timestamps of the precipitation data.
name_col (str, optional) – Column name to use when data is a DataFrame. Defaults to “value”.
check_gaps (bool, optional) – Remove events that fall within storm_separation_time of the dataset boundaries or internal data gaps. Defaults to True.

Returns:

List of np.ndarray, each containing the timestamps of one ordinary event (values >= self.min_rain separated by more than self.storm_separation_time hours).

Return type:

list

get_ordinary_events_values(data: ndarray, dates: ndarray, arr_dates_oe: ndarray, method: str = 'vectorized') → Tuple[Dict[str, DataFrame], Dict[str, DataFrame]][source]

Extract ordinary events and annual maxima from precipitation data.

Parameters:

data (np.ndarray) – Full precipitation time series.
dates (np.ndarray) – Timestamps of the full precipitation dataset.
arr_dates_oe (np.ndarray) – End and start times of ordinary events as returned by remove_short.
method (str, optional) –
Backend used for the sliding-window maximum search. Defaults to "vectorized". One of:
- "vectorized" — pure numpy, np.convolve per event.
- "njit" — numba JIT-compiled loop, single-threaded.
- "njit_parallel" — numba JIT-compiled loop, parallelised over events. Requires numba to be installed.

Returns:

dict_ordinary (dict) – Key is duration (str), value is a pd.DataFrame with columns year, oe_time, ordinary (event depth/intensity).
dict_AMS (dict) – Key is duration (str), value is a pd.DataFrame with columns year and AMS (annual maximum value).

magnitude_model(data_oe_prec, data_oe_temp, thr, b_set=None, b_exp=False, minimize_method='Nelder-Mead')[source]

Fits the data to the magnitude model of TENAX.

Parameters:

data_oe_prec (numpy.ndarray) – Array of precipitation ordinary events data.
data_oe_temp (numpy.ndarray) – Array of temperature ordinary events data.
thr (numpy.float64) – Magnitude of precipitation threshold.
b_set (NoneType or float) – Set value of b. fits magnitude model with a specified value for b.
b_exp (bool) – If True, uses the exponential rather than linear fit for b.
minimize_method (str, optional) – Optimisation method passed to scipy.optimize.minimize. Defaults to 'Nelder-Mead'. Tested also 'L-BFGS-B', though it can lead to some instability

Returns:

phat (np.ndarray) – Fitted parameters [kappa_0, b, lambda_0, a].
loglik (float) – Log-likelihood of the selected model.
loglik_H1 (float or None) – Log-likelihood of the alternative hypothesis (H1). None when b_set is provided.
loglik_H0shape (float or None) – Log-likelihood of the null hypothesis (constant shape). None when b_set is provided.

model_inversion(F_phat, g_phat, n, Ts, gen_P_mc=False, gen_RL=True, temp_method='norm', method_root_scalar='brentq', b_exp=False)[source]

Inversion of the TENAX model to predict return levels or plot model.

Parameters:

F_phat (numpy.ndarray) – distribution values. F_phat = [kappa_0,b,lambda_0,a].
g_phat (numpy.ndarray) – [mu, sigma] of temperature distribution.
n (float) – Mean number of ordinary events per year.
Ts (numpy.ndarray) – Array of T values to use in the Monte Carlo.
gen_P_mc (bool, optional) – Specify whether to generate Monte Carlo values for precipitation. The default is False.
gen_RL (bool, optional) – Specify whether to generate return levels. The default is True.
temp_method (str, optional) – Type of fit used for the temperature model. The default is “norm”.
method_root_scalar (str, optional) – method used for inversion. The default is “brentq”.
b_exp (bool) – If True, uses the exponential rather than linear fit for b.

Returns:

ret_lev (np.ndarray or list) – Return levels at periods specified in self.return_period. Empty list [] if gen_RL=False.
T_mc (np.ndarray) – Monte Carlo generated temperature values, shape (n_monte_carlo, 1).
P_mc (np.ndarray or list) – Monte Carlo generated precipitation values. Empty list [] if gen_P_mc=False.

remove_incomplete_years(data_pr: DataFrame, name_col='value', nan_to_zero=True) → DataFrame[source]

Delete incomplete years in precipitation data.

An incomplete year is defined as a year where observations are missing above a given threshold.

Parameters:

data_pr (pd.DataFrame) – Dataframe containing (hourly) precipitation values.
name_col (str, optional) – Column name in data_pr with precipitation values. Defaults to “value”.
nan_to_zero (bool, optional) – Set nan to zero. Defaults to True.

Returns:

Dataframe containing (hourly) precipitation values with incomplete years removed.

Return type:

pd.DataFrame

remove_short(list_ordinary: list) → Tuple[ndarray, ndarray, DataFrame][source]

Remove ordinary events that are too short.

Parameters:

list_ordinary (list) – List of ordinary events as returned by get_ordinary_events(). Each event may contain pd.Timestamp or np.datetime64 values.

Returns:

arr_vals (np.ndarray) – Boolean array (all True) of length equal to the number of kept events, one entry per event that passed the duration filter.
arr_dates (np.ndarray) – Array of (end, start) date tuples for each kept event.
n_ordinary_per_year (pd.DataFrame) – DataFrame with the count of ordinary events per year.

temperature_model(data_oe_temp, beta=0, method='norm')[source]

Fit temperature data to the TENAX temperature model.

Parameters:

data_oe_temp (np.ndarray) – Temperature data.
beta (float, optional) – Shape parameter of the generalised normal distribution. If 0, uses self.beta. Defaults to 0.
method (str, optional) – Distribution to fit. "norm" uses the generalised normal; "skewnorm" uses a skewed normal. Defaults to "norm".

Returns:

g_phat – Fitted parameters. [mu, sigma] for "norm"; [alpha, loc, scale] for "skewnorm".

Return type:

np.ndarray

Supporting utils

pyTENAX.tenax.MC_tSMEV_cdf(y, wbl_phat, n)[source]

Evaluate the Monte Carlo SMEV CDF at given values.

Parameters:

y (float or array-like) – Value(s) at which to evaluate the CDF.
wbl_phat (np.ndarray) – Weibull parameters, shape (N, 2): columns are [scale, shape].
n (float) – Power applied to the average probability.

Returns:

CDF value(s) for input y.

Return type:

np.ndarray

pyTENAX.tenax.SMEV_Mc_inversion(wbl_phat, n, target_return_periods, vguess, method_root_scalar)[source]

Invert the MC-SMEV CDF to find quantiles for target return periods.

Parameters:

wbl_phat (np.ndarray) – Weibull parameters, shape (N, 2): columns are [scale, shape].
n (float) – Power applied to the average probability.
target_return_periods (list or array-like) – Desired return periods.
vguess (np.ndarray) – Initial value grid for root-finding.
method_root_scalar (str) – Root-finding method passed to scipy.optimize.root_scalar.

Returns:

Quantiles corresponding to each target return period.

Return type:

np.ndarray

pyTENAX.tenax.gen_norm_loglik(x: ndarray, par: list, beta: float) → float[source]

Compute the log-likelihood for the Generalized normal distribution.

Parameters:

x (np.ndarray) – Data points.
par (list) – Parameters [mu, sigma].
beta (float) – Shape parameter.

Returns:

Log-likelihood value.

Return type:

float

pyTENAX.tenax.gen_norm_pdf(x: ndarray, mu: float, sigma: float, beta: float) → ndarray[source]

Compute the Generalized normal distribution PDF.

Parameters:

x (np.ndarray) – Data points.
mu (float) – Location parameter.
sigma (float) – Scale parameter.
beta (float) – Shape parameter.

Returns:

Generalized normal distribution PDF values.

Return type:

np.ndarray

pyTENAX.tenax.inverse_magnitude_model(F_phat, eT, qs, b_exp=False)[source]

Calculate percentiles from the Weibell magnitude model

Parameters:

F_phat (numpy.ndarray) – distribution values. F_phat = [kappa_0,b,lambda_0,a].
eT (numpy.ndarray) – Temperature values from which to produce distribution.
qs (list) – list of percentiles to calculate (between 0 and 1). e.g. [0.85,0.95,0.99].
b_exp (bool) – If True, uses the exponential rather than linear fit for b.

Returns:

percentile_lines – array with shape length(qs) by length(eT) giving the magnitudes for each eT. percentile_lines[0] are the values for qs[0].

Return type:

numpy.ndarray

pyTENAX.tenax.randdf(size, df, flag)[source]

Generate random numbers from a user-defined PDF or CDF.

Pythonised version of MATLAB’s randdf coded by halleyhit on Aug. 15th, 2018. Email: halleyhit@sjtu.edu.cn or halleyhit@163.com

Parameters:

size (int or tuple) – Output size. 10 → 1-D array of 10; (10, 2) → 10×2 matrix.
df (np.ndarray) – 2-row matrix: first row is function values (PDF or CDF), second row is the corresponding sampling points.
flag (str) – "pdf" or "cdf".

Returns:

Random samples drawn according to the defined distribution.

Return type:

np.ndarray

pyTENAX.tenax.wbl_leftcensor_loglik(theta, t0, x1, t1, thr)[source]

Compute log-likelihood for a left-censored Weibull distribution.

Shape and scale parameters depend linearly on temperature. Observations below the threshold are left-censored.

Parameters:

theta (array-like) – Parameter vector [kappa_0, b, lambda_0, a].
t0 (np.ndarray) – Temperature values for censored events (precipitation below thr).
x1 (np.ndarray) – Precipitation values at or above thr.
t1 (np.ndarray) – Temperature values corresponding to x1.
thr (float) – Left-censoring threshold.

Returns:

Log-likelihood value.

Return type:

float

pyTENAX.tenax.wbl_leftcensor_loglik_H0shape(theta, t0, x1, t1, thr)[source]

Compute log-likelihood for a left-censored Weibull with constant shape (H0).

Same as wbl_leftcensor_loglik but with b=0, i.e. shape does not depend on temperature. Used as the null hypothesis in the likelihood-ratio test.

Parameters:

theta (array-like) – Parameter vector [kappa_0, b, lambda_0, a] (b is ignored).
t0 (np.ndarray) – Temperature values for censored events (precipitation below thr).
x1 (np.ndarray) – Precipitation values at or above thr.
t1 (np.ndarray) – Temperature values corresponding to x1.
thr (float) – Left-censoring threshold.

Returns:

Log-likelihood value.

Return type:

float

pyTENAX.tenax.wbl_leftcensor_loglik_bset(theta, t0, x1, t1, thr, b_set)[source]

Compute log-likelihood for a left-censored Weibull with fixed b.

Same as wbl_leftcensor_loglik but b is fixed to b_set and not optimised.

Parameters:

theta (array-like) – Parameter vector [kappa_0, b, lambda_0, a] (b is overridden by b_set).
t0 (np.ndarray) – Temperature values for censored events (precipitation below thr).
x1 (np.ndarray) – Precipitation values at or above thr.
t1 (np.ndarray) – Temperature values corresponding to x1.
thr (float) – Left-censoring threshold.
b_set (float) – Fixed value of b (shape temperature-dependence parameter).

Returns:

Log-likelihood value.

Return type:

float

Plotting utils

pyTENAX.plotting.TNX_FIG_magn_model(P: ndarray, T: ndarray, F_phat: ndarray, thr: float, eT: ndarray, qs: list, obscol='r', valcol='b', xlimits: list = [-12, 30], ylimits: list = [0.1, 1000], b_exp=False) → None[source]

Plots Figure 2a of Marra et al. (2024), the observed T-P pairs and the W model percentiles.

Parameters:

P (np.ndarray) – Precipitation values.
T (np.ndarray) – Temperature values.
F_phat (np.ndarray) – Distribution values. F_phat = [kappa_0,b,lambda_0,a]..
thr (float) – Precipitation threshold for left-censorig.
eT (np.ndarray) – x values to plot W model.
qs (list) – Percentiles to calculate W.
obscol (str, optional) – Color code to plot observations. Defaults to “r”.
valcol (str, optional) – Color code to plot model. Defaults to “b”.
xlimits (list, optional) – x limits of plot. Defaults to [-12, 30].
ylimits (list, optional) – y limits of plot. Defaults to [0.1, 1000].

pyTENAX.plotting.TNX_FIG_scaling(P, T, P_mc, T_mc, F_phat, niter_smev, eT, iTs, qs=[0.99], obscol='r', valcol='b', xlimits=[-15, 30], ylimits=[0.4, 1000])[source]

Plots figure 5.

Parameters:

P (numpy.ndarray) – precipitation values
T (numpy.ndarray) – temperature values
P_mc (numpy.ndarray) – Monte Carlo generated precipitation values.
T_mc (numpy.ndarray) – Monte Carlo generated temperature values.
F_phat (numpy.ndarray) – distribution values. F_phat = [kappa_0,b,lambda_0,a].
niter_smev (int) – Number of iterations for uncertainty for the SMEV model .
eT (numpy.ndarray) – x (temperature) values to produce distribution for magnitude model.
iTs (numpy.ndarray) – x (temperature) values to produce distribution for quantile regression, binning, and TENAX.
qs (list) – percentiles to calculate W.
obscol (string, optional) – color code to plot observations. The default is ‘r’.
valcol (string, optional) – color code to plot model. The default is ‘b’.
xlimits (list, optional) – [min_x,max_x]. x limits to plot. The default is [-15,30].
ylimits (list, optional) – [min_y,max_y]. y limits to plot. The default is [0.4,1000].

Returns:

scaling_rate – scaling rate of

Return type:

float

pyTENAX.plotting.TNX_FIG_temp_model(T, g_phat, beta, eT, obscol='r', valcol='b', obslabel='Observations', vallabel='Temperature model g(T)', xlimits=[-15, 30], ylimits=[0, 0.06], method='norm')[source]

Plots the observational and model temperature pdf

Parameters:

T (numpy.ndarray) – Array of observed temperatures.
g_phat (numpy.ndarray) – [mu, sigma] of temperature distribution.
beta (float) – value of beta in generalised normal distribution.
eT (numpy.ndarray) – x (temperature) values to produce distribution.
obscol (string, optional) – color to plot observations. The default is ‘r’.
valcol (string, optional) – color to plot magnitude model. The default is ‘b’.
obslabel (string, optional) – Label for observations. The default is ‘observations’.
vallabel (string, optional) – Label for model plot. The default is ‘temperature model g(T)’.
xlimits (list, optional) – limits for the x axis [lower_x_limit, upper_x_limit]. The default is [-15,30].
ylimits (list, optional) – limits for the y axis [lower_y_limit, upper_y_limit]. The default is [0,0.06].

Returns:

hist (numpy.ndarray) – pdf values of observed distribution.
pdf_values (numpy.ndarray) – pdf values of fitted model.

pyTENAX.plotting.TNX_FIG_valid(AMS: DataFrame, RP: list, RL: ndarray, smev_RL: ndarray | list = [], RL_unc: ndarray | list = [], smev_RL_unc=0, TENAXcol='b', obscol_shape='g+', smev_colshape='--r', TENAXlabel='The TENAX model', obslabel='Observed annual maxima', smevlabel='The SMEV model', alpha=0.2, xlimits: list = [1, 200], ylimits: list = [0, 50]) → None[source]

Plots figure 4 of Marra et al. (2024).

Parameters:

AMS (pd.DataFrame) – Dataframe containing annual maxima.
RP (list) – Return periods to plot.
RL (np.ndarray) – Return levels calculated by TENAX.
smev_RL (Union[np.ndarray, list], optional) – Return levels calculated by SMEV. Defaults to [].
RL_unc (int, optional) – Uncertainty of return levels calculated by TENAX. Only relevant if smev_RL is provided. Defaults to 0.
smev_RL_unc (int, optional) – Uncertainty of return levels calculated by SMEV. Only relevant if smev_RL is provided. Defaults to 0.
TENAXcol (str, optional) – Linestyle for TENAX data to use in plot. Defaults to “b”.
obscol_shape (str, optional) – Linestyle for annual maxima data to use in plot. Defaults to “g+”.
smev_colshape (str, optional) – Linestyle for SMEV data to use in plot. Defaults to “–r”.
TENAXlabel (str, optional) – Label for TENAX data to use in plot. Defaults to “The TENAX model”.
obslabel (str, optional) – Label for annual maxima observation data to use in plot. Defaults to “Observed annual maxima”.
smevlabel (str, optional) – Label for SMEV data to use in plot. Defaults to “The SMEV model”.
alpha (float, optional) – Transparency to use in plot. Defaults to 0.2.
xlimits (list, optional) – x limits of plot. Defaults to [1, 200].
ylimits (list, optional) – y limits of plot. Defaults to [0, 50].

pyTENAX.plotting.TNX_obs_scaling_rate(P, T, qs, niter)[source]

Calculate quantile regression parameters.

Parameters:

P (numpy.ndarray) – precipitation values
T (numpy.ndarray) – temperature values
qs (float) – percentile.

Returns:

qhat – [intercept, scaling rate].

Return type:

numpy.ndarray

Supporting SMEV class

The SMEV implementation in pyTENAX builds upon the concepts from the pysmev library.

class pyTENAX.smev.SMEV(return_period: list[int | float], durations: list[int], time_resolution: int, tolerance: float = 0.1, min_event_duration: int = 30, storm_separation_time: int = 24, left_censoring: list = [0, 1], min_rain: float | int = 0)[source]

Bases: object

do_smev_all(dict_ordinary: Dict[str, DataFrame], n: float) → Dict[str, dict][source]

Run SMEV parameter estimation and return level computation for all durations.

Parameters:

dict_ordinary (Dict[str, pd.DataFrame]) – Dictionary of ordinary events per duration, as returned by get_ordinary_events_values.
n (float) – Mean number of ordinary events per year.

Returns:

Keys are duration strings (e.g. "10"). Each value is a dict with keys 'SMEV_phat' (list[float] of length 2: [shape, scale]) and 'RLs' (float or np.ndarray of return levels, one per return period).

Return type:

Dict[str, dict]

estimate_smev_parameters(ordinary_events: ndarray | Series | list, data_portion: list[Tuple[int, float]]) → list[float][source]

Estimate shape and scale parameters of the Weibull distribution.

Parameters:

ordinary_events (np.ndarray or pd.Series or list) – Values of ordinary events.
data_portion (list) – Lower and upper limits of the probabilities of data to be used for the parameters estimation.

Returns:

Shape and scale parameters of the Weibull distribution.

Return type:

list[float]

get_ordinary_events(data: DataFrame | ndarray, dates: ndarray, name_col: str = 'value', check_gaps=True) → list[source]

Extract ordinary precipitation events from a time series.

Groups timesteps at or above self.min_rain into independent storm events separated by at least self.storm_separation_time hours. Optionally removes events too close to dataset boundaries or data gaps.

Parameters:

data (Union[pd.DataFrame, np.ndarray]) – Precipitation values.
dates (np.ndarray) – Timestamps of the precipitation data.
name_col (str, optional) – Column name to use when data is a DataFrame. Defaults to “value”.
check_gaps (bool, optional) – Remove events that fall within storm_separation_time of the dataset boundaries or internal data gaps. Defaults to True.

Returns:

List of np.ndarray, each containing the timestamps of one ordinary event (values >= self.min_rain separated by more than self.storm_separation_time hours).

Return type:

list

get_ordinary_events_values(data: ndarray, dates: ndarray, arr_dates_oe: ndarray, method: str = 'vectorized') → Tuple[Dict[str, DataFrame], Dict[str, DataFrame]][source]

Extract ordinary events and annual maxima from precipitation data.

Parameters:

data (np.ndarray) – Full precipitation time series.
dates (np.ndarray) – Timestamps of the full precipitation dataset.
arr_dates_oe (np.ndarray) – End and start times of ordinary events as returned by remove_short.
method (str, optional) –
Backend used for the sliding-window maximum search. Defaults to "vectorized". One of:
- "vectorized" — pure numpy, np.convolve per event.
- "njit" — numba JIT-compiled loop, single-threaded.
- "njit_parallel" — numba JIT-compiled loop, parallelised over events. Requires numba to be installed.

Notes

When using method="njit" or method="njit_parallel", the first call in a Python session triggers JIT compilation and can take several seconds. Run a warmup call before any timed or production code:

# Warmup — compile both kernels once at session start
S_SMEV.get_ordinary_events_values(
    data=df_arr, dates=df_dates, arr_dates_oe=arr_dates,
    method="njit"
)
S_SMEV.get_ordinary_events_values(
    data=df_arr, dates=df_dates, arr_dates_oe=arr_dates,
    method="njit_parallel"
)
# Subsequent calls are fast
dict_ordinary, dict_AMS = S_SMEV.get_ordinary_events_values(
    data=df_arr, dates=df_dates, arr_dates_oe=arr_dates,
    method="njit_parallel"
)

Returns:

dict_ordinary (dict) – Key is duration (str), value is a pd.DataFrame with columns year, oe_time, ordinary (event depth/intensity). Example: ``{“10”: pd.DataFrame(

columns=[‘year’, ‘oe_time’, ‘ordinary’])}``.
dict_AMS (dict) – Key is duration (str), value is a pd.DataFrame with columns year and AMS (annual maximum value).

static get_stats(df: DataFrame) → Tuple[Series, Series, Series, Series][source]

Compute statistics of precipitation values.

Statistics are total precipitation per year, mean precipitation per year, standard deviation of precipitation per year, and count of precipitation events per year.

Parameters:

df (pd.DataFrame) – Dataframe with precipitation values.

Returns:

total_prec (pd.Series) – Total precipitation per year.
mean_prec (pd.Series) – Mean precipitation per year.
sd_prec (pd.Series) – Standard deviation of precipitation per year.
count_prec (pd.Series) – Count of precipitation events per year.

remove_incomplete_years(data_pr: DataFrame, name_col='value', nan_to_zero=True) → DataFrame[source]

Delete incomplete years in precipitation data.

An incomplete year is defined as a year where observations are missing above a given threshold.

Parameters:

data_pr (pd.DataFrame) – Dataframe containing (hourly) precipitation values.
name_col (str, optional) – Column name in data_pr with precipitation values. Defaults to “value”.
nan_to_zero (bool, optional) – Set nan to zero. Defaults to True.

Returns:

Dataframe containing (hourly) precipitation values with incomplete years removed.

Return type:

pd.DataFrame

remove_short(list_ordinary: list) → Tuple[ndarray, ndarray, DataFrame][source]

Remove ordinary events that are too short.

Parameters:

list_ordinary (list) – List of ordinary events as returned by get_ordinary_events(). Each event may contain pd.Timestamp or np.datetime64 values.

Returns:

arr_vals (np.ndarray) – Boolean array (all True) of length equal to the number of kept events, one entry per event that passed the duration filter.
arr_dates (np.ndarray) – Array of (end, start) date tuples for each kept event.
n_ordinary_per_year (pd.DataFrame) – DataFrame with the count of ordinary events per year.

smev_bootstrap_uncertainty(P: ndarray, blocks_id: ndarray, niter: int, n: float)[source]

Bootstrap uncertainty of SMEV return values.

Parameters:

P (np.ndarray) – Array of precipitation data.
blocks_id (np.ndarray) – Array of block identifiers (e.g., years).
niter (int) – Number of bootstrap iterations.
n (float) – SMEV parameter n.

Returns:

Array with bootstrapped return value uncertainty.

Return type:

np.ndarray

smev_return_values(return_period: int | float | list | ndarray, shape: float, scale: float, n: float) → float | ndarray[source]

Calculate return values (rainfall intensity) from Weibull parameters.

Parameters:

return_period (Union[int, float, list, np.ndarray]) – Return period(s) of interest. Scalar returns a float, array-like returns np.ndarray.
shape (float) – Shape parameter value.
scale (float) – Scale parameter value.
n (float) – SMEV parameter n.

Returns:

Rainfall intensity value(s).

Return type:

Union[float, np.ndarray]

Weibull tail test

Original code written in MATLAB is available at: https://zenodo.org/records/7234708 For now, users can refer to the file tests/test_weibull_test.py to see how to use this code.

Created on Wed Sep 10 15:53:57 2025

@author1: Yaniv yaniv.goldschmidt@unipd.it @author2: PetrVey

The test is described in: - Marra F, W Amponsah, SM Papalexiou, 2023. Non-asymptotic Weibull tails explain the statistics of extreme daily precipitation. Adv. Water Resour., 173, 104388, https://doi.org/10.1016/j.advwatres.2023.104388

Original code written in MATLAB is available at: https://zenodo.org/records/7234708

pyTENAX.wbl_tail_test.check_confidence_interval(annual_max_indexes, records_df, p_confidence, annual_max, censor_value, p_out_dicts_lst)[source]

Arguments: - annual_max_indexes (list): List of indexes in the record of the annual/block maxima - records_df (dataframe): df with all the synthetic records - p_confidence (float): Probability to be used for the test. confidence interval = 1-p_confidence - annual_max (list): List of values over which the hypothesis is tested, i.e. block maxima - censor_value (float): The threshold for left censoring the record - p_out_dicts_lst (list): List of dicts - Each censor value tested gets a dict as follow: {censor_value:p_out}

Returns: - p_out_dicts_lst (list): Same list as in the arguments, after appending dict for the tested censor_value —————————————————————————–

pyTENAX.wbl_tail_test.create_synthetic_records(seed_random: int, synthetic_records_amount: int, record_size: int, shape: float, scale: float) → DataFrame[source]

The synthetic records contain random ordinary events sampled uniformly from the Weibull distribution. These synthetic records use as the basis for extracting the confidence interval.

Parameters:

seed_random (-) – Value that determines the starting point for the pseudorandom number generator => due to reproducibility
synthetic_records_amount (-) – Value that determines how many synthetic records to generate => number of stochastic realizations
record_size (-) – The number of ordinary events in the record
shape (-) – Weibull distribution parameter
scale (-) – Weibull distribution parameter

Returns:

pd.DataFrame – DataFrame with all the synthetic records. Each row represents separate synthetic record.
—————————————————————————–

pyTENAX.wbl_tail_test.estimate_smev_param_without_AM(ordinary_events: ndarray | Series | list, censor_value, annual_max_indexes)[source]

Arguments: - ordinary_events ([np.ndarray, pd.Series, list): values of ordinary events - without zeros!!! - censor_value (float): The threshold for left censoring the record - annual_max_indexes (list): List of indexes in the record of the annual/block maxima, COMING ALREADY FROM SORTED ARRAY!

Returns: - shape, scale (floats): Weibull distribution parameters —————————————————————————–

pyTENAX.wbl_tail_test.find_optimal_threshold(p_out_dicts_lst, p_confidence)[source]

If all threshold rejected - it will return 1

Arguments: - p_out_dicts_lst (list): List of dicts for each of the censor values tested, as follow: {censor_value:p_out} - p_confidence (float): Probability to be used for the test. confidence interval = 1-p_confidence

Returns: - optimal_threshold (float): The minimal threshold from which p_out <= p_confidence for all bigger thresholds.

If all threshold rejected - it will return 1. If not all thresholds rejected, (1-optimal_threshold) is the portion of the record that can be assumed to be distributed Weibull.

range_of_optimal: list
List of thresholds where p_out < p_confidence.

pyTENAX.wbl_tail_test.plot_curve(p_out_dicts_lst, p_confidence, optimal_threshold)[source]

Arguments: - p_out_dicts_lst (list): List of dicts for each of the censor values tested, as follow: {censor_value:p_out} - p_confidence (float): Probability to be used for the test. confidence interval = 1-p_confidence - optimal_threshold (float): The optimal left censoring threshold - csv_filename (str): Name of the input CSV file (without extension) to use for the output plot name

Returns: - Saves figure to monte_carlo/monte_carlo_output directory —————————————————————————–

pyTENAX.wbl_tail_test.weibul_test_MC(ordinary_events_df: DataFrame, pr_field: str, hydro_year_field: str, seed_random: int = 42, synthetic_records_amount: int = 500, p_confidence: float = 0.1, make_plot: bool = True, censor_AM: bool = True, censor_values: ndarray | list = array([0., 0.05, 0.1, 0.15, 0.2, 0.25, 0.3, 0.35, 0.4, 0.45, 0.5, 0.55, 0.6, 0.65, 0.7, 0.75, 0.8, 0.85, 0.9, 0.95]))[source]

This function will return the optimal left censoring threshold. If all threshold rejected - it will return 1. If not all thresholds rejected, (1-optimal_threshold) is the portion of the record that can be assumed to be distributed Weibull.

Warning: If the returned optimal_threshold == 1.0, this does not necessarily mean that all thresholds failed. It may indicate that your tested threshold range extended too high (e.g., up to 0.95 or higher), where results are dominated by stochastic sampling noise of extremes. In such cases, you should check whether some lower thresholds worked. As a rule of thumb, thresholds above ~0.95 are usually unreliable, though the exact cutoff is case dependent.

Parameters:

ordinary_events_df (-) – One column pandas dataframe of the ordinary events - without zeros!!!
pr_field (-) – The name of the column with the precipitation values
hydro_year_field (-) – The name of the column with the hydrological-years / blocks values
seed_random (-) – Value that determines the starting point for the pseudorandom number generator => due to reproducibility
synthetic_records_amount (-) – Value that determines how many synthetic records to generate. => number of stochastic realizations
p_confidence (-) – Probability to be used for the test. confidence interval = 1-p_confidence
make_plot (-) – Choose whether or not to include the plot
censor_AM (-) – Choose whether or not the annual maximas should be included in ordinary events and test
censor_values_range (-) – The censoring thresholds which should be tested, nativally range from 0 to 1 in 0.05 step

Returns:

- optimal_threshold (Union[float, int]) – The optimal left censoring threshold, or 1 if all rejected, or 1111 if there is a problem with Weibull parameters fit
- estimated_params (list) – Estimated weibull parameters of the optimal threshold (None if optimal==1)
- range_of_optimal (list) – List of thresholds where p_out < p_confidence.
- p_out_dicts_lst (list) – Fraction of block maxima outside of the Y = 1-p_out confidence interval
—————————————————————————–