Tutorial 03: TENAX Hindcast evaluation

This tutorial guides you through the process of implementing the scaling of TENAX-based changes in mean temperature (𝜇 μ) and the standard deviation of temperature (σ) during precipitation events.

Although this method can also be used to project changes in extreme sub-hourly precipitation under a future warmer climate, it relies solely on climate model projections of temperatures during wet days and anticipated changes in precipitation frequency.”

Import all libraries needed for running TENAX and SMEV

[1]:
from importlib.resources import files
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from scipy.stats import chi2
# Import pyTENAX
from pyTENAX import tenax, plotting

Let’s initiate TENAX class and SMEV class with given setup.

[2]:
# Initiate TENAX class with customized setup
S = tenax.TENAX(
    return_period=[
        2,
        5,
        10,
        20,
        50,
        100,
        200,
    ],
    durations=[10, 60, 180, 360, 720, 1440], #durations are in minutes and they refer to depth of rainfall within given duration
    time_resolution=10,  # time resolution in minutes
    left_censoring=[0, 0.90], # left censoring threshold
    alpha=0.05, #dependence of shape on T depends on statistical significance at the alpha-level.
    min_rain = 0.1, #minimum rainfall depth threshold
)

Once again we start with same data.

[3]:
# Load precipitation data
# Create input path file for the test file
file_path_input = files('pyTENAX.res').joinpath('prec_data_Aadorf.parquet')
# Load data from csv file
data = pd.read_parquet(file_path_input)
# Convert 'prec_time' column to datetime, if it's not already
data["prec_time"] = pd.to_datetime(data["prec_time"])
# Set 'prec_time' as the index
data.set_index("prec_time", inplace=True)
name_col = "prec_values"  # name of column containing data to extract

# load temperature data
file_path_temperature = files('pyTENAX.res').joinpath('temp_data_Aadorf.parquet')
t_data = pd.read_parquet(file_path_temperature)
# Convert 'temp_time' column to datetime if it's not already in datetime format
t_data["temp_time"] = pd.to_datetime(t_data["temp_time"])
# Set 'temp_time' as the index
t_data.set_index("temp_time", inplace=True)
temp_name_col = "temp_values"

Repeat the preprocessing to get ordinary events values and corresponding temperature.

We once again focus only on 10-minuts rainfall depth.

[4]:
data = S.remove_incomplete_years(data, name_col)

# get data from pandas to numpy array
df_arr = np.array(data[name_col])
df_dates = np.array(data.index)
df_arr_t_data = np.array(t_data[temp_name_col])
df_dates_t_data = np.array(t_data.index)

# extract indexes of ordinary events
# these are time-wise indexes =>returns list of np arrays with np.timeindex
idx_ordinary = S.get_ordinary_events(data=df_arr,
                                     dates=df_dates,
                                     name_col=name_col,
                                     check_gaps=False)

# get ordinary events by removing too short events
# returns boolean array, dates of OE in TO, FROM format, and count of OE in each years
arr_vals, arr_dates, n_ordinary_per_year = S.remove_short(idx_ordinary)

# assign ordinary events values by given durations, values are in depth per duration, NOT in intensity mm/h
dict_ordinary, dict_AMS = S.get_ordinary_events_values(data=df_arr,
                                                       dates=df_dates,
                                                       arr_dates_oe=arr_dates)

dict_ordinary, _, n_ordinary_per_year = S.associate_vars(dict_ordinary,
                                                         df_arr_t_data,
                                                         df_dates_t_data)

# Your data (P, T arrays) and threshold thr=3.8
P = dict_ordinary["10"]["ordinary"].to_numpy()  # Replace with your actual data
T = dict_ordinary["10"]["T"].to_numpy()  # Replace with your actual data
blocks_id = dict_ordinary["10"]["year"].to_numpy()  # Replace with your actual data
# Number of threshold
thr = dict_ordinary["10"]["ordinary"].quantile(S.left_censoring[1])
# Exctract annual maximas
AMS = dict_AMS["10"]  # yet the annual maxima

# For plotting
# we must create range of temperature
eT = np.arange(
    np.min(T)-4, np.max(T) + 4, 1
)  # define T values to calculate distributions. +4 to go beyond graph end


Hindcast evaluation

We evaluated the ability of the TENAX model to project precipitation return levels under increased temperatures through a hindcast, by splitting the 38-year record of the climate station into two 19-year periods.

[5]:
yrs = dict_ordinary["10"]["oe_time"].dt.year
yrs_unique = np.unique(yrs)
midway = yrs_unique[
    int(np.ceil(np.size(yrs_unique) / 2)) - 1
]  # -1 to adjust indexing because this returns a sort of length

# DEFINE FIRST PERIOD
P1 = P[yrs <= midway]
T1 = T[yrs <= midway]
AMS1 = AMS[AMS["year"] <= midway]
n_ordinary_per_year1 = n_ordinary_per_year[n_ordinary_per_year.index <= midway]
n1 = n_ordinary_per_year1.sum() / len(n_ordinary_per_year1)

# DEFINE SECOND PERIOD
P2 = P[yrs > midway]
T2 = T[yrs > midway]
AMS2 = AMS[AMS["year"] > midway]
n_ordinary_per_year2 = n_ordinary_per_year[n_ordinary_per_year.index > midway]
n2 = n_ordinary_per_year2.sum() / len(n_ordinary_per_year2)

Comparing Temperature models in two periods (1981-1999 & 2000-2018)

[6]:
g_phat1 = S.temperature_model(T1) #returns mu and sigma
g_phat2 = S.temperature_model(T2) #returns mu and sigma

_, _ = plotting.TNX_FIG_temp_model(
    T=T1,
    g_phat=g_phat1,
    beta=4,
    eT=eT,
    obscol="b",
    valcol="b",
    obslabel=None,
    vallabel="Temperature model " + str(yrs_unique[0]) + "-" + str(midway),
)
_, _ = plotting.TNX_FIG_temp_model(
    T=T2,
    g_phat=g_phat2,
    beta=4,
    eT=eT,
    obscol="r",
    valcol="r",
    obslabel=None,
    vallabel="Temperature model " + str(midway + 1) + "-" + str(yrs_unique[-1]),
)  # model based on temp ave and std changes
../_images/tutorials_Tutorial_03_12_0.png

Predicted Temperature Model: Based on Changes in μ (Mean) and σ (Standard Deviation)

Increases in the mean temperature (μ) and/or the standard deviation (σ) imply a higher probability of precipitation events occurring at higher temperatures.

  • mu_delta represents the change in mean temperature during precipitation events.

  • sigma_factor represents the scaling factor applied to the standard deviation of temperature during precipitation events.

The predicted temperature model is computed by:

  • Adding mu_delta to the original mean (μ), and

  • Multiplying the original standard deviation (σ) by sigma_factor.

Mathematical Formulation

Let:

  • μ = original mean temperature of first model (1981-1999)

  • σ = original standard deviation of first model (1981-1999)

  • μ’ = predicted mean temperature

  • σ’ = predicted standard deviation

  • Δμ = mu_delta

  • σ_factor = sigma_factor

Then:

\[\mu' = \mu + \Delta\mu\]
\[\sigma' = \sigma \times \sigma_{\text{factor}}\]
[7]:
mu_delta = np.mean(T2) - np.mean(T1)
sigma_factor = np.std(T2) / np.std(T1)
print(f"Mean temperature has changed by {mu_delta}")
print(f"Standart deviation has changed by factor of {sigma_factor}")

# Create a predicted temperature model
g_phat2_predict = [g_phat1[0] + mu_delta, g_phat1[1] * sigma_factor]

# Compare with Temperatude model of second period we have created before
print(f"Temperatude model of second period {g_phat2}")
print(f"Predicted Temperatude model {g_phat2_predict}")
Mean temperature has changed by 0.48157014197678905
Standart deviation has changed by factor of 1.057753334197546
Temperatude model of second period [10.04413169 12.64035242]
Predicted Temperatude model [10.045000131438464, 12.725393549960536]

Create TENAX Magnitude models of two periods (1981-1999 & 2000-2018)

A magnitude model \(W(x; T)\) was fitted independently for each time period. To assess the similarity of the models, a likelihood ratio test was applied.

According to the general theory of the likelihood ratio test, under the null hypothesis \(H_0\), we compute the likelihood function \(L(\theta)\), where \(\theta \in \Theta\) and \(\Theta\) is the parameter space.

The test statistic is defined as:

\[-2 \ln \left( \frac{\sup_{\theta \in H_0} L(\theta)}{\sup_{\theta \in \Theta} L(\theta)} \right)\]

This statistic follows a chi-squared distribution under certain regularity conditions and can be used to determine whether the models fitted for different periods are significantly different.

[8]:
# Sampling intervals for the Montecarlo based on original temperature T
Ts = np.arange(
    np.min(T) - S.temp_delta, np.max(T) + S.temp_delta, S.temp_res_monte_carlo
)

# Maginitude model of original data that containst both periods (1981-2018)
F_phat, loglik, _, _ = S.magnitude_model(P, T, thr)

# Maginitude model of first period (1981-1999)
F_phat1, loglik1, _, _ = S.magnitude_model(P1, T1, thr)
RL1, _, _ = S.model_inversion(F_phat1, g_phat1, n1, Ts)

# Maginitude model of second period  (2000-2018)
F_phat2, loglik2, _, _ = S.magnitude_model(P2, T2, thr)
RL2, _, _ = S.model_inversion(F_phat2, g_phat2, n2, Ts)
[9]:
if F_phat[1] == 0:  # check if b parameter is 0 (shape=shape_0*b
    dof = 3
    alpha1 = 1  # b parameter is not significantly different from 0; 3 degrees of freedom for the LR test
else:
    dof = 4
    alpha1 = 0  # b parameter is significantly different from 0; 4 degrees of freedom for the LR test
[10]:
# check magnitude model the same in both periods
lambda_LR = -2 * (loglik - (loglik1 + loglik2))
pval = chi2.sf(lambda_LR, dof)
if pval > S.alpha:
    print(f"p={pval}. Magnitude models not  different at {S.alpha*100}% significance.")
else:
    print(f"p={pval}. Magnitude models are different at {S.alpha*100}% significance.")

p=0.6803400017045984. Magnitude models not  different at 5.0% significance.

Estimating return levels based on projected change in temperature model and by using the magnitude model of first period.

We compare these return levels to annual maxima of second period and Note: Here we do not apply scaling of n (the average number of events per year). Although, this can be simply done by calculating ratio difference of events during two periods and multipying n1 by this factor.

[11]:
RL2_predict, _, _ = S.model_inversion(F_phat1, g_phat2_predict, n1, Ts)
[12]:
# Plot the results
plotting.TNX_FIG_valid(
    AMS1,
    S.return_period,
    RL1,
    TENAXcol="b",
    obscol_shape="b+",
    TENAXlabel="The TENAX model " + str(yrs_unique[0]) + "-" + str(midway),
    obslabel="Observed annual maxima " + str(yrs_unique[0]) + "-" + str(midway),
)
plotting.TNX_FIG_valid(
    AMS2,
    S.return_period,
    RL2_predict,
    TENAXcol="r",
    obscol_shape="r+",
    TENAXlabel="The predicted TENAX model "
    + str(midway + 1)
    + "-"
    + str(yrs_unique[-1]),
    obslabel="Observed annual maxima " + str(midway + 1) + "-" + str(yrs_unique[-1]),
)
plt.xticks(S.return_period)
plt.gca().set_xticks(S.return_period)  # This sets the actual tick marks on log scale
plt.gca().get_xaxis().set_major_formatter(plt.ScalarFormatter())  # Optional: shows ticks as plain numbers
plt.legend(loc="upper center", bbox_to_anchor=(0.5, -0.2))
plt.grid(True, which='both', axis='both', linestyle='--', color='lightgray', alpha=0.7)
plt.show()
../_images/tutorials_Tutorial_03_21_0.png