torchuq.evaluate Subpackage¶
This section contains the Python API reference for the torchuq.evaluate
subpackage,
containing code for evaluating and visualizing predictions.
torchuq.evaluate.distribution Module¶
- torchuq.evaluate.distribution.compute_crps(predictions, labels, reduction='mean', resolution=500)¶
Compute the CRPS score.
The CRPS score is a proper score that measures the quality of a prediction.
- Parameters
predictions (distribution) – a batch of distribution predictions.
labels (tensor) – array [batch_size] of labels.
reduction (str) – the method to aggregate the results across the batch. Can be ‘none’, ‘mean’, ‘sum’, ‘median’, ‘min’, or ‘max’.
resolution (int) – the number of discretization bins, higher resolution increases estimation accuracy but also requires more memory/compute.
- Returns
The CRPS score, when reduction is ‘none’ the shape is [batch_size], otherwise the shape is [].
- Return type
tensor with shape [batch_size] or []
- torchuq.evaluate.distribution.compute_ece(predictions, labels, debiased=False)¶
Compute the (weighted) ECE score as in https://arxiv.org/abs/1807.00263.
Note that this function has biased gradients because of non-differentiable sorting.
- Parameters
predictions (distribution) – a batch of distribution predictions.
labels (tensor) – array [batch_size] of labels.
debiased (bool) – if True then the estimation bias is deducted: if the predictions are perfectly calibrated, then this function in expectation returns 0.
- Returns
The ECE score.
- Return type
tensor with shape []
- torchuq.evaluate.distribution.compute_mean(predictions, reduction='mean', resolution=500)¶
Compute the mean of the distribution predictions.
- Parameters
predictions (distribution) – a batch of distribution predictions.
reduction (str) – the method to aggregate the results across the batch. Can be ‘none’, ‘mean’, ‘sum’, ‘median’, ‘min’, or ‘max’.
resolution (int) – the number of discretization bins, where higher resolution increases estimation accuracy but also requires more memory/compute.
- Returns
The computed mean, when reduction is ‘none’ the shape is [batch_size], otherwise the shape is [].
- Return type
tensor with shape [batch_size] or []
- torchuq.evaluate.distribution.compute_mean_std(predictions, reduction='mean', resolution=500)¶
Same as compute_mean and compute_std, but combines into one function for better efficiency.
- Parameters
predictions (distribution) – a batch of distribution predictions.
reduction (str) – the method to aggregate the results across the batch. Can be ‘none’, ‘mean’, ‘sum’, ‘median’, ‘min’, or ‘max’.
resolution (int) – the number of discretization bins, where higher resolution increases estimation accuracy but also requires more memory/compute.
- Returns
The mean and the standard deviation. When reduction is ‘none’ the shape is [batch_size], otherwise the shape is [].
- Return type
tuple of two tensors with shape [batch_size] or []
- torchuq.evaluate.distribution.compute_std(predictions, reduction='mean', resolution=500)¶
Compute the standard deviation of the distribution predictions.
- Parameters
predictions (distribution) – a batch of distribution predictions.
reduction (str) – the method to aggregate the results across the batch. Can be ‘none’, ‘mean’, ‘sum’, ‘median’, ‘min’, or ‘max’.
resolution (int) – the number of discretization bins, where higher resolution increases estimation accuracy but also requires more memory/compute.
- Returns
The standard deviation, when reduction is ‘none’ the shape is [batch_size], otherwise the shape is [].
- Return type
tensor with shape [batch_size] or []
- torchuq.evaluate.distribution.plot_cdf(predictions, labels=None, ax=None, max_count=30, resolution=200)¶
Plot the CDF functions.
- Parameters
predictions (distribution) – a batch of distribution predictions.
labels (tensor with shape [batch_size]) – the true labels. If None the true labels are not plotted.
ax (axes) – the axes to plot the figure on, if None automatically creates a figure with recommended size.
max_count (int) – the maximum number of CDFs to plot.
resolution (int) – the number of points to compute the density. Higher resolution leads to a more accurate plot, but also requires more computation.
- Returns
the ax on which the plot is made.
- Return type
matplotlib.axes.Axes
- torchuq.evaluate.distribution.plot_cdf_sequence(predictions, labels=None, ax=None, max_count=20, resolution=200)¶
Plot the CDF functions.
- Parameters
predictions (distribution) – a batch of distribution predictions.
labels (tensor with shape [batch_size]) – the true labels. If None the true labels are not plotted.
ax (axes) – the axes to plot the figure on, if None automatically creates a figure with recommended size.
max_count (int) – the maximum number of CDFs to plot.
resolution (int) – the number of points to compute the density. Higher resolution leads to a more accurate plot, but also requires more computation.
- Returns
the ax on which the plot is made.
- Return type
matplotlib.axes.Axes
- torchuq.evaluate.distribution.plot_density_sequence(predictions, labels=None, max_count=100, ax=None, resolution=100, smooth_bw=0)¶
Plot the PDF of the predictions and the labels.
For aesthetics the PDFs are reflected along y axis to make a symmetric violin shaped plot.
- Parameters
predictions (distribution) – a batch of distribution predictions.
labels (tensor with shape [batch_size]) – the true labels. If None the true labels are not plotted.
ax (axes) – the axes to plot the figure on, if None automatically creates a figure with recommended size.
max_count (int) – the maximum number of PDFs to plot.
resolution (int) – the number of points to compute the density. Higher resolution leads to a more accurate plot, but also requires more computation.
smooth_bw (int) – smooth the PDF with a uniform kernel whose bandwidth is smooth_bw / resolution.
- Returns
the ax on which the plot is made.
- Return type
matplotlib.axes.Axes
- torchuq.evaluate.distribution.plot_icdf(predictions, labels=None, ax=None, max_count=30, resolution=200)¶
Plot the inverse CDF functions.
- Parameters
predictions (distribution) – a batch of distribution predictions.
labels (tensor with shape [batch_size]) – the true labels. If None the true labels are not plotted.
ax (axes) – optional matplotlib.axes.Axes, the axes to plot the figure on. If None, automatically creates a figure with recommended size.
max_count (int) – the maximum number of CDFs to plot.
resolution (int) – the number of points to compute the density. Higher resolution leads to a more accurate plot, but also requires more computation.
- Returns
the ax on which the plot is made.
- Return type
matplotlib.axes.Axes
- torchuq.evaluate.distribution.plot_reliability_diagram(predictions, labels, ax=None)¶
Plot the reliability diagram https://arxiv.org/abs/1807.00263.
- Parameters
predictions (distribution) – a batch of distribution predictions.
labels (tensor with shape [batch_size]) – the true labels.
ax (axes) – optional matplotlib.axes.Axes, the axes to plot the figure on. If None, automatically creates a figure with recommended size.
- Returns
the ax on which the plot is made.
- Return type
matplotlib.axes.Axes
torchuq.evaluate.interval Module¶
- torchuq.evaluate.interval.compute_coverage(predictions, labels, reduction='mean')¶
Compute the empirical coverage. This function is not differentiable.
- Parameters
predictions (tensor) – a batch of interval predictions, which is an array [batch_size, 2].
labels (tensor) – the labels, an array of shape [batch_size].
reduction (str) – the method to aggregate the results across the batch. Can be ‘none’, ‘mean’, ‘sum’, ‘median’, ‘min’, or ‘max’.
- Returns
the coverage, an array with shape [batch_size] or shape [] depending on the reduction.
- Return type
tensor
- torchuq.evaluate.interval.compute_length(predictions, reduction='mean')¶
Compute the average length of an interval prediction.
- Parameters
predictions (tensor) – a batch of interval predictions, which is an array [batch_size, 2].
reduction (str) – the method to aggregate the results across the batch. Can be ‘none’, ‘mean’, ‘sum’, ‘median’, ‘min’, or ‘max’.
- Returns
the interval length, an array with shape [batch_size] or shape [] depending on the reduction.
- Return type
tensor
- torchuq.evaluate.interval.plot_interval_sequence(predictions, labels=None, ax=None, max_count=100)¶
Plot the PDF of the predictions and the labels.
For aesthetics the PDFs are reflected along y axis to make a symmetric violin shaped plot.
- Parameters
predictions (tensor) – a batch of interval predictions, which is an array [batch_size, 2].
labels (tensor) – the labels, an array of shape [batch_size].
ax (axes) – the axes to plot the figure on. If None, automatically creates a figure with recommended size.
max_count (int) – the maximum number of intervals to plot.
- Returns
the ax on which the plot is made.
- Return type
axes
- torchuq.evaluate.interval.plot_length_cdf(predictions, ax=None, plot_median=True)¶
Plot the CDF of interval length.
- Parameters
predictions (tensor) – a batch of interval predictions, which is an array [batch_size, 2].
ax (axes) – the axes to plot the figure on, if None automatically creates a figure with recommended size.
plot_median (bool) – if true plot the median interval length.
- Returns
the ax on which the plot is made.
- Return type
axes
torchuq.evaluate.point Module¶
- torchuq.evaluate.point.compute_huber_loss(predictions, labels, reduction='mean', delta=None)¶
Compute the Huber loss.
- Parameters
predictions (tensor) – a batch of point predictions.
labels (tensor) – the labels, an array of shape [batch_size].
reduction (str) – the method to aggregate the results across the batch. Can be ‘none’, ‘mean’, ‘sum’, ‘median’, ‘min’, or ‘max’.
delta (float) – the delta parameter for the huber loss, if None then automatically set it as the top 20% largest absolute error.
- Returns
the huber loss, an array with shape [batch_size] or shape [] depending on the reduction.
- Return type
tensor
- torchuq.evaluate.point.compute_l2_loss(predictions, labels, reduction='mean')¶
Compute the L2 loss.
- Parameters
predictions (tensor) – a batch of point predictions.
labels (tensor) – the labels, an array of shape [batch_size].
reduction (str) – the method to aggregate the results across the batch. Can be ‘none’, ‘mean’, ‘sum’, ‘median’, ‘min’, or ‘max’.
- Returns
the l2 loss, an array with shape [batch_size] or shape [] depending on the reduction.
- Return type
tensor
- torchuq.evaluate.point.compute_pinball_loss(predictions, labels, alpha=0.5, reduction='mean')¶
Compute the pinball loss for the alpha-th quantile.
- Parameters
predictions (tensor) – a batch of point predictions.
labels (tensor) – the labels, an array of shape [batch_size].
reduction (str) – the method to aggregate the results across the batch. Can be ‘none’, ‘mean’, ‘sum’, ‘median’, ‘min’, or ‘max’.
alpha (float) – the quantile to compute the pinball loss for.
- Returns
the pinball loss, an array with shape [batch_size] or shape [] depending on the reduction.
- Return type
tensor
- torchuq.evaluate.point.plot_conditional_bias(predictions, labels, ax=None, knn=None, conditioning='label')¶
Make the conditional bias diagram as described in [TODO: add paper reference].
- Parameters
predictions (tensor) – a batch of point predictions.
labels (tensor) – the labels, an array of shape [batch_size].
ax (axes) – the axes to plot the figure on, if None automatically creates a figure with recommended size.
knn (int) – the number of nearest neighbors to average over. If None knn is set automatically.
conditioning (str) – can be ‘label’ or ‘prediction’.
- Returns
the ax on which the plot is made.
- Return type
axes
- torchuq.evaluate.point.plot_scatter(predictions, labels, ax=None)¶
Plot the scatter plot between the point predictions and the labels.
- Parameters
predictions (tensor) – a batch of point predictions.
labels (tensor) – the labels, an array of shape [batch_size].
ax (axes) – the axes to plot the figure on, if None automatically creates a figure with recommended size.
- Returns
the ax on which the plot is made.
- Return type
axes
torchuq.evaluate.quantile Module¶
- torchuq.evaluate.quantile.compute_pinball_loss(predictions, labels, reduction='mean')¶
Compute the pinball loss, which is a proper scoring rule for quantile predictions.
- Parameters
predictions (tensor) – a batch of quantile predictions, which is an array with shape [batch_size, n_quantiles] or [batch_size, 2, n_quantiles].
labels (tensor) – the labels, an array of shape [batch_size].
reduction (str) – the method to aggregate the results across the batch. Can be ‘none’, ‘mean’, ‘sum’, ‘median’, ‘min’, or ‘max’.
- Returns
the pinball loss, an array with shape [batch_size] or shape [] depending on the reduction.
- Return type
tensor
- torchuq.evaluate.quantile.plot_quantile_calibration(predictions, labels, ax=None)¶
Plot the reliability diagram for quantiles.
- Parameters
predictions (tensor) – a batch of quantile predictions, which is an array with shape [batch_size, n_quantiles] or [batch_size, 2, n_quantiles].
labels (tensor) – the labels, an array of shape [batch_size].
ax (axes) – the axes to plot the figure on, if None automatically creates a figure with recommended size.
- Returns
the ax on which the plot is made.
- Return type
axes
- torchuq.evaluate.quantile.plot_quantile_sequence(predictions, labels=None, ax=None, max_count=100)¶
Plot the PDF of the predictions and the labels.
For aesthetics the PDFs are reflected along y axis to make a symmetric violin shaped plot.
- Parameters
predictions (tensor) – a batch of quantile predictions, which is an array with shape [batch_size, n_quantiles] or [batch_size, 2, n_quantiles].
labels (tensor) – the labels, an array of shape [batch_size].
ax (axes) – the axes to plot the figure on, if None automatically creates a figure with recommended size.
max_count (int) – the maximum number of quantiles to plot.
- Returns
the ax on which the plot is made.
- Return type
axes