torchuq.evaluate Subpackage
This section contains the Python API reference for the torchuq.evaluate
subpackage,
containing code for evaluating and visualizing predictions.
torchuq.evaluate.distribution Module
- torchuq.evaluate.distribution.compute_crps(predictions, labels, reduction='mean', resolution=500)
Compute the CRPS score.
The CRPS score is a proper score that measures the quality of a prediction.
- Parameters
predictions (distribution) – a batch of distribution predictions.
labels (tensor) – array [batch_size] of labels.
reduction (str) – the method to aggregate the results across the batch. Can be ‘none’, ‘mean’, ‘sum’, ‘median’, ‘min’, or ‘max’.
resolution (int) – the number of discretization bins, higher resolution increases estimation accuracy but also requires more memory/compute.
- Returns
the CRPS score, an array with shape [batch_size] or shape [] depending on the reduction.
- Return type
tensor
- torchuq.evaluate.distribution.compute_ece(predictions, labels, debiased=False)
Compute the (weighted) ECE score as in https://arxiv.org/abs/1807.00263.
Note that this function has biased gradient because of the non-differentiable nature of sorting.
- Parameters
predictions (labels are truely drawn from the) – a batch of distribution predictions.
labels (tensor) – array [batch_size] of labels.
debiased (bool) – if debiased=True then the finite sample bias is removed. If the
predictions –
0. (the this function will in expectation return) –
- Returns
the ECE score, which is an scalar array (array of shape []).
- Return type
tensor
- torchuq.evaluate.distribution.compute_mean(predictions, reduction='mean', resolution=500)
Compute the mean of the predictions.
- Parameters
predictions (distribution) – a batch of distribution predictions.
reduction (str) – the method to aggregate the results across the batch. Can be ‘none’, ‘mean’, ‘sum’, ‘median’, ‘min’, or ‘max’.
resolution (int) – the number of discretization bins, where higher resolution increases estimation accuracy but also requires more memory/compute.
- Returns
the computed mean, array [batch_size] or array [] depending on the reduction.
- Return type
tensor
- torchuq.evaluate.distribution.compute_mean_std(predictions, reduction='mean', resolution=500)
Same as compute_mean and compute_std, but combines into one function for better efficiency.
- Parameters
predictions (distribution) – a batch of distribution predictions.
reduction (str) – the method to aggregate the results across the batch. Can be ‘none’, ‘mean’, ‘sum’, ‘median’, ‘min’, or ‘max’.
resolution – the number of discretization bins, where higher resolution increases estimation accuracy but also requires more memory/compute.
- Returns
the computed mean, array [batch_size] or array [] depending on the reduction. tensor: the computed standard deviation, array [batch_size] or array [] depending on the reduction.
- Return type
tensor
- torchuq.evaluate.distribution.compute_std(predictions, reduction='mean', resolution=500)
Compute the standard deviation of the predictions.
- Parameters
predictions (distribution) – a batch of distribution predictions.
reduction (str) – the method to aggregate the results across the batch. Can be ‘none’, ‘mean’, ‘sum’, ‘median’, ‘min’, or ‘max’.
resolution (int) – the number of discretization bins, where higher resolution increases estimation accuracy but also requires more memory/compute.
- Returns
the standard deviation, array [batch_size] or array [], the standard deviation.
- Return type
tensor
- torchuq.evaluate.distribution.plot_cdf(predictions, labels=None, ax=None, max_count=30, resolution=200)
Plot the CDF functions.
- Parameters
predictions (distribution) – a batch of distribution predictions.
labels (tensor) – the labels.
ax (axes) – the axes to plot the figure on, if None automatically creates a figure with recommended size.
max_count (int) – the maximum number of CDFs to plot.
resolution (int) – the number of points to compute the density. Higher resolution leads to a more accurate plot, but also requires more computation.
- Returns
the ax on which the plot is made.
- Return type
axes
- torchuq.evaluate.distribution.plot_cdf_sequence(predictions, labels=None, ax=None, max_count=20, resolution=200)
Plot the CDF functions.
- Parameters
predictions (distribution) – a batch of distribution predictions.
labels (tensor) – the labels, an array [batch_size], if not provided then no label will be plotted.
ax (axes) – the axes to plot the figure on, if None automatically creates a figure with recommended size.
max_count (int) – the maximum number of CDFs to plot.
resolution (int) – the number of points to compute the density. Higher resolution leads to a more accurate plot, but also requires more computation.
- Returns
the ax on which the plot is made.
- Return type
axes
- torchuq.evaluate.distribution.plot_density_sequence(predictions, labels=None, max_count=100, ax=None, resolution=100, smooth_bw=0)
Plot the PDF of the predictions and the labels.
For aesthetics the PDFs are reflected along y axis to make a symmetric violin shaped plot.
- Parameters
predictions (distribution) – a batch of distribution predictions.
labels (tensor) – the labels, if None then the labels are not plotted.
ax (axes) – the axes to plot the figure on, if None automatically creates a figure with recommended size.
max_count (int) – the maximum number of PDFs to plot.
resolution (int) – the number of points to compute the density. Higher resolution leads to a more accurate plot, but also requires more computation.
smooth_bw (int) – smooth the PDF with a uniform kernel whose bandwidth is smooth_bw / resolution.
- Returns
the ax on which the plot is made, it is an instance of matplotlib.axes.Axes.
- Return type
axes
- torchuq.evaluate.distribution.plot_icdf(predictions, labels=None, ax=None, max_count=30, resolution=200)
Plot the inverse CDF functions.
- Parameters
predictions (distribution) – a batch of distribution predictions.
labels (tensor) – the labels, an array [batch_size].
ax (axes) – optional matplotlib.axes.Axes, the axes to plot the figure on. If None, automatically creates a figure with recommended size.
max_count (int) – the maximum number of CDFs to plot.
resolution (int) – the number of points to compute the density. Higher resolution leads to a more accurate plot, but also requires more computation.
- Returns
the ax on which the plot is made.
- Return type
axes
- torchuq.evaluate.distribution.plot_reliability_diagram(predictions, labels, ax=None)
Plot the reliability diagram https://arxiv.org/abs/1807.00263.
- Parameters
predictions (distribution) – a batch of distribution predictions.
labels (tensor) – the labels, array [batch_size].
ax (axes) – optional matplotlib.axes.Axes, the axes to plot the figure on. If None, automatically creates a figure with recommended size.
- Returns
the ax on which the plot is made, it is an instance of matplotlib.axes.Axes.
- Return type
axes
torchuq.evaluate.interval Module
- torchuq.evaluate.interval.compute_coverage(predictions, labels, reduction='mean')
Compute the empirical coverage. This function is not differentiable.
- Parameters
predictions (tensor) – a batch of interval predictions, which is an array [batch_size, 2].
labels (tensor) – the labels, an array of shape [batch_size].
reduction (str) – the method to aggregate the results across the batch. Can be ‘none’, ‘mean’, ‘sum’, ‘median’, ‘min’, or ‘max’.
- Returns
the coverage, an array with shape [batch_size] or shape [] depending on the reduction.
- Return type
tensor
- torchuq.evaluate.interval.compute_length(predictions, reduction='mean')
Compute the average length of an interval prediction.
- Parameters
predictions (tensor) – a batch of interval predictions, which is an array [batch_size, 2].
reduction (str) – the method to aggregate the results across the batch. Can be ‘none’, ‘mean’, ‘sum’, ‘median’, ‘min’, or ‘max’.
- Returns
the interval length, an array with shape [batch_size] or shape [] depending on the reduction.
- Return type
tensor
- torchuq.evaluate.interval.plot_interval_sequence(predictions, labels=None, ax=None, max_count=100)
Plot the PDF of the predictions and the labels.
For aesthetics the PDFs are reflected along y axis to make a symmetric violin shaped plot.
- Parameters
predictions (tensor) – a batch of interval predictions, which is an array [batch_size, 2].
labels (tensor) – the labels, an array of shape [batch_size].
ax (axes) – the axes to plot the figure on. If None, automatically creates a figure with recommended size.
max_count (int) – the maximum number of intervals to plot.
- Returns
the ax on which the plot is made.
- Return type
axes
- torchuq.evaluate.interval.plot_length_cdf(predictions, ax=None, plot_median=True)
Plot the CDF of interval length.
- Parameters
predictions (tensor) – a batch of interval predictions, which is an array [batch_size, 2].
ax (axes) – the axes to plot the figure on, if None automatically creates a figure with recommended size.
plot_median (bool) – if true plot the median interval length.
- Returns
the ax on which the plot is made.
- Return type
axes
torchuq.evaluate.point Module
- torchuq.evaluate.point.compute_huber_loss(predictions, labels, reduction='mean', delta=None)
Compute the Huber loss.
- Parameters
predictions (tensor) – a batch of point predictions.
labels (tensor) – the labels, an array of shape [batch_size].
reduction (str) – the method to aggregate the results across the batch. Can be ‘none’, ‘mean’, ‘sum’, ‘median’, ‘min’, or ‘max’.
delta (float) – the delta parameter for the huber loss, if None then automatically set it as the top 20% largest absolute error.
- Returns
the huber loss, an array with shape [batch_size] or shape [] depending on the reduction.
- Return type
tensor
- torchuq.evaluate.point.compute_l2_loss(predictions, labels, reduction='mean')
Compute the L2 loss.
- Parameters
predictions (tensor) – a batch of point predictions.
labels (tensor) – the labels, an array of shape [batch_size].
reduction (str) – the method to aggregate the results across the batch. Can be ‘none’, ‘mean’, ‘sum’, ‘median’, ‘min’, or ‘max’.
- Returns
the l2 loss, an array with shape [batch_size] or shape [] depending on the reduction.
- Return type
tensor
- torchuq.evaluate.point.compute_pinball_loss(predictions, labels, alpha=0.5, reduction='mean')
Compute the pinball loss for the alpha-th quantile.
- Parameters
predictions (tensor) – a batch of point predictions.
labels (tensor) – the labels, an array of shape [batch_size].
reduction (str) – the method to aggregate the results across the batch. Can be ‘none’, ‘mean’, ‘sum’, ‘median’, ‘min’, or ‘max’.
alpha (float) – the quantile to compute the pinball loss for.
- Returns
the pinball loss, an array with shape [batch_size] or shape [] depending on the reduction.
- Return type
tensor
- torchuq.evaluate.point.plot_conditional_bias(predictions, labels, ax=None, knn=None, conditioning='label')
Make the conditional bias diagram as described in [TODO: add paper reference].
- Parameters
predictions (tensor) – a batch of point predictions.
labels (tensor) – the labels, an array of shape [batch_size].
ax (axes) – the axes to plot the figure on, if None automatically creates a figure with recommended size.
knn (int) – the number of nearest neighbors to average over. If None knn is set automatically.
conditioning (str) – can be ‘label’ or ‘prediction’.
- Returns
the ax on which the plot is made.
- Return type
axes
- torchuq.evaluate.point.plot_scatter(predictions, labels, ax=None)
Plot the scatter plot between the point predictions and the labels.
- Parameters
predictions (tensor) – a batch of point predictions.
labels (tensor) – the labels, an array of shape [batch_size].
ax (axes) – the axes to plot the figure on, if None automatically creates a figure with recommended size.
- Returns
the ax on which the plot is made.
- Return type
axes
torchuq.evaluate.quantile Module
- torchuq.evaluate.quantile.compute_pinball_loss(predictions, labels, reduction='mean')
Compute the pinball loss, which is a proper scoring rule for quantile predictions.
- Parameters
predictions (tensor) – a batch of quantile predictions, which is an array with shape [batch_size, n_quantiles] or [batch_size, 2, n_quantiles].
labels (tensor) – the labels, an array of shape [batch_size].
reduction (str) – the method to aggregate the results across the batch. Can be ‘none’, ‘mean’, ‘sum’, ‘median’, ‘min’, or ‘max’.
- Returns
the pinball loss, an array with shape [batch_size] or shape [] depending on the reduction.
- Return type
tensor
- torchuq.evaluate.quantile.plot_quantile_calibration(predictions, labels, ax=None)
Plot the reliability diagram for quantiles.
- Parameters
predictions (tensor) – a batch of quantile predictions, which is an array with shape [batch_size, n_quantiles] or [batch_size, 2, n_quantiles].
labels (tensor) – the labels, an array of shape [batch_size].
ax (axes) – the axes to plot the figure on, if None automatically creates a figure with recommended size.
- Returns
the ax on which the plot is made.
- Return type
axes
- torchuq.evaluate.quantile.plot_quantile_sequence(predictions, labels=None, ax=None, max_count=100)
Plot the PDF of the predictions and the labels.
For aesthetics the PDFs are reflected along y axis to make a symmetric violin shaped plot.
- Parameters
predictions (tensor) – a batch of quantile predictions, which is an array with shape [batch_size, n_quantiles] or [batch_size, 2, n_quantiles].
labels (tensor) – the labels, an array of shape [batch_size].
ax (axes) – the axes to plot the figure on, if None automatically creates a figure with recommended size.
max_count (int) – the maximum number of quantiles to plot.
- Returns
the ax on which the plot is made.
- Return type
axes