opda.utils module#
Utilities.
- opda.utils.sort_by_first(*args)[source]#
Return the arrays sorted by the first array.
- Parameters:
- *argsarrays, required
The arrays to sort.
- Returns:
- arrays
The arrays sorted by the first array. Thus, the first array will be sorted and the other arrays will have their elements permuted the same way as the elements from the first array.
- opda.utils.dkw_epsilon(n, confidence)[source]#
Return epsilon from the Dvoretzky-Kiefer-Wolfowitz inequaltiy.
The Dvoretzky-Kiefer-Wolfowitz inequality states that a confidence interval for the CDF is given by the empirical CDF plus or minus:
\[\epsilon = \sqrt{\frac{\log \frac{2}{\alpha}}{2n}}\]Where \(1 - \alpha\) is the coverage.
- Parameters:
- npositive int, required
The number of samples.
- confidencefloat from 0 to 1 inclusive, required
The desired confidence or coverage.
- Returns:
- non-negative float
The epsilon for the Dvoretzky-Kiefer-Wolfowitz inequality.
- opda.utils.beta_equal_tailed_interval(a, b, coverage)[source]#
Return an interval containing
coverage
of the probability.For the beta distribution with parameters
a
andb
, return the equal-tailed interval that containscoverage
of the probability mass.- Parameters:
- afinite positive float or array of floats, required
The alpha parameter for the beta distribution.
- bfinite positive float or array of floats, required
The beta parameter for the beta distribution.
- coveragefloat or array of floats from 0 to 1 inclusive, required
The desired coverage for the returned intervals.
- Returns:
- pair of floats or arrays of floats from 0 to 1 inclusive
A pair of floats or arrays of floats with the shape determined by broadcasting
a
,b
, andcoverage
together. The first returned value gives the lower bound and the second the upper bound for the equal-tailed intervals.
- opda.utils.beta_highest_density_interval(a, b, coverage, *, atol=1e-10)[source]#
Return an interval containing
coverage
of the probability.For the beta distribution with parameters
a
andb
, return the shortest interval that containscoverage
of the probability mass. Note that the highest density interval only exists if at least one ofa
orb
is greater than 1.- Parameters:
- afinite positive float or array of floats, required
The alpha parameter for the beta distribution.
- bfinite positive float or array of floats, required
The beta parameter for the beta distribution.
- coveragefloat or array of floats from 0 to 1 inclusive, required
The desired coverage for the returned intervals.
- atolnon-negative float, optional
The absolute tolerance to use for stopping the iteration.
- Returns:
- pair of floats or arrays of floats from 0 to 1 inclusive
A pair of floats or arrays of floats with the shape determined by broadcasting
a
,b
, andcoverage
together. The first returned value gives the lower bound and the second the upper bound for the intervals.
- opda.utils.beta_equal_tailed_coverage(a, b, x)[source]#
Return the coverage of the smallest interval containing
x
.For the beta distribution with parameters
a
andb
, return the coverage of the smallest equal-tailed interval containingx
. See the related function:beta_equal_tailed_interval()
.- Parameters:
- afinite positive float or array of floats, required
The alpha parameter for the beta distribution.
- bfinite positive float or array of floats, required
The beta parameter for the beta distribution.
- xfloat or array of floats from 0 to 1 inclusive, required
The points defining the minimal equal-tailed intervals whose coverage to return.
- Returns:
- pair of floats or arrays of floats from 0 to 1 inclusive
A float or array of floats with shape determined by broadcasting
a
,b
, andx
together. The values represent the coverage of the minimal equal-tailed interval containing the corresponding value fromx
.
- opda.utils.beta_highest_density_coverage(a, b, x, *, atol=1e-10)[source]#
Return the coverage of the smallest interval containing
x
.For the beta distribution with parameters
a
andb
, return the coverage of the smallest highest density interval containingx
. Note that the highest density interval only exists if at least one ofa
orb
is greater than 1. See the related function:beta_highest_density_interval()
.- Parameters:
- afinite positive float or array of floats, required
The alpha parameter for the beta distribution.
- bfinite positive float or array of floats, required
The beta parameter for the beta distribution.
- xfloat or array of floats from 0 to 1 inclusive, required
The points defining the minimal intervals whose coverage to return.
- atolnon-negative float, optional
The absolute tolerance to use for stopping the iteration.
- Returns:
- pair of floats or arrays of floats from 0 to 1 inclusive
A float or array of floats with shape determined by broadcasting
a
,b
, andx
together. The values represent the coverage of the minimal highest density interval containing the corresponding value fromx
.
- opda.utils.binomial_confidence_interval(n_successes, n_total, confidence)[source]#
Return a confidence interval for the binomial distribution.
Given
n_successes
out ofn_total
, return an equal-tailed Clopper-Pearson confidence interval with coverageconfidence
.- Parameters:
- n_successesnon-negative int or array of ints, required
An int or array of ints with each entry denoting the number of successes in a sample. Must be broadcastable with
n_total
.- n_totalpositive int or array of ints, required
An int or array of ints with each entry denoting the total number of observations in a sample. Must be broadcastable with
n_successes
.- confidencefloat or array of floats from 0 to 1 inclusive, required
A float or array of floats between zero and one denoting the desired confidence for each confidence interval. Must be broadcastable with
n_successes
broadcasted withn_total
.
- Returns:
- pair of floats or arrays of floats from 0 to 1 inclusive
A possibly scalar array of floats representing the lower confidence bounds and a possibly scalar array of floats representing the upper confidence bounds.
Notes
The Clopper-Pearson interval [1] does not account for the binomial distribution’s discreteness. This lack of correction causes Clopper-Pearson intervals to be conservative. In addition, this function implements an equal-tailed version of the Clopper-Pearson interval which can be very conservative when the number of successes is zero or the total number of observations.
References
[1]Clopper, C. and Pearson, E. S., “The Use of Confidence or Fiducial Limits Illustrated in the Case of the Binomial” (1934). Biometrika. 26 (4): 404-413. doi:10.1093/biomet/26.4.404.
- opda.utils.normal_pdf(xs)[source]#
Evaluate the PDF of the standard normal distribution.
- Parameters:
- xsfloat or array of floats, required
The points at which to evaluate the standard normal distribution’s probability density function.
- Returns:
- non-negative float or array of floats
The standard normal distribution’s probability density function evaluated at
xs
.
- opda.utils.normal_cdf(xs)[source]#
Evaluate the CDF of the standard normal distribution.
- Parameters:
- xsfloat or array of floats, required
The points at which to evaluate the standard normal distribution’s cumulative distribution function.
- Returns:
- float or array of floats from 0 to 1 inclusive
The standard normal distribution’s cumulative distribution function evaluated at
xs
.
- opda.utils.normal_ppf(qs)[source]#
Evaluate the PPF of the standard normal distribution.
- Parameters:
- qsfloat or array of floats from 0 to 1 inclusive, required
The points at which to evaluate the standard normal distribution’s quantile function.
- Returns:
- float or array of floats
The standard normal distribution’s quantile function evaluated at
qs
.