pypdevsbbl.extra.rngstreams.distributions module

RNGs do not necessary only apply on the interval (0, 1).

For this reason, this file also allows numerous different distributions from which we can obtain random streams. Based on the simple proof of:

\[Pr[X <= x] = Pr[F^{-1}(U) <= x] = Pr[U <= F(x)] = F(x)\]

we know that all we need is an inverse cumulative distribution function for all distributions we like to offer.

In this module, there are multiple inverse cumulative distribution functions implemented in such a way that you can use any RNG in the range (0, 1) for generating the data. It is therefore completely standalone from some builtin implementation for a RNG.

Besides Wikipedia, I refer to the following link when it comes to the actual distributions: http://www2.isikun.edu.tr/mustafahekimoglu/simulation/Lecture5-ProbabilityReview.pdf

Also, all functions return numpy-classes!

pypdevsbbl.extra.rngstreams.distributions.beta(y, a, b)[source]

Inverse of the beta cdf on the interval [0, 1] with shapes a1 and a2.

Parameters:
  • y (numeric) – The y-value of the function.
  • a (numeric) – The first shape parameter.
  • b (numeric) – The second shape parameter.
Returns:

A numpy.float64

pypdevsbbl.extra.rngstreams.distributions.binomial(y, n, p)[source]

Inverse of the discrete binomial cdf with n experiments that have p chance of success.

Since a Bernoulli distribution is a special case of the binomial distribution, you can also use this function for those usecases.

\[Bernouilli(y, p) = binomial(y, 1, p)\]
Parameters:
  • y (numeric) – The y-value of the function.
  • n (numeric) – The number of experiments.
  • p (numeric) – The chance for success in an experiment.
Returns:

A numpy.float64

pypdevsbbl.extra.rngstreams.distributions.cauchy(y, x0=0, scale=1)[source]

The inverse of the cauchy cdf.

This distribution is also known as the Lorentz distribution or the Breit-Wigner distribution. It is often used as the canonical example of a pathological distribution.

Parameters:
  • y (numeric) – The y-value of the function.
  • x0 (numeric) – The location of the distribution. Defaults to 0.
  • scale (numeric) – The scale of the distribution. Defaults to 1.
Returns:

A numpy.float64

pypdevsbbl.extra.rngstreams.distributions.chiSquare(y, k)[source]

Inverse of the chi square cdf with k degrees of freedom.

This distribution is a special case of the gamma distribution and occurs the most often in inferential statistics.

Technically, this is a special case of the gamma distribution where:
shape = k / 2 rate = 1
Parameters:
  • y (numeric) – The y-value of the function.
  • k (numeric) – The degrees of freedom to use.
Returns:

A numpy.float64

pypdevsbbl.extra.rngstreams.distributions.exponential(y, l)[source]

The inverse of the exponential cdf.

This distribution is often used to predict the wait time until the first event. The mean of this distribution is 1/l.

The exponential function is “memoryless”. In other words: Pr[X > s+t | X > s] = Pr[X > t]

Parameters:
  • y (numeric) – The y-value of the function.
  • l (numeric) – The rate parameter (lambda), i.e. mean number of occurrences per time unit.
Returns:

A numpy.float64

pypdevsbbl.extra.rngstreams.distributions.F(y, d1, d2)[source]

Inverse of the Fisher-Snedecor cdf with d1 and d2 its degrees of freedom.

Parameters:
  • y (numeric) – The y-value of the function.
  • d1 (numeric) – The degrees of freedom to use.
  • d2 (numeric) – The degrees of freedom to use.
Returns:

A numpy.float64

pypdevsbbl.extra.rngstreams.distributions.frechet(y, alpha, m=0, s=1)[source]

The inverse of the Fréchet cdf.

This distribution is also known as the Generalized Extreme Value Type B distribution or the log-Weibull distribution. It is commonly used in hydrology.

Parameters:
  • y (numeric) – The y-value of the function.
  • alpha (numeric) – The shape of the distribution.
  • m (numeric) – The location of the distribution. Defaults to 0.
  • s (numeric) – The scale of the distribution. Defaults to 1.
Returns:

A numpy.float64

pypdevsbbl.extra.rngstreams.distributions.gamma(y, a, b)[source]

Inverse of the gamma cdf with shape a and rate b.

This distribution is used to predict the wait time until the a-th event.

This formula is based upon the shape-rate system, so if you want to use the shape-scale system, use the property tao = 1/b

Seeing as the erlang distribution is a special case of the gamma distribution (i.e. b is an integer), you can also use this function for those purposes.

This function makes use of the _gamma_cache dictionary for efficiency.

Parameters:
  • y (numeric) – The y-value of the function.
  • a (numeric) – The shape of the curve.
  • b (numeric) – The rate parameter.
Returns:

A numpy.float64

pypdevsbbl.extra.rngstreams.distributions.geometric(y, p)[source]

Inverse of the discrete geometric cdf with p chance for success.

\(x\) would represent the amount of faillures.

Parameters:
  • y (numeric) – The y-value of the function.
  • p (numeric) – The chance for success in an experiment.
Returns:

A numpy.float64

pypdevsbbl.extra.rngstreams.distributions.gumbel(y, mu=0, beta=1)[source]

The inverse of the Gumbel cdf.

This distribution is also known as the Generalized Extreme Value Type A distribution or the log-Weibull distribution. It is commonly used in hydrology.

Parameters:
  • y (numeric) – The y-value of the function.
  • mu (numeric) – The location of the distribution. Defaults to 0.
  • beta (numeric) – The scale of the distribution. Defaults to 1.
Returns:

A numpy.float64

pypdevsbbl.extra.rngstreams.distributions.JSB(y, a1, a2, a, b)[source]

Inverse of the Johnson Bounded cdf.

Parameters:
  • y (numeric) – The y-value of the function.
  • a1 (numeric) – The shape of the distribution.
  • a2 (numeric) – The shape of the distribution.
  • a (numeric) – The location of the distribution.
  • b (numeric) – The scale of the distribution.
Returns:

A numpy.float64

pypdevsbbl.extra.rngstreams.distributions.johnsonB(y, a1, a2, a, b)

Alias for JSB() for readability.

pypdevsbbl.extra.rngstreams.distributions.JSU(y, a1, a2, g, b)[source]

Inverse of the Johnson Unbounded cdf.

Parameters:
  • y (numeric) – The y-value of the function.
  • a1 (numeric) – The shape of the distribution.
  • a2 (numeric) – The shape of the distribution.
  • g (numeric) – The location of the distribution.
  • b (numeric) – The scale of the distribution.
Returns:

A numpy.float64

pypdevsbbl.extra.rngstreams.distributions.johnsonU(y, a1, a2, g, b)

Alias for JSU() for readability.

pypdevsbbl.extra.rngstreams.distributions.laplace(y, mu=0, b=1)[source]

Inverse of the Laplace cdf.

The distribution is used in speech recognition and hydrology.

Parameters:
  • y (numeric) – The y-value of the function.
  • mu (numeric) – The location of the distribution. Defaults to 0.
  • b (numeric) – The scale of the distribution. Defaults to 1.
Returns:

A numpy.float64

pypdevsbbl.extra.rngstreams.distributions.logistic(y, mu=0, s=1)[source]

Inverse of the Logistic cdf.

Also known as the Sech-squared distribution. It appears in logistic regression and feedforward neural networks.

Parameters:
  • y (numeric) – The y-value of the function.
  • mu (numeric) – The location of the distribution. Defaults to 0.
  • s (numeric) – The scale of the distribution. Defaults to 1.
Returns:

A numpy.float64

pypdevsbbl.extra.rngstreams.distributions.loglogistic(y, a, b)[source]

Inverse of the Log-Logistic / Fisk cdf.

In economics, this distribution is known as the Fisk distribution.

Parameters:
  • y (numeric) – The y-value of the function.
  • a (numeric) – The shape of the distribution.
  • b (numeric) – The scale of the distribution.
Returns:

A numpy.float64

pypdevsbbl.extra.rngstreams.distributions.lognormal(y, mu, s)[source]

Inverse of the log-normal cdf with mean mu and standard derivation s.

Generates a variable whose natural logarithm is normally distributed with mean mu and standard derivation s.

Parameters:
  • y (numeric) – The y-value of the function.
  • mu (numeric) – The mean of the distribution.
  • s (numeric) – The standard derivation of the distribution.
Returns:

A numpy.float64

pypdevsbbl.extra.rngstreams.distributions.negbinomial(y, r, p)[source]

Inverse of the discrete negative binomial cdf with p chance of success foreach of s dist.

Parameters:
  • y (numeric) – The y-value of the function.
  • r (numeric) – The number of faillures.
  • p (numeric) – The chance for success in an experiment.
Returns:

A numpy.float64

pypdevsbbl.extra.rngstreams.distributions.normal(y, mu=0, s=1)[source]

Inverse of the normal cdf with mean mu and standard derivation s.

The implementation of this function is based on the quantile function and the properties of the errorfunction erf.

Parameters:
  • y (numeric) – The y-value of the function.
  • mu (numeric) – The mean of the distribution. Defaults to 0.
  • s (numeric) – The standard derivation of the distribution. Defaults to 1.
Returns:

A numpy.float64

pypdevsbbl.extra.rngstreams.distributions.pareto(y, alpha, xm)[source]

Inverse of the Pareto cdf.

The distribution is used to describe social, scientifical, geophysical, actuatial… phenomena. It originally was used to describe the distribution of wealth.

Colloquially, the distribution is refered to as the 80-20 rule.

Parameters:
  • y (numeric) – The y-value of the function.
  • alpha (numeric) – The shape of the distribution.
  • xm (numeric) – The scale of the distribution. Defaults to 1.
Returns:

A numpy.float64

pypdevsbbl.extra.rngstreams.distributions.poisson(y, l)[source]

Inverse of the discrete Poisson cdf with lambda l.

This distribution is a discrete distribution that expresses the probability that any given number of intervals occurs in a fixed interval of time.

The implemented algorithm here is based on the inverse transform sampling algorithm on Wikipedia: https://en.wikipedia.org/wiki/Poisson_distribution

This has been chosen, because it requires only a single random value for each call. Cumulative probabilities are examined in turn until one exceeds the random value.

Parameters:
  • y (numeric) – The y-value of the function.
  • l (numeric) – The average number of events that occur (i.e. lambda).
Returns:

A numpy.int

pypdevsbbl.extra.rngstreams.distributions.PT5(y, a, b)[source]

Inverse of the Pearson Type V cdf.

This method makes use of the gamma distribution.

Parameters:
  • y (numeric) – The y-value of the function.
  • a (numeric) – The shape of the distribution.
  • b (numeric) – The scale of the distribution.
Returns:

A numpy.float64

pypdevsbbl.extra.rngstreams.distributions.pearson5(y, a, b)

Alias of PT5() for readability.

pypdevsbbl.extra.rngstreams.distributions.PT6(y, a1, a2, b)[source]

Inverse of the Pearson Type VI cdf.

Parameters:
  • y (numeric) – The y-value of the function.
  • a1 (numeric) – The shape of the first distribution.
  • a2 (numeric) – The shape of the second distribution.
  • b (numeric) – The scale of the distribution.
Returns:

A numpy.float64

pypdevsbbl.extra.rngstreams.distributions.pearson6(y, a1, a2, b)

Alias of PT6() for readability.

pypdevsbbl.extra.rngstreams.distributions.student(y, v, step=0.001)[source]

Approximation of the inverse student t-cdf with v degrees of freedom.

This distribution is used for identifying unknowns in a distribution and occurs most often when estimating unknowns. When combined with RNG, these often arise in multi-dimensional applications of copula-dependency (dependence between variables).

This implementation uses a simple sampling method, but requires discretization. For this to be accurate, minimize the step size to use.

Parameters:
  • y (numeric) – The y-value of the function.
  • v (numeric) – The degrees of freedom to use.
  • step (float) – The stepsize to use when sampling.
Returns:

A numpy.float64

pypdevsbbl.extra.rngstreams.distributions.triangular(y, a, b, c)[source]

Inverse of the triangular distribution with minimum a, mode b and maximum c.

This distribution is typically used as a subjective description of a population for which there is limited sample data, especially if the relationship between variables is known, but the data is scarse.

Please be aware of the differences between b (mode) and c (maximum). Some sources use b and c the other way around (i.e. they use c as mode and b as maximum).

There are also resources that encode these values in a single number q, which can be obtained via (a, b, c w.r.t. the parameters of this function):

\[q = (b - a) / (c - a)\]
Parameters:
  • y (numeric) – The y-value of the function.
  • a (numeric) – The minimum.
  • b (numeric) – The mode.
  • c (numeric) – The maximum.
Returns:

A numpy.float64

pypdevsbbl.extra.rngstreams.distributions.uniform(y, a, b)[source]

Inverse of the uniform cdf over the range (a, b).

Parameters:
  • y (numeric) – The y-value of the function.
  • a (numeric) – The lower bound of the distribution.
  • b (numeric) – The upper bound of the distribution.
Returns:

A numpy.float64 in the range (a, b).

pypdevsbbl.extra.rngstreams.distributions.uniformInt(y, a, b)[source]

Inverse of the rounded uniform cdf over the range [a, b].

Parameters:
  • y (numeric) – The y-value of the function.
  • a (numeric) – The lower bound of the distribution.
  • b (numeric) – The upper bound of the distribution.
Returns:

A numpy.float64 in the range [a, b].

pypdevsbbl.extra.rngstreams.distributions.wald(y, mu, l, step=0.001)[source]

Approximation of the inverse Wald cdf.

Also known as the Inverse Gaussian Distribution, the Wald distribution is commonly used to compute the properties of the Brownian Motion.

Parameters:
  • y (numeric) – The y-value of the function.
  • mu (numeric) – The mean of the distribution.
  • l (numeric) – The shape of the distribution.
  • step (float) – The stepsize to use when sampling.
Returns:

A numpy.float64

pypdevsbbl.extra.rngstreams.distributions.weibull(y, a, b, v=0)[source]

Inverse of the weibell cdf with scale a, shape b and location v.

This distribution is most commonly used to predict the lifetime of technical equipment. The location is not often used, in which cases it is to be set at 0 (also the default).

Parameters:
  • y (numeric) – The y-value of the function.
  • a (numeric) – The scale of the curve.
  • b (numeric) – The shape parameter.
  • v (numeric) – The location parameter. Defaults to 0 (not often used).
Returns:

A numpy.float64

pypdevsbbl.extra.rngstreams.distributions.zipf(y, s, N: int)[source]

Inverse of a discrete cdf w.r.t. Zipf’s law.

Parameters:
  • y (numeric) – The y-value of the function.
  • s (numeric) – The degrees of freedom to use.
  • N (int) – The degrees of freedom to use.
Returns:

A numpy.float64

pypdevsbbl.extra.rngstreams.distributions._sample(y, f, step)[source]

Sampling method.