libs/math/doc/sf_and_dist/dist_tutorial.qbk - platform/external/boost - Git at Google

 [/ def names all end in distrib to avoid clashes with names of functions]
 [def __binomial_distrib [link math_toolkit.dist.dist_ref.dists.binomial_dist Binomial Distribution]]
 [def __chi_squared_distrib [link math_toolkit.dist.dist_ref.dists.chi_squared_dist Chi Squared Distribution]]
 [def __normal_distrib [link math_toolkit.dist.dist_ref.dists.normal_dist Normal Distribution]]
 [def __F_distrib [link math_toolkit.dist.dist_ref.dists.f_dist Fisher F Distribution]]
 [def __students_t_distrib [link math_toolkit.dist.dist_ref.dists.students_t_dist Students t Distribution]]

 [def __handbook [@http://www.itl.nist.gov/div898/handbook/
 NIST/SEMATECH e-Handbook of Statistical Methods.]]

 [section:stat_tut Statistical Distributions Tutorial]
 This library is centred around statistical distributions, this tutorial
 will give you an overview of what they are, how they can be used, and
 provides a few worked examples of applying the library to statistical tests.

 [section:overview Overview of Distributions]

 [section:headers Headers and Namespaces]

 All the code in this library is inside namespace boost::math.

 In order to use a distribution /my_distribution/ you will need to include
 either the header <boost/math/distributions/my_distribution.hpp> or
 the "include all the distributions" header: <boost/math/distributions.hpp>.

 For example, to use the Students-t distribution include either
 <boost/math/distributions/students_t.hpp> or
 <boost/math/distributions.hpp>

 You also need to bring distribution names into scope,
 perhaps with a `using namespace boost::math;` declaration,

 or specific  `using` declarations like `using boost::math::normal;` (*recommended*).

 [caution Some math function names are also used in namespace std so including <random> could cause ambiguity!]

 [endsect] [/ section:headers Headers and Namespaces]

 [section:objects Distributions are Objects]

 Each kind of distribution in this library is a class type - an object.

 [link math_toolkit.policy Policies] provide fine-grained control
 of the behaviour of these classes, allowing the user to customise
 behaviour such as how errors are handled, or how the quantiles
 of discrete distribtions behave.

 [tip If you are familiar with statistics libraries using functions,
 and 'Distributions as Objects' seem alien, see
 [link math_toolkit.dist.stat_tut.weg.nag_library the comparison to
 other statistics libraries.]
 ] [/tip]

 Making distributions class types does two things:

 * It encapsulates the kind of distribution in the C++ type system;
 so, for example, Students-t distributions are always a different C++ type from
 Chi-Squared distributions.
 * The distribution objects store any parameters associated with the
 distribution: for example, the Students-t distribution has a
 ['degrees of freedom] parameter that controls the shape of the distribution.
 This ['degrees of freedom] parameter has to be provided
 to the Students-t object when it is constructed.

 Although the distribution classes in this library are templates, there
 are typedefs on type /double/ that mostly take the usual name of the
 distribution
 (except where there is a clash with a function of the same name: beta and gamma,
 in which case using the default template arguments - `RealType = double` -
 is nearly as convenient).
 Probably 95% of uses are covered by these typedefs:

    // using namespace boost::math; // Avoid potential ambiguity with names in std <random>
    // Safer to declare specific functions with using statement(s):

    using boost::math::beta_distribution;
    using boost::math::binomial_distribution;
    using boost::math::students_t;

    // Construct a students_t distribution with 4 degrees of freedom:
    students_t d1(4);

    // Construct a double-precision beta distribution
    // with parameters a = 10, b = 20
    beta_distribution<> d2(10, 20); // Note: _distribution<> suffix !

 If you need to use the distributions with a type other than `double`,
 then you can instantiate the template directly: the names of the
 templates are the same as the `double` typedef but with `_distribution`
 appended, for example: __students_t_distrib or __binomial_distrib:

    // Construct a students_t distribution, of float type,
    // with 4 degrees of freedom:
    students_t_distribution<float> d3(4);

    // Construct a binomial distribution, of long double type,
    // with probability of success 0.3
    // and 20 trials in total:
    binomial_distribution<long double> d4(20, 0.3);

 The parameters passed to the distributions can be accessed via getter member
 functions:

    d1.degrees_of_freedom();  // returns 4.0

 This is all well and good, but not very useful so far.  What we often want
 is to be able to calculate the /cumulative distribution functions/ and
 /quantiles/ etc for these distributions.

 [endsect] [/section:objects Distributions are Objects]


 [section:generic Generic operations common to all distributions are non-member functions]

 Want to calculate the PDF (Probability Density Function) of a distribution?
 No problem, just use:

    pdf(my_dist, x);  // Returns PDF (density) at point x of distribution my_dist.

 Or how about the CDF (Cumulative Distribution Function):

    cdf(my_dist, x);  // Returns CDF (integral from -infinity to point x)
                      // of distribution my_dist.

 And quantiles are just the same:

    quantile(my_dist, p);  // Returns the value of the random variable x
                           // such that cdf(my_dist, x) == p.

 If you're wondering why these aren't member functions, it's to
 make the library more easily extensible: if you want to add additional
 generic operations - let's say the /n'th moment/ - then all you have to
 do is add the appropriate non-member functions, overloaded for each
 implemented distribution type.

 [tip

 [*Random numbers that approximate Quantiles of Distributions]

 If you want random numbers that are distributed in a specific way,
 for example in a uniform, normal or triangular,
 see [@http://www.boost.org/libs/random/ Boost.Random].

 Whilst in principal there's nothing to prevent you from using the
 quantile function to convert a uniformly distributed random
 number to another distribution, in practice there are much more
 efficient algorithms available that are specific to random number generation.
 ] [/tip Random numbers that approximate Quantiles of Distributions]

 For example, the binomial distribution has two parameters:
 n (the number of trials) and p (the probability of success on any one trial).

 The `binomial_distribution` constructor therefore has two parameters:

 `binomial_distribution(RealType n, RealType p);`

 For this distribution the __random_variate is k: the number of successes observed.
 The probability density\/mass function (pdf) is therefore written as ['f(k; n, p)].

 [note

 [*Random Variates and Distribution Parameters]

 The concept of a __random_variable is closely linked to the term __random_variate:
 a random variate is a particular value (outcome) of a random variable.
 and [@http://en.wikipedia.org/wiki/Parameter distribution parameters]
 are conventionally distinguished (for example in Wikipedia and Wolfram MathWorld)
 by placing a semi-colon or vertical bar)
 /after/ the __random_variable (whose value you 'choose'),
 to separate the variate from the parameter(s) that defines the shape of the distribution.[br]
 For example, the binomial distribution probability distribution function (PDF) is written as
 ['f(k| n, p)] = Pr(K = k|n, p) = probability of observing k successes out of n trials.
 K is the __random_variable, k is the __random_variate,
 the parameters are n (trials) and p (probability).
 ] [/tip Random Variates and Distribution Parameters]

 [note  By convention, __random_variate are lower case, usually k is integral, x if real, and
 __random_variable are upper case, K if integral, X if real.  But this implementation treats
 all as floating point values `RealType`, so if you really want an integral result,
 you must round: see note on Discrete Probability Distributions below for details.]

 As noted above the non-member function `pdf` has one parameter for the distribution object,
 and a second for the random variate.  So taking our binomial distribution
 example, we would write:

 `pdf(binomial_distribution<RealType>(n, p), k);`

 The ranges of __random_variate values that are permitted and are supported can be
 tested by using two functions `range` and `support`.

 The distribution (effectively the __random_variate) is said to be 'supported'
 over a range that is
 [@http://en.wikipedia.org/wiki/Probability_distribution
  "the smallest closed set whose complement has probability zero"].
 MathWorld uses the word 'defined' for this range.
 Non-mathematicians might say it means the 'interesting' smallest range
 of random variate x that has the cdf going from zero to unity.
 Outside are uninteresting zones where the pdf is zero, and the cdf zero or unity.

 For most distributions, with probability distribution functions one might describe
 as 'well-behaved', we have decided that it is most useful for the supported range
 to *exclude* random variate values like exact zero *if the end point is discontinuous*.
 For example, the Weibull (scale 1, shape 1) distribution smoothly heads for unity
 as the random variate x declines towards zero.
 But at x = zero, the value of the pdf is suddenly exactly zero, by definition.
 If you are plotting the PDF, or otherwise calculating,
 zero is not the most useful value for the lower limit of supported, as we discovered.
 So for this, and similar distributions,
 we have decided it is most numerically useful to use
 the closest value to zero, min_value, for the limit of the supported range.
 (The `range` remains from zero, so you will still get `pdf(weibull, 0) == 0`).
 (Exponential and gamma distributions have similarly discontinuous functions).

 Mathematically, the functions may make sense with an (+ or -) infinite value,
 but except for a few special cases (in the Normal and Cauchy distributions)
 this implementation limits random variates to finite values from the `max`
 to `min` for the `RealType`.
 (See [link math_toolkit.backgrounders.implementation.handling_of_floating_point_infinity
 Handling of Floating-Point Infinity] for rationale).


 [note

 [*Discrete Probability Distributions]

 Note that the [@http://en.wikipedia.org/wiki/Discrete_probability_distribution
 discrete distributions], including the binomial, negative binomial, Poisson & Bernoulli,
 are all mathematically defined as discrete functions:
 that is to say the functions `cdf` and `pdf` are only defined for integral values
 of the random variate.

 However, because the method of calculation often uses continuous functions
 it is convenient to treat them as if they were continuous functions,
 and permit non-integral values of their parameters.

 Users wanting to enforce a strict mathematical model may use `floor`
 or `ceil` functions on the random variate prior to calling the distribution
 function.

 The quantile functions for these distributions are hard to specify
 in a manner that will satisfy everyone all of the time.  The default
 behaviour is to return an integer result, that has been rounded
 /outwards/: that is to say, lower quantiles - where the probablity
 is less than 0.5 are rounded down, while upper quantiles - where
 the probability is greater than 0.5 - are rounded up.  This behaviour
 ensures that if an X% quantile is requested, then /at least/ the requested
 coverage will be present in the central region, and /no more than/
 the requested coverage will be present in the tails.

 This behaviour can be changed so that the quantile functions are rounded
 differently, or return a real-valued result using
 [link math_toolkit.policy.pol_overview Policies].  It is strongly
 recommended that you read the tutorial
 [link math_toolkit.policy.pol_tutorial.understand_dis_quant
 Understanding Quantiles of Discrete Distributions] before
 using the quantile function on a discrete distribtion.  The
 [link math_toolkit.policy.pol_ref.discrete_quant_ref reference docs]
 describe how to change the rounding policy
 for these distributions.

 For similar reasons continuous distributions with parameters like
 "degrees of freedom"
 that might appear to be integral, are treated as real values
 (and are promoted from integer to floating-point if necessary).
 In this case however, there are a small number of situations where non-integral
 degrees of freedom do have a genuine meaning.
 ]

 [endsect] [/ section:generic Generic operations common to all distributions are non-member functions]

 [#complements]
 [section:complements Complements are supported too - and when to use them]

 Often you don't want the value of the CDF, but its complement, which is
 to say `1-p` rather than `p`.  It is tempting to calculate the CDF and subtract
 it from `1`, but if `p` is very close to `1` then cancellation error
 will cause you to lose accuracy, perhaps totally.

 [link why_complements See below ['"Why and when to use complements?"]]

 In this library, whenever you want to receive a complement, just wrap
 all the function arguments in a call to `complement(...)`, for example:

    students_t dist(5);
    cout << "CDF at t = 1 is " << cdf(dist, 1.0) << endl;
    cout << "Complement of CDF at t = 1 is " << cdf(complement(dist, 1.0)) << endl;

 But wait, now that we have a complement, we have to be able to use it as well.
 Any function that accepts a probability as an argument can also accept a complement
 by wrapping all of its arguments in a call to `complement(...)`, for example:

    students_t dist(5);

    for(double i = 10; i < 1e10; i *= 10)
    {
       // Calculate the quantile for a 1 in i chance:
       double t = quantile(complement(dist, 1/i));
       // Print it out:
       cout << "Quantile of students-t with 5 degrees of freedom\n"
               "for a 1 in " << i << " chance is " << t << endl;
    }

 [tip

 [*Critical values are just quantiles]

 Some texts talk about quantiles, or percentiles or fractiles,
 others about critical values, the basic rule is:

 ['Lower critical values] are the same as the quantile.

 ['Upper critical values] are the same as the quantile from the complement
 of the probability.

 For example, suppose we have a Bernoulli process, giving rise to a binomial
 distribution with success ratio 0.1 and 100 trials in total.  The
 ['lower critical value] for a probability of 0.05 is given by:

 `quantile(binomial(100, 0.1), 0.05)`

 and the ['upper critical value] is given by:

 `quantile(complement(binomial(100, 0.1), 0.05))`

 which return 4.82 and 14.63 respectively.
 ]

 [#why_complements]
 [tip

 [*Why bother with complements anyway?]

 It's very tempting to dispense with complements, and simply subtract
 the probability from 1 when required.  However, consider what happens when
 the probability is very close to 1: let's say the probability expressed at
 float precision is `0.999999940f`, then `1 - 0.999999940f = 5.96046448e-008`,
 but the result is actually accurate to just ['one single bit]: the only
 bit that didn't cancel out!

 Or to look at this another way: consider that we want the risk of falsely
 rejecting the null-hypothesis in the Student's t test to be 1 in 1 billion,
 for a sample size of 10,000.
 This gives a probability of 1 - 10[super -9], which is exactly 1 when
 calculated at float precision.  In this case calculating the quantile from
 the complement neatly solves the problem, so for example:

 `quantile(complement(students_t(10000), 1e-9))`

 returns the expected t-statistic `6.00336`, where as:

 `quantile(students_t(10000), 1-1e-9f)`

 raises an overflow error, since it is the same as:

 `quantile(students_t(10000), 1)`

 Which has no finite result.

 With all distributions, even for more reasonable probability
 (unless the value of p can be represented exactly in the floating-point type)
 the loss of accuracy quickly becomes significant if you simply calculate probability from 1 - p
 (because it will be mostly garbage digits for p ~ 1).

 So always avoid, for example, using a probability near to unity like 0.99999

 `quantile(my_distribution, 0.99999)`

 and instead use

 `quantile(complement(my_distribution, 0.00001))`

 since 1 - 0.99999 is not exactly equal to 0.00001 when using floating-point arithmetic.

 This assumes that the 0.00001 value is either a constant,
 or can be computed by some manner other than subtracting 0.99999 from 1.

 ] [/ tip *Why bother with complements anyway?]

 [endsect] [/ section:complements Complements are supported too - and why]

 [section:parameters Parameters can be calculated]

 Sometimes it's the parameters that define the distribution that you
 need to find.  Suppose, for example, you have conducted a Students-t test
 for equal means and the result is borderline.  Maybe your two samples
 differ from each other, or maybe they don't; based on the result
 of the test you can't be sure.  A legitimate question to ask then is
 "How many more measurements would I have to take before I would get
 an X% probability that the difference is real?"  Parameter finders
 can answer questions like this, and are necessarily different for
 each distribution.  They are implemented as static member functions
 of the distributions, for example:

    students_t::find_degrees_of_freedom(
       1.3,        // difference from true mean to detect
       0.05,       // maximum risk of falsely rejecting the null-hypothesis.
       0.1,        // maximum risk of falsely failing to reject the null-hypothesis.
       0.13);      // sample standard deviation

 Returns the number of degrees of freedom required to obtain a 95%
 probability that the observed differences in means is not down to
 chance alone.  In the case that a borderline Students-t test result
 was previously obtained, this can be used to estimate how large the sample size
 would have to become before the observed difference was considered
 significant.  It assumes, of course, that the sample mean and standard
 deviation are invariant with sample size.

 [endsect] [/ section:parameters Parameters can be calculated]

 [section:summary Summary]

 * Distributions are objects, which are constructed from whatever
 parameters the distribution may have.
 * Member functions allow you to retrieve the parameters of a distribution.
 * Generic non-member functions provide access to the properties that
 are common to all the distributions (PDF, CDF, quantile etc).
 * Complements of probabilities are calculated by wrapping the function's
 arguments in a call to `complement(...)`.
 * Functions that accept a probability can accept a complement of the
 probability as well, by wrapping the function's
 arguments in a call to `complement(...)`.
 * Static member functions allow the parameters of a distribution
 to be found from other information.

 Now that you have the basics, the next section looks at some worked examples.

 [endsect] [/section:summary Summary]
 [endsect] [/section:overview Overview]

 [section:weg Worked Examples]
 [include distributions/distribution_construction.qbk]
 [include distributions/students_t_examples.qbk]
 [include distributions/chi_squared_examples.qbk]
 [include distributions/f_dist_example.qbk]
 [include distributions/binomial_example.qbk]
 [include distributions/geometric_example.qbk]
 [include distributions/negative_binomial_example.qbk]
 [include distributions/normal_example.qbk]
 [/include distributions/inverse_gamma_example.qbk]
 [/include distributions/inverse_gaussian_example.qbk]
 [include distributions/nc_chi_squared_example.qbk]
 [include distributions/error_handling_example.qbk]
 [include distributions/find_location_and_scale.qbk]
 [include distributions/nag_library.qbk]
 [include distributions/c_sharp.qbk]
 [endsect] [/section:weg Worked Examples]

 [include background.qbk]

 [endsect] [/ section:stat_tut Statistical Distributions Tutorial]

 [/ dist_tutorial.qbk
   Copyright 2006, 2010 John Maddock and Paul A. Bristow.
   Distributed under the Boost Software License, Version 1.0.
   (See accompanying file LICENSE_1_0.txt or copy at
   http://www.boost.org/LICENSE_1_0.txt).
 ]
	[/ def names all end in distrib to avoid clashes with names of functions]
	[def __binomial_distrib [link math_toolkit.dist.dist_ref.dists.binomial_dist Binomial Distribution]]
	[def __chi_squared_distrib [link math_toolkit.dist.dist_ref.dists.chi_squared_dist Chi Squared Distribution]]
	[def __normal_distrib [link math_toolkit.dist.dist_ref.dists.normal_dist Normal Distribution]]
	[def __F_distrib [link math_toolkit.dist.dist_ref.dists.f_dist Fisher F Distribution]]
	[def __students_t_distrib [link math_toolkit.dist.dist_ref.dists.students_t_dist Students t Distribution]]

	[def __handbook [@http://www.itl.nist.gov/div898/handbook/
	NIST/SEMATECH e-Handbook of Statistical Methods.]]

	[section:stat_tut Statistical Distributions Tutorial]
	This library is centred around statistical distributions, this tutorial
	will give you an overview of what they are, how they can be used, and
	provides a few worked examples of applying the library to statistical tests.

	[section:overview Overview of Distributions]

	[section:headers Headers and Namespaces]

	All the code in this library is inside namespace boost::math.

	In order to use a distribution /my_distribution/ you will need to include
	either the header <boost/math/distributions/my_distribution.hpp> or
	the "include all the distributions" header: <boost/math/distributions.hpp>.

	For example, to use the Students-t distribution include either
	<boost/math/distributions/students_t.hpp> or
	<boost/math/distributions.hpp>

	You also need to bring distribution names into scope,
	perhaps with a `using namespace boost::math;` declaration,

	or specific `using` declarations like `using boost::math::normal;` (recommended).

	[caution Some math function names are also used in namespace std so including <random> could cause ambiguity!]

	[endsect] [/ section:headers Headers and Namespaces]

	[section:objects Distributions are Objects]

	Each kind of distribution in this library is a class type - an object.

	[link math_toolkit.policy Policies] provide fine-grained control
	of the behaviour of these classes, allowing the user to customise
	behaviour such as how errors are handled, or how the quantiles
	of discrete distribtions behave.

	[tip If you are familiar with statistics libraries using functions,
	and 'Distributions as Objects' seem alien, see
	[link math_toolkit.dist.stat_tut.weg.nag_library the comparison to
	other statistics libraries.]
	] [/tip]

	Making distributions class types does two things:

	* It encapsulates the kind of distribution in the C++ type system;
	so, for example, Students-t distributions are always a different C++ type from
	Chi-Squared distributions.
	* The distribution objects store any parameters associated with the
	distribution: for example, the Students-t distribution has a
	['degrees of freedom] parameter that controls the shape of the distribution.
	This ['degrees of freedom] parameter has to be provided
	to the Students-t object when it is constructed.

	Although the distribution classes in this library are templates, there
	are typedefs on type /double/ that mostly take the usual name of the
	distribution
	(except where there is a clash with a function of the same name: beta and gamma,
	in which case using the default template arguments - `RealType = double` -
	is nearly as convenient).
	Probably 95% of uses are covered by these typedefs:

	// using namespace boost::math; // Avoid potential ambiguity with names in std <random>
	// Safer to declare specific functions with using statement(s):

	using boost::math::beta_distribution;
	using boost::math::binomial_distribution;
	using boost::math::students_t;

	// Construct a students_t distribution with 4 degrees of freedom:
	students_t d1(4);

	// Construct a double-precision beta distribution
	// with parameters a = 10, b = 20
	beta_distribution<> d2(10, 20); // Note: _distribution<> suffix !

	If you need to use the distributions with a type other than `double`,
	then you can instantiate the template directly: the names of the
	templates are the same as the `double` typedef but with `_distribution`
	appended, for example: __students_t_distrib or __binomial_distrib:

	// Construct a students_t distribution, of float type,
	// with 4 degrees of freedom:
	students_t_distribution<float> d3(4);

	// Construct a binomial distribution, of long double type,
	// with probability of success 0.3
	// and 20 trials in total:
	binomial_distribution<long double> d4(20, 0.3);

	The parameters passed to the distributions can be accessed via getter member
	functions:

	d1.degrees_of_freedom(); // returns 4.0

	This is all well and good, but not very useful so far. What we often want
	is to be able to calculate the /cumulative distribution functions/ and
	/quantiles/ etc for these distributions.

	[endsect] [/section:objects Distributions are Objects]


	[section:generic Generic operations common to all distributions are non-member functions]

	Want to calculate the PDF (Probability Density Function) of a distribution?
	No problem, just use:

	pdf(my_dist, x); // Returns PDF (density) at point x of distribution my_dist.

	Or how about the CDF (Cumulative Distribution Function):

	cdf(my_dist, x); // Returns CDF (integral from -infinity to point x)
	// of distribution my_dist.

	And quantiles are just the same:

	quantile(my_dist, p); // Returns the value of the random variable x
	// such that cdf(my_dist, x) == p.

	If you're wondering why these aren't member functions, it's to
	make the library more easily extensible: if you want to add additional
	generic operations - let's say the /n'th moment/ - then all you have to
	do is add the appropriate non-member functions, overloaded for each
	implemented distribution type.

	[tip

	[*Random numbers that approximate Quantiles of Distributions]

	If you want random numbers that are distributed in a specific way,
	for example in a uniform, normal or triangular,
	see [@http://www.boost.org/libs/random/ Boost.Random].

	Whilst in principal there's nothing to prevent you from using the
	quantile function to convert a uniformly distributed random
	number to another distribution, in practice there are much more
	efficient algorithms available that are specific to random number generation.
	] [/tip Random numbers that approximate Quantiles of Distributions]

	For example, the binomial distribution has two parameters:
	n (the number of trials) and p (the probability of success on any one trial).

	The `binomial_distribution` constructor therefore has two parameters:

	`binomial_distribution(RealType n, RealType p);`

	For this distribution the __random_variate is k: the number of successes observed.
	The probability density\/mass function (pdf) is therefore written as ['f(k; n, p)].

	[note

	[*Random Variates and Distribution Parameters]

	The concept of a __random_variable is closely linked to the term __random_variate:
	a random variate is a particular value (outcome) of a random variable.
	and [@http://en.wikipedia.org/wiki/Parameter distribution parameters]
	are conventionally distinguished (for example in Wikipedia and Wolfram MathWorld)
	by placing a semi-colon or vertical bar)
	/after/ the __random_variable (whose value you 'choose'),
	to separate the variate from the parameter(s) that defines the shape of the distribution.[br]
	For example, the binomial distribution probability distribution function (PDF) is written as
	['f(k\| n, p)] = Pr(K = k\|n, p) = probability of observing k successes out of n trials.
	K is the __random_variable, k is the __random_variate,
	the parameters are n (trials) and p (probability).
	] [/tip Random Variates and Distribution Parameters]

	[note By convention, __random_variate are lower case, usually k is integral, x if real, and
	__random_variable are upper case, K if integral, X if real. But this implementation treats
	all as floating point values `RealType`, so if you really want an integral result,
	you must round: see note on Discrete Probability Distributions below for details.]

	As noted above the non-member function `pdf` has one parameter for the distribution object,
	and a second for the random variate. So taking our binomial distribution
	example, we would write:

	`pdf(binomial_distribution<RealType>(n, p), k);`

	The ranges of __random_variate values that are permitted and are supported can be
	tested by using two functions `range` and `support`.

	The distribution (effectively the __random_variate) is said to be 'supported'
	over a range that is
	[@http://en.wikipedia.org/wiki/Probability_distribution
	"the smallest closed set whose complement has probability zero"].
	MathWorld uses the word 'defined' for this range.
	Non-mathematicians might say it means the 'interesting' smallest range
	of random variate x that has the cdf going from zero to unity.
	Outside are uninteresting zones where the pdf is zero, and the cdf zero or unity.

	For most distributions, with probability distribution functions one might describe
	as 'well-behaved', we have decided that it is most useful for the supported range
	to exclude random variate values like exact zero if the end point is discontinuous.
	For example, the Weibull (scale 1, shape 1) distribution smoothly heads for unity
	as the random variate x declines towards zero.
	But at x = zero, the value of the pdf is suddenly exactly zero, by definition.
	If you are plotting the PDF, or otherwise calculating,
	zero is not the most useful value for the lower limit of supported, as we discovered.
	So for this, and similar distributions,
	we have decided it is most numerically useful to use
	the closest value to zero, min_value, for the limit of the supported range.
	(The `range` remains from zero, so you will still get `pdf(weibull, 0) == 0`).
	(Exponential and gamma distributions have similarly discontinuous functions).

	Mathematically, the functions may make sense with an (+ or -) infinite value,
	but except for a few special cases (in the Normal and Cauchy distributions)
	this implementation limits random variates to finite values from the `max`
	to `min` for the `RealType`.
	(See [link math_toolkit.backgrounders.implementation.handling_of_floating_point_infinity
	Handling of Floating-Point Infinity] for rationale).


	[note

	[*Discrete Probability Distributions]

	Note that the [@http://en.wikipedia.org/wiki/Discrete_probability_distribution
	discrete distributions], including the binomial, negative binomial, Poisson & Bernoulli,
	are all mathematically defined as discrete functions:
	that is to say the functions `cdf` and `pdf` are only defined for integral values
	of the random variate.

	However, because the method of calculation often uses continuous functions
	it is convenient to treat them as if they were continuous functions,
	and permit non-integral values of their parameters.

	Users wanting to enforce a strict mathematical model may use `floor`
	or `ceil` functions on the random variate prior to calling the distribution
	function.

	The quantile functions for these distributions are hard to specify
	in a manner that will satisfy everyone all of the time. The default
	behaviour is to return an integer result, that has been rounded
	/outwards/: that is to say, lower quantiles - where the probablity
	is less than 0.5 are rounded down, while upper quantiles - where
	the probability is greater than 0.5 - are rounded up. This behaviour
	ensures that if an X% quantile is requested, then /at least/ the requested
	coverage will be present in the central region, and /no more than/
	the requested coverage will be present in the tails.

	This behaviour can be changed so that the quantile functions are rounded
	differently, or return a real-valued result using
	[link math_toolkit.policy.pol_overview Policies]. It is strongly
	recommended that you read the tutorial
	[link math_toolkit.policy.pol_tutorial.understand_dis_quant
	Understanding Quantiles of Discrete Distributions] before
	using the quantile function on a discrete distribtion. The
	[link math_toolkit.policy.pol_ref.discrete_quant_ref reference docs]
	describe how to change the rounding policy
	for these distributions.

	For similar reasons continuous distributions with parameters like
	"degrees of freedom"
	that might appear to be integral, are treated as real values
	(and are promoted from integer to floating-point if necessary).
	In this case however, there are a small number of situations where non-integral
	degrees of freedom do have a genuine meaning.
	]

	[endsect] [/ section:generic Generic operations common to all distributions are non-member functions]

	[#complements]
	[section:complements Complements are supported too - and when to use them]

	Often you don't want the value of the CDF, but its complement, which is
	to say `1-p` rather than `p`. It is tempting to calculate the CDF and subtract
	it from `1`, but if `p` is very close to `1` then cancellation error
	will cause you to lose accuracy, perhaps totally.

	[link why_complements See below ['"Why and when to use complements?"]]

	In this library, whenever you want to receive a complement, just wrap
	all the function arguments in a call to `complement(...)`, for example:

	students_t dist(5);
	cout << "CDF at t = 1 is " << cdf(dist, 1.0) << endl;
	cout << "Complement of CDF at t = 1 is " << cdf(complement(dist, 1.0)) << endl;

	But wait, now that we have a complement, we have to be able to use it as well.
	Any function that accepts a probability as an argument can also accept a complement
	by wrapping all of its arguments in a call to `complement(...)`, for example:

	students_t dist(5);

	for(double i = 10; i < 1e10; i *= 10)
	{
	// Calculate the quantile for a 1 in i chance:
	double t = quantile(complement(dist, 1/i));
	// Print it out:
	cout << "Quantile of students-t with 5 degrees of freedom\n"
	"for a 1 in " << i << " chance is " << t << endl;
	}

	[tip

	[*Critical values are just quantiles]

	Some texts talk about quantiles, or percentiles or fractiles,
	others about critical values, the basic rule is:

	['Lower critical values] are the same as the quantile.

	['Upper critical values] are the same as the quantile from the complement
	of the probability.

	For example, suppose we have a Bernoulli process, giving rise to a binomial
	distribution with success ratio 0.1 and 100 trials in total. The
	['lower critical value] for a probability of 0.05 is given by:

	`quantile(binomial(100, 0.1), 0.05)`

	and the ['upper critical value] is given by:

	`quantile(complement(binomial(100, 0.1), 0.05))`

	which return 4.82 and 14.63 respectively.
	]

	[#why_complements]
	[tip

	[*Why bother with complements anyway?]

	It's very tempting to dispense with complements, and simply subtract
	the probability from 1 when required. However, consider what happens when
	the probability is very close to 1: let's say the probability expressed at
	float precision is `0.999999940f`, then `1 - 0.999999940f = 5.96046448e-008`,
	but the result is actually accurate to just ['one single bit]: the only
	bit that didn't cancel out!

	Or to look at this another way: consider that we want the risk of falsely
	rejecting the null-hypothesis in the Student's t test to be 1 in 1 billion,
	for a sample size of 10,000.
	This gives a probability of 1 - 10[super -9], which is exactly 1 when
	calculated at float precision. In this case calculating the quantile from
	the complement neatly solves the problem, so for example:

	`quantile(complement(students_t(10000), 1e-9))`

	returns the expected t-statistic `6.00336`, where as:

	`quantile(students_t(10000), 1-1e-9f)`

	raises an overflow error, since it is the same as:

	`quantile(students_t(10000), 1)`

	Which has no finite result.

	With all distributions, even for more reasonable probability
	(unless the value of p can be represented exactly in the floating-point type)
	the loss of accuracy quickly becomes significant if you simply calculate probability from 1 - p
	(because it will be mostly garbage digits for p ~ 1).

	So always avoid, for example, using a probability near to unity like 0.99999

	`quantile(my_distribution, 0.99999)`

	and instead use

	`quantile(complement(my_distribution, 0.00001))`

	since 1 - 0.99999 is not exactly equal to 0.00001 when using floating-point arithmetic.

	This assumes that the 0.00001 value is either a constant,
	or can be computed by some manner other than subtracting 0.99999 from 1.

	] [/ tip *Why bother with complements anyway?]

	[endsect] [/ section:complements Complements are supported too - and why]

	[section:parameters Parameters can be calculated]

	Sometimes it's the parameters that define the distribution that you
	need to find. Suppose, for example, you have conducted a Students-t test
	for equal means and the result is borderline. Maybe your two samples
	differ from each other, or maybe they don't; based on the result
	of the test you can't be sure. A legitimate question to ask then is
	"How many more measurements would I have to take before I would get
	an X% probability that the difference is real?" Parameter finders
	can answer questions like this, and are necessarily different for
	each distribution. They are implemented as static member functions
	of the distributions, for example:

	students_t::find_degrees_of_freedom(
	1.3, // difference from true mean to detect
	0.05, // maximum risk of falsely rejecting the null-hypothesis.
	0.1, // maximum risk of falsely failing to reject the null-hypothesis.
	0.13); // sample standard deviation

	Returns the number of degrees of freedom required to obtain a 95%
	probability that the observed differences in means is not down to
	chance alone. In the case that a borderline Students-t test result
	was previously obtained, this can be used to estimate how large the sample size
	would have to become before the observed difference was considered
	significant. It assumes, of course, that the sample mean and standard
	deviation are invariant with sample size.

	[endsect] [/ section:parameters Parameters can be calculated]

	[section:summary Summary]

	* Distributions are objects, which are constructed from whatever
	parameters the distribution may have.
	* Member functions allow you to retrieve the parameters of a distribution.
	* Generic non-member functions provide access to the properties that
	are common to all the distributions (PDF, CDF, quantile etc).
	* Complements of probabilities are calculated by wrapping the function's
	arguments in a call to `complement(...)`.
	* Functions that accept a probability can accept a complement of the
	probability as well, by wrapping the function's
	arguments in a call to `complement(...)`.
	* Static member functions allow the parameters of a distribution
	to be found from other information.

	Now that you have the basics, the next section looks at some worked examples.

	[endsect] [/section:summary Summary]
	[endsect] [/section:overview Overview]

	[section:weg Worked Examples]
	[include distributions/distribution_construction.qbk]
	[include distributions/students_t_examples.qbk]
	[include distributions/chi_squared_examples.qbk]
	[include distributions/f_dist_example.qbk]
	[include distributions/binomial_example.qbk]
	[include distributions/geometric_example.qbk]
	[include distributions/negative_binomial_example.qbk]
	[include distributions/normal_example.qbk]
	[/include distributions/inverse_gamma_example.qbk]
	[/include distributions/inverse_gaussian_example.qbk]
	[include distributions/nc_chi_squared_example.qbk]
	[include distributions/error_handling_example.qbk]
	[include distributions/find_location_and_scale.qbk]
	[include distributions/nag_library.qbk]
	[include distributions/c_sharp.qbk]
	[endsect] [/section:weg Worked Examples]

	[include background.qbk]

	[endsect] [/ section:stat_tut Statistical Distributions Tutorial]

	[/ dist_tutorial.qbk
	Copyright 2006, 2010 John Maddock and Paul A. Bristow.
	Distributed under the Boost Software License, Version 1.0.
	(See accompanying file LICENSE_1_0.txt or copy at
	http://www.boost.org/LICENSE_1_0.txt).
	]