Posts filled under #tatifius

An extract on #tatifius

The central limit theorem states that under certain (fairly common) conditions, the sum of many random variables will have an approximately normal distribution. More specifically, where X 1 , , X n {\displaystyle X_{1},\ldots ,X_{n}} are independent and identically distributed random variables with the same arbitrary distribution, zero mean, and variance 2 {\displaystyle \sigma ^{2}} and Z {\displaystyle Z} is their mean scaled by n {\displaystyle {\sqrt {n}}} Z = n ( 1 n i = 1 n X i ) {\displaystyle Z={\sqrt {n}}\left({\frac {1}{n}}\sum _{i=1}^{n}X_{i}\right)} Then, as n {\displaystyle n} increases, the probability distribution of Z {\displaystyle Z} will tend to the normal distribution with zero mean and variance 2 {\displaystyle \sigma ^{2}} . The theorem can be extended to variables ( X i ) {\displaystyle (X_{i})} that are not independent and/or not identically distributed if certain constraints are placed on the degree of dependence and the moments of the distributions. Many test statistics, scores, and estimators encountered in practice contain sums of certain random variables in them, and even more estimators can be represented as sums of random variables through the use of influence functions. The central limit theorem implies that those statistical parameters will have asymptotically normal distributions. The central limit theorem also implies that certain distributions can be approximated by the normal distribution, for example: The binomial distribution B ( n , p ) {\displaystyle B(n,p)} is approximately normal with mean n p {\displaystyle np} and variance n p ( 1 p ) {\displaystyle np(1-p)} for large n {\displaystyle n} and for p {\displaystyle p} not too close to 0 or 1. The Poisson distribution with parameter {\displaystyle \lambda } is approximately normal with mean {\displaystyle \lambda } and variance {\displaystyle \lambda } , for large values of {\displaystyle \lambda } . The chi-squared distribution 2 ( k ) {\displaystyle \chi ^{2}(k)} is approximately normal with mean k {\displaystyle k} and variance 2 k {\displaystyle 2k} , for large k {\displaystyle k} . The Student's t-distribution t ( ) {\displaystyle t(\nu )} is approximately normal with mean 0 and variance 1 when {\displaystyle \nu } is large. Whether these approximations are sufficiently accurate depends on the purpose for which they are needed, and the rate of convergence to the normal distribution. It is typically the case that such approximations are less accurate in the tails of the distribution. A general upper bound for the approximation error in the central limit theorem is given by the BerryEsseen theorem, improvements of the approximation are given by the Edgeworth expansions.