Random Variable

A random variable is a variable, a symbol that takes the place of number in an equation or inequality, and the probability of the values it can have. Random variables are specified by their cumulative distribution function F^X(x), or just F(x), if X is understood, giving the probability X\le x.

Every cdf must satisfy 0\le F\le 1 since it is a probability and is nondecreasing, F(x) \le F(y) if x < y, since X\le x implies X\le y.

Hint. This means \lim_{h\downarrow 0} F(x + h) = F(x). Use (-\infty, x] = \cap_{h>0} (-\infty, x + h].

Hint. This means \lim_{h\downarrow 0} F(x - h) exists. Use (-\infty, x) = \cup_{h>0} (-\infty, x - h] and every bounded set in {\boldsymbol{R}} has a least upper bound.

Expected Value

The expected value, or mean, of X is the Riemann-Stieljes integral E[X] = \int_{\boldsymbol{R}}x\,dF(x). The greek letter μ is typically used for this. Expected value is a measure of central tendency providing a single number indicating the location of X. The median of X is the number m with F(m) = P(X \le m) = 1/2. It provides a more robust measure of location. Values of F(x) for x > m or x < m do not change the median.

The expected value of f(X) for a function f\colon{\boldsymbol{R}}\to{\boldsymbol{R}} is E[f(X)] = \int_{\boldsymbol{R}}f(x)\,dF(x).

Exercise. If f is continuous show \int_{{\boldsymbol{R}}} f(x)\,dF(x) = f(a) when F(x) = 1(x \ge a).

Hint. All terms in the Riemann-Stieljes integral are zero except for the interval containing a.

Solution

Given increasing \{x_j\}, x_j\in{\boldsymbol{R}}, we have F(x_{j+1}) - F(x_j) = 0 if a < x_j or a > x_{j+1} and F(x_{j+1}) - F(x_j) = 1 otherwise. As \Delta x_j\to 0 the minimum and maximum of f on [x_j, x_{j+1}] converge to f(a) since f is continuous.

If A is a set the indicator function 1_A(x) is 1 when x\in A and 0 when x\not\in A. If A(x) is a proposition, a statement that is either true or false, involving x then 1(A(x)) is 1 when A(x) is true and 0 when A(x) is false.

✓ Exercise. If X = a with probablity one show F(x) = 1(x < a).

Solution

Since F(x) = P(X\le x), F(x) = 0 if x < a and F(x) = 1 if x\ge a.

This is also written as dF(x) = δ_a(x), the delta function with unit mass at a. Delta functions are set functions. Given a set A the delta “function” δ_a(A) is 1 if a\in A and 0 if a\not\in A.

Discrete

A random variable X is discrete if P(X = x_j) = p_j where p_j > 0 and \sum_j p_j = 1. The data (x_j, p_j) completely specify the random variable.

A discrete random variable that can only take on values 0 or 1 is called Bernoulli. A Bernoulli random variable is specified by a single number p\in(0, 1) where P(X = 1) = p. This implies P(X = 0) = 1 - p.

Continuous

A random variable X is continuously distributed if F(x) = \int_{-\infty}^x dF(u) = \int_{-\infty}^x F'(u)\,du. The function f(x) = F'(x) is the density function of the random variable.

Uniform

The random variable U is uniformly distributed on the interval [0,1] if it has density function f(x) = 1_{[0,1]}(x), -\infty < x < \infty. A remarkable fact is that every random variable can be defined using this. Random variables have the same law if they have the same cumulative distribution function.

✓ Exercise. If X has cdf F then X has the same law as F^{-1}(U).

Hint. What is P(F^{-1}(U) \le x)? Use P(U \le u) = u for 0\le u\le 1.

Solution

P(F^{-1}(U) \le x) = P(U\le F(x)) = F(x) since 0\le F(x) \le 1.

There are no random random variables. The uniform distribution has enough randomness to specify any random variable.

Moments

The n-th moment of X is μ_n = E[X^n] when it exists. The moment generating function of X is m(s) = E[e^{sX}] = \sum_{n=0}^\infty μ_n s^n/n!. It is possible for two unequal random variables to have the same moments (cite?).

The n-th central moment of X is \bar{μ}_n = E[(X - E[X])^n]. The second central moment is the variance.

✓ Exercise. Show \operatorname{Var}(X) = E[(X - E[X])^2] = E[X^2] - E[X]^2.

Hint. Expectation is an integral so it is linear, E[aX + b] = aE[X] + b.

Solution

\operatorname{Var}(X) = E[(X - E[X])^2] = E[X^2 - 2XE[X] + E[X]^2] = E[X^2] - 2E[X]E[X] + E[X]^2 = E[X^2] - E[X]^2.

Variance is a measure of dispersion, how far the random variable can stray from its mean. The standard deviation is the square root of the variance and has the same units as X. The greek letter σ is typically used for the standard deviation.

✓ Exercise. If \operatorname{Var}(X) = 0 show P(X = E[X]) = 1.

Solution

If you know P(|X - E[X]| > \epsilon) < \operatorname{Var}(X)/\epsilon^2 then P(|X - E[X]| > 0) = 0 so P(X = E[X]) = 1.

✓ Exercise. If X has first and second moments then (X - μ)/σ has mean zero and variance one.

Hint. \operatorname{Var}(X + a) = \operatorname{Var}(X) and \operatorname{Var}(aX) = a^2\operatorname{Var}(X) for a\in{\boldsymbol{R}}.

Solution

E[(X - \mu)/\sigma] = (E[X] - \mu)/\sigma = 0. \operatorname{Var}((X - \mu)/\sigma) = \operatorname{Var}(X)/\sigma^2 = 1.

Subtracting the mean and dividing by the standard deviation is called standardizing the random variable.

The third moment of a standardized random variable is called skewness, a measure of how lopsided a distribution is.

✓ Exercise. If X and -X have the same distribution then its skewness is zero.

Solution

Note E[X^3] = E[(-X)^3] = -E[X^3] so E[X^3] = 0.

The fourth moment of a standardized random variable is called kurtosis, a measure of how peaked a distribution is.

Normal

The standard normal density function is f(x) = e^{-x^2/2}/\sqrt{2\pi}, -\infty < x < \infty.

Exercise. Show \int_{-\infty}^\infty e^{-\pi x^2}\,dx = 1.

Solution

Let I = \int_{-\infty}^\infty e^{-\pi x^2}\,dx. Using polar coordinates x = r\cos\theta and y = r\sin\theta we have \begin{aligned} I^2 &= \int_{-\infty}^\infty \int_{-\infty}^\infty e^{-\pi x^2} e^{-\pi y^2}\,dx\,dy \\ &= \int_{-\infty}^\infty \int_{-\infty}^\infty e^{-\pi (x^2 + y^2)}\,dx\,dy \\ &= \int_0^{2\pi} \int_0^\infty e^{-\pi r^2} r\,dr\,d\theta \\ &= \int_0^{2\pi} -e^{-\pi r^2}/2\pi|_0^\infty\,d\theta \\ &= \int_0^{2\pi} 1/2\pi\,d\theta \\ &= 1 \\ \end{aligned}

✓ Exercise. Show \int_{\boldsymbol{R}}e^{-\alpha x^2}\,dx = \sqrt{\pi/\alpha}.

This shows \int_{\boldsymbol{R}}e^{-x^2/2}\,dx = \sqrt{2\pi}.

Solution

If \alpha x^2 = \pi y^2 then x = \sqrt{\pi/\alpha}y so \int_{\boldsymbol{R}}e^{-\alpha x^2}\,dx = \int_{\boldsymbol{R}}e^{-\pi y^2}\sqrt{\pi/\alpha}\,dy.

✓ Exercise. If Z is standard normal then E[e^{sZ}] = e^{s^2/2}.

Hint. E[e^{sZ}] = \int_{\boldsymbol{R}}e^{sz} e^{-z^2/2}/\sqrt{2\pi}\,dz. Use sz - z^2/2 = s^2/2 - (z - s)^2/2.

Solution

E[e^{sZ}] = \int_{\boldsymbol{R}}e^{sz} e^{-z^2/2}/\sqrt{2\pi}\,dz = e^{s^2/2}\int_R e^{-(z - s)^2/2}\sqrt{2\pi}\,dz and change variable z \to z + s.

✓ Exercise. If N is normal show E[e^N] = e^{E[N] + \operatorname{Var}(N)/2}.

Hint. Every normal can be written as N = \mu + \sigma Z where Z is standard normal.

Solution

E[e^{sN}] = E[e^{\mu + \sigma Z}] = e^\mu e^{\sigma^2/2}.

✓ Exercise. If Z is standard normal then E[e^{sZ} f(Z)] = E[e^{sZ}] E[f(Z + s)].

Hint. E[e^{sZ} f(Z)] = \int_{\boldsymbol{R}}e^{sz} f(z) e^{-z^2/2}/\sqrt{2\pi}\,dz = e^{-s^2/2}\int_{\boldsymbol{R}}f(z) e^{-(z - s)^2/2}/\sqrt{2\pi}\,dz.

Solution

\begin{aligned} E[e^{sZ}f(Z)] &= int_{\boldsymbol{R}}e^{sz} e^{-z^2/2}f(z)/\sqrt{2\pi}\,dz \\ &= e^{s^2/2}\int_R e^{-(z - s)^2/2}f(z)\sqrt{2\pi}\,dz \\ &= e^{s^2/2}\int_R e^{-z^2/2}f(z + s)\sqrt{2\pi}\,dz \\ &= E[e^{sZ}] E[f(Z + s)] \\ \end{aligned}

✓ Exercise. If N is normal then E[e^N f(N)] = E[e^N] E[f(N + \operatorname{Var}(N))].

Solution

\begin{aligned} E[e^N f(N)] &= E[e^{\mu + \sigma Z} f(\mu + \sigma Z)] \\ &= E[e^{\mu + \sigma Z}] E[f(\mu + \sigma (Z + \sigma))] \\ &= E[e^N] E[f(N + \sigma^2)] \\ \end{aligned}

Exercise. Show the 2n-th moment of a standard normal is (2n)!/2^n n!.

Solution

We have \sum_0^\infty μ_n s^n/n! = e^{s^2/2} = \sum_0^\infty (s^2/2)^n/n!. Equating coefficients, μ_{2n}/(2n)! = (1/2)^n/n! so μ_{2n} = (2n)!/2^n n!.

Cumulants

The logarithm of the moment generating function is the cumulant κ(s) = \log E[e^{sX}] = \sum_{n=1}^\infty κ_n s^n/n!. The coefficients (κ_n) are the cumulants.

✓ Exercise. Show κ(0) = 0, κ'(0) = E[X], and κ''(0) = \operatorname{Var}(X).

Solution

\kappa(0) = \log E[e^0] = \log 1 = 0.

\kappa'(s) = E[Xe^{sX}]/E[e^{sX}] so \kappa'(0) = E[X].

\kappa''(s) = (E[e^{sX}] E[X^2e^{sX}])/E[e^{sX}]^2 - E[Xe^{sX}]E[Xe^{sX}]/E[e^{sX}]^2 so \kappa''(0) = E[X^2] - E[X]^2 = \operatorname{Var}(X).

✓ Exercise. If Z is standard normal show \kappa_n = 0 for n > 2.

Solution

\kappa(s) = \log E[e^{sX}] = s^2/2 = \sum_{n > 0}\kappa_n s^n/n!.

Inequalities

The mean of a random variable places a contraint on its upper tail.

Lemma. (Markov) For any random variable X, P(X > λ) \le E[X]/λ.

Proof. E[X] \ge E[X1(X\ge λ)] \ge \lambda E[1(X\ge\lambda)] = λ P(X\ge λ).

✓ Exercise. (Chebychev) Show P(|X - E[X]| > ε) \le \operatorname{Var}(X)/ε^2.

Hint. Replace X by (X - E[X])^2 and λ by ε^2.

Solution

P(|X - E[X]| > ε) = P((X - E[X])^2 > \epsilon^2) \le E[(X - E[X])^2]/\epsilon^2 = \operatorname{Var}(X)/ε^2.

Concentration Inequalities

Let (X_j) be independent, identically distributed random variables with mean μ and standard deviation σ. Define the average S_n = (1/n)\sum_{j=1}^n X_j.

Exercise. Show E[S_n] = μ and \operatorname{Var}(S_n) = σ^2/n.

Solution

E[S_n] = E[(1/n)\sum_{j=1}^n X_j] = nE[X]/n = \mu.

\operatorname{Var}(S_n) = \operatorname{Var}((1/n)\sum_{j=1}^n X_j) = \operatorname{Var}(\sum_{j=1}^n X_j)/n^2 = n\operatorname{Var}(X)/n^2 = \sigma^2/n.

Exercise. (Weak law of large numbers) Show for any ε > 0 that \lim_{n\to\infty} P(|S_n - μ| > ε) = 0.

Hint. Show P(|S_n - μ| > ε) \le σ^2/nε^2

Solution

By Chebyshev’s inequality P(|S_n - μ| > ε) \le σ^2/nε^2 and \lim_{n\to\infty} c/n = 0 for any constant c.