Mar 2, 2026
A random variable is a variable, a symbol that takes the place of number in an equation or inequality, and the probability of the values it can have. Random variables are specified by their cumulative distribution function F^X(x), or just F(x), if X is understood, giving the probability X\le x.
Every cdf must satisfy 0\le F\le 1 since it is a probability and is nondecreasing, F(x) \le F(y) if x < y, since X\le x implies X\le y.
Exercise Show every cdf is right continuous.
Hint. This means \lim_{h\downarrow 0} F(x + h) = F(x). Use (-\infty, x] = \cap_{h>0} (-\infty, x + h].
Exercise Show every cdf has left limits.
Hint. This means \lim_{h\downarrow 0} F(x - h) exists. Use (-\infty, x) = \cup_{h>0} (-\infty, x - h] and every bounded set in {\boldsymbol{R}} has a least upper bound.
Any function satsifying these properties is the cdf of a random variable.
The expected value, or mean, of X is the Riemann-Stieljes integral E[X] = \int_{\boldsymbol{R}}x\,dF(x). The greek letter μ is typically used for this. Expected value is a measure of central tendency providing a single number indicating the location of X. The median of X is the number m with F(m) = P(X \le m) = 1/2. It provides a more robust measure of location. Values of F(x) for x > m or x < m do not change the median.
The expected value of f(X) for a function f\colon{\boldsymbol{R}}\to{\boldsymbol{R}} is E[f(X)] = \int_{\boldsymbol{R}}f(x)\,dF(x).
Exercise. If f is continuous show \int_{{\boldsymbol{R}}} f(x)\,dF(x) = f(a) when F(x) = 1(x \ge a).
Hint. All terms in the Riemann-Stieljes integral are zero except for the interval containing a.
If A is a set the indicator function 1_A(x) is 1 when x\in A and 0 when x\not\in A. If A(x) is a proposition, a statement that is either true or false, involving x then 1(A(x)) is 1 when A(x) is true and 0 when A(x) is false.
✓ Exercise. If X = a with probablity one show F(x) = 1(x < a).
This is also written as dF(x) = δ_a(x), the delta function with unit mass at a. Delta functions are set functions. Given a set A the delta “function” δ_a(A) is 1 if a\in A and 0 if a\not\in A.
A random variable X is discrete if P(X = x_j) = p_j where p_j > 0 and \sum_j p_j = 1. The data (x_j, p_j) completely specify the random variable.
A discrete random variable that can only take on values 0 or 1 is called Bernoulli. A Bernoulli random variable is specified by a single number p\in(0, 1) where P(X = 1) = p. This implies P(X = 0) = 1 - p.
A random variable X is continuously distributed if F(x) = \int_{-\infty}^x dF(u) = \int_{-\infty}^x F'(u)\,du. The function f(x) = F'(x) is the density function of the random variable.
The random variable U is uniformly distributed on the interval [0,1] if it has density function f(x) = 1_{[0,1]}(x), -\infty < x < \infty. A remarkable fact is that every random variable can be defined using this. Random variables have the same law if they have the same cumulative distribution function.
✓ Exercise. If X has cdf F then X has the same law as F^{-1}(U).
Hint. What is P(F^{-1}(U) \le x)? Use P(U \le u) = u for 0\le u\le 1.
There are no random random variables. The uniform distribution has enough randomness to specify any random variable.
The n-th moment of X is μ_n = E[X^n] when it exists. The moment generating function of X is m(s) = E[e^{sX}] = \sum_{n=0}^\infty μ_n s^n/n!. It is possible for two unequal random variables to have the same moments (cite?).
The n-th central moment of X is \bar{μ}_n = E[(X - E[X])^n]. The second central moment is the variance.
✓ Exercise. Show \operatorname{Var}(X) = E[(X - E[X])^2] = E[X^2] - E[X]^2.
Hint. Expectation is an integral so it is linear, E[aX + b] = aE[X] + b.
Variance is a measure of dispersion, how far the random variable can stray from its mean. The standard deviation is the square root of the variance and has the same units as X. The greek letter σ is typically used for the standard deviation.
✓ Exercise. If \operatorname{Var}(X) = 0 show P(X = E[X]) = 1.
✓ Exercise. If X has first and second moments then (X - μ)/σ has mean zero and variance one.
Hint. \operatorname{Var}(X + a) = \operatorname{Var}(X) and \operatorname{Var}(aX) = a^2\operatorname{Var}(X) for a\in{\boldsymbol{R}}.
Subtracting the mean and dividing by the standard deviation is called standardizing the random variable.
The third moment of a standardized random variable is called skewness, a measure of how lopsided a distribution is.
✓ Exercise. If X and -X have the same distribution then its skewness is zero.
The fourth moment of a standardized random variable is called kurtosis, a measure of how peaked a distribution is.
The standard normal density function is f(x) = e^{-x^2/2}/\sqrt{2\pi}, -\infty < x < \infty.
Exercise. Show \int_{-\infty}^\infty e^{-\pi x^2}\,dx = 1.
✓ Exercise. Show \int_{\boldsymbol{R}}e^{-\alpha x^2}\,dx = \sqrt{\pi/\alpha}.
This shows \int_{\boldsymbol{R}}e^{-x^2/2}\,dx = \sqrt{2\pi}.
✓ Exercise. If Z is standard normal then E[e^{sZ}] = e^{s^2/2}.
Hint. E[e^{sZ}] = \int_{\boldsymbol{R}}e^{sz} e^{-z^2/2}/\sqrt{2\pi}\,dz. Use sz - z^2/2 = s^2/2 - (z - s)^2/2.
✓ Exercise. If N is normal show E[e^N] = e^{E[N] + \operatorname{Var}(N)/2}.
Hint. Every normal can be written as N = \mu + \sigma Z where Z is standard normal.
✓ Exercise. If Z is standard normal then E[e^{sZ} f(Z)] = E[e^{sZ}] E[f(Z + s)].
Hint. E[e^{sZ} f(Z)] = \int_{\boldsymbol{R}}e^{sz} f(z) e^{-z^2/2}/\sqrt{2\pi}\,dz = e^{-s^2/2}\int_{\boldsymbol{R}}f(z) e^{-(z - s)^2/2}/\sqrt{2\pi}\,dz.
✓ Exercise. If N is normal then E[e^N f(N)] = E[e^N] E[f(N + \operatorname{Var}(N))].
Exercise. Show the 2n-th moment of a standard normal is (2n)!/2^n n!.
The logarithm of the moment generating function is the cumulant κ(s) = \log E[e^{sX}] = \sum_{n=1}^\infty κ_n s^n/n!. The coefficients (κ_n) are the cumulants.
✓ Exercise. Show κ(0) = 0, κ'(0) = E[X], and κ''(0) = \operatorname{Var}(X).
\kappa(0) = \log E[e^0] = \log 1 = 0.
\kappa'(s) = E[Xe^{sX}]/E[e^{sX}] so \kappa'(0) = E[X].
\kappa''(s) = (E[e^{sX}] E[X^2e^{sX}])/E[e^{sX}]^2 - E[Xe^{sX}]E[Xe^{sX}]/E[e^{sX}]^2 so \kappa''(0) = E[X^2] - E[X]^2 = \operatorname{Var}(X).✓ Exercise. If Z is standard normal show \kappa_n = 0 for n > 2.
The mean of a random variable places a contraint on its upper tail.
Lemma. (Markov) For any random variable X, P(X > λ) \le E[X]/λ.
Proof. E[X] \ge E[X1(X\ge λ)] \ge \lambda E[1(X\ge\lambda)] = λ P(X\ge λ).
✓ Exercise. (Chebychev) Show P(|X - E[X]| > ε) \le \operatorname{Var}(X)/ε^2.
Hint. Replace X by (X - E[X])^2 and λ by ε^2.
Let (X_j) be independent, identically distributed random variables with mean μ and standard deviation σ. Define the average S_n = (1/n)\sum_{j=1}^n X_j.
Exercise. Show E[S_n] = μ and \operatorname{Var}(S_n) = σ^2/n.
E[S_n] = E[(1/n)\sum_{j=1}^n X_j] = nE[X]/n = \mu.
\operatorname{Var}(S_n) = \operatorname{Var}((1/n)\sum_{j=1}^n X_j) = \operatorname{Var}(\sum_{j=1}^n X_j)/n^2 = n\operatorname{Var}(X)/n^2 = \sigma^2/n.Exercise. (Weak law of large numbers) Show for any ε > 0 that \lim_{n\to\infty} P(|S_n - μ| > ε) = 0.
Hint. Show P(|S_n - μ| > ε) \le σ^2/nε^2