Continuous random variables

Probability density function

Probability density function (PDF) for continuous random variable x is function:

CDF(a ≤ X ≤ b) = P(a ≤ X ≤ b) = ∫_a, b f_X(x) dx

f_X(x) ≥ 0

∫_{− ∞, + ∞} f_X(x) dx = 1

f_X(x) funtion maps values x from sample space to real numbers.

For continuous random variable:

P(X = a) = 0

Expectation

Expectation of continuous random variable is:

μ = E[X] = ∫_{− ∞, + ∞} x·f_X(x) dx

Properties:

E[X + Y] = E[X] + E[Y]

E[a·X] = a·E[X]

E[a·X + b] = a·E[X] + b

Variance

Variance of continuous random variable is:

var[X] = ∫_{− ∞, + ∞} (x − μ)²·f_X(x) dx

Properties:

var[a·X + b] = a²·var[X]

var[X] = E[X²] − E²[X]

Standard deviation

Standard deviation of continuous random variable is:

σ_Χ = sqrt(var[X])

Cumulative distribution functions

Cumulative distribution functions (CDF) of random variable X is:

F_X(x) = P(X ≤ x) = ∫_{− ∞, x} f_X(t) dt

So:

P(a ≤ X ≤ b) = F_X(b) − F_X(a) + f_X(a) = ∫_a, b f_X(x) dx

F_X( − ∞) = 0

F_X( + ∞) = 1

and F_X(a) ≤ F_X(b) for a ≤ b.

Relation between CDF and PDF:

(d(CDF(t)) ⁄ dt)(x) = PDF(x)

Conditional probability

Conditional probability of continuous random variable is:

P(X ⊆ B|A) = ∫_B f_X|A(x) dx = ∫_A∩B f_X(x) dx ⁄ P(A)

Conditional expectation of continuous random variable is:

E[X|A] = ∫x·f_X|A(x) dx

Properties:

E[g(X)|A] = ∫g(x)·f_X|A(x) dx

Independence

Random variable X, Y are independent if:

f_X, Y(x, y) = f_X(x)·f_Y(y)

Continuous uniform random variable

Continuous uniform random variable is f_X(x) that is non-zero only on [a, b] with f_X(x) = `1 ⁄ (b − a).

E[unif(a, b)] = (b + a) ⁄ 2

var[unif(a, b)] = (b − a)² ⁄ 12

σ = (b − a) ⁄ sqrt(12)

Proofs:

E[unif(a, b)] = ∫_a, b x·1 ⁄ (b − a)·dx = x² ⁄ 2 ⁄ (b − a)|_a, b = (b² − a²) ⁄ (b − a) ⁄ 2 = (b + a) ⁄ 2

E[unif²(a, b)] = ∫_a, bx²·1 ⁄ (b − a)·dx = x³ ⁄ 3 ⁄ (b − a)|_a, b = (b³ − a³) ⁄ (b − a) ⁄ 3 = (b² + b·a + a²) ⁄ 3

var[unif(a, b)] = E[unif²(a, b)] − E²[unif(a, b)] = (b² + b·a + a²) ⁄ 3 − (b + a)² ⁄ 4 = (b − a)² ⁄ 12

Note

In maxima:

(%i4) factor((b^2+b*a+a^2)/3 - (a+b)^2/4);
            2
     (b - a)
     --------
        12

Exponential random variables

Exponential random variables with parameter λ is:

f_X(x) = λ·exp( − λ·x)

for x ≥ 0, and zero otherwise.

Properties:

E[exp(λ)] = 1 ⁄ λ

var[exp(λ)] = 1 ⁄ λ²

Proof:

∫_{− ∞, + ∞} f_X(x) dx = ∫_{0, + ∞} λ·exp( − λ·x) dx = − exp( − λ·x)|_{0, + ∞} = 1

E[exp(λ)] = ∫_{0, + ∞} x·λ·exp( − λ·x) dx = 1 ⁄ λ

E[exp²(λ)] = ∫_{0, + ∞} x²·λ·exp( − λ·x) dx = 1 ⁄ λ²

Note

From maxima:

(%i15) assume(lambda>0);
(%o15)                           [lambda > 0]

(%i16) integrate(lambda*%e^(-lambda*x),x,0,inf);
(%o16)                                 1

(%i17) integrate(x*lambda*%e^(-lambda*x),x,0,inf);
                                      1
(%o17)                              ------
                                    lambda

(%i18) integrate(x^2*lambda*%e^(-lambda*x),x,0,inf);
                                       2
(%o18)                              -------
                                          2
                                    lambda

Normal random variables

Normal random variables with parameters μ, σ and σ > 0 defined by PDF:

norm(μ, σ²) = 1 ⁄ sqrt(2·π) ⁄ σ·exp( − (x − μ)² ⁄ σ² ⁄ 2)

Properties:

E[norm(μ, σ²)] = μ

var[norm(μ, σ²)] = σ²

Summa of two normal r.v.

If Z = X + Y and X and Y is independent normal r.v. then:

norm(μ_z, σ_z²) = norm(μ_x + μ_y, σ_x² + σ_y²)

Proof:

norm(μ_z, σ_z²) = ∫_x f_X(x)·f_Y(z − x) dx

= ∫_x 1 ⁄ sqrt(2·π) ⁄ σ_x·exp( − (x − μ_x)² ⁄ σ_x² ⁄ 2)·1 ⁄ sqrt(2·π) ⁄ σ_y·exp( − (z − x − μ_y)² ⁄ σ_y² ⁄ 2) dx

= 1 ⁄ sqrt(2·π·(σ_x² + σ_y²))·exp( − (x − μ_x − μ_y)² ⁄ (σ_x² + σ_y²) ⁄ 2)

Linear function of distribution

If Y = a·X + b then f_Y(y) = 1 ⁄ |a|·f_X((y − b) ⁄ a).

Proof, for y > 0:

F_Y(Y ≤ y) = F_X(a·X + b ≤ y) = F_X(X ≤ (y − b) ⁄ a)

so:

f_Y(y) = d ⁄ dy F_Y(Y ≤ y) = d ⁄ dy F_X(x ≤ (y − b) ⁄ a) = 1 ⁄ a·f_X((y − b) ⁄ a)

For y < 0:

F_Y(Y > y) = F_X(a·X + b > y) = F_X(X < (y − b) ⁄ a)

F_Y(Y < = y) = 1 − F_Y(Y > y) = 1 − F_X(X < (y − b) ⁄ a)

d ⁄ dy f_Y(y) = − 1 ⁄ a·f_X((y − b) ⁄ a)

Combining expression for a ≠ 0 gives us result.

If X is uniform distribution with parameters c, d then a·Y + b also is uniform distribution with parameters a·c + b, a·d + b.

If X is exponential distribution with parameters λ then a·Y also is exponential distribution with parameters λ ⁄ a for a > 0.

If X is normal distribution with parameters μ, σ² then a·Y + b also is normal distribution with parameters a·μ + b, (a·σ)².

Proofs.

When Χ exp(λ) and Y = a·X then:

f_Y(y) = 1 ⁄ a·f_X(y ⁄ a) = λ ⁄ a·e^{− λ·y ⁄ a} exp(λ ⁄ a)

When Χ norm(μ, σ²) and Y = a·X + b then:

f_Y(y) = 1 ⁄ a·f_X((y − b) ⁄ a) = 1 ⁄ a·1 ⁄ sqrt(2·π) ⁄ σ·e^{− λ·((y − b) ⁄ a − μ)² ⁄ σ² ⁄ 2}

= 1 ⁄ sqrt(2·π) ⁄ (a·σ)·e^{− λ·(y − (a·μ + b))² ⁄ (a·σ)² ⁄ 2} = norm(a·μ + b, (a·σ)²)

Monotonic function of distribution

Let's Y = g(X) and g is monotonic function on range [a, b]. So there is inverse function h(Y) = X on range [g(a), g(b)] (if g is increasing values) or on range [g(b), g(a)] (if g is decreasing values). In that case:

f_Y(y) = f_X(h(y))·(d h(t) ⁄ dt)(y)

Proof. Let g is monotonically increasing function. Thus:

F_Y(Y ≤ y) = F_X(g(X) ≤ y) = F_X(X ≤ h(y)) = F_X(h(y))

and so:

f_Y(y) = (d F_Y(t) ⁄ dt)(y) = (d F_X(h(t)) ⁄ dt)(y) = f_X(h(y))·(d h(t) ⁄ dt)(y)

Convolution formula

If Z = X + Y and X and Y is independent r.v. then:

f_Z(z) = ∫_x f_X(x)·f_Y(z − x)·dx

Proof:

Consider Z at conditional event X = x:

f_Z|X(z|X = x) = f_z|X = x(z|X = x)

Becasue of independence of X and Y:

f_Z|X(z|X = x) = f_{X + Y|X = x}(z|X = x) = f_x + Y(z) = f_Y(z − x)

Joint PDF of X and Z is:

f_X, Z(x, z) = f_X(x)·f_Z|X(z|X = x) = f_X(x)·f_Y(z − x)

By integrating by x we get:

f_Z(z) = ∫_x f_X, Z(x, z) dx = ∫_x f_X(x)·f_Y(z − x) dx

https://en.wikipedia.org/wiki/List_of_convolutions_of_probability_distributions

Covariance

Covariance of two r.v. is:

cov(X, Y) = E[(X − E[X])·(Y − E[Y])]

Properties:

cov(X, Y) = E[X·Y] − E[X]·E[Y]

cov(X, X) = var(X)

cov(a·X + b, Y) = a·cov(X, Y)

cov(X, Y + Z) = cov(X, Y) + cov(X, Z)

var(X + Y) = var(X) + var(Y) + 2·cov(X, Y)

Covariance of two independent r.v. is zero.

Proofs:

cov(X, Y) = E[(X − E[X])·(Y − E[Y])] = E[X·Y − X·E[Y] − E[X]·Y + E[X]·E[Y]]

= E[X·Y] − E[X·E[Y]] − E[E[X]·Y] + E[E[X]·E[Y]]

= E[X·Y] − E[X]·E[Y] − E[X]·E[Y] + E[X]·E[Y] = E[X·Y] − E[X]·E[Y]

cov(a·X + b, Y) = E[(a·X + b − E[a·X + b])·(Y − E[Y])]

= E[(a·X + b − (a·E[X] + b))·(Y − E[Y])] = E[(a·X + a·E[X])·(Y − E[Y])]

= a·E[(X + E[X])·(Y − E[Y])] = a·cov(X, Y)

cov(X, Y + Z) = E[(X − E[X])·(Y + Z − E[Y + Z])] = E[(X − E[X])·(Y − E[Y] + Z − E[Z])]

= E[(X − E[X])·(Y − E[Y]) + (X − E[X])·(Z − E[Z])]

= E[(X − E[X])·(Y − E[Y])] + E[(X − E[X])·(Z − E[Z])] = cov(X, Y) + cov(X, Z)

var(X) + var(Y) + 2·cov(X, Y) = E[X²] − (E[X])² + E[Y²] − (E[Y])² + 2·E[X·Y] − 2·E[X]·E[Y]

= E[X² − X·E[X] + Y² − Y·E[Y] + 2·X·Y − X·E[Y] − Y·E[X]]

= E[(X + Y)² − (X·E[X] + Y·E[Y] + X·E[Y] + Y·E[X])]

= E[(X + Y)²] − E[(X + Y)·(E[X] + E[Y])] = E[(X + Y)²] − E[X + Y]·E[X + Y] = var(X + Y)

For independent r.v. X and Y:

cov(X, Y) = E[(X − E[X])·(Y − E[Y])] = E[(X − E[X])]·E[(Y − E[Y])] = 0

Correlation coefficient

Dimensionless version of covariance:

ρ(Χ, Υ) = E[(X − E[X]) ⁄ σ_Χ·(Y − E[Y]) ⁄ σ_Y] = cov(X, Y) ⁄ (σ_X·σ_Y)

It is defined only for cases when σ_X ≠ 0 and σ_Y ≠ 0.

Obviously − 1 ≤ ρ(Χ, Υ) ≤ + 1 and ρ(Χ, X) = 0.

For independent r.v. ρ(Χ, Y) = 0.

If |ρ(X, Y)| = 1 then X and Y is have linear dependencies X = Y or X = − Y.

Properties:

ρ(a·X + b, Y) = sign(a)·ρ(X, Y)

Conditioned expectation

E[X|Y] = ∫_X x·f_X|Y(x|Y) dx

Law of total expectation

E[X] = E[E[X|Y]]

Proof:

.. math::

E[E[X|Y]] = ∫_Yf_Y(y)·∫_Xx·f_{X|Y}(x|y)dx·dy

= ∫_Y∫_Xx·f_Y(y)·f_{X|Y}(x|y)dx·dy = ∫_Y∫_Xx·f_{X,Y}(x,y)dx·dy

= ∫_Xx·∫_Yf_{X,Y}(x,y)dy·dx = ∫_Xx·f_X(x)dx = E[X]

https://en.wikipedia.org/wiki/Law_of_total_expectation

Iterated expectations with nested conditioning sets

E[X|A] = E[E[X|B]|A]

Conditional variance

var(X|Y = y) = E[(X − E[X|Y = y])²|Y = y]

https://en.wikipedia.org/wiki/Conditional_variance

Law of total variance

var(X) = E[var(X|Y)] + var(E[X|Y]) = E_Y[var_X(X|Y)] + var_X(E_Y[X|Y])

Proof:

var(X) = E[X²] − (E[X])² = E[E[X²|Y]] − (E[E[X|Y]])²

= E[var(X|Y) + (E[X|Y])²] − (E[E[X|Y]])²

https://en.wikipedia.org/wiki/Law_of_total_variance

Law of total covariance

https://en.wikipedia.org/wiki/Law_of_total_covariance

Sum of normally distributed random variables

For X norm(μ_X, σ_X²) and Y norm(μ_Y, σ_Y²) random variable X + Y is also has normal distribution with parameters:

norm(μ_X + μ_Y, σ_X² + σ_Y²)

https://en.wikipedia.org/wiki/Sum_of_normally_distributed_random_variables

Sum of random number of i.i.r.v.

Let Y = X₁ + ... + X_N is a sum of r.v. N and all X_i are i.i.r.v. Thus:

E[Y] = E[N]·E[X]

var[Y] = E[N]·var(X) + var(N)·(E[N])²

Proofs:

E[Y] = E[∑_i = 1..N X_i] = E[E[∑_i = 1..n X_i|N = n]] = E[∑_i = 1..n E[X_i|N = n]]

= E[∑_i = 1..n E[X]] = E[N·E[X]] = E[N]·E[X]

var[Y] = E[var(Y|N)] + var(E[Y|N]) = E[var(∑_i = 1..n X_i|N = n)] + var(E[∑_i = 1..n X_i|N = n])

= E[N·var(X)] + var(N·E[X]) = E[N]·var(X) + var(N)·(E[X])²