11. Statistics Preliminaries#

11.1. The Normal Distribution#

Figure made with TikZ

Fig. 11.1 The figure shows the density function of a normally distributed random variable with mean $\mu$ and standard deviation $\sigma.$

We say that a real-valued random variable (RV) $X$ is normally distributed with mean $\mu$ and standard deviation $\sigma$ if its probability density function (PDF) is:

\[\begin{equation*} f(x) = \frac{1}{\sqrt{2 \pi \sigma^{2}}} e^{-\frac{(x - \mu)^{2}}{2 \sigma^{2}}} \end{equation*}\]

and we usually write $X \sim \Normal(\mu, \sigma^{2}).$ The parameters $\mu$ and $\sigma$ are related to the first and second moments of $X.$

Property 11.1 (Moments of the Normal Distribution)

The parameter $\mu$ is the mean or expectation of $X$ while $\sigma$ denote its standard deviation. The variance of $X$ is given by $\sigma^{2}.$

Proof

Let $X = \mu + \sigma Z$ where $Z \sim \Normal(0, 1)$. Start by defining $f(z) = e^{-\frac{1}{2} z^{2}},$ which implies that $f^{\prime}(z) = -z e^{-\frac{1}{2} z^{2}}$ and $f^{\prime \prime}(x) = z^{2} e^{-\frac{1}{2} z^{2}} - e^{-\frac{1}{2} z^{2}}.$ We can then write:

\[\begin{align*} z e^{-\frac{1}{2} z^{2}} & = -f^{\prime}(z) \\ z^{2} e^{-\frac{1}{2} z^{2}} & = f^{\prime \prime}(x) + f(z) \end{align*}\]

Also, note that:

\[\begin{equation*} \int_{-\infty}^{\infty} f(z) \, dz = \sqrt{2 \pi}. \end{equation*}\]

Then,

\[\begin{align*} \ev(Z) & = \int_{-\infty}^{\infty} \frac{1}{\sqrt{2 \pi}} z e^{-\frac{1}{2} z^{2}} \, dz \\ & = \frac{1}{\sqrt{2 \pi}} \int_{-\infty}^{\infty} -f^{\prime}(z) \, dz\\ & = \frac{1}{\sqrt{2 \pi}} \left( \left. -f(z) \vphantom{-e^{-\frac{1}{2} z^{2}}} \right|_{-\infty}^{\infty} \right) \\ & = 0, \\ \ev(Z^{2}) & = \int_{-\infty}^{\infty} \frac{1}{\sqrt{2 \pi}} z^{2} e^{-\frac{1}{2} z^{2}} \, dz \\ & = \frac{1}{\sqrt{2 \pi}} \int_{-\infty}^{\infty} f^{\prime \prime}(x) + f(z) \, dz \\ & = \frac{1}{\sqrt{2 \pi}} \left( \left. f^{\prime}(z) \vphantom{-z e^{-\frac{1}{2} z^{2}}} \right|_{-\infty}^{\infty} + \int_{-\infty}^{\infty} f(z) \, dz \right) \\ & = \frac{1}{\sqrt{2 \pi}} (0 + \sqrt{2 \pi}) \\ & = 1, \\ \var(Z) & = \ev(Z^{2}) - \ev(Z)^{2} \\ & = 1. \end{align*}\]

We can now compute $\ev(X) = \mu + \sigma \ev(Z) = \mu$ and $\var(X) = \sigma^{2} \var(Z) = \sigma^{2}$.

As with any real-valued random variable $X,$ in order to compute the probability that $X \leq x$ we need to integrate the density function from $-\infty$ to $x \colon$

\[\begin{equation*} \prob(X \leq x) = \int_{-\infty}^{x} \frac{1}{\sqrt{2 \pi \sigma^{2}}} e^{-\frac{(u - \mu)^{2}}{2 \sigma^{2}}} du. \end{equation*}\]

The function $F(x) = \prob(X \leq x)$ is called the cumulative distribution function of $X$. The Leibniz integral rule implies that $F^{\prime}(x) = f(x).$

11.1.1. The Standard Normal Distribution#

Figure made with TikZ

Fig. 11.2 The blue shaded area represents $\cdf(z).$

An important case of normally distributed random variables is when $\mu = 0$ and $\sigma = 1$. In this case we say that $Z \sim \Normal(0, 1)$ has the standard normal distribution and its cumulative distribution function is usually denoted by the capital Greek letter $\Phi$ (phi), and is defined by the integral:

\[\begin{equation*} \cdf(z) = \prob(Z \leq z) = \int_{-\infty}^{z} \frac{1}{\sqrt{2 \pi}} e^{-\frac{x^{2}}{2}} \, dx. \end{equation*}\]

Since the integral cannot be solved in closed-form, the probability must then be obtained from a table or using a computer. For example, in Python we can compute $\cdf(-0.4)$ by typing the following:

from scipy.stats import norm
norm.cdf(-0.4)

0.3445782583896758

If you prefer to use Excel, you need to type in a cell =norm.s.dist(-0.4,TRUE), which yields the same answer.

11.1.2. Left-Tail Probability#

Knowing how to compute or approximate $\cdf(z)$ allows us to compute $\prob(X \leq x)$ when $X \sim \Normal(\mu, \sigma^{2})$ since $Z = \frac{X - \mu}{\sigma} \sim \Normal(0, 1) \colon$

\[\begin{align*} \prob(X \leq x) & = \prob\left( \frac{X - \mu}{\sigma} \leq \frac{x - \mu}{\sigma} \right) \\ & = \prob\left( Z \leq \frac{x - \mu}{\sigma} \right) \\ & = \cdf\left(\frac{x - \mu}{\sigma}\right) \end{align*}\]

where $Z = \dfrac{X - \mu}{\sigma} \sim \Normal(0, 1)$ is called a Z-score.

Example 11.1

Suppose that $X \sim \Normal(\mu, \sigma^{2})$ with $\mu = 10$ and $\sigma = 25.$ What is the probability that $X \leq 0$?

\[\begin{align*} \prob(X \leq 0) & = \prob\left( Z \leq \tfrac{0 - 10}{25} \right) \\ & = \cdf(-0.40) \\ & = 0.3446. \end{align*}\]

11.1.3. Right-Tail Probability#

Figure made with TikZ

Fig. 11.3 The right-tail probability is the probability of the whole distribution, which is one, minus the left-tail probability.

For a random variable $X,$ the right-tail probability is defined as $\prob(X > x).$ Since $\prob(X \leq x) + \prob(X > x) = 1,$ we have that:

\[\begin{equation*} \prob(X > x) = 1 - \prob(X \leq x). \end{equation*}\]

Example 11.2

Suppose that $X \sim \Normal(\mu, \sigma^{2})$ with $\mu = 10$ and $\sigma = 25$. What is the probability that $X > 12$?

\[\begin{align*} \prob(X \leq 12) & = \prob\left( Z \leq \tfrac{12 - 10}{25} \right) \\ & = \cdf(0.08) \\ & = 0.5319. \end{align*}\]

Therefore, $\prob(X > 12) = 1 - 0.5319 = 0.4681.$

11.1.4. Interval Probability#

Figure made with TikZ

Fig. 11.4 If you subtract the area to the left of $x_{1}$ to the area that is to the left of $x_{2}$ you obtain the probability of $x_{1} < X \leq x_{2}.$

The probability that a random variable $X$ falls within an interval $(X_{1}, X_{2}]$ is given by $\prob(x_{1} < X \leq x_{2}) = \prob(X \leq x_{2}) - \prob(X \leq x_{1}).$

Example 11.3

Suppose that $X \sim \Normal(\mu, \sigma^{2})$ with $\mu = 10$ and $\sigma = 25$. What is the probability that $2 < X \leq 14$?

\[\begin{align*} \prob(X \leq 14) & = \prob\left( Z \leq \tfrac{14 - 10}{25} \right) \\ & = \cdf(0.16) \\ & = 0.5636, \\ \prob(X \leq 2) & = \prob\left( Z \leq \tfrac{2 - 10}{25} \right) \\ & = \cdf(-0.32) \\ & = 0.3745. \end{align*}\]

Therefore, $\prob(2 < X \leq 14) = 0.5636 - 0.3745 = 0.1891$.

11.1.5. Percentiles#

Figure made with TikZ

Fig. 11.5 The right-tail percentile is the value $z_{\alpha}$ that gives an area to the right equal to $\alpha$.

For a standard normal variable $Z$, a right-tail percentile is the value $z_{\alpha}$ above which we obtain a certain probability $\alpha.$ Mathematically, this means finding $z_{\alpha}$ such that:

\[ \prob(Z > z_{\alpha}) = \alpha \Leftrightarrow \prob(Z \leq z_{\alpha}) = 1 - \alpha. \]

This implies that $\cdf(z_{\alpha}) = 1 - \alpha$, or $z_{\alpha} = \cdf^{-1}(1 - \alpha)$, where $\cdf^{-1}(\cdot)$ denotes the inverse function of $\cdf(\cdot)$. Again, there is no closed-form expression for this function and we need a computer to obtain the values. For example, say that $\alpha = 0.025$. In Python we could compute $z_{\alpha} = \cdf^{-1}(0.975)$ by using the function ppf included in scipy.stats.norm as follows:

from scipy.stats import norm
norm.ppf(0.975)

1.959963984540054

In Excel the function =norm.s.inv(0.975) provides the same result.

The following table shows common values for $z_{\alpha}$:

$\boldsymbol{\alpha}$	$\boldsymbol{z_{\alpha}}$
0.050	1.64
0.025	1.96
0.010	2.33
0.005	2.58

Figure made with TikZ

Fig. 11.6 The areas on each side are both equal to $\alpha/2.$

A $(1 - \alpha)$ two-sided confidence interval (CI) defines left and right percentiles such that the probability on each side is $\alpha/2$. For a standard normal variable $Z$, the symmetry of its pdf implies:

\[\begin{equation*} \prob(Z \leq -z_{\alpha/2}) = \prob(Z > z_{\alpha/2}) = \alpha/2 \end{equation*}\]

Example 11.4

Since $z_{2.5\%} = 1.96$, the 95% confidence interval of $Z$ is $[-1.96, 1.96]$. This means that if we randomly sample this variable 100,000 times, approximately 95,000 observations will fall inside this interval.

If $X \sim \Normal(\mu, \sigma^{2})$, its confidence interval is determined by $\xi$ and $\zeta$ such that:

\[\begin{align*} & \prob(X \leq \xi) = \alpha / 2 \\ & \hspace{0.3in} \Rightarrow \prob(Z \leq \tfrac{\xi - \mu}{\sigma}) = \alpha/2, \\ & \prob(X > \zeta) = \alpha / 2 \\ & \hspace{0.3in} \Rightarrow \prob(Z > \tfrac{\zeta - \mu}{\sigma}) = \alpha/2, \end{align*}\]

which implies that $-z_{\alpha/2} = \tfrac{\xi - \mu}{\sigma}$ and $z_{\alpha/2} = \tfrac{\zeta - \mu}{\sigma}$.The $(1 - \alpha)$ confidence interval for $X$ is then $[\mu - z_{\alpha/2}\sigma, \mu + z_{\alpha/2}\sigma]$.

Example 11.5

Suppose that $X \sim \Normal(\mu, \sigma^{2})$ with $\mu = 10$ and $\sigma = 25$. Since $z_{2.5\%} = 1.96$, the 95% confidence interval of $X$ is:

\[\begin{equation*} [10-1.96(25), 10+1.96(25)] = [-39, 59]. \end{equation*}\]

11.2. The Lognormal Distribution#

If $X \sim \Normal(\mu, \sigma^{2})$, then $Y = e^{X}$ is said to be lognormally distributed with the same parameters. The pdf of a lognormally distributed random variable $Y$ can be obtained from the pdf of $X$.

Figure made with TikZ

Fig. 11.7 The figure shows the difference between a normal and a lognormal PDF with the same parameters.

Property 11.2 (Lognormal Density)

If $Y$ is lognormally distributed with parameters $\mu$ and $\sigma^{2}$, the PDF of $Y$ is given by:

\[\begin{equation*} f(y) = \frac{1}{y \sqrt{2 \pi \sigma^{2}}} e^{-\frac{(\ln(y) - \mu)^{2}}{2 \sigma^{2}}}. \end{equation*}\]

Proof

Let $Y = e^{X}$ where $X = \mu + \sigma Z$ and $Z \sim \Normal(0, 1)$. Then,

\[\begin{align*} \prob(Y \leq y) & = \prob(X \leq \ln(y)) \\ & = \int_{-\infty}^{\ln(y)} \frac{1}{\sqrt{2 \pi \sigma^{2}}} e^{-\frac{(x - \mu)^{2}}{2 \sigma^{2}}} \, dx. \\ \end{align*}\]

Let’s define $z = e^{x}$. This implies that $x = \ln(z)$, which in turn implies that $dx = (1 / z) dz$. Therefore,

\[\begin{equation*} \prob(Y \leq y) = \int_{-\infty}^{y} \frac{1}{z \sqrt{2 \pi \sigma^{2}}} e^{-\frac{(\ln(z) - \mu)^{2}}{2 \sigma^{2}}} \, dz. \end{equation*}\]

Thus, the integrand of the previous expression is the probability density function of $Y$.

Unlike the normal density, the lognormal density function is not symmetric around its mean. Normally distributed variables can take values in $(-\infty, \infty)$, whereas lognormally distributed variables are always positive.

11.2.1. Computing Probabilities#

We can use the fact that the logarithm of a lognormal random variable is normally distributed to compute cumulative probabilities.

Example 11.6

Let $Y = e^{4 + 1.5 Z}$ where $Z \sim \Normal(0, 1)$. What is the probability that $Y \leq 100$?

\[\begin{align*} \prob(Y \leq 100) & = \prob(e^{X} \leq 100) \\ & = \prob(X \leq \ln(100)) \\ & = \prob\left(Z \leq \tfrac{\ln(100) - 4}{1.5}\right) \\ & = \cdf(0.4034) \\ & = 0.6567 \end{align*}\]

11.2.2. Confidence Interval#

Let $Y = e^{\mu + \sigma Z}$ where $Z \sim \Normal(0, 1)$. We have that:

\[\begin{align*} & -z_{\alpha/2} < Z \leq z_{\alpha/2} \\ & \hspace{0.4in} \Rightarrow \mu - \sigma z_{\alpha/2} < \mu + \sigma Z \leq \mu + \sigma z_{\alpha/2} \\ & \hspace{0.4in} \Rightarrow e^{\mu - \sigma z_{\alpha/2}} < e^{\mu + \sigma Z} \leq e^{\mu + \sigma z_{\alpha/2}} \end{align*}\]

The $(1 - \alpha)$ confidence interval for $Y$ is $[e^{\mu - \sigma z_{\alpha/2}}, e^{\mu + \sigma z_{\alpha/2}}]$.

Example 11.7

Let $Y = e^{4 + 1.5 Z}$ where $Z \sim \Normal(0, 1)$. The 95% confidence interval for $Y$ is:

\[ [e^{4 - 1.96(1.5)}, e^{4 + 1.96(1.5)}] = [2.89, 1032.71]. \]

11.2.3. Moments#

Property 11.3 (Moments of a Lognormal Distribution)

Let $Y = e^{\mu + \sigma Z}$ where $Z \sim \Normal(0, 1)$. We have that:

\[\begin{align*} \ev(Y) & = e^{\mu + 0.5 \sigma^{2}} \\ \var(Y) & = e^{2\mu + \sigma^{2}} (e^{\sigma^{2}} - 1) \\ \stdev(Y) & = \ev(Y) \sqrt{e^{\sigma^{2}} - 1} \end{align*}\]

Proof

\[\begin{align*} \ev(Y) & = \int_{-\infty}^{\infty} \frac{1}{\sqrt{2 \pi \sigma^{2}}} e^{-\frac{(x - \mu)^{2}}{2 \sigma^{2}}} e^{x} \, dx \\ & = \int_{-\infty}^{\infty} \frac{1}{\sqrt{2 \pi \sigma^{2}}} e^{-\frac{(x - \mu)^{2}}{2 \sigma^{2}} + x} \, dx \\ & = \int_{-\infty}^{\infty} \frac{1}{\sqrt{2 \pi \sigma^{2}}} e^{-\frac{(x - (\mu + \sigma^{2}))^{2}}{2\sigma^{2}} + (\mu + 0.5 \sigma^{2})} \, dx \\ & = e^{\mu + 0.5 \sigma^{2}} \underbrace{\int_{-\infty}^{\infty} \frac{1}{\sqrt{2 \pi \sigma^{2}}} e^{-\frac{(x - (\mu + \sigma^{2}))^{2}}{2\sigma^{2}}} \, dx}_{= 1} \\ & = e^{\mu + 0.5 \sigma^{2}} \end{align*}\]

Using the fact that $\alpha X \sim \Normal(\alpha \mu, (\alpha \sigma)^{2})$, it is also possible to compute the expectation of powers of lognormally distributed variables:

\[\begin{equation*} \ev(Y^{\alpha}) = \ev(e^{\alpha X}) = e^{\alpha \mu + 0.5 (\alpha \sigma)^{2}}. \end{equation*}\]

This is useful to compute the variance and standard deviation of $Y$:

\[\begin{align*} \var(Y) & = \ev(Y^{2}) - \left(\ev(Y)\right)^{2} \\ & = e^{2\mu + 2 \sigma^{2}} - e^{2\mu + \sigma^{2}} \\ & = e^{2\mu + \sigma^{2}} (e^{\sigma^{2}} - 1) \\ \stdev(Y) & = \sqrt{\var(Y)} \\ & = \ev(Y) \sqrt{e^{\sigma^{2}} - 1} \end{align*}\]

Example 11.8

Let $Y = e^{4 + 1.5 Z}$ where $Z \sim \Normal(0, 1)$. The expectation and standard deviation of $Y$ are:

\[\begin{align*} \ev(Y) & = e^{4 + 0.5(1.5^{2})} = 168.17 \\ \stdev(Y) & = 168.17 \sqrt{e^{1.5^{2}} - 1} = 489.95 \end{align*}\]

11.2.4. Partial Expectations#

When pricing a call option, the payoff is positive if the option is in-the-money and zero otherwise. We usually use an indicator function to quantify this behavior:

\[\begin{equation*} \1{Y > K} = \begin{cases} 0 & \text{if $Y \leq K$} \\ 1 & \text{if $Y > K$} \end{cases} \end{equation*}\]

Property 11.4 (Partial Expectations)

Let $Y = e^{X}$ where $X \sim \Normal(\mu, \sigma^{2})$. Then we have that:

\[\begin{align*} \ev\left(Y \1{Y > K}\right) & = e^{\mu + \frac{1}{2}\sigma^{2}} \cdf\left(\frac{\mu + \sigma^{2} - \ln(K)}{\sigma}\right) \\ \ev\left(K \1{Y > K}\right) & = K \cdf\left(\frac{\mu - \ln(K)}{\sigma}\right) \end{align*}\]

Proof

The first expectation can be computed as:

\[\begin{align*} \ev\left(Y \1{Y > K}\right) & = \int_{\ln(K)}^{\infty} \frac{1}{\sqrt{2 \pi \sigma^{2}}} e^{-\frac{(x - \mu)^{2}}{2 \sigma^{2}}} e^{x} \, dx \\ & = \int_{-\infty}^{-\ln(K)} \frac{1}{\sqrt{2 \pi \sigma^{2}}} e^{-\frac{(y + \mu)^{2}}{2 \sigma^{2}}} e^{-y} \, dy \\ & = \int_{-\infty}^{-\ln(K)} \frac{1}{\sqrt{2 \pi \sigma^{2}}} e^{-\frac{(y + \mu)^{2}}{2 \sigma^{2}} - y} \, dy \\ & = \int_{-\infty}^{-\ln(K)} \frac{1}{\sqrt{2 \pi \sigma^{2}}} e^{-\frac{(y + (\mu + \sigma^{2}))^{2}}{2\sigma^{2}} + (\mu + 0.5 \sigma^{2})} \, dy \\ & = e^{\mu + 0.5 \sigma^{2}} \int_{-\infty}^{-\ln(K)} \frac{1}{\sqrt{2 \pi \sigma^{2}}} e^{-\frac{(y + (\mu + \sigma^{2}))^{2}}{2\sigma^{2}}} \, dy \\ & = e^{\mu + 0.5 \sigma^{2}} \cdf\left(\tfrac{\mu + \sigma^{2} - \ln(K)}{\sigma}\right) \end{align*}\]

The second expectation yields:

\[\begin{align*} \ev\left(K \1{Y > K}\right) & = K \int_{\ln(K)}^{\infty} \frac{1}{\sqrt{2 \pi \sigma^{2}}} e^{-\frac{(x - \mu)^{2}}{2 \sigma^{2}}} \, dx \\ & = K \int_{-\infty}^{-\ln(K)} \frac{1}{\sqrt{2 \pi \sigma^{2}}} e^{-\frac{(y + \mu)^{2}}{2 \sigma^{2}}} \, dy \\ & = K \cdf\left(\tfrac{\mu - \ln(K)}{\sigma}\right) \end{align*}\]

11.3. Practice Problems#

Exercise 11.1

Suppose that $X$ is a normally distributed random variable with mean $\mu=12$ and standard deviation $\sigma=20$.

What is the probability that $X \leq 0$?
What is the probability that $X \leq -4$?
What is the probability that $X > 8$?
What is the probability that $4 < X \leq 10$?

Solution to Exercise 11.1

$\prob(X \leq 0) = \cdf(\frac{0-12}{20})$
$\hphantom{\prob(X \leq 0)} = \cdf(-0.60)$
$\hphantom{\prob(X \leq 0)} = 0.2743$.
$\prob(X \leq -4) = \cdf(\frac{-4-12}{20})$
$\hphantom{\prob(X \leq -4)} = \cdf(-0.80)$
$\hphantom{\prob(X \leq -4)} = 0.2119$.
$\prob(X > 8) = 1 - \prob(X \leq 8)$
$\hphantom{\prob(X > 8)} = 1 - \cdf(\frac{8-12}{20})$
$\hphantom{\prob(X > 8)} = 1 - \cdf(-0.20)$
$\hphantom{\prob(X > 8)} = 0.5793$.
$\prob(4 < X \leq 10) = \prob(X \leq 10) - \prob(X \leq 4)$
$\hphantom{\prob(4 < X \leq 10)} = \cdf(\frac{10-12}{20}) - \cdf(\frac{4-12}{20})$
$\hphantom{\prob(4 < X \leq 10)} = \cdf(-0.10) - \cdf(-0.40)$
$\hphantom{\prob(4 < X \leq 10)} = 0.1156$.

Exercise 11.2

Suppose that $X$ is a normally distributed random variable with mean $\mu=10$ and standard deviation $\sigma=20$. Compute the

90%,
95%, and
99%

confidence interval for $X$.

Solution to Exercise 11.2

The $(1-\alpha)$ confidence interval (CI) for $X$ is given by $[\mu - z_{\alpha/2} \sigma, \mu + z_{\alpha/2} \sigma]$ where $z_{\alpha/2} = \cdf^{-1}(1-\alpha/2)$. For example, if you want to compute the $z$-level corresponding to the 90% confidence interval, then $\alpha = 0.10$ and $\alpha/2 = 0.05$, so to compute $z_{0.05}$ you need to type in Excel =norm.s.inv(0.95).

$z_{0.05} = \cdf^{-1}(0.95) = 1.64$ so the 90% CI for $X$ is $[-22.90, 42.90]$.
$z_{0.025} = \cdf^{-1}(0.975) = 1.96$ so the 95% CI for $X$ is $[-29.20, 49.20]$.
$z_{0.005} = \cdf^{-1}(0.995) = 2.58$ so the 99% CI for $X$ is $[-41.52, 61.52]$.

Exercise 11.3

Suppose that $X=\ln(Y)$ is a normally distributed random variable with mean $\mu=3.9$ and standard deviation $\sigma=15$.

What is the probability that $Y \leq 6$?
What is the probability that $Y > 4$?
What is the probability that $3 < Y \leq 12$?
What is the probability that $Y \leq 0$?

Solution to Exercise 11.3

$\prob(Y \leq 6) = \prob(X \leq \ln(Y))$
$\hphantom{\prob(Y \leq 6)} = \cdf(\frac{\ln(6)-3.9}{15})$
$\hphantom{\prob(Y \leq 6)} = \cdf(-0.1405)$
$\hphantom{\prob(Y \leq 6)} = 0.4441$
$\prob(Y > 4) = 1 - \prob(Y \leq 4)$
$\hphantom{\prob(Y > 4)} = 1 - \prob(X \leq \ln(4))$
$\hphantom{\prob(Y > 4)} = 1 - \cdf(\frac{\ln(4)-3.9}{15})$
$\hphantom{\prob(Y > 4)} = 1 - \cdf(-0.1676)$
$\hphantom{\prob(Y > 4)} = 0.5665$
$\prob(3 < Y \leq 12) = \prob(Y \leq 12) - \prob(Y \leq 3)$
$\hphantom{\prob(3 < Y \leq 12)} = \cdf(\frac{\ln(12)-3.9}{15}) - \cdf(\frac{\ln(3)-3.9}{15})$
$\hphantom{\prob(3 < Y \leq 12)} = \cdf(-0.0943) - \cdf(-1868)$
$\hphantom{\prob(3 < Y \leq 12)} = 0.4624 - 0.4259$
$\hphantom{\prob(3 < Y \leq 12)} = 0.0365$
$\prob(Y \leq 0) = \prob(X \leq -\infty) = 0$

Exercise 11.4

Suppose that $X=\ln(Y)$ is a normally distributed random variable with mean $\mu=2.7$ and standard deviation $\sigma=1$. Compute the

90%,
95%, and
99%

confidence interval for $X$ and report the corresponding values for $Y$.

Solution to Exercise 11.4

The $(1 - \alpha)$ confidence interval (CI) for $X$ is given by $[\mu - z_{\alpha/2} \sigma, \mu + z_{\alpha/2} \sigma]$. Remember that to compute $z_{\alpha/2}$ we use in Excel =norm.s.inv(1-alpha/2). The corresponding interval for $Y$ is then $[e^{\mu - z_{\alpha/2} \sigma}, e^{\mu + z_{\alpha/2} \sigma}]$.

$z_{0.05} = 1.64$ so the 90% CI for $X$ is $[1.06, 4.34]$, and the corresponding values for $Y$ are $[2.87, 77.08]$.
$z_{0.025} = 1.96$ so the 95% CI for $X$ is $[0.74, 4.66]$, and the corresponding values for $Y$ are $[2.10, 105.63]$.
$z_{0.005} = 2.58$ so the 99% CI for $X$ is $[0.12, 5.28]$, and the corresponding values for $Y$ are $[1.13, 195.55]$.

Exercise 11.5

Let $Y = e^{\mu + \sigma Z}$ where $\mu = 1$, $\sigma = 2$ and $Z \sim \Normal(0, 1)$. Compute:

$\ev(Y)$
$\stdev(Y) = \sqrt{\ev(Y^{2}) - \ev(Y)^{2}}$
$\ev(Y^{0.3})$
$\ev(Y^{-1})$

Solution to Exercise 11.5

In some of the questions we use the fact that if $X \sim \Normal(\mu, \sigma^{2})$, then $\alpha X \sim \Normal(\alpha\mu, \alpha^{2}\sigma^{2})$, which implies that $\ev(Y^{\alpha}) = \ev(e^{\alpha X}) = e^{\alpha\mu+\frac{1}{2}\alpha^{2}\sigma^{2}}$.

$\ev(Y) = e^{1+\frac{1}{2}2^{2}} = 20.09$
$\ev(Y^{2}) = e^{(2)(1)+\frac{1}{2}(2)^{2}2^{2}}$
$\hphantom{\ev(Y^{2})} = 22026.47$,
$\left(\ev(Y)\right)^{2} = (20.09)^{2}$
$\hphantom{\left(\ev(Y)\right)^{2}} = 403.43$,
$\stdev(Y) = \sqrt{22026.47-403.43}$
$\hphantom{\stdev(Y)} = 147.05$.
$\ev(Y^{0.3}) = e^{(0.3)(1)+\frac{1}{2}(0.3)^{2}2^{2}} = 1.62$
$\ev(Y^{-1}) = e^{(-1)(1)+\frac{1}{2}(-1)^{2}2^{2}} = 2.72$

Statistics Preliminaries

Contents

11. Statistics Preliminaries#

11.1. The Normal Distribution#

11.1.1. The Standard Normal Distribution#

11.1.2. Left-Tail Probability#

11.1.3. Right-Tail Probability#

11.1.4. Interval Probability#

11.1.5. Percentiles#

11.2. The Lognormal Distribution#

11.2.1. Computing Probabilities#

11.2.2. Confidence Interval#

11.2.3. Moments#

11.2.4. Partial Expectations#

11.3. Practice Problems#