Summaries of distributions: Quantiles and expected values
2024-11-14
Definition 1 (Wasserman (2004, p. 25) Definition 2.16) Let \(X\) be a random variable with cdf \(F\). The inverse cdf or quantile function is defined by \[F^{-1}\left(q\right)=\inf\{x: F(x)>q\}\] for \(q\in [0,1]\).
You may see a slightly different definition in Evans and Rosenthal Definition 2.10.1. They have \[F^{-1}\left(q\right)=\min\{x: F(x)\geq q\}\] for \(q\in (0,1)\).
Quantile functions are used in everyday life:
A company wants to determine if they should enter a market. But there are other potential competitors. We have the following internal projections:
Let \(X\) be the number of other entrants in the market. This number is uncertain: \[\mathbb{P}\left(X=x\right)=\begin{cases}0.1 & \mathsf{if}\ x=1\\ 0.25 & \mathsf{if}\ x=2\\ 0.3 & \mathsf{if}\ x=3\\ 0.25 & \mathsf{if}\ x=4\\ 0.1 & \mathsf{if}\ x=5 \end{cases}\]
What would be the quantile function for the profit of the company? Derive it. Produce a simulation.
Definition 2 (Wasserman (2004, p. 47) Definition 3.1) The expected value of a discrete random variable \(X\) is defined to be \[\mathbb{E}\left(X\right)=\sum_x xf_X(x)\] assuming that the sum is well-defined.
There are also multiple acceptable notations: \(\mathbb{E}\left(X\right)\), \(\mathbb{E}X\), \(\mu\), \(\mu_X\).
The requirement of a well-defined expected value is not an issue if \(X\) takes on a finite number of values.
But it can be an issue if \(X\) takes on an infinite but countable number of values.
Illustrate using some exercises.
The expected value may not exist. Let \(k=\pm 1, \pm 2, \ldots\). Consider \[\mathbb{P}\left(X=k\right)=\frac{3}{\pi^2 k^2}.\]
Refer to other toy examples from Examples 3.1.10 and 3.1.11 of Evans and Rosenthal.
Another less artificial example is from the St. Petersburg paradox. Refer to Examples 3.1.12 and 3.1.13 of Evans and Rosenthal.
You already saw this in the context of tossing a coin. Let \(X=1\) if the toss produces heads and \(X=0\) if the toss produces tails. Then \[\mathbb{E}\left(X\right)=\mathbb{P}\left(X=1\right).\]
Set \(\mathbb{P}\left(X=1\right)=0.4\). Toss such a coin independently many times. Look at the relative frequency.
Now, roll a fair die. Let \(X\) be the outcome of one roll. Here, \(\mathbb{E}\left(X\right)=3.5\).
Instead of looking at a relative frequency, consider the sample mean of the outcomes from independently rolling a fair die as you have more and more rolls.
Notice the long-run order of the sample mean in the next slide.
Pay attention to the weighting scheme.
Notice in both cases, the expected value does NOT have to be one of the possible outcomes of \(X\).
Refer to the distribution of \(X\) in Wasserman Chapter 2 Exercise 2.
Wasserman (2004, p. 48) Theorem 3.6: Saves us time in computing expected values of functions of some random variable \(X\).
Some very useful properties of expected values:
Definition 3 (Wasserman (2004, p. 47)) The \(k\)th moment of \(X\) is defined to be \(\mathbb{E}\left(X^k\right)\) assuming that \(\mathbb{E}\left(|X|^k\right)<\infty\).
Definition 4 (Wasserman (2004, p. 51)) Let \(X\) be a random variable with mean \(\mu\). The variance of \(X\), denoted by \(\sigma^2\), \(\sigma^2_X\), \(\mathbb{V}\left(X\right)\), \(\mathbb{V}X\), or \(\mathsf{Var}\left(X\right)\) is defined \[\sigma^2=\mathbb{E}\left(X-\mu\right)^2,\] assuming that this expectation exists. The standard deviation is \(\mathsf{sd}(X)=\sqrt{\mathbb{V}\left(X\right)}\) and is also denoted by \(\sigma\) or \(\sigma_X\).
Illustrate using examples.
Wasserman (2004, p. 51) Theorem 3.15 points to two important properties of the variance:
Theorem 1 (Wasserman (2004, p. 64) Chebyshev’s inequality) Let \(\mu=\mathbb{E}\left(X\right)\) and \(\sigma^2=\mathsf{Var}\left(X\right)\). Then, for all \(t>0\), \[\mathbb{P}\left(|X-\mu|\geq t\right) \leq \frac{\sigma^2}{t^ 2}.\]
If we let \(Z=\dfrac{X-\mu}{\sigma}\), then \[\mathbb{P}\left(|Z|\geq k\right)\leq \frac{1}{k^2}.\]
\(Z=\dfrac{X-\mu}{\sigma}\) is called standardization.
Therefore, we know a lot about the tail behavior of a standardized random variable! In particular, \[\mathbb{P}\left(|Z|\geq 2\right)\leq \frac{1}{4}, \ \ \mathbb{P}\left(|Z|\geq 3\right)\leq \frac{1}{9}\]