Lecture 5a

Connections across Lectures 1 to 4

Andrew Pua

2024-10-07

Plan for these slides

  1. How does everything you have seen so far mesh together?
  2. Introducing the idea of random variable as a way to link the sample space to data
  3. Ways to describe random variables
  4. Your first special distribution
  5. Connections to hypothesis tests you have learned so far
  • Examples include:

    • Measurements obtained from experiments
    • Measurements obtained from observational data
    • Statistics used for summaries
    • Statistics used to evaluate claims

Concept of a random variable

  • Two students are selected at random (with replacement) from a college in which 60% of the students are male. Let \(X\) be the number of males in the sample.

  • Define \(M\) to be a label for male, and \(F\) for female.

  • Here we have \[\Omega=\{MM, MF, FM, FF\}.\]

  • Here \(X\left(MM\right)=2\). What happens to the other elements of \(\Omega\)?

  • Here \(X\) is what is called a random variable. It connects a particular outcome in a sample space with a value for \(X\).

Definition 1 (Wasserman (2004, p. 19) Definition 2.1) A random variable is a mapping \(X:\Omega\to \mathbb{R}\) that assigns a real number \(X(\omega)\) to each outcome \(\omega\).

  • In effect, we have “hidden” the sample space in order to directly work with a suitable random variable of interest.

Concept of a distribution of a random variable

  • A list of values of \(X\), together with the corresponding probabilities is called the distribution of \(X\).

  • In effect, we “distribute” 1 to the possible values that a random variable \(X\) can take.

    • \(\mathbb{P}\left(X\in A\right)=\mathbb{P}\left(\{\omega\in\Omega: X(\omega)\in A\}\right)\) for a subset \(A\) of \(\mathbb{R}\)
    • \(\mathbb{P}\left(X=x\right)=\mathbb{P}\left(\{\omega\in\Omega: X(\omega)=x\}\right)\) for a particular value \(x\) of \(X\)
  • For example, \[\mathbb{P}\left(X=2\right)= \mathbb{P}\left(\{MM\}\right)=0.6^2.\]

  • Now finish constructing the distribution of \(X\), where \(X\) is the number of males in the sample.

  • Given the distribution of \(X\), you can also find \(\mathbb{P}\left(X\leq 1\right)\). You can also generalize to \(\mathbb{P}\left(X\leq x\right)\) for any \(x\in\mathbb{R}\).

Cumulative distribution functions

Definition 2 (Wasserman (2004, p. 20) Definition 2.2) The cumulative distribution function or cdf is the function \(F_X:\mathbb{R}\to [0,1]\) defined by \[F_X\left(x\right) = \mathbb{P}\left(X\leq x\right).\]

Connections to previous lectures

A fair coin is tossed independently for ten times.

  • Find the distribution of the \(X\), defined as the number of heads in those ten tosses.
  • Change the coin from a fair one to the situation where the probability of heads is 0.25.
  • Change the coin from a fair one to the situation where the probability of heads is \(p\) and instead of 10, consider \(n\) tosses.
Outcome \(\omega\) Probability \(X(\omega)\)
TTTTTTTTTT \((0.5)^{10}\) 0
TTTTTTTTTH \((0.5)^9 (0.5)\) 1
TTTTTTTTHT \((0.5)^9 (0.5)\) 1
\(\vdots\) \(\vdots\) \(\vdots\)
HTTHHTHHH \((0.5)^4 (0.5)^6\) 6
\(\vdots\) \(\vdots\) \(\vdots\)
HHHHHHHHHH \((0.5)^{10}\) 10
  • Eventually, you will obtain the distribution of \(X\), defined as the number of heads:

\[\mathbb{P}\left(X=k\right)=\begin{pmatrix}10 \\ k \end{pmatrix} (0.5)^k (0.5)^{10-k}, \ k=0,1,\ldots, 10\]

  • What happens when you consider the other modifications?
  • Congratulations! You have formed a special or named distribution called a binomial distribution.
  • Experiments resulting in “success” or “failure” but not both
  • You have to define what “success” means and ensure that “failure” is the complement.
  • When \(n=1\), we have a Bernoulli distribution.
  • Fix \(n\) as a positive integer and \(p\in (0,1)\). If there are \(n\) independent Bernoulli trials with each trial having the same probability of success \(p\), then the total number of successes \(X\) has a Binomial distribution with parameters \(n\) and \(p\). In short, \(X\sim Bin\left(n,p\right)\).

Connections to previous lectures

  • 10 independent tosses of a coin

  • Parallels to dogs example: 10 independent trials involving Harley choosing the correct cup

  • Before we collect data:

    • We have 10 random variables \(X_1, X_2,\ldots, X_{10}\).
    • Here \(X_i\) takes on only two possible values 0 and 1.
    • Assume that \(\mathrm{P}\left(X_i=1\right)=\theta\) for all \(i\) and that each trial is independent of the next.
  • We wanted to the test the null hypothesis that \(\theta=0.5\) against the alternative that \(\theta>0.5\).

    • To evaluate which hypothesis is supported by the data, we computed a \(p\)-value, which depends on the distribution of the sample proportion.
    • The sample proportion is the number of times Harley chose the correct cup is: \[ \overline{X}_{10} =\frac{X_1+X_2+\ldots+X_{10}}{10}.\]
  • We used simulation to compute a \(p\)-value which is given by \[\mathbb{P}_{\theta=0.5}\left(\overline{X}_{10} \geq 0.9 \right)\] or \[\mathbb{P}_{\theta=0.5}\left(10\overline{X}_{10} \geq 9 \right)=\mathbb{P}_{\theta=0.5}\left(X_1+X_2+\ldots+X_{10} \geq 9 \right).\]

  • What distribution springs to mind?

  • Can you now compute (by hand and using R) the exact \(p\)-value for Harley’s case?