Lecture 2

Digging a bit deeper

Andrew Pua

2024-09-11

Plan for these slides

Dig a bit deeper into testing a claim
Introduce simulations via R
Learn how to test a “simple” claim which usually gets taught at the end of a statistics course

A typical scientific investigation

Our focus is Harley

Setup for the experiment

Summary of the data from the experiment

What does the original dataset look like?
Focus on Harley’s result for the cue “Bow look dog”. How do we judge the finding?

Judging findings requires models

Think about what could be explanations for the finding.
Think about the circumstances which led to the finding.
What do the authors mean when they say “the ability of domestic dogs to use human body cues”?

A simple model to start judging findings

The starting point is that what we observe in the data have arisen from a random process or what others call a data generating process.
This is a philosophical advance and a different way of thinking. Random is a technical term and it does not mean the same as haphazard.

A simple model

Tossing a coin
Two possible outcomes (heads or tails)
Which outcome shows up is not known in advance
With enough observations of the act of coin tossing under similar conditions, a particular type of long-run behavior emerges.

Learn bits of R

Go to Google Colab https://colab.research.google.com.
Create a new notebook.

Change runtime type

Default is Python 3. Change to R. You’re done with setup.

Learn bits of R

Congratulations you can now use R without having installing anything on your computer.
This is a short-term quick solution. For quick usage, this is enough.
You may want to change the name of your notebooks. Your notebooks are stored in your Google drive under Colab Notebooks.

Learn bits of R

rbinom(1, 1, 0.5)

[1] 0

rbinom(10, 1, 0.5)

 [1] 0 0 1 1 0 0 1 0 1 1

x <- rbinom(10, 1, 0.5)
x

 [1] 0 0 0 1 0 0 1 1 1 0

x <- rbinom(10, 1, 0.5)
x

 [1] 0 1 1 0 0 0 1 1 1 0

Learn bits of R

plot(x)

Learn bits of R

c(1, 3, 8)

[1] 1 3 8

cumsum(c(1, 3, 8))

[1]  1  4 12

c(1, 3, 8)/3

[1] 0.3333333 1.0000000 2.6666667

1:20

 [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20

c(1, 3, 8)/(1:3)

[1] 1.000000 1.500000 2.666667

rep(1, 5)

[1] 1 1 1 1 1

Learn bits of R

 [1] 0 1 1 0 0 0 1 1 1 0

length(x)

[1] 10

z <- cumsum(x)/(1:length(x))
plot(z)

Making nicer pictures

plot(1:length(z), z, 
     type = "l", xlab = "number of coin flips", 
     ylab = "relative frequency", ylim = c(0, 1),
     cex.lab=1.5, cex.axis=1.5)
lines(1:length(z), rep(0.5, length(z)), lty = 3, col = 2, lwd = 3)

Run this code a few more times

x <- rbinom(10, 1, 0.5)
z <- cumsum(x)/(1:length(x))
plot(1:length(z), z, 
     type = "l", xlab = "number of coin flips", ylab = "relative frequency", ylim = c(0, 1),
     cex.lab=1.5, cex.axis=1.5)
lines(1:length(z), rep(0.5, length(z)), 
      lty = 3, col = 2, lwd = 3)

What are the emerging patterns?
What stays the same? What changes?
In terms of R: What are the crucial parts? Which are the bells and whistles?

Emerging patterns from another run

Experiment further

Focus on x <- rbinom(10, 1, 0.5).
Change 10 to 1000, while holding everything else constant. Run the previous code a few times. Emerging patterns?
Change 0.5 to 0.05, while holding everything else constant. Run the previous code a few times. Emerging patterns?
Change 10 to 1000 and 0.5 to 0.05, while holding everything else constant. Run the previous code a few times. Emerging patterns?

Variation is everywhere…

… even for artificially generated or simulated data.
The line z <- cumsum(x)/(1:length(x)) is really calculating a sequence of relative frequencies.
In our context, these relative frequencies are called sample proportions.
The number on the vertical axis corresponding to the red line is a fixed constant of potential interest.
In our context, this fixed constant is called a population proportion or a probability of “success”.

Sample proportions are averages!

Let \(n\) be the number of trials or observations.
For \(t=1,\ldots, n\), let \(X_t\) be the \(t\)th observation of a characteristic measured by \(X\).
In our case, \(X\) takes on two values: 0 or 1.
So, we have \(X_1, X_2, \ldots, X_n\).
What does \(X_1+X_2+\ldots+X_n\) represent?
After dividing the sum by \(n\), what do we obtain?

Congratulations!

You now have some first-hand experience with a tool that statisticians use: It is called Monte Carlo or fake-data simulation.
You also get to have a feel for the connection between data (even though artificially generated) and a fixed constant of potential interest. This is a version of what is called the law of large numbers (LLN).
You have also learned a bit of R in the process.
A model has been present whether you noticed it or not.

Dogs and coins

Pretend that “the dog chooses the correct cup” is like “getting Heads in a coin toss”.
Pretend we are in the long run.
- What does it mean when “the dog cannot understand human cues”?
- What does it mean when “the dog can understand human cues”?
- As a result, there are two claims to adjudicate.

Dogs and coins

Once again, pretend we are in the long run. How many out of the 1 million times would you expect the dog to choose the correct cup if “the dog cannot understand human cues”?
So, if Harley cannot understand human cues, how many out of the 10 times would you expect him to choose the correct cup?
Think of what we actually observed. How many times did Harley choose the correct cup?

Learn even more R

# Number of times to repeat the process of tossing a fair coin 10 times 
nsim <- 2
# Repeat for nsim times "tossing of a fair coin 10 times"
a <- replicate(nsim, rbinom(10, 1, 0.5))
a

      [,1] [,2]
 [1,]    0    1
 [2,]    0    1
 [3,]    0    0
 [4,]    0    1
 [5,]    1    1
 [6,]    1    1
 [7,]    0    0
 [8,]    1    1
 [9,]    0    0
[10,]    0    1

# Calculate the relative frequency or sample proportion for every repetition
props <- colMeans(a)
props

[1] 0.3 0.7

Which claim is supported by real data?

# Number of times to repeat the process of tossing a fair coin 10 times 
nsim <- 10^4
# Repeat for nsim times "tossing of a fair coin 10 times"
a <- replicate(nsim, rbinom(10, 1, 0.5))
# Calculate the relative frequency for every repetition
props <- colMeans(a)

length(props)

[1] 10000

head(props)

[1] 0.4 0.8 0.7 0.3 0.4 0.8

# Summarize in tabular format
table(props)

props
   0  0.1  0.2  0.3  0.4  0.5  0.6  0.7  0.8  0.9    1 
   8   95  476 1145 2053 2404 2100 1150  446  109   14

Which claim is supported by real data?

# Visualize using a histogram
hist(props, freq = FALSE, cex.lab=1.5, cex.axis=1.5, cex.main = 1.5)

How can you decide?

A very popular approach (with dissenting opinions) is to report the following statistic:

If you take the position that Harley cannot understand human gestures, how often can you hypothetically observe something more extreme than what you actually observed in real data?

How can you decide?

In our case, it is given by 0.0123.
- But it is ridiculous to report at this level of accuracy.
- 0.01 is good enough for our setting. How did I know that this is good enough?
What decision will you make?

Congratulations again!

You have been exposed to an argument that is supported by foundations and principles that have some mathematical basis.
This style of argument is called hypothesis testing.
You will learn more as we progress through the course.
There are other statistical ways of thinking depending on the goal. For now, the goal was to adjudicate between two claims about the fixed constant of interest.

Taking stock

What was the simple model of chance that was used?
Why were we using the word “pretend”?
What are the steps of the argument? Why could it make sense?
Where was the math?
How do we know that the computer is doing what we think it should do?

Some technical terms

model, random, data generating process, long run, Monte Carlo simulation, fake-data simulation, law of large numbers, population, probability, relative frequency, expect, histogram, density, summary, hypothesis testing

Informal	In general	In our example
Fixed constant of interest	Parameter	Population proportion
Data summary (can be hypothetical/fake or real)	Statistic	Sample proportion, relative frequency