CourseKata Chapter 8

PRE, F, Cohen’s D

Mansour Abdoli, PhD

Overview / Goals

By the end of today, you can:

  • fit and interpret multi‑group (3+ groups) models using lm()
  • compare group models using PRE, df, MS, and the F ratio
  • describe and compute effect sizes (mean differences, PRE, Cohen’s d)
  • use shuffle/simulation to ask whether an effect could plausibly be due to randomness

Review: Group Models

  • Data:
    • Outcome: \(Y\) (A numerical Variable)
    • Explanatory: A Categorical Variable
  • Models:
    • Empty model: one mean for everyone
    • Group model: separate mean for each group.

Example

Comparing Models

  • Model Measures:
    Gender Interest
SSE  10546    11567
PRE 0.1123 0.026324
F   19.609   2.0818
  • Which one is better?

Effect of Increasing Levels

Example: 2-Level Explanatory

  • \(Y_i = b_0 + b_1 \cdot X_{1i} + e_i\)
    • \(b_0 = \text{mean}(\text{Thumb}_\text{short})\)
    • \(b_1 = \text{mean}(\text{Thumb}_\text{tall})-\text{mean}(\text{Thumb}_\text{short})\)
    • \(X_{1i} = 1 \text{ for Height}==\text{tall}\)

Example: 3-Level Explanatory

  • \(Y_i = b_0 + b_1 \cdot X_{1i} + b_2 \cdot X_{2i} + e_i\)
    • \(b_0 = \text{mean}(\text{Thumb}_\text{Short})\)
    • \(b_1 = \text{mean}(\text{Thumb}_\text{Med.})-\text{mean}(\text{Thumb}_\text{Short})\)
      • \(X_{1i} = 1 \text{ for Height}==\text{Med.}\)
    • \(b_2 = \text{mean}(\text{Thumb}_\text{Tall})-\text{mean}(\text{Thumb}_\text{Short})\)
      • \(X_{2i} = 1 \text{ for Height}==\text{Tall}\)

Comparing Model Measures

  • Notice anything?

Level Size and Model Measures

SSE (SSR):

  • How much of error is not explained.

PRE = (1-SSE)/SST

  • Proportion of Explained variability
  • More levels, more complex models, more explained variability

F = MSM/MSE

  • Ratio of average explained variability to unexplained variablity
    • Less sensetive to larger number of levels

The F Ratio

Why we need F

PRE alone doesn’t “penalize” complexity.

  • Adding parameters almost always reduces SSE.
    • Is the reduction big enough to be worth the extra parameters?

F ratio \[F = \frac{MS_{\text{Model}}}{MS_{\text{Error}}}\]

  • Large \(F\) means: error reduced per parameter is large relative to remaining error.

Effect Size in Group Models

Why effect size?

  • With large \(n\), tiny effects can look convincing.
  • Effect size says: How big is the difference?

Common group-model effect sizes:

  • Mean differences (model parameters)
  • PRE (variance explained relative to empty model)
  • Cohen’s \(d\) (standardized mean difference)

Cohen’s d (two groups)

  • Standardized Difference: \[d = \frac{\bar Y_1 - \bar Y_2}{s_{pooled}}\]
    • \(s_{pooled}^2 = MSE = \frac{SSE}{df_E}\)
    • In general: \(s_{pooled}^2 = \frac{df_1\cdot s_1^2+df_2\cdot s_2^2}{df_1+df_2}\)
  • Verify the calculation.

Effect size ≠ causality

  • Large difference between groups:
    • Practical Effect
    • Nor a proof of causal effect


  • Group models show association
  • Causal claims need stronger design / evidence

Thinking about DGP

Observed Sample to Population Inference

  • The observed effect:
    • A real systematic difference,
    • Or a random variation (noise)
  • A Test Approach
    • Assume random variation (the true effect = 0)
      • Null Hypothesis (\(H_0\))
    • Find Null Distribution of effect
      • Sampling Distribution under the Null
    • Measure how often effects are as large as observed one

Shuffle-based Simulation (Conceptual Inference)

Shuffle logic

  • Keep the same \(Y\) values
  • Randomly reassign them to groups
  • Refit the model
  • Record the effect size (e.g., \(b_1\) or PRE)
  • Repeat many times to get a shuffle distribution
    • A.k.a. Null Distribution

Example: Two-group Model

Visualize the shuffle distribution

Interpreting the shuffle plot

  • The histogram:
    • effects expected by chance
      (if the DGP had no group effect)
  • The vertical line is the observed effect
  • If the observed effect is far out in the tails:
    • The “no-effect” DGP is unlikely

Wrap-up

What Chapter 8 emphasizes

  • Multi-group models:
    baseline mean and group changes
  • Model comparison:
    fit (PRE) + complexity (df)
  • F ratio:
    a complexity-aware comparison tool
  • Effect sizes:
    Statistical vs Practical Difference
  • Shuffle simulation:
    Could the effect be due to Noise?