CourseKata Chapter 7

Modeling by Group Means

Mansour Abdoli, PhD

Session Goals

By the end of today you can:

  • Explain what a group model does
  • Fit lm(y ~ group)
  • Interpret intercept and coefficients
  • Explain how predictions change from the empty model

Review: The Empty Model

  • DATA = Mean + Error
  • Assumption:
    • Everyone is the same (One Mean)
    • Variations are by Chance (Error)
  • Formula: \[\hat{y} = \bar{y}\]

  • Application:
    Everyone gets the same prediction.

Example: Thumb Length

Model:

Empty-Model Residual

Residual = Observed − Predicted

Modeling with Group Means

Grouping by Gender

  • Questions:
    • Does thumb length differ by gender?
    • Does one overall mean seem reasonable?

Model by Group (Gender)

  • In word: \[\text{Thumb} = \text{Gender} + \text{Error}\]

  • In Application:

    • Each gender gets its own mean as prediction.

Fit the Group Model

Interpreting Coefficients

  • Here, Female is the reference:
    • Intercept = mean thumb for females
    • GenderMale = difference (Male − Female)

Visualizing Predictions

  • Interpretation:
    • Black points: observed data
    • Red/Blue lines: predicted group means

Compare to Empty Model

Residual Comparison (Conceptual)

  • Discussion:

    • Which model has smaller residual spread?
    • What does that imply?

Key Insight

  • The group model:
    • Allows different means
    • Reduces prediction error
    • Uses information from explanatory variable
  • But…
    • Is the reduction large enough to matter?