CourseKata Chapter 10

Logic of Inference

Mansour Abdoli, PhD

Overview / Goals

  • Modeling Data vs DGP
  • Sampling Distribution
  • Null Distribution of \(b_1: (\mu_1-\mu_2)\)
  • Unlikely (unexpected) Events
  • \(\alpha\) vs. p-value
  • Testing \(b_1: \beta_1\)

Inference vs Fit

Modeling Data vs DGP

Modeling Data (Fit)

  • Observed:
    • DATA = Model + Error
    • More complex, Better Fit
      • Fit: PRE

Modeling DGP (Inference)

  • Unknown:
    • Hypothesis (Claim)
    • Null Hypothesis
      • Chance Model
  • How good a model estimates DGP?
    • not sure, but it is not by chance => Test of Hypothesis
    • useful for prediction => Cross-validation

Models

  • Quantitative Response: \(Y = Model + Error\)
    • Categorical Explanatory \(\Rightarrow b_0=\bar x_1\), \(b_1=\bar x_2-\bar x_1\)
    • Quantitative Explanatory \(\Rightarrow b_0=\hat y_{x=0}\), \(b_1=\text{slope}\)
  • Models have assumptions:
    • We focus on the parameter(s): \(\beta_0\) or \(\beta_1\)

Distributions

  • Population Distribution (DGP)
    • Empty Model: \(\beta_0 + \varepsilon\)
    • Fixed shape, center and spread
  • Sample Distribution
    • Data: \(\bar x + e\)
    • Everything may change per sample
  • Distribution of a Sample Statistics
    • Sample Mean: \(\beta_0 + \epsilon\)
    • May change per sample size.

Quantitative ~ Categorical

Null Distribution of \(b_1: (\mu_1-\mu_2)\)

  • Null: Empty Model
  • \(Y\) is independent of \(X\)
  • Simulate Null Distribution of \(\bar y\)
    • Shuffle \(Y\)
    • Compute \(b_1=\mu_2-\mu_1\)
    • Repeat many times
  • Visualize with a histogram

Null Distribution Example

Unlikely (unexpected) Events

  • Observed Statistics far from Expected value
    • Distance in not universal
      • Use standardized value
      • Use Probability
  • If tail probability is low,
    • observed statistic is unlikely

Example

\(\alpha\) vs. \(p\)-value

How far is far?

  • How low is an unusual probability?
    • Depends! But typically 5%.
    • We call this Significance Level (\(\alpha\))
  • Far:
    • \(p\)-value\(< \alpha\)
    • Observed statistic in \(\alpha\) portion of tail(s)

Testing \(b_1: (\mu_1-\mu_2)\)

  • Null Hypothesis: Empty Model DGP
    • Reject this if observed statistics in unusually far.
  • In what direction?
    • The Alternative:
      • \(\beta_1>0\): right tail
      • \(\beta_1<0\): Left tail
      • \(\beta_1\ne0\): both tails

Example: One-Tail Tests

\(H_a: \beta_1>0\)

\(H_a: \beta_1<0\)

Example: Two-Tail Tests

\(H_a: \beta_1\ne 0\)

\(F\) p-value

\(F\) and \(t\) Distributions

\(F\) statistics in ANOVA: - Follows \(F(df_{num}, df_{denum})\)

\(F\) p-value

Quantitative ~ Quantitative

Testing \(b_1: \beta_1\)