Coursekata
Simulation-Based Inference Summary
How to use this page
This page summarizes the simulation steps for confidence intervals and hypothesis tests in common settings.
Two big ideas appear throughout:
- A confidence interval uses a bootstrap distribution centered near the observed statistic.
- A hypothesis test uses a null distribution centered at the null value.
Tip
For a confidence interval:
- Compute the observed sample statistic.
- Resample with replacement from the sample.
- Recompute the statistic for each bootstrap sample.
- Repeat many times to build a bootstrap distribution.
- Use the middle 95% to cmpute the confidence interval directly or via estimating the margin of error.
For a hypothesis test:
- State the null and alternative hypotheses.
- Compute the observed sample statistic.
- Simulate data under the null model.
- Recompute the statistic for each simulated sample.
- Repeat many times to build a null distribution.
- Find the proportion of simulated statistics at least as extreme as the observed statistic, or identify rejection region(s).
- Make a decision and state a conclusion in context.
Parameter and statistic
- Parameter: \(p\)
- Statistic: \(\hat{p}\)
- Sample size: \(n\)
Confidence interval
- Record the observed sample proportion \(\hat{p}\).
- Represent the sample of size \(n\) as a set of 0s and 1s.
- Generate one bootstrap sample proportion:
- Resample (sample with replacement) from the observed sample, using the same sample size \(n\).
- Compute the proportion of 1s in the resample.
- Repeat many times to build a bootstrap distribution of \(\hat{p}\).
- Use the middle 95% of the simulated proportions, or compute a margin of error \(ME\) from the bootstrap distribution.
- Build and interpret the interval as plausible values for the population proportion.
Hypothesis test
- State \(H_0: p = p_0\) and the alternative.
- Compute the observed sample proportion \(\hat{p}_{obs}\).
- Simulate a sample proportion \(\hat{p}\) using a population where the probability of success is \(p_0\):
- Sample with replacement \(n\) times.
- Compute one simulated sample proportion \(\hat{p}\).
- Repeat many times to build the null distribution of \(\hat{p}\).
- Find the proportion of simulated values at least as extreme as \(\hat{p}_{obs}\) relative to \(p_0\), or identify rejection region(s).
- Make a decision and state your conclusion in context.
Note
- The confidence interval is centered near the observed \(\hat{p}\).
- The test distribution is centered at the null value \(p_0\).
- Larger sample sizes produce less variability in \(\hat{p}\).
Parameter and statistic
- Parameters: \(p_1\) and \(p_2\) (\(p_1 - p_2\))
- Statistics: \(\hat{p}_1\) and \(\hat{p}_2\) (\(\hat{p}_1 - \hat{p}_2\))
- Sample sizes: \(n_1\) and \(n_2\)
Confidence interval
- Compute the observed difference in sample proportions, \(\hat{p}_1 - \hat{p}_2\).
- Represent each sample as 0s and 1s.
- Generate one bootstrap difference, \(\hat{p}_1 - \hat{p}_2\):
- Resample (sample with replacement) within group 1, using size \(n_1\), and compute \(\hat{p}_1\).
- Resample (sample with replacement) within group 2, using size \(n_2\), and compute \(\hat{p}_2\).
- Compute \(\hat{p}_1 - \hat{p}_2\).
- Repeat many times to build a bootstrap distribution of \(\hat{p}_1 - \hat{p}_2\).
- Use the middle 95% or compute a margin of error.
- Interpret the interval as plausible values for the difference in population proportions.
Hypothesis test
- State \(H_0: p_1 - p_2 = 0\) and the alternative.
- Compute the observed difference, \((\hat{p}_1 - \hat{p}_2)_{obs}\).
- Build the null distribution:
- Assume both groups come from the same population under the null.
- Pool the outcomes, then randomly assign or resample outcomes into two groups of sizes \(n_1\) and \(n_2\).
- Compute the simulated difference in proportions.
- Repeat many times to build the null distribution of \(\hat{p}_1 - \hat{p}_2\).
- Find the proportion of simulated differences at least as extreme as the observed difference, or identify rejection region(s).
- Make a decision and state your conclusion in context.
Note
- For confidence intervals, resample within each group.
- For tests, simulate under the null: shuffle the response and reassign to explanatory; i.e.
- Null: No association; treating the two groups as coming from the same population.
Parameter and statistic
- Parameter: \(\mu\)
- Statistic: \(\bar{x}\)
- Sample size: \(n\)
Confidence interval
- Record the sample mean \(\bar{x}\).
- Generate one bootstrap sample mean:
- Resample with replacement from the sample, using the same size \(n\).
- Compute the sample mean.
- Repeat many times to build a bootstrap distribution of \(\bar{x}\).
- Use the middle 95% or compute a margin of error.
- Interpret the interval as plausible values for the population mean.
Hypothesis test
- State \(H_0: \mu = \mu_0\) and the alternative.
- Compute the observed mean \(\bar{x}_{obs}\).
- Build the null distribution:
- Shift the sample so its mean equals \(\mu_0\), or simulate from a model centered at \(\mu_0\).
- Draw a sample of size \(n\).
- Compute the simulated mean.
- Repeat many times to build the null distribution of \(\bar{x}\).
- Find the proportion of simulated means at least as extreme as \(\bar{x}_{obs}\) relative to \(\mu_0\), or identify rejection region(s).
- Make a decision and state your conclusion in context.
Note
- For a one-mean test, the null model must be centered at \(\mu_0\).
- The standard error gets smaller as \(n\) increases.
Parameter and statistic
- Parameters: \(\mu_1\) and \(\mu_2\)
- Statistic: \(\bar{x}_1 - \bar{x}_2\)
- Sample sizes: \(n_1\) and \(n_2\)
Confidence interval
- Compute the observed difference in sample means, \(\bar{x}_1 - \bar{x}_2\).
- Generate one bootstrap difference:
- Resample with replacement within group 1 and compute \(\bar{x}_1\).
- Resample with replacement within group 2 and compute \(\bar{x}_2\).
- Compute \(\bar{x}_1 - \bar{x}_2\).
- Repeat many times to build a bootstrap distribution of \(\bar{x}_1 - \bar{x}_2\).
- Use the middle 95% or compute a margin of error.
- Interpret the interval as plausible values for the difference in population means.
Hypothesis test
- State \(H_0: \mu_1 - \mu_2 = 0\) and the alternative.
- Compute the observed difference, \((\bar{x}_1 - \bar{x}_2)_{obs}\).
- Build the null distribution:
- Assume there is no group effect under the null.
- Pool the observed values and randomly assign them into two groups of sizes \(n_1\) and \(n_2\), or shuffle the group labels.
- Compute the simulated difference in means.
- Repeat many times to build the null distribution of \(\bar{x}_1 - \bar{x}_2\).
- Find the proportion of simulated differences at least as extreme as the observed difference, or identify rejection region(s).
- Make a decision and state your conclusion in context.
Note
- For confidence intervals, resample within groups.
- For tests, shuffle or reassign group labels under the null.
- Null: No association; observations are from the same population.
Parameter and statistic
- Parameter: \(\beta_1\)
- Statistic: \(b_1\)
- Sample size: \(n\)
Confidence interval
- Fit the regression line and record the sample slope \(b_1\).
- Generate one bootstrap slope:
- Resample cases with replacement from the original data.
- Refit the regression line.
- Record the slope.
- Repeat many times to build a bootstrap distribution of \(b_1\).
- Use the middle 95% or compute a margin of error.
- Interpret the interval as plausible values for the population slope \(\beta_1\).
Hypothesis test
- State \(H_0: \beta_1 = 0\) and the alternative.
- Fit the regression and record the observed slope \(b_{1,obs}\).
- Build the null distribution:
- Keep the explanatory variable values fixed.
- Shuffle the response values, or shuffle the pairing between \(x\) and \(y\), to break any linear relationship.
- Refit the regression and record the simulated slope.
- Repeat many times to build the null distribution of \(b_1\).
- Find the proportion of simulated slopes at least as extreme as the observed slope, or identify rejection region(s).
- Make a decision and state your conclusion in context.
Note
- For confidence intervals, resample whole cases.
- For tests, shuffle the response and pair back to predictor values.
- Null: No association between \(y\) and \(x\).
Parameter and statistic
- Parameters: \(\mu_1, \mu_2, \dots, \mu_k\)
- Statistic: usually \(F\), or another measure of between-group vs within-group variation
- Sample sizes: \(n_1, n_2, \dots, n_k\)
Confidence interval
A single overall confidence interval is not the focus for multiple means.
Instead, common follow-up goals are:
- confidence intervals for specific pairwise differences
- confidence intervals for group means
If needed, construct pairwise bootstrap intervals by resampling within each group and computing the difference in means for the chosen pair.
Hypothesis test
- State \(H_0: \mu_1 = \mu_2 = \cdots = \mu_k\) and the alternative that at least one mean differs.
- Compute the observed test statistic, often \(F\).
- Build the null distribution:
- Assume all groups come from the same population under the null.
- Pool the observed values.
- Randomly assign the values into groups of the original sizes.
- Compute the test statistic for the simulated grouping.
- Repeat many times to build the null distribution of the statistic.
- Find the proportion of simulated statistics at least as extreme as the observed statistic, or identify a rejection region.
- Make a decision and state your conclusion in context.
Note
- The overall test asks whether there is evidence that any group mean differs.
- A single test that determines whether a model is significant prevents the inflation of Type I Error.
- If the overall test is significant, follow-up comparisons are usually needed.
- Adjusted-\(\alpha\) is used for the post-hoc pairwise comparisons.
Parameter and statistic
- Parameters: population joint distribution or population cell proportions
- Statistic: often \(\chi^2\)
- Sample size: \(n\)
Confidence interval
A single confidence interval is not usually the focus for a test of independence.
Instead, we usually ask whether two categorical variables are associated.
Useful follow-up summaries may include:
- confidence intervals for selected conditional proportions
- confidence intervals for differences in proportions within specific categories
Hypothesis test
- State \(H_0\): the two categorical variables are independent, and the alternative that they are associated.
- Compute the observed test statistic, often \(\chi^2\).
- Build the null distribution:
- Keep one variable fixed.
- Shuffle the labels of the other variable to break any association while preserving the marginal structure.
- Recompute the test statistic for the shuffled data.
- Repeat many times to build the null distribution of the statistic.
- Find the proportion of simulated statistics at least as extreme as the observed statistic, or identify a rejection region.
- Make a decision and state your conclusion in context.
Note
- A significant result suggests an association, not causation.
- A single test that determines whether a model is significant prevents the inflation of Type I Error.
- If the overall test is significant, follow-up comparisons are usually needed.
- Adjusted-\(\alpha\) is used for the post-hoc pairwise comparisons.
- After the test, examine conditional proportions or residuals to see where the association appears.