Statistical Inference
Notes & simulations
Interactive simulations for statistics, econometrics, and causal inference. The idea is to see how things like sampling distributions, p-values, and OLS actually work, not just read about them.
Start Here
Probability & Uncertainty
- Foundations — Distributions, Sampling & Confidence Intervals
- Variance, SD & Standard Error — Population vs Sample, SD vs SE & Why n − 1
- The Sampling Distribution — Data vs Sampling Distribution & When to Assume Normality
The Central Limit Theorem
- CLT Simulator — Interactive CLT Simulator
- LLN vs CLT — Why Averages Stabilize Before They Become Normal
Inference
- p-values & Confidence Intervals — What They Actually Mean
- Test Statistics — Z, t, F, Chi-Squared, Wald, LR & Score
- The Bootstrap — Resampling-Based Inference
- Power, Alpha, Beta & MDE — Hypothesis Testing & Experiment Design
- Monte Carlo Experiments — How We Understand Estimators
- Multiple Testing — False Discoveries & the Replication Crisis
Regression
- Regression & the CEF — OLS and the Conditional Expectation Function
- Residuals & Controls — Diagnostics, Partialling Out & Omitted Variable Bias
- Frisch-Waugh-Lovell — Partialling Out & OVB
- Omitted Variable Bias — The OVB formula, sign-of-bias table & when controls eliminate confounding
Algebra of Regression
- The Algebra Behind OLS — Matrix notation, the VCV matrix, and where standard errors come from
Estimation
- Method of Moments — Match sample moments to population moments
- Maximum Likelihood — Find the parameters that make the data most probable
- Limited Dependent Variables — Logit, probit, marginal effects & the Tobit model
- Generalized Method of Moments — When you have more moment conditions than parameters
- Bayesian Estimation — MAP, posterior mean, and the link to regularization
Standard Errors & Diagnostics
- Heteroskedasticity — Constant vs Non-Constant Variance & Robust SEs
- Clustered SEs — When Observations Aren’t Independent
- The Delta Method — Standard errors for nonlinear transformations of estimates
- Measurement Error — Why Noisy Regressors Bias You Toward Zero
- Bias-Variance Tradeoff — Underfitting, Overfitting & MSE Decomposition
- Model Selection — AIC, BIC, cross-validation & choosing the right complexity
- When Inference Breaks Down — Why Variation Is the Fuel of Inference
Causal Thinking
- From Correlation to Causation — Why Correlation Isn’t Enough & When It Becomes Causal
Bayesian Thinking
- Bayesian Updating — One Gentle Introduction
Statistical Foundations of AI
- Training as Maximum Likelihood — Cross-entropy, SGD, and what training optimizes
- Regularization as Bayesian Inference — Weight decay, dropout, and the RLHF penalty as priors
- Prediction vs Causation in Foundation Models — \(P(Y \mid X)\) vs \(P(Y \mid do(X))\) and identification vs training
- Experimental Design for AI Systems — A/B testing, power, and multiple testing for model evaluation
- Calibration & Uncertainty Quantification — When to trust model confidence