Bayesian vs Frequentist
Same data, different questions
Bayesian and frequentist statistics look at the same data but ask different questions:
| Frequentist | Bayesian | |
|---|---|---|
| Parameters are… | Fixed but unknown | Random variables with distributions |
| Probability means… | Long-run frequency | Degree of belief |
| Result | Point estimate + confidence interval | Full posterior distribution |
| “There’s a 95% chance…” | …that this procedure captures the true value | …that the true value is in this interval |
The frequentist says: “If I repeated this experiment forever, 95% of my CIs would contain the true value.” The Bayesian says: “Given what I’ve seen, I’m 95% sure the true value is in this range.”
Most people actually think like Bayesians (“what’s the probability the parameter is between A and B?”) but compute like frequentists (p-values, CIs).
Side-by-side comparison
The simulation below runs the same experiment and shows both the frequentist confidence interval and the Bayesian credible interval. Watch how they differ — especially with small samples and informative priors.
#| standalone: true
#| viewerHeight: 620
library(shiny)
ui <- fluidPage(
tags$head(tags$style(HTML("
.stats-box {
background: #f0f4f8; border-radius: 6px; padding: 14px;
margin-top: 12px; font-size: 14px; line-height: 1.9;
}
.stats-box b { color: #2c3e50; }
"))),
sidebarLayout(
sidebarPanel(
width = 3,
sliderInput("true_mu", HTML("True μ:"),
min = -3, max = 3, value = 1, step = 0.5),
sliderInput("n", "Sample size:",
min = 2, max = 200, value = 10, step = 1),
sliderInput("prior_mu", "Bayesian prior mean:",
min = -3, max = 3, value = 0, step = 0.5),
sliderInput("prior_sd", "Prior SD:",
min = 0.5, max = 10, value = 2, step = 0.5),
actionButton("go", "New experiment", class = "btn-primary", width = "100%"),
uiOutput("results")
),
mainPanel(
width = 9,
fluidRow(
column(6, plotOutput("interval_plot", height = "420px")),
column(6, plotOutput("repeat_plot", height = "420px"))
)
)
)
)
server <- function(input, output, session) {
dat <- reactive({
input$go
true_mu <- input$true_mu
n <- input$n
prior_mu <- input$prior_mu
prior_sd <- input$prior_sd
sigma <- 2
y <- rnorm(n, mean = true_mu, sd = sigma)
y_bar <- mean(y)
se <- sigma / sqrt(n)
# Frequentist 95% CI
freq_lo <- y_bar - 1.96 * se
freq_hi <- y_bar + 1.96 * se
# Bayesian posterior
prior_prec <- 1 / prior_sd^2
data_prec <- n / sigma^2
post_prec <- prior_prec + data_prec
post_sd <- 1 / sqrt(post_prec)
post_mu <- (prior_prec * prior_mu + data_prec * y_bar) / post_prec
bayes_lo <- qnorm(0.025, post_mu, post_sd)
bayes_hi <- qnorm(0.975, post_mu, post_sd)
# Repeated experiments for right panel
k <- 50
reps <- t(replicate(k, {
yy <- rnorm(n, mean = true_mu, sd = sigma)
yy_bar <- mean(yy)
f_lo <- yy_bar - 1.96 * se
f_hi <- yy_bar + 1.96 * se
d_prec <- n / sigma^2
p_prec <- prior_prec + d_prec
p_sd <- 1 / sqrt(p_prec)
p_mu <- (prior_prec * prior_mu + d_prec * yy_bar) / p_prec
b_lo <- qnorm(0.025, p_mu, p_sd)
b_hi <- qnorm(0.975, p_mu, p_sd)
c(yy_bar, f_lo, f_hi, p_mu, b_lo, b_hi)
}))
list(true_mu = true_mu, y_bar = y_bar,
freq_lo = freq_lo, freq_hi = freq_hi,
post_mu = post_mu, post_sd = post_sd,
bayes_lo = bayes_lo, bayes_hi = bayes_hi,
reps = reps, prior_mu = prior_mu)
})
output$interval_plot <- renderPlot({
d <- dat()
par(mar = c(4.5, 8, 3, 1))
xlim <- range(c(d$freq_lo, d$freq_hi, d$bayes_lo, d$bayes_hi, d$true_mu)) +
c(-0.5, 0.5)
plot(NULL, xlim = xlim, ylim = c(0.5, 2.5),
yaxt = "n", ylab = "", xlab = expression(mu),
main = "This Experiment")
axis(2, at = 1:2, labels = c("Frequentist\n95% CI", "Bayesian\n95% CrI"),
las = 1, cex.axis = 0.85)
# Frequentist
segments(d$freq_lo, 1, d$freq_hi, 1, lwd = 4, col = "#e74c3c")
points(d$y_bar, 1, pch = 19, cex = 1.5, col = "#e74c3c")
# Bayesian
segments(d$bayes_lo, 2, d$bayes_hi, 2, lwd = 4, col = "#3498db")
points(d$post_mu, 2, pch = 19, cex = 1.5, col = "#3498db")
# True value
abline(v = d$true_mu, lty = 2, lwd = 2, col = "#2c3e50")
text(d$true_mu, 2.4, expression("True " * mu), cex = 0.9, col = "#2c3e50")
})
output$repeat_plot <- renderPlot({
d <- dat()
k <- nrow(d$reps)
par(mar = c(4.5, 4, 3, 1))
freq_covers <- d$reps[, 2] <= d$true_mu & d$reps[, 3] >= d$true_mu
bayes_covers <- d$reps[, 5] <= d$true_mu & d$reps[, 6] >= d$true_mu
xlim <- range(d$reps[, 2:6], d$true_mu) + c(-0.5, 0.5)
plot(NULL, xlim = xlim, ylim = c(1, k),
xlab = expression(mu), ylab = "Experiment #",
main = paste0(k, " repeated experiments"))
for (i in seq_len(k)) {
# Frequentist (left-shifted slightly)
clr_f <- if (freq_covers[i]) "#e74c3c" else "#e74c3c40"
segments(d$reps[i, 2], i - 0.15, d$reps[i, 3], i - 0.15,
lwd = 1.5, col = clr_f)
# Bayesian (right-shifted slightly)
clr_b <- if (bayes_covers[i]) "#3498db" else "#3498db40"
segments(d$reps[i, 5], i + 0.15, d$reps[i, 6], i + 0.15,
lwd = 1.5, col = clr_b)
}
abline(v = d$true_mu, lty = 2, lwd = 2, col = "#2c3e50")
legend("topright", bty = "n", cex = 0.8,
legend = c(
paste0("Freq CI (", sum(freq_covers), "/", k, " cover)"),
paste0("Bayes CrI (", sum(bayes_covers), "/", k, " cover)")
),
col = c("#e74c3c", "#3498db"), lwd = 3)
})
output$results <- renderUI({
d <- dat()
tags$div(class = "stats-box",
HTML(paste0(
"<b>Frequentist:</b><br>",
"Estimate: ", round(d$y_bar, 3), "<br>",
"95% CI: [", round(d$freq_lo, 3), ", ", round(d$freq_hi, 3), "]<br>",
"<hr style='margin:8px 0'>",
"<b>Bayesian:</b><br>",
"Posterior mean: ", round(d$post_mu, 3), "<br>",
"95% CrI: [", round(d$bayes_lo, 3), ", ", round(d$bayes_hi, 3), "]<br>",
"<hr style='margin:8px 0'>",
"<small>CrI is narrower because the prior adds information.</small>"
))
)
})
}
shinyApp(ui, server)
Things to try
- n = 5, prior centered at 0, true mu = 1: the Bayesian CrI is narrower but pulled toward 0 (shrinkage). The frequentist CI is wider but centered on the data.
- n = 200: both intervals are nearly identical. With lots of data, the prior washes out and Bayesian = frequentist.
- Set a wrong prior (prior mean = -3, true mu = 2, n = 5): the Bayesian interval gets pulled toward -3. A bad prior hurts with small samples. Slide n up — the data corrects it.
- Right panel: the frequentist CI is designed so that ~95% of the red intervals cover the truth across repetitions. The Bayesian CrI coverage depends on how good the prior is.
The bottom line
| Use frequentist when… | Use Bayesian when… |
|---|---|
| You want procedure guarantees (coverage) | You want direct probability statements |
| You have no prior information | You have real prior knowledge |
| Regulatory/peer review expects it | Small samples, need to borrow strength |
| Simple problems | Complex hierarchical models |
In practice, most applied researchers use frequentist methods but interpret them like Bayesians. Understanding both helps you know what your numbers actually mean.