Monte Carlo Experiments: How We Understand Estimators

What is a Monte Carlo experiment?

Every claim in this course — unbiasedness, coverage, power — was verified the same way: by simulation. A Monte Carlo experiment is simple:

Specify a data-generating process (DGP): decide the true parameter, the distribution, and the sample size.
Draw a sample from the DGP and compute your estimator.
Repeat many times (1,000–10,000 simulations).
Summarize: look at the distribution of your estimates. Is the estimator centered on the truth (unbiased)? How spread out is it (variance)? How often does the CI cover the truth?

This is how statisticians check their own work. Theory says OLS is unbiased? Prove it — simulate 5,000 datasets and see if the average estimate equals the true value. Theory says the CI has 95% coverage? Simulate it.

The Oracle View. Monte Carlo is pure Oracle territory. We specify the entire data-generating process — true parameter, distribution, sample size — then generate thousands of datasets to study how estimators behave. In practice, you get one dataset from an unknown DGP. Monte Carlo is how statisticians verify their tools work before you use them on real data.

Simulation 1: Compare estimators

Pick an estimator, pick a DGP, run \(N\) simulations. Compare bias, variance, and MSE (mean squared error = bias² + variance) across estimators.

#| standalone: true
#| viewerHeight: 1050

library(shiny)

ui <- fluidPage(
  tags$head(tags$style(HTML("
    .stats-box {
      background: #f0f4f8; border-radius: 6px; padding: 14px;
      margin-top: 12px; font-size: 14px; line-height: 1.9;
    }
    .stats-box b { color: #2c3e50; }
  "))),

  sidebarLayout(
    sidebarPanel(
      width = 4,

      selectInput("dgp", "DGP:",
                  choices = c("Normal(5, 2)",
                              "Exponential(0.5)",
                              "Contaminated normal",
                              "Uniform(0, 10)")),

      sliderInput("n", "Sample size:",
                  min = 5, max = 200, value = 20, step = 5),

      sliderInput("sims", "Simulations:",
                  min = 500, max = 5000, value = 2000, step = 500),

      checkboxGroupInput("estimators", "Estimators:",
                         choices = c("Mean", "Median",
                                     "Trimmed mean",
                                     "Midrange"),
                         selected = c("Mean", "Median",
                                      "Trimmed mean")),

      actionButton("go", "Run Monte Carlo",
                   class = "btn-primary", width = "100%"),

      uiOutput("results")
    ),

    mainPanel(
      width = 8,
      plotOutput("dgp_plot", height = "250px"),
      plotOutput("hist_plot", height = "400px"),
      plotOutput("mse_plot", height = "300px")
    )
  )
)

server <- function(input, output, session) {

  draw_dgp <- function(n, dgp) {
    switch(dgp,
      "Normal(5, 2)"          = rnorm(n, mean = 5, sd = 2),
      "Exponential(0.5)" = rexp(n, rate = 0.5),
      "Contaminated normal"   = {
        k <- rbinom(n, 1, 0.1)
        (1 - k) * rnorm(n, 5, 1) + k * rnorm(n, 5, 10)
      },
      "Uniform(0, 10)"        = runif(n, 0, 10)
    )
  }

  true_mu <- function(dgp) {
    switch(dgp,
      "Normal(5, 2)"          = 5,
      "Exponential(0.5)" = 2,
      "Contaminated normal"   = 5,
      "Uniform(0, 10)"        = 5
    )
  }

  compute_est <- function(x, est) {
    switch(est,
      "Mean"              = mean(x),
      "Median"            = median(x),
      "Trimmed mean" = mean(x, trim = 0.1),
      "Midrange"          = (min(x) + max(x)) / 2
    )
  }

  dat <- reactive({
    input$go
    n    <- input$n
    sims <- input$sims
    dgp  <- input$dgp
    ests <- input$estimators
    mu   <- true_mu(dgp)

    if (length(ests) == 0) ests <- "Mean"

    results <- list()
    for (est in ests) {
      vals <- replicate(sims, compute_est(draw_dgp(n, dgp), est))
      bias <- mean(vals) - mu
      variance <- var(vals)
      mse <- bias^2 + variance
      results[[est]] <- list(vals = vals, bias = bias,
                             variance = variance, mse = mse)
    }

    list(results = results, mu = mu, n = n, sims = sims,
         dgp = dgp, ests = ests)
  })

  output$dgp_plot <- renderPlot({
    d <- dat()
    par(mar = c(4, 5, 3, 2))

    big <- draw_dgp(10000, d$dgp)
    hist(big, breaks = 60, probability = TRUE,
         col = "#e0e0e0", border = "#b0b0b0",
         main = paste("DGP:", d$dgp),
         xlab = "", ylab = "Density")
    abline(v = d$mu, lty = 2, lwd = 2, col = "#e74c3c")
  })

  output$hist_plot <- renderPlot({
    d <- dat()
    par(mar = c(5, 5, 4, 2))

    cols <- c("#3498db", "#e74c3c", "#27ae60", "#9b59b6")
    all_vals <- unlist(lapply(d$results, function(r) r$vals))
    xlim <- quantile(all_vals, c(0.005, 0.995))

    first <- TRUE
    for (i in seq_along(d$ests)) {
      vals <- d$results[[d$ests[i]]]$vals
      if (first) {
        hist(vals, breaks = 50, probability = TRUE,
             col = adjustcolor(cols[i], 0.3),
             border = adjustcolor(cols[i], 0.6),
             main = paste0("Sampling distributions (",
                          d$sims, " simulations, n = ", d$n, ")"),
             xlab = "Estimate", ylab = "Density",
             xlim = xlim)
        first <- FALSE
      } else {
        hist(vals, breaks = 50, probability = TRUE,
             col = adjustcolor(cols[i], 0.2),
             border = adjustcolor(cols[i], 0.5),
             add = TRUE)
      }
    }

    abline(v = d$mu, lty = 2, lwd = 2.5, col = "#2c3e50")

    legend("topright", bty = "n", cex = 0.8,
           legend = c(d$ests, paste0("True \u03bc = ", d$mu)),
           col = c(cols[seq_along(d$ests)], "#2c3e50"),
           lwd = c(rep(8, length(d$ests)), 2.5),
           lty = c(rep(1, length(d$ests)), 2))
  })

  output$mse_plot <- renderPlot({
    d <- dat()
    par(mar = c(5, 10, 4, 2))

    mses <- sapply(d$results, function(r) r$mse)
    biases2 <- sapply(d$results, function(r) r$bias^2)
    vars <- sapply(d$results, function(r) r$variance)

    cols <- c("#3498db", "#e74c3c", "#27ae60", "#9b59b6")
    n_est <- length(d$ests)

    # Stacked bar: bias^2 + variance = MSE
    mat <- rbind(biases2, vars)

    bp <- barplot(mat, names.arg = d$ests, col = c("#e74c3c80", "#3498db80"),
                  horiz = TRUE, las = 1,
                  main = "MSE decomposition",
                  xlab = "MSE = Bias\u00b2 + Variance",
                  border = NA, cex.names = 0.75)

    legend("bottomright", bty = "n", cex = 0.8,
           legend = c("Bias\u00b2", "Variance"),
           fill = c("#e74c3c80", "#3498db80"), border = NA)
  })

  output$results <- renderUI({
    d <- dat()
    lines <- paste0(sapply(d$ests, function(est) {
      r <- d$results[[est]]
      paste0(
        "<b>", est, ":</b><br>",
        "&nbsp; Bias: ", round(r$bias, 4), "<br>",
        "&nbsp; Var: ", round(r$variance, 4), "<br>",
        "&nbsp; MSE: ", round(r$mse, 4), "<br>"
      )
    }), collapse = "<hr style='margin:6px 0'>")

    tags$div(class = "stats-box", HTML(lines))
  })
}

shinyApp(ui, server)

Simulation 2: Consistency — estimates tighten with n

A consistent estimator converges to the true parameter as \(n \to \infty\). Watch how the sampling distribution of each estimator gets narrower and more centered as you increase the sample size.

#| standalone: true
#| viewerHeight: 650

library(shiny)

ui <- fluidPage(
  tags$head(tags$style(HTML("
    .stats-box {
      background: #f0f4f8; border-radius: 6px; padding: 14px;
      margin-top: 12px; font-size: 14px; line-height: 1.9;
    }
    .stats-box b { color: #2c3e50; }
  "))),

  sidebarLayout(
    sidebarPanel(
      width = 4,

      selectInput("dgp2", "DGP:",
                  choices = c("Normal(5, 2)",
                              "Exponential(0.5)",
                              "Contaminated normal")),

      selectInput("est2", "Estimator:",
                  choices = c("Mean", "Median", "Trimmed mean")),

      sliderInput("sims2", "Simulations per n:",
                  min = 500, max = 3000, value = 1000, step = 250),

      actionButton("go2", "Run",
                   class = "btn-primary", width = "100%"),

      uiOutput("results2")
    ),

    mainPanel(
      width = 8,
      plotOutput("consistency_plot", height = "530px")
    )
  )
)

server <- function(input, output, session) {

  draw_dgp <- function(n, dgp) {
    switch(dgp,
      "Normal(5, 2)"          = rnorm(n, mean = 5, sd = 2),
      "Exponential(0.5)" = rexp(n, rate = 0.5),
      "Contaminated normal"   = {
        k <- rbinom(n, 1, 0.1)
        (1 - k) * rnorm(n, 5, 1) + k * rnorm(n, 5, 10)
      }
    )
  }

  true_mu <- function(dgp) {
    switch(dgp,
      "Normal(5, 2)"          = 5,
      "Exponential(0.5)" = 2,
      "Contaminated normal"   = 5
    )
  }

  compute_est <- function(x, est) {
    switch(est,
      "Mean"             = mean(x),
      "Median"           = median(x),
      "Trimmed mean" = mean(x, trim = 0.1)
    )
  }

  dat <- reactive({
    input$go2
    dgp  <- input$dgp2
    est  <- input$est2
    sims <- input$sims2
    mu   <- true_mu(dgp)

    ns <- c(5, 10, 25, 50, 100, 200)
    all_results <- list()
    for (n in ns) {
      all_results[[as.character(n)]] <- replicate(
        sims, compute_est(draw_dgp(n, dgp), est)
      )
    }

    list(all_results = all_results, ns = ns, mu = mu,
         dgp = dgp, est = est, sims = sims)
  })

  output$consistency_plot <- renderPlot({
    d <- dat()
    par(mar = c(5, 6, 4, 2))

    colors <- colorRampPalette(c("#e74c3c", "#3498db"))(length(d$ns))

    all_vals <- unlist(d$all_results)
    xlim <- quantile(all_vals, c(0.01, 0.99))

    plot(NULL, xlim = xlim,
         ylim = c(0.5, length(d$ns) * 1.5 + 0.5),
         yaxt = "n", ylab = "",
         xlab = paste0("Estimate (", d$est, ")"),
         main = paste0("Consistency: ", d$est,
                       " under ", d$dgp))
    positions <- seq_along(d$ns) * 1.5
    axis(2, at = positions, labels = paste0("n=", d$ns),
         las = 1, cex.axis = 0.9)

    for (i in seq_along(d$ns)) {
      vals <- d$all_results[[i]]
      dens <- density(vals)
      dens$y <- dens$y / max(dens$y) * 0.65
      polygon(dens$x, dens$y + positions[i],
              col = adjustcolor(colors[i], 0.4),
              border = colors[i], lwd = 1.5)
    }

    abline(v = d$mu, lty = 2, lwd = 2.5, col = "#2c3e50")
    text(d$mu, max(positions) + 0.8,
         paste0("True \u03bc = ", d$mu), cex = 0.9)
  })

  output$results2 <- renderUI({
    d <- dat()
    lines <- sapply(seq_along(d$ns), function(i) {
      vals <- d$all_results[[i]]
      paste0("n=", d$ns[i], ": SD = ", round(sd(vals), 4))
    })

    tags$div(class = "stats-box",
      HTML(paste0(
        "<b>Spread shrinks with n:</b><br>",
        paste(lines, collapse = "<br>")
      ))
    )
  })
}

shinyApp(ui, server)

Things to try

Normal DGP, Mean vs Median: the mean has smaller variance (it’s the efficient estimator for normal data). The median is less efficient but still unbiased.
Contaminated normal, Mean vs Median: now the median wins. The outliers from the contamination inflate the variance of the mean, but barely affect the median. This is why robust estimators exist.
10% trimmed mean: a compromise — it trims the most extreme 10% of observations. It handles outliers better than the mean, with less variance than the median. Often the best of both worlds.
Midrange: the average of the min and max. Under normal data it’s surprisingly efficient for small \(n\), but terrible for heavy-tailed or skewed data. The MSE plot shows exactly why.
Consistency (Sim 2): all three estimators converge to \(\mu\) as \(n\) grows. The ridgeline plot gets tighter and more centered with each step.

The bottom line

Monte Carlo is the universal verification tool in statistics. If you claim an estimator has some property (unbiased, consistent, 95% coverage), you can check it by simulation.
Bias = does the estimator aim at the right target? Variance = how spread out are the estimates? MSE = bias² + variance = overall accuracy.
Sometimes a biased estimator beats an unbiased one if it has much lower variance. This is the bias-variance tradeoff in miniature.

Did you know?

The Monte Carlo method was invented by Stanislaw Ulam and John von Neumann during the Manhattan Project in the late 1940s. Ulam was recovering from brain surgery and playing solitaire when he realized that it was easier to estimate the probability of winning by playing many random games than by computing it analytically. He and von Neumann applied this idea to nuclear physics simulations.
The name “Monte Carlo” was suggested by Nicholas Metropolis, a colleague, as a reference to the Monte Carlo Casino in Monaco — a nod to the role of randomness. The method was classified as part of the hydrogen bomb program until the 1950s.
Today, Monte Carlo methods are everywhere: pricing financial derivatives, training AI models, simulating protein folding, predicting elections, and rendering 3D graphics in movies. The basic idea — replace hard math with random simulation — is one of the most powerful tricks in computational science.