Frisch-Waugh-Lovell Theorem

The Frisch-Waugh-Lovell (FWL) theorem says: the coefficient on \(X_1\) in \(Y = \beta_1 X_1 + \beta_2 X_2 + \varepsilon\) is identical to the slope from regressing the residualized \(Y\) on the residualized \(X_1\) — after partialling out \(X_2\) from both.

Drag the sliders to see it hold for any DGP.

The Oracle View. In these simulations, we set the true \(\beta_1\), \(\beta_2\), and the correlation between \(X_1\) and \(X_2\). We can verify FWL gives the same answer as the full regression. In practice, you don’t know the true coefficients — but FWL holds mechanically regardless, which is why it’s useful for understanding what “controlling for” actually does.

#| standalone: true
#| viewerHeight: 900

library(shiny)

ui <- fluidPage(
  tags$head(tags$style(HTML("
    .eq-box {
      background: #f0f4f8; border-radius: 6px; padding: 14px;
      margin-bottom: 14px; font-size: 14px; line-height: 1.9;
    }
    .eq-box b { color: #2c3e50; }
    .match  { color: #27ae60; font-weight: bold; }
    .coef   { color: #e74c3c; font-weight: bold; }
  "))),

  sidebarLayout(
    sidebarPanel(
      width = 4,

      sliderInput("n", "Sample size:",
                  min = 50, max = 500, value = 200, step = 50),

      sliderInput("b1", HTML("True &beta;<sub>1</sub>:"),
                  min = -3, max = 3, value = 1.5, step = 0.1),

      sliderInput("b2", HTML("True &beta;<sub>2</sub>:"),
                  min = -3, max = 3, value = -1, step = 0.1),

      sliderInput("rho", HTML("Corr(X<sub>1</sub>, X<sub>2</sub>):"),
                  min = -0.9, max = 0.9, value = 0.6, step = 0.1),

      sliderInput("sigma", HTML("Error SD (&sigma;):"),
                  min = 0.5, max = 5, value = 1, step = 0.5),

      actionButton("resim", "New draw", class = "btn-primary", width = "100%"),

      uiOutput("results_box")
    ),

    mainPanel(
      width = 8,
      plotOutput("plot_full",    height = "350px"),
      fluidRow(
        column(6, plotOutput("plot_partial", height = "350px")),
        column(6, plotOutput("plot_fwl",     height = "350px"))
      ),
      uiOutput("step_text")
    )
  )
)

server <- function(input, output, session) {

  dat <- reactive({
    input$resim
    n     <- input$n
    b1    <- input$b1
    b2    <- input$b2
    rho   <- input$rho
    sigma <- input$sigma

    # Generate correlated X1, X2
    z1 <- rnorm(n)
    z2 <- rnorm(n)
    x1 <- z1
    x2 <- rho * z1 + sqrt(1 - rho^2) * z2

    eps <- rnorm(n, sd = sigma)
    y   <- b1 * x1 + b2 * x2 + eps

    # Full OLS
    full_fit <- lm(y ~ x1 + x2)

    # FWL steps
    ey <- resid(lm(y  ~ x2))   # residualise Y on X2
    ex <- resid(lm(x1 ~ x2))   # residualise X1 on X2
    fwl_fit <- lm(ey ~ ex)

    list(x1 = x1, x2 = x2, y = y,
         ey = ey, ex = ex,
         full_fit = full_fit, fwl_fit = fwl_fit,
         b1 = b1, b2 = b2)
  })

  # --- Plot 1: Y vs X1 (naive scatter) ---
  output$plot_full <- renderPlot({
    d <- dat()
    par(mar = c(5, 5, 4, 2))
    plot(d$x1, d$y, pch = 16, col = "#3498db80", cex = 0.8,
         xlab = expression(X[1]), ylab = "Y",
         main = expression("Y vs " * X[1] * " (raw)"))
    abline(lm(d$y ~ d$x1), col = "#e74c3c", lwd = 2.5)
    naive_b <- round(coef(lm(d$y ~ d$x1))[2], 4)
    legend("topleft", bty = "n", cex = 0.9,
           legend = paste("Naive slope =", naive_b))
  })

  # --- Plot 2: residualised X1 (partial out X2) ---
  output$plot_partial <- renderPlot({
    d <- dat()
    par(mar = c(5, 5, 4, 2))
    plot(d$x1, d$ex, pch = 16, col = "#9b59b680", cex = 0.8,
         xlab = expression(X[1]),
         ylab = expression(e[X[1]]),
         main = expression("Residualise " * X[1] * " on " * X[2]))
    abline(h = 0, lty = 2, col = "gray50")
    abline(lm(d$ex ~ d$x1), col = "#8e44ad", lwd = 2)
    legend("topleft", bty = "n", cex = 0.85,
           legend = expression("Variation in " * X[1] * " independent of " * X[2]))
  })

  # --- Plot 3: FWL regression ---
  output$plot_fwl <- renderPlot({
    d <- dat()
    par(mar = c(5, 5, 4, 2))
    plot(d$ex, d$ey, pch = 16, col = "#2ecc7180", cex = 0.8,
         xlab = expression(e[X[1]]),
         ylab = expression(e[Y]),
         main = "FWL: Residual Y vs Residual X1")
    abline(d$fwl_fit, col = "#e74c3c", lwd = 2.5)
    fwl_b <- round(coef(d$fwl_fit)[2], 4)
    legend("topleft", bty = "n", cex = 0.9,
           legend = paste("FWL slope =", fwl_b))
  })

  # --- Results comparison ---
  output$results_box <- renderUI({
    d <- dat()
    full_b1 <- round(coef(d$full_fit)["x1"], 4)
    fwl_b1  <- round(coef(d$fwl_fit)[2], 4)
    naive_b <- round(coef(lm(d$y ~ d$x1))[2], 4)

    tags$div(class = "eq-box", style = "margin-top: 16px;",
      HTML(paste0(
        "<b>True &beta;<sub>1</sub>:</b> ", d$b1, "<br>",
        "<b>Full OLS &beta;<sub>1</sub>:</b> <span class='coef'>", full_b1, "</span><br>",
        "<b>FWL &beta;<sub>1</sub>:</b> <span class='coef'>", fwl_b1, "</span><br>",
        "<span class='match'>&#10003; They match!</span><br><br>",
        "<b>Naive slope:</b> ", naive_b, "<br>",
        "<small>(biased by omitting X<sub>2</sub>)</small>"
      ))
    )
  })

  # --- Step explanation ---
  output$step_text <- renderUI({
    tags$div(class = "eq-box", style = "margin-top: 8px;",
      HTML(paste0(
        "<b>Steps:</b> ",
        "(1) Regress Y on X<sub>2</sub> &rarr; residuals <i>e<sub>Y</sub></i> &nbsp;|&nbsp; ",
        "(2) Regress X<sub>1</sub> on X<sub>2</sub> &rarr; residuals <i>e<sub>X₁</sub></i> &nbsp;|&nbsp; ",
        "(3) Regress <i>e<sub>Y</sub></i> on <i>e<sub>X₁</sub></i> &rarr; ",
        "slope = &beta;<sub>1</sub> from full regression"
      ))
    )
  })
}

shinyApp(ui, server)

Did you know?

  • Ragnar Frisch and Jan Tinbergen won the very first Nobel Prize in Economics in 1969. Frisch coined the terms “econometrics,” “microeconomics,” and “macroeconomics.” The FWL theorem appeared in Frisch & Waugh (1933).
  • Michael Lovell extended the result in 1963, showing it applies to any partitioned regression — not just the two-variable case. That’s why it’s FWL, not just FW.
  • FWL is the theoretical foundation behind “partialling out” and “controlling for” variables. Every time you add a control to a regression, you’re implicitly doing the residualization that FWL describes.
  • In machine learning, the same idea appears as “residualization” in double/debiased ML (Chernozhukov et al., 2018) — one of the most important recent developments in causal ML.