Frisch-Waugh-Lovell Theorem

The Frisch-Waugh-Lovell (FWL) theorem says: the coefficient on \(X_1\) in \(Y = \beta_1 X_1 + \beta_2 X_2 + \varepsilon\) is identical to the slope from regressing the residualized \(Y\) on the residualized \(X_1\) — after partialling out \(X_2\) from both.

Drag the sliders to see it hold for any DGP.

The Oracle View. In these simulations, we set the true \(\beta_1\), \(\beta_2\), and the correlation between \(X_1\) and \(X_2\). We can verify FWL gives the same answer as the full regression. In practice, you don’t know the true coefficients — but FWL holds mechanically regardless, which is why it’s useful for understanding what “controlling for” actually does.

#| standalone: true
#| viewerHeight: 900

library(shiny)

ui <- fluidPage(
  tags$head(tags$style(HTML("
    .eq-box {
      background: #f0f4f8; border-radius: 6px; padding: 14px;
      margin-bottom: 14px; font-size: 14px; line-height: 1.9;
    }
    .eq-box b { color: #2c3e50; }
    .match  { color: #27ae60; font-weight: bold; }
    .coef   { color: #e74c3c; font-weight: bold; }
  "))),

  sidebarLayout(
    sidebarPanel(
      width = 4,

      sliderInput("n", "Sample size:",
                  min = 50, max = 500, value = 200, step = 50),

      sliderInput("b1", HTML("True &beta;<sub>1</sub>:"),
                  min = -3, max = 3, value = 1.5, step = 0.1),

      sliderInput("b2", HTML("True &beta;<sub>2</sub>:"),
                  min = -3, max = 3, value = -1, step = 0.1),

      sliderInput("rho", HTML("Corr(X<sub>1</sub>, X<sub>2</sub>):"),
                  min = -0.9, max = 0.9, value = 0.6, step = 0.1),

      sliderInput("sigma", HTML("Error SD (&sigma;):"),
                  min = 0.5, max = 5, value = 1, step = 0.5),

      actionButton("resim", "New draw", class = "btn-primary", width = "100%"),

      uiOutput("results_box")
    ),

    mainPanel(
      width = 8,
      plotOutput("plot_full",    height = "350px"),
      fluidRow(
        column(6, plotOutput("plot_partial", height = "350px")),
        column(6, plotOutput("plot_fwl",     height = "350px"))
      ),
      uiOutput("step_text")
    )
  )
)

server <- function(input, output, session) {

  dat <- reactive({
    input$resim
    n     <- input$n
    b1    <- input$b1
    b2    <- input$b2
    rho   <- input$rho
    sigma <- input$sigma

    # Generate correlated X1, X2
    z1 <- rnorm(n)
    z2 <- rnorm(n)
    x1 <- z1
    x2 <- rho * z1 + sqrt(1 - rho^2) * z2

    eps <- rnorm(n, sd = sigma)
    y   <- b1 * x1 + b2 * x2 + eps

    # Full OLS
    full_fit <- lm(y ~ x1 + x2)

    # FWL steps
    ey <- resid(lm(y  ~ x2))   # residualise Y on X2
    ex <- resid(lm(x1 ~ x2))   # residualise X1 on X2
    fwl_fit <- lm(ey ~ ex)

    list(x1 = x1, x2 = x2, y = y,
         ey = ey, ex = ex,
         full_fit = full_fit, fwl_fit = fwl_fit,
         b1 = b1, b2 = b2)
  })

  # --- Plot 1: Y vs X1 (naive scatter) ---
  output$plot_full <- renderPlot({
    d <- dat()
    par(mar = c(5, 5, 4, 2))
    plot(d$x1, d$y, pch = 16, col = "#3498db80", cex = 0.8,
         xlab = expression(X[1]), ylab = "Y",
         main = expression("Y vs " * X[1] * " (raw)"))
    abline(lm(d$y ~ d$x1), col = "#e74c3c", lwd = 2.5)
    naive_b <- round(coef(lm(d$y ~ d$x1))[2], 4)
    legend("topleft", bty = "n", cex = 0.9,
           legend = paste("Naive slope =", naive_b))
  })

  # --- Plot 2: residualised X1 (partial out X2) ---
  output$plot_partial <- renderPlot({
    d <- dat()
    par(mar = c(5, 5, 4, 2))
    plot(d$x1, d$ex, pch = 16, col = "#9b59b680", cex = 0.8,
         xlab = expression(X[1]),
         ylab = expression(e[X[1]]),
         main = expression("Residualise " * X[1] * " on " * X[2]))
    abline(h = 0, lty = 2, col = "gray50")
    abline(lm(d$ex ~ d$x1), col = "#8e44ad", lwd = 2)
    legend("topleft", bty = "n", cex = 0.85,
           legend = expression("Variation in " * X[1] * " independent of " * X[2]))
  })

  # --- Plot 3: FWL regression ---
  output$plot_fwl <- renderPlot({
    d <- dat()
    par(mar = c(5, 5, 4, 2))
    plot(d$ex, d$ey, pch = 16, col = "#2ecc7180", cex = 0.8,
         xlab = expression(e[X[1]]),
         ylab = expression(e[Y]),
         main = "FWL: Residual Y vs Residual X1")
    abline(d$fwl_fit, col = "#e74c3c", lwd = 2.5)
    fwl_b <- round(coef(d$fwl_fit)[2], 4)
    legend("topleft", bty = "n", cex = 0.9,
           legend = paste("FWL slope =", fwl_b))
  })

  # --- Results comparison ---
  output$results_box <- renderUI({
    d <- dat()
    full_b1 <- round(coef(d$full_fit)["x1"], 4)
    fwl_b1  <- round(coef(d$fwl_fit)[2], 4)
    naive_b <- round(coef(lm(d$y ~ d$x1))[2], 4)

    tags$div(class = "eq-box", style = "margin-top: 16px;",
      HTML(paste0(
        "<b>True &beta;<sub>1</sub>:</b> ", d$b1, "<br>",
        "<b>Full OLS &beta;<sub>1</sub>:</b> <span class='coef'>", full_b1, "</span><br>",
        "<b>FWL &beta;<sub>1</sub>:</b> <span class='coef'>", fwl_b1, "</span><br>",
        "<span class='match'>&#10003; They match!</span><br><br>",
        "<b>Naive slope:</b> ", naive_b, "<br>",
        "<small>(biased by omitting X<sub>2</sub>)</small>"
      ))
    )
  })

  # --- Step explanation ---
  output$step_text <- renderUI({
    tags$div(class = "eq-box", style = "margin-top: 8px;",
      HTML(paste0(
        "<b>Steps:</b> ",
        "(1) Regress Y on X<sub>2</sub> &rarr; residuals <i>e<sub>Y</sub></i> &nbsp;|&nbsp; ",
        "(2) Regress X<sub>1</sub> on X<sub>2</sub> &rarr; residuals <i>e<sub>X₁</sub></i> &nbsp;|&nbsp; ",
        "(3) Regress <i>e<sub>Y</sub></i> on <i>e<sub>X₁</sub></i> &rarr; ",
        "slope = &beta;<sub>1</sub> from full regression"
      ))
    )
  })
}

shinyApp(ui, server)

Did you know?

Ragnar Frisch and Jan Tinbergen won the very first Nobel Prize in Economics in 1969. Frisch coined the terms “econometrics,” “microeconomics,” and “macroeconomics.” The FWL theorem appeared in Frisch & Waugh (1933).
Michael Lovell extended the result in 1963, showing it applies to any partitioned regression — not just the two-variable case. That’s why it’s FWL, not just FW.
FWL is the theoretical foundation behind “partialling out” and “controlling for” variables. Every time you add a control to a regression, you’re implicitly doing the residualization that FWL describes.
In machine learning, the same idea appears as “residualization” in double/debiased ML (Chernozhukov et al., 2018) — one of the most important recent developments in causal ML.