Entropy Balancing
The problem with IPW
IPW relies on correctly specifying the propensity score model. If you get the model wrong, the weights are wrong, and the estimate is biased. Even if the model is right, extreme propensity scores create extreme weights and noisy estimates.
Entropy balancing: a different approach
Entropy balancing (Hainmueller, 2012) skips the propensity score entirely. Instead, it directly finds weights for the control group that make the covariate distributions exactly match the treated group on specified moments (mean, variance, skewness).
The weights are chosen to be as close to uniform as possible (maximum entropy) subject to the balance constraints. This guarantees:
- Exact balance on the moments you specify
- Smooth weights (no extreme values like IPW can produce)
IPW vs Entropy Balancing
| IPW | Entropy Balancing | |
|---|---|---|
| Requires a propensity score model | Yes | No |
| Balance is… | Approximate (check after) | Exact (by construction) |
| Extreme weights? | Can be severe | Controlled |
| Sensitive to misspecification? | Yes | Less so |
Assumptions
- Selection on observables: conditional on the covariates you balance on, treatment is independent of potential outcomes. Same as IPW — if you’re missing a confounder, balancing the observed ones doesn’t help.
- Overlap: treated and control groups share common support in covariate space. You can’t reweight controls to look like treated units if no controls exist in that part of the distribution.
- Correct moments: you need to balance the right moments. If the outcome depends on \(X^2\) but you only balance on \(E[X]\), you’ll miss the confounding.
#| standalone: true
#| viewerHeight: 650
library(shiny)
ui <- fluidPage(
tags$head(tags$style(HTML("
.stats-box {
background: #f0f4f8; border-radius: 6px; padding: 14px;
margin-top: 12px; font-size: 14px; line-height: 1.9;
}
.stats-box b { color: #2c3e50; }
.good { color: #27ae60; font-weight: bold; }
.bad { color: #e74c3c; font-weight: bold; }
"))),
sidebarLayout(
sidebarPanel(
width = 3,
sliderInput("n", "Sample size:",
min = 200, max = 2000, value = 500, step = 100),
sliderInput("ate", "True ATE:",
min = 0, max = 5, value = 2, step = 0.5),
sliderInput("confounding", "Confounding strength:",
min = 0, max = 3, value = 1.5, step = 0.25),
actionButton("go", "New draw", class = "btn-primary", width = "100%"),
uiOutput("results")
),
mainPanel(
width = 9,
fluidRow(
column(6, plotOutput("balance_plot", height = "380px")),
column(6, plotOutput("weight_plot", height = "380px"))
)
)
)
)
server <- function(input, output, session) {
# Simple entropy balancing: find weights for control group
# that match treated group mean of X
ebal <- function(x_ctrl, target_mean, max_iter = 200, tol = 1e-6) {
n <- length(x_ctrl)
lambda <- 0
for (i in seq_len(max_iter)) {
w <- exp(lambda * x_ctrl)
w <- w / sum(w) * n
current <- weighted.mean(x_ctrl, w)
grad <- current - target_mean
if (abs(grad) < tol) break
lambda <- lambda - 0.5 * grad
}
w / sum(w) * n
}
dat <- reactive({
input$go
n <- input$n
ate <- input$ate
conf <- input$confounding
x <- rnorm(n)
p_true <- pnorm(conf * x)
treat <- rbinom(n, 1, p_true)
y <- 1 + 2 * x + ate * treat + rnorm(n)
# Naive
naive <- mean(y[treat == 1]) - mean(y[treat == 0])
# IPW
ps <- fitted(glm(treat ~ x, family = binomial))
w_ipw <- ifelse(treat == 1, 1 / ps, 1 / (1 - ps))
ipw_est <- weighted.mean(y[treat == 1], w_ipw[treat == 1]) -
weighted.mean(y[treat == 0], w_ipw[treat == 0])
# Entropy balancing (balance control to match treated mean of X)
x_ctrl <- x[treat == 0]
x_treat_mean <- mean(x[treat == 1])
w_eb <- ebal(x_ctrl, x_treat_mean)
eb_est <- mean(y[treat == 1]) - weighted.mean(y[treat == 0], w_eb)
# Balance diagnostics
ctrl_mean_raw <- mean(x[treat == 0])
ctrl_mean_eb <- weighted.mean(x[treat == 0], w_eb)
treat_mean_x <- x_treat_mean
list(x = x, treat = treat, y = y,
w_ipw = w_ipw, w_eb = w_eb,
naive = naive, ipw_est = ipw_est, eb_est = eb_est,
ate = ate,
ctrl_mean_raw = ctrl_mean_raw,
ctrl_mean_eb = ctrl_mean_eb,
treat_mean_x = treat_mean_x)
})
output$balance_plot <- renderPlot({
d <- dat()
par(mar = c(4.5, 1, 3, 1))
means <- c(d$treat_mean_x, d$ctrl_mean_raw, d$ctrl_mean_eb)
cols <- c("#3498db", "#e74c3c", "#27ae60")
labels <- c("Treated", "Control\n(unweighted)", "Control\n(EB weighted)")
bp <- barplot(means, col = cols, border = NA,
names.arg = labels, cex.names = 0.85,
main = "Mean of X: Balance Check",
ylab = "", ylim = range(means) * c(0.8, 1.3))
text(bp, means + 0.05, round(means, 3), cex = 0.9)
})
output$weight_plot <- renderPlot({
d <- dat()
par(mar = c(4.5, 4.5, 3, 1))
ctrl_idx <- which(d$treat == 0)
w_ipw_ctrl <- d$w_ipw[ctrl_idx]
w_eb_ctrl <- d$w_eb
xlim <- c(0, max(c(w_ipw_ctrl, w_eb_ctrl)) * 1.1)
d_ipw <- density(w_ipw_ctrl, from = 0)
d_eb <- density(w_eb_ctrl, from = 0)
ylim <- c(0, max(d_ipw$y, d_eb$y) * 1.2)
plot(d_ipw, col = "#e74c3c", lwd = 2.5,
main = "Weight Distributions (Control Units)",
xlab = "Weight", ylab = "Density",
xlim = xlim, ylim = ylim)
lines(d_eb, col = "#27ae60", lwd = 2.5)
legend("topright", bty = "n", cex = 0.85,
legend = c("IPW weights", "Entropy balancing weights"),
col = c("#e74c3c", "#27ae60"), lwd = 2.5)
})
output$results <- renderUI({
d <- dat()
tags$div(class = "stats-box",
HTML(paste0(
"<b>True ATE:</b> ", d$ate, "<br>",
"<b>Naive:</b> <span class='bad'>", round(d$naive, 3), "</span>",
" (bias: ", round(d$naive - d$ate, 3), ")<br>",
"<b>IPW:</b> ", round(d$ipw_est, 3),
" (bias: ", round(d$ipw_est - d$ate, 3), ")<br>",
"<b>Entropy Bal:</b> <span class='good'>", round(d$eb_est, 3), "</span>",
" (bias: ", round(d$eb_est - d$ate, 3), ")"
))
)
})
}
shinyApp(ui, server)
Things to try
- Confounding = 1.5: look at the right plot. IPW weights have a long tail (some control units get huge weight). Entropy balancing weights are much smoother.
- Confounding = 3: IPW weights become extreme. EB stays stable.
- Left plot: the green bar (EB-weighted control mean) exactly matches the blue bar (treated mean). That’s the guarantee — exact balance by construction.
- Compare the bias numbers in the sidebar: EB is typically closer to the true ATE, especially under strong confounding.
In Stata
* Install entropy balancing
* ssc install ebalance
* Generate weights that balance x1, x2, x3 between treated and control
ebalance treatment x1 x2 x3, gen(eb_weight)
* Use the weights in your outcome regression
reg outcome treatment x1 x2 x3 [pw=eb_weight]ebalance finds the smoothest set of control-group weights that exactly matches the treated group on the moments you specify (means by default; add targets(2) to also match variances). No propensity score model needed.