QALYs & Cost-Effectiveness
Measuring health outcomes
Not all life-years are equal. A year of life in perfect health is not the same as a year of life with severe chronic pain, limited mobility, or cognitive impairment. The quality-adjusted life year (QALY) accounts for this by weighting each year by a quality score:
\[\text{QALYs} = \sum_{t=1}^{T} q_t\]
where \(q_t\) is the health-related quality of life in year \(t\), scaled from 0 (dead) to 1 (perfect health). A treatment that gives you 10 years at quality 0.8 produces \(10 \times 0.8 = 8\) QALYs. A treatment that gives 6 years at quality 1.0 produces 6 QALYs.
Cost-effectiveness analysis
When comparing two treatments, we compute the Incremental Cost-Effectiveness Ratio (ICER):
\[\text{ICER} = \frac{C_{\text{new}} - C_{\text{old}}}{\text{QALYs}_{\text{new}} - \text{QALYs}_{\text{old}}} = \frac{\Delta C}{\Delta Q}\]
The ICER tells you the cost per additional QALY gained. If the ICER is below society’s willingness-to-pay (WTP) threshold, the new treatment is cost-effective.
Common WTP thresholds:
| Country/Standard | Threshold |
|---|---|
| US (older standard) | $50,000/QALY |
| US (recent estimates) | $100,000–$150,000/QALY |
| UK (NICE) | $25,000–$40,000/QALY |
| WHO (per-capita GDP rule) | 1–3x GDP per capita |
The WTP threshold is a social judgment — it reflects how much society is willing to pay for a marginal year of healthy life.
The cost-effectiveness plane
#| standalone: true
#| viewerHeight: 680
library(shiny)
ui <- fluidPage(
tags$head(tags$style(HTML("
.stats-box {
background: #f0f4f8; border-radius: 6px; padding: 14px;
margin-top: 12px; font-size: 14px; line-height: 1.9;
}
.stats-box b { color: #2c3e50; }
.good { color: #27ae60; font-weight: bold; }
.bad { color: #e74c3c; font-weight: bold; }
.info-box {
background: #ebf5fb; border-radius: 6px; padding: 12px;
margin-top: 10px; font-size: 13px;
}
"))),
sidebarLayout(
sidebarPanel(
width = 3,
tags$h4("Treatment A (standard)"),
sliderInput("costA", "Cost ($1000s):", min = 5, max = 200, value = 30, step = 5),
sliderInput("yearsA", "Life-years:", min = 1, max = 30, value = 10, step = 1),
sliderInput("qualA", "Quality weight:", min = 0.1, max = 1, value = 0.6, step = 0.05),
tags$hr(),
tags$h4("Treatment B (new)"),
sliderInput("costB", "Cost ($1000s):", min = 5, max = 500, value = 80, step = 5),
sliderInput("yearsB", "Life-years:", min = 1, max = 30, value = 14, step = 1),
sliderInput("qualB", "Quality weight:", min = 0.1, max = 1, value = 0.75, step = 0.05),
tags$hr(),
sliderInput("wtp", "WTP threshold ($1000/QALY):",
min = 10, max = 200, value = 100, step = 10),
tags$hr(),
uiOutput("info_box")
),
mainPanel(
width = 9,
plotOutput("ce_plot", height = "520px")
)
)
)
server <- function(input, output, session) {
vals <- reactive({
cA <- input$costA
cB <- input$costB
yA <- input$yearsA
yB <- input$yearsB
qA <- input$qualA
qB <- input$qualB
wtp <- input$wtp
qalyA <- yA * qA
qalyB <- yB * qB
delta_c <- cB - cA
delta_q <- qalyB - qalyA
icer <- if (abs(delta_q) > 0.01) delta_c / delta_q else Inf
# Decision
if (delta_q > 0 && delta_c <= 0) {
decision <- "DOMINANT"
dec_class <- "good"
} else if (delta_q <= 0 && delta_c > 0) {
decision <- "DOMINATED"
dec_class <- "bad"
} else if (delta_q > 0 && delta_c > 0) {
if (icer <= wtp) {
decision <- "COST-EFFECTIVE"
dec_class <- "good"
} else {
decision <- "NOT COST-EFFECTIVE"
dec_class <- "bad"
}
} else if (delta_q <= 0 && delta_c <= 0) {
if (abs(delta_q) < 0.01) {
decision <- "COST-SAVING (SAME EFFECT)"
dec_class <- "good"
} else {
decision <- "TRADE-OFF (fewer QALYs, lower cost)"
dec_class <- "bad"
}
} else {
decision <- "UNCERTAIN"
dec_class <- "bad"
}
list(cA = cA, cB = cB, qalyA = qalyA, qalyB = qalyB,
delta_c = delta_c, delta_q = delta_q, icer = icer,
wtp = wtp, decision = decision, dec_class = dec_class,
yA = yA, yB = yB, qA = qA, qB = qB)
})
output$ce_plot <- renderPlot({
v <- vals()
par(mfrow = c(1, 2), mar = c(5, 5, 3, 2))
# Panel 1: QALY comparison
bar_data <- matrix(c(v$yA * v$qA, v$yB * v$qB,
v$yA * (1 - v$qA), v$yB * (1 - v$qB)),
nrow = 2, ncol = 2)
colnames(bar_data) <- c("Quality-adjusted", "Quality loss")
bp <- barplot(t(matrix(c(v$qalyA, v$qalyB), ncol = 1)),
beside = TRUE,
names.arg = c("Treatment A", "Treatment B"),
col = c("#3498db", "#e74c3c"),
main = "QALYs Comparison",
ylab = "QALYs",
ylim = c(0, max(v$qalyA, v$qalyB) * 1.3),
cex.lab = 1.1, cex.names = 1.1)
# Show breakdown
text(bp, c(v$qalyA, v$qalyB) + max(v$qalyA, v$qalyB) * 0.05,
paste0(round(c(v$qalyA, v$qalyB), 1), " QALYs"),
col = c("#3498db", "#e74c3c"), font = 2, cex = 0.95)
text(bp, c(v$qalyA, v$qalyB) * 0.5,
paste0(c(v$yA, v$yB), " yrs x ",
c(v$qA, v$qB), " quality"),
col = "white", cex = 0.8)
# Panel 2: Cost-effectiveness plane
x_range <- max(abs(v$delta_q), 2) * 1.5
y_range <- max(abs(v$delta_c), 20) * 1.5
plot(0, 0, type = "n",
xlim = c(-x_range, x_range),
ylim = c(-y_range, y_range),
xlab = expression(Delta * " QALYs (B - A)"),
ylab = expression(Delta * " Cost ($1000s, B - A)"),
main = "Cost-Effectiveness Plane",
cex.lab = 1.1)
# Quadrant shading
rect(-x_range, 0, 0, y_range,
col = adjustcolor("#e74c3c", 0.08), border = NA) # NW: dominated
rect(0, -y_range, x_range, 0,
col = adjustcolor("#27ae60", 0.08), border = NA) # SE: dominant
rect(0, 0, x_range, y_range,
col = adjustcolor("#f39c12", 0.08), border = NA) # NE: tradeoff
rect(-x_range, -y_range, 0, 0,
col = adjustcolor("#f39c12", 0.08), border = NA) # SW: tradeoff
# Quadrant labels
text(-x_range * 0.6, y_range * 0.85, "DOMINATED\n(worse & costlier)",
col = "#e74c3c", cex = 0.8, font = 2)
text(x_range * 0.6, -y_range * 0.85, "DOMINANT\n(better & cheaper)",
col = "#27ae60", cex = 0.8, font = 2)
text(x_range * 0.6, y_range * 0.85, "More effective\nbut costlier",
col = "#e67e22", cex = 0.75)
text(-x_range * 0.6, -y_range * 0.85, "Less effective\nbut cheaper",
col = "#e67e22", cex = 0.75)
abline(h = 0, v = 0, col = "#bdc3c7", lwd = 1.5)
# WTP threshold line (slope = WTP in NE quadrant)
abline(a = 0, b = v$wtp, lty = 2, lwd = 2, col = "#9b59b6")
text(x_range * 0.7, v$wtp * x_range * 0.7 * 0.9,
paste0("WTP = $", v$wtp, "K/QALY"),
col = "#9b59b6", cex = 0.8, srt = atan(v$wtp / (y_range / x_range)) * 180 / pi * 0.5)
# Plot the point
pt_col <- ifelse(v$dec_class == "good", "#27ae60", "#e74c3c")
points(v$delta_q, v$delta_c, pch = 19, cex = 3, col = pt_col)
points(v$delta_q, v$delta_c, pch = 1, cex = 3, col = "black", lwd = 1.5)
# ICER line from origin
if (is.finite(v$icer) && abs(v$delta_q) > 0.01) {
segments(0, 0, v$delta_q, v$delta_c,
lty = 3, lwd = 2, col = pt_col)
}
})
output$info_box <- renderUI({
v <- vals()
icer_txt <- if (is.finite(v$icer) && abs(v$delta_q) > 0.01) {
paste0("$", format(round(v$icer), big.mark = ","), "K / QALY")
} else if (abs(v$delta_q) <= 0.01) {
"Undefined (same QALYs)"
} else {
"Undefined"
}
tags$div(class = "stats-box",
HTML(paste0(
"<b>Treatment A:</b> ", round(v$qalyA, 1), " QALYs, $",
v$cA, "K<br>",
"<b>Treatment B:</b> ", round(v$qalyB, 1), " QALYs, $",
v$cB, "K<br>",
"<hr style='margin:8px 0'>",
"<b>Δ Cost:</b> $", round(v$delta_c), "K<br>",
"<b>Δ QALYs:</b> ", round(v$delta_q, 2), "<br>",
"<b>ICER:</b> ", icer_txt, "<br>",
"<hr style='margin:8px 0'>",
"<b>Decision:</b> <span class='", v$dec_class, "'>",
v$decision, "</span>"
))
)
})
}
shinyApp(ui, server)
Things to try
- Set Treatment B to have more QALYs and lower cost: the point lands in the SE quadrant (dominant) — adopt B regardless of WTP threshold.
- Set B to have more QALYs but much higher cost: the point moves to the NE quadrant. Whether it is cost-effective depends on whether the ICER is below the WTP line.
- Lower the WTP threshold: the WTP line rotates, and treatments that were cost-effective may no longer be. This is why the threshold matters enormously for policy.
- Set B to have fewer QALYs and higher cost: dominated — never adopt B.
- Try equal QALYs but different costs: the ICER is undefined because you are dividing by zero. If B costs less, it is strictly preferred.
Controversies
QALYs are the standard metric in health technology assessment, but they are not without criticism:
The equal-value assumption. A QALY is a QALY — one QALY gained by a 20-year-old is worth the same as one gained by an 80-year-old. Many people find this counterintuitive. Some argue for age-weighting or fair-innings approaches.
Disability rights critique. QALYs assume that life with a disability is worth less than life without one. A year at quality 0.5 is “worth” only half a year in perfect health. Disability advocates argue this devalues the lives of people with disabilities and could lead to denial of care for disabled patients.
Whose preferences? Quality weights are typically elicited from the general public (who imagine what it would be like to have a condition) rather than from patients (who actually live with it). These differ: people who experience a condition often rate it higher than the general public imagines — the phenomenon of hedonic adaptation.
The threshold problem. Any cost-effectiveness threshold is ultimately arbitrary. Why $100K/QALY and not $50K or $200K? The threshold implicitly puts a price on human life, which many find uncomfortable despite its necessity for resource allocation.
Did you know?
NICE (the UK’s National Institute for Health and Care Excellence) is the world’s most prominent user of cost-effectiveness analysis. It typically approves treatments with ICERs below 20,000–30,000 pounds per QALY. Treatments above 30,000 pounds/QALY need to demonstrate additional benefits to be approved. This system has been both praised for its rigor and criticized for denying access to expensive drugs.
The $50,000/QALY threshold widely cited in the US traces back to a rough estimate of the cost-effectiveness of dialysis for end-stage renal disease — a treatment Congress chose to cover universally in 1972. It was never intended as a formal threshold, but it became one by convention. Recent evidence suggests the US implicitly values a QALY at $100,000–$150,000 based on actual coverage decisions.
QALY calculations can produce uncomfortable comparisons. A drug that extends 100 cancer patients’ lives by 1 year each (at quality 0.7) produces 70 QALYs. A hip replacement that improves quality of life by 0.3 for 20 years produces 6 QALYs per patient — but for 12 patients that is 72 QALYs. The QALY framework says these are roughly equivalent, which may conflict with moral intuitions about the urgency of life extension vs quality improvement.