§6.1
Regression as Effect Isolation
When several business variables move at once, a simple two-variable correlation is rarely the answer to a strategic question. Multiple regression is the workhorse that lets managers ask a sharper question: how does the outcome respond to a single lever, holding other observable factors constant? This article explains what that "holding constant" actually does — it is residualization, not real-world control — and why understanding the mechanics changes how you read every regression coefficient you will ever see.
We will work through the regression model, walk through the Frisch–Waugh–Lovell view of what it computes, and end with a data case that climbs a "regression ladder" on real scanner data — adding controls one at a time and watching the price coefficient settle as confounders are stripped out.
The Executive Question: What Else Was Changing?
A naive correlation says weeks with heavier email-coupon volume see higher revenue. Three things were probably also true of those weeks: they were holiday weeks, a competitor was on the air, and the recipients were the loyal customers most likely to buy anyway. Each one contaminates the coupon-revenue correlation in a predictable direction.
| Confounder | Correlation with coupon volume | Effect on revenue | Direction of naive bias |
|---|---|---|---|
| Holiday season | Positive (more sends in peak weeks) | Positive (holiday demand) | Inflates apparent coupon lift |
| Competitor ad blitz | Positive (defensive couponing) | Negative (share losses) | Suppresses apparent lift |
| Loyal customer targeting | Positive (loyalists on the list) | Positive (they buy anyway) | Inflates apparent lift |
The decision-relevant question is not "are coupons correlated with revenue?" but "what does the data look like when we hold those three factors constant?" Multiple regression is the tool that gives a precise answer.
Multiple Regression: Effect Isolation by Math
The standard linear model is
Multiple linear regression
The coefficient has a very specific managerial interpretation:
is the expected change in the outcome for a one-unit increase in , holding the other included variables fixed.
That last clause does not mean we held anything constant in the real world. It means the regression mathematically removed the part of 's variation that the controls could explain, removed the part of 's variation that the controls could explain, and looked at what was left. That two-step "what was left" view is the Frisch–Waugh–Lovell theorem.
The Frisch–Waugh–Lovell View
FWL says a multiple-regression coefficient is identical to a much simpler two-stage procedure:
- Residualize the treatment. Regress on the controls. Keep the residual — the variation in the treatment that the controls cannot explain.
- Residualize the outcome. Regress on the controls. Keep the residual — the variation in the outcome the controls cannot explain.
- Slope of residuals on residuals. The simple slope of on is exactly the multiple-regression coefficient .
Regression as residualization (Frisch–Waugh)
The regression coefficient on D after adjusting for controls X equals the simple regression of Y's residuals on D's residuals.
This is enormously clarifying. A regression coefficient is never about all of the variation in . It is about the slice of that moves independently of whatever controls you included. The implications follow directly:
- Controls steal variation, by design. Adding a control absorbs the part of that moves with that control. If your treatment varies mostly with one of your controls, very little independent variation is left, and the coefficient becomes noisy.
- A "bad control" (a post-treatment variable) is a thief, not a friend. If is itself a consequence of the treatment, residualizing the treatment on removes part of the very causal pathway you are trying to measure.
- The model has nothing to say about regions of with no data. The residualized scatter only covers the support of you actually observed. Predictions outside it are extrapolation, not estimation.
Reading a Coefficient Like a Manager
A defensible report of a regression coefficient always names four things — the outcome, the lever, the scale, and the controls:
"Holding store and week fixed effects, competitor pricing, and seasonal dummies constant, a 1% increase in price is associated with a 2.23% decrease in unit volume."
Notice how much that sentence concedes. It does not claim causation in general — it claims a conditional comparison. The clause "holding X constant" is doing real work, and naming exactly which X's are held constant is what separates a serious estimate from a marketing slide.
Data Case: The Progresso Soup Regression Ladder
A "regression ladder" runs the same outcome–treatment regression with progressively more controls, and watches the coefficient evolve. It is the most useful single artifact in an observational pricing study, because the shifts are what tell you which confounders mattered.
For Progresso soup, we regress on across roughly 88,000 store-month observations, building up the control set in five steps:
- Raw correlation — bivariate log-log regression, no controls.
- + Seasonality — month dummies absorbing winter demand shocks.
- + Competitor price — log price of Campbell's, the nearest substitute.
- + Region — census-region dummies absorbing stable regional preferences.
- + Store fixed effects — one intercept per store, absorbing every stable store-level difference. (We will cover this design formally in the next article.)
The elasticity estimate changes as the comparison gets cleaner
88,409 store-months across 2,042 stores. Coefficient is on log(Progresso price).
Two patterns are worth naming:
- The biggest single move comes from adding store fixed effects. Whatever stable differences across stores were driving the naive bias — neighborhood income, local competition density, store size — they were the largest source of confounding. Census-region dummies absorbed some, but not most, of that variation.
- The estimate stabilizes. Between the demographics step and the store fixed-effects step the coefficient shifts only modestly, and within the noise of the last two specifications it is hard to argue for further controls. Stability across the last rungs is the visual signature of a credible identification, given the controls in hand.
The preferred estimate, , says that within a given store, a 10% price increase is associated with roughly a 22% volume decrease, after stripping out seasonal demand, competitor pricing, and stable store-level differences. That is the number you would use for pricing — and the regression ladder is what earns it the right to be used.