§4.4
Statistical Charts Before Statistics
Statistical charts should arrive before statistical formulas. A histogram can explain why a log transform is useful. A scatterplot can explain why a slope matters. A coefficient plot can explain why an estimate needs an interval. If students can read those visuals, the later regression equation feels like a compact notation for something they already understand.
The executive question: what does the statistical chart help us see before the model?
The soup case gives three visual bridges to later pricing work. First, store-month volume is highly skewed, so raw sales and log sales tell different visual stories. Second, log price and log volume have a downward relationship, which previews elasticity. Third, the slope changes when we adjust for month, which previews why seasonality matters.
That number is not yet the final price elasticity. It is an intuition-building estimate. The later pricing chapters will ask whether the variation behind the estimate is credible enough for a causal pricing decision.
Figure 1 starts with distribution shape. Raw Progresso volume has a long right tail: many ordinary store-months and a few very large ones. The raw histogram uses fixed-width bins clipped near the 99th percentile so the shape is visible rather than dominated by the largest stores. The log transform then compresses that tail and makes typical store-months easier to compare.
Raw volume is a long-tail distribution
Most store-months are modest; a few stores move enormous volume. The count axis is shared with the log view so the shape change is the only difference.
Slopes change once seasonality enters the picture
A forest plot: each estimate is a center, a 95% interval, and a comparison to zero. The dashed line is the no-relationship baseline; later pricing chapters handle identification.
Figure 1 also introduces coefficient intervals. The point is not to teach regression mechanics yet. The point is to show that a statistical estimate is a visual object: a center, an interval, a comparison group, and a limit statement.
Seasonality as the bridge to countercyclical pricing
The most intuitive scatterplot is log(Progresso volume) against log(Progresso price). In a log-log chart, the slope has elasticity intuition: a one percent price difference is associated with a percent difference in volume. That is why pricing chapters use log regression.
But the soup case also shows the trap. Winter months are high-demand months. Non-winter months are lower-demand months. If Progresso pricing changes across that same seasonal cycle, the scatterplot mixes price behavior with seasonal demand.
Figure 2 separates the scatter by winter and non-winter. Winter is defined for this case as October through February, the broader soup-demand season.
Log price–volume slope by season
A downward log-log slope previews price elasticity. The fitted line and slope are descriptive, not yet causal.
Figure 2 is useful precisely because it is not enough. It gives students the visual intuition for elasticity while leaving room for the identification question: what price variation is independent of demand shocks, promotions, inventory, store mix, and seasonality?
Concept check
Three questions spanning what a narrow interval does and does not certify, reading distribution shape and the log transform, and overlapping-interval reasoning.
- 1.A dashboard shows last quarter's regional sales gap with very tight error bars, so a VP concludes the gap is a reliable basis to shift the ad budget. What is the strongest objection?
- 2.Store-month soup volume is sharply right-skewed, and the chapter takes the log before plotting price against volume. What is the real reason to use logs here?
- 3.Two months' mean-share intervals overlap slightly; a teammate says "overlap means no real difference, so ignore it." A second month's interval is much wider than the first. How should you reason?