§4.4

Statistical Charts Before Statistics

Statistical charts should arrive before statistical formulas. A histogram can explain why a log transform is useful. A scatterplot can explain why a slope matters. A coefficient plot can explain why an estimate needs an interval. If students can read those visuals, the later regression equation feels like a compact notation for something they already understand.

The executive question: what does the statistical chart help us see before the model?

The soup case gives three visual bridges to later pricing work. First, store-month volume is highly skewed, so raw sales and log sales tell different visual stories. Second, log price and log volume have a downward relationship, which previews elasticity. Third, the slope changes when we adjust for month, which previews why seasonality matters.

-2.46
The national month-adjusted log-log slope: a 1% higher Progresso price is associated with about 2.46% lower volume in this descriptive preview.

That number is not yet the final price elasticity. It is an intuition-building estimate. The later pricing chapters will ask whether the variation behind the estimate is credible enough for a causal pricing decision.

Figure 1 starts with distribution shape. Raw Progresso volume has a long right tail: many ordinary store-months and a few very large ones. The raw histogram uses fixed-width bins clipped near the 99th percentile so the shape is visible rather than dominated by the largest stores. The log transform then compresses that tail and makes typical store-months easier to compare.

Raw volume is a long-tail distribution

Most store-months are modest; a few stores move enormous volume. The count axis is shared with the log view so the shape change is the only difference.

x-axis

Slopes change once seasonality enters the picture

A forest plot: each estimate is a center, a 95% interval, and a comparison to zero. The dashed line is the no-relationship baseline; later pricing chapters handle identification.

Figure 1. Fixed-width histograms show the long-tail soup volume problem; log transformation makes typical store-months easier to compare before coefficient intervals enter the pricing story.

Figure 1 also introduces coefficient intervals. The point is not to teach regression mechanics yet. The point is to show that a statistical estimate is a visual object: a center, an interval, a comparison group, and a limit statement.

Seasonality as the bridge to countercyclical pricing

The most intuitive scatterplot is log(Progresso volume) against log(Progresso price). In a log-log chart, the slope has elasticity intuition: a one percent price difference is associated with a percent difference in volume. That is why pricing chapters use log regression.

But the soup case also shows the trap. Winter months are high-demand months. Non-winter months are lower-demand months. If Progresso pricing changes across that same seasonal cycle, the scatterplot mixes price behavior with seasonal demand.

Figure 2 separates the scatter by winter and non-winter. Winter is defined for this case as October through February, the broader soup-demand season.

Log price–volume slope by season

A downward log-log slope previews price elasticity. The fitted line and slope are descriptive, not yet causal.

Figure 2. Winter and non-winter scatterplots both slope downward, but the comparison is still descriptive because seasonality and pricing strategy move together.

Figure 2 is useful precisely because it is not enough. It gives students the visual intuition for elasticity while leaving room for the identification question: what price variation is independent of demand shocks, promotions, inventory, store mix, and seasonality?

Concept check

Three questions spanning what a narrow interval does and does not certify, reading distribution shape and the log transform, and overlapping-interval reasoning.

  1. 1.
    A dashboard shows last quarter's regional sales gap with very tight error bars, so a VP concludes the gap is a reliable basis to shift the ad budget. What is the strongest objection?
  2. 2.
    Store-month soup volume is sharply right-skewed, and the chapter takes the log before plotting price against volume. What is the real reason to use logs here?
  3. 3.
    Two months' mean-share intervals overlap slightly; a teammate says "overlap means no real difference, so ignore it." A second month's interval is much wider than the first. How should you reason?