Part I · Chapter 1

Reading Data as Business Evidence

Before you ask what model to run, ask what one row means, what shape the table is in, and what kind of column you are looking at.

This chapter focuses on the three reading errors that sit behind almost every analytical mistake a manager hears about — double-counted revenue, joins that explode, a 4.2-star average that misleads, a quarterly ranking that rewards a fading store. Working through a single week, then eight weeks, at Bean & Basket Coffee, it shows how the grain of a row, the shape of a table, and the measurement type of a column quietly set the ceiling on what any analysis can honestly claim. The takeaway is a posture rather than a formula: read the panel before you rank, keep the finest grain you can sustain, and write the one-page variable dictionary that catches type confusion at design time instead of slide-review time.

Start reading

Topics covered

dataset grain and unit of observationmean-of-a-mean weighting errorduplicate explosion across mismatched join grainscross-section vs. time-series vs. panelgeo-spatial and network data shapessnapshot mistaken for trendstorage type vs. measurement typeordered categoricals and the top-2-box sharethe variable dictionary as cheap insurance

Topics covered

In this chapter