Part IV · Chapter 9
Predictive Task Design
Get the task contract and the features right, and the algorithm almost picks itself; get them wrong, and no model can save you.
This chapter opens Part IV by writing the prediction problem down honestly — the step where most production failures are born or avoided. It traces the ladder from manager intuition to hand-coded rule to statistical score to machine-learned model, then pins the supervised task to four decisions — target, features, unit, and label timing — condensed into a one-sentence Task Contract. From there it builds the generalization toolkit (random, time-based, and group splits, plus cross-validation) alongside a gallery of leakage traps, and closes on feature engineering, where a manager's domain knowledge actually enters the model. The Bean & Basket churn model runs throughout as a reminder that the human leverage has migrated from picking algorithms to defining the task.
Topics covered
In this chapter
- 9.1From Business Rules to AlgorithmsFrames the ladder from manager intuition to machine-learned model and tests when a repeated, label-rich, actionable decision is worth algorithmizing.
- 9.2The Supervised Learning SetupFixes the supervised vocabulary — target, features, unit, label timing — into a one-sentence Task Contract that decides whether a project ships.
- 9.3Train/Test Splits, Generalization, and LeakageExplains train/test splits, cross-validation, and the leakage traps that let future information sneak into the past and inflate offline metrics.
- 9.4Feature EngineeringTurns warehouse columns into a leakage-safe feature catalog — RFM, engagement, encodings, and interactions — where managerial domain knowledge enters the model.