§17.1

The Data Product View

I
What happened?
II
Where & for whom?
III
What caused it?
III
How much does X matter?
IV
What is likely next?
V
What does the text/image say?
VI
How do we operate this?

Part VI — running the analytics organization as a learning system.

Analytics work generates artefacts whether or not anyone is paying attention. The pricing analysis becomes a slide deck that lives on someone's laptop. The churn model becomes a notebook in a private repository. The customer voice dashboard becomes a Tableau workbook with three authors and no owner. None of this is unusable. None of it is durable. Six months later, the analyst has rotated, the workbook references stale tables, and the next team rebuilds the same artefact from scratch. The first job of Part VI is to stop that cycle.

The data product view is the operating posture that gets us there. Every artefact — every card, every memo, every studio, every dashboard, every case pack — has a name, an owner, a version, a contract, and a refresh cadence. The artefacts compound; they don't decay. This article lays out the framing, walks through the artefact catalog the book has produced, and ends with the case-pack architecture the appended exercises follow.


The Executive Question

What should remain reusable after an analysis is complete, and how do we treat each artefact as a product rather than a one-off deliverable?

The honest version: the firm has been producing artefacts the whole time. The shift is to manage them — to give them names, owners, contracts, and update schedules — so the next analysis starts from a higher rung.


Five Properties of a Data Product

Anything that survives the work has these five properties. If a missing property cannot be filled in, the artefact is not yet a product.

  1. Name. Stable, human-readable, used consistently across the firm. "The Q2 retention scoring model" beats "the_model_v17_fixed_FINAL_use_this.ipynb."
  2. Owner. A named role, not a person. People rotate; roles persist. "Customer Analytics retention lead" is durable; "Priya's notebook" is not.
  3. Version. Semantic versioning for cards and memos; date-stamped versions for datasets and dashboards. The version is part of the artefact's identity, not a sidecar metadata field.
  4. Contract. What the artefact promises to do, and what it doesn't. The Model Card from §10.5 is one example; the AI Workflow Card from §16.4 is another. Contracts make compatibility explicit.
  5. Refresh cadence. When it gets updated, who triggers the update, and what re-validates the artefact after a refresh.

A team that has built all five properties around an artefact has, in effect, built a product. The artefact will outlast the person who wrote it.


The Artefact Catalog

Across the book, the methods produced a deliberate set of artefacts. The catalog is the index — every artefact, where it was introduced, and the role that should own it.

The artefact catalog — what the book builds, where, and who owns it

One-page Cards & Memos
ArtefactWhereOwner role
Decision Question Card§9.1Project sponsor + analyst
Identification Memo§11.2Analyst + reviewer
Predictive Task Contract§14.2Modelling team
Model Card§15.5Modelling team + governance
AI Workflow Card§22.1AI workflow owner
Decision Memo§24.1Sponsor + analyst
Studios (capstones)
ArtefactWhereOwner role
Data Language Studio§4.1Data team
Visual Decision Brief Studio§8.2Analyst + executive
Pricing & Promotion Studio§13.4Pricing + revenue management
Customer Intelligence Studio§17.4Customer analytics
Customer Voice Intelligence Studio§22.2Customer insights + ops
Final Integrative Case§25.1Executive owner
Dashboards & Monitoring
ArtefactWhereOwner role
KPI / dashboard storyboard§8.1Analytics + operator
Threshold–profit curve§15.2Modelling + finance
Model monitoring dashboard§17.3ML operations
AI evaluation dashboard§22.1AI governance
Portfolio monitoring view§24.2Analytics leadership
Case packs (appended)
ArtefactWhereOwner role
Soup, Milk, ZillowPart III appendixCourse / self-study
BAV, AirbnbPart IV appendixCourse / self-study
Yelp, Goose Island, Earnings, JobsPart V appendixCourse / self-study

Every artefact has a home in the book, a named owner role, and an update cadence in §24.2.

Figure 1. The full artefact catalog the book produces. Four groups — cards & memos, studios, dashboards & monitoring, case packs — each with a representative section reference and a named owner role. Every artefact should be locatable in this index.

The catalog is not exhaustive. A working analytics organization will produce additional artefacts the book did not name — feature catalogs, eval suites, prompt libraries, golden sets, on-call rotations. The pattern stays the same: name, owner, version, contract, cadence.


The Case-Pack Architecture

The standalone data cases live outside the chapter prose. The reasons are the same as the reasons for the data product view: the cases compound across uses, and they need to be versioned and owned.

Every case pack ships with a stable directory structure:

case-packs/<case-id>/
├── case_brief.md             # one-page description: business question, dataset, methods
├── data_dictionary.json      # every column, every type, every measurement choice
├── datasets/
│   ├── raw/                  # original source files (read-only)
│   └── processed/            # analysis-ready tables (regenerable)
├── charts/                   # static SVG/PNG figures the case ships with
├── tables/                   # JSON or CSV summary tables
├── insights/                 # one-paragraph findings, with the chart they belong to
├── exercises/                # student-facing prompts and questions
├── solutions/                # full worked solutions (instructor-only or graded)
└── teaching_notes.md         # what the case is good for, where it tends to confuse, common student errors

Three properties of the structure worth flagging:

  • raw/ is read-only. Source data is never modified. Everything in processed/ is regenerable from raw/ plus the scripts. This is the simplest reproducibility guarantee a case pack can make.
  • solutions/ is gated. For classroom cases, solutions live behind access control. For self-study cases, they ship alongside.
  • teaching_notes.md carries the institutional memory. The Goose Island Twitter case's teaching notes record that students reliably get confused by the difference between sentiment and emotion; the notes flag that confusion and suggest the §14.4 framing as the cure. The notes are the part that survives the instructor.

What "Shippable" Means

A working artefact passes three tests:

  1. A new team member can read it cold. No tribal knowledge required. The artefact's name, contract, and one-page card explain it.
  2. An auditor can trace every claim to a source. For a memo, every chart is reproducible from a query or notebook; every model number traces to a model card and a fitted artefact; every quote in a §22 AI workflow traces to a retrieved chunk.
  3. An operator knows what to do when it breaks. A named owner, an escalation path, and a known recovery procedure (rollback, refresh, retrain, deprecate).

A research artefact passes the first test. A production artefact passes all three. The book has been training for the third condition all along.


Versioning, Ownership, and Stewardship

Three operational notes that affect how the catalog is actually run:

Versioning. Cards and memos version semantically: v1.2.0 is a clarification; v2.0.0 is a contract change. Datasets and dashboards version by date: 2026-07-08 plus a changelog entry. The version is part of the artefact's name (BB-Churn-Model-Card_v2.1.0) not a hidden Git tag.

Ownership rotation. Roles are durable; people are not. Every artefact card should list the role that owns it ("Customer Analytics retention lead"), with the current incumbent in a separate, easily-updated field. When the incumbent rotates, the artefact transfers cleanly.

Deprecation as a first-class operation. Artefacts that are no longer in use should be deprecated explicitly, with the reason and the replacement (if any) named. An undeprecated zombie artefact is a permanent footnote on the catalog; an explicitly deprecated artefact has been retired with dignity.

These three practices — versioning, role-based ownership, explicit deprecation — separate an analytics organization that learns from one that re-litigates last year's findings every quarter.



Concept check

Three short questions on the data product view.

  1. 1.
    An analyst leaves the firm. Three months later, no one can re-run her retention analysis. The structural fault is most likely:
  2. 2.
    Three slightly different versions of "the churn model" sit on different laptops. The cleanest fix is:
  3. 3.
    An artefact's owner field reads "Priya Sharma." Priya rotated to a new team last quarter. The structural fault is: