§17.4
Final Integrative Case: The Bean & Basket Expansion
The book has spent twenty-five chapters teaching methods one at a time. The methods are useful one at a time. The real test, and the reason for the book, is whether a manager can hold them all in mind at once when an actual strategic decision arrives — the kind that has a structured-data ingredient and a visual ingredient and a causal ingredient and a predictive ingredient and a customer-voice ingredient and an operating ingredient. This article is that test.
We pose one strategic decision and walk down the ladder. By the end, every Part of the book will have contributed; a single memo will pull the contributions together; and the reader will have seen what it looks like to deploy the full evidence stack against one question.
The Executive Question
Bean & Basket has the option to open in roughly fifty new mid-sized US cities over the next two years. Which markets, in what sequence, and on what conditions?
The question is genuinely hard. Every Part of the book contributes a piece, and no Part can answer it alone.
Where We Are on the Ladder
This is a multi-rung question. It does not sit on one rung; it touches every rung.
The decision ladder
The walk-through that follows takes the rungs in order. Each section names the Part of the book it pulls from, the artefact it produces, and the question it can and cannot answer.
Part I — Describing the Candidate Markets
The first question is mechanical: what do we know about each candidate market?
The team assembles a candidate-market table at the city level. Each row is a metropolitan statistical area (MSA). Columns include demographics (population, median income, age distribution), commercial geography (retail density, competitor store counts, foot-traffic estimates), and existing Bean & Basket presence (current customer base from the loyalty program, online sales penetration).
The Part I work is to:
- Define the unit of analysis (MSA, not store; the store-level decision comes later).
- Build the data dictionary that documents every column's source, definition, and trust level (§2.4, §2.5).
- Run the Data Language Studio (§2.6) on the candidate-market table to produce a clean metric brief.
What this produces: a clean, named, versioned candidate-market dataset with documented provenance. About 80 candidate MSAs survive a basic filter (population threshold, no current store).
What this doesn't produce: any opinion about which markets to enter. Description is not yet decision.
Part II — Seeing the Pattern
Before any model, the team builds a small set of visual evidence views (§5–§8):
- A map of candidate MSAs sized by population and colored by an early opportunity score.
- Small multiples of historical Bean & Basket-style demographic signatures across existing stores, so the candidate markets can be visually compared to existing successful regions.
- An interval-plot view of existing store revenue by market type, with confidence intervals — what range of first-year revenue is plausible based on history?
- A dashboard storyboard (§4.5) that walks an executive through: here is the candidate pool → here is where we already work → here is the overlap → here is where the white space is.
The visual decision brief (§4.6) is the first managerial artefact: an executive-facing summary of where in the candidate pool the strongest visual cases live, before any modelling.
What this produces: a shared visual map, an executive briefing with five charts that drive the next conversation, and a short list of markets with the most plausible-looking visual case.
What this doesn't produce: causal claims. The visual cases are associations.
Part III — What We've Learned From Past Expansion
Bean & Basket has expanded before. The Part III question: what does our history tell us about which past expansions actually worked, and what generalizes to the candidate markets?
A few moves:
- Difference-in-differences (§7.1) on past city launches — for each historical expansion, compare the entered market's local coffee/foodservice metrics to a synthetic comparison of similar non-entered cities. Did the entry move regional indicators, or was the entered market already converging there?
- Heterogeneous treatment effects (§7.3) — break down past expansion outcomes by market type (college towns vs. office hubs vs. dense urban infills vs. suburban). Which types delivered the strongest first-year revenue per square foot?
- Synthetic control (§7.2) on a flagship past expansion — what would have happened in that market without our entry, and how confident is the team in that counterfactual?
- An identification memo (§6.2) documenting the assumptions: parallel trends in the donor pool, no contemporaneous shocks, the threats and how the team treats them.
What this produces: a credible read on which kinds of markets the firm has historically succeeded in, and a written record of the identification logic.
What this doesn't produce: a recommendation for any specific candidate. Past expansion outcomes inform the prior; they don't pick the target.
Part IV — Ranking the Candidates
With the Part III prior in hand, the team builds a predictive ranking of the 80 candidate markets.
The Part IV machinery:
- A supervised model trained on historical store-level first-year revenue, with features from demographics, geography, competitor density, and Bean & Basket's online customer penetration in the area (§14, §10.3 for numeric prediction).
- A classification companion that scores each candidate market on the probability of clearing a profitability threshold within 18 months (§10.1, §10.2 for the threshold-profit logic).
- Segmentation of the candidate pool (§11.1) to group similar markets together — large coastal college towns, mid-sized Sun Belt office hubs, dense Midwest urban infills, etc. The segments will inform the rollout sequencing, not just the ranking.
- A perceptual map of candidate markets along value-vs-premium and convenience-vs-experience dimensions (§11.2), so the team can see which white spaces the firm is best positioned to occupy.
- A model card (§10.5) documenting the predictive model's known failure modes — particularly the markets where the historical training data is thinnest.
What this produces: a ranked list of the 80 candidates by expected first-year revenue and by probability of clearing the profitability threshold, grouped by market type, with model-card-documented uncertainty.
What this doesn't produce: an account of what the local market's customers are actually saying about coffee, brand, or service. Predictive scores are about patterns; the texture of each market is in its language.
Part V — Reading the Local Customer Voice
A candidate market that looks promising on demographics and competitor density may still be a bad fit if the local coffee culture is hostile, the regional press is unfriendly, or the existing competitors enjoy strong loyalty narratives. Part V is how the team reads that texture.
For each top-20 candidate market, the team runs:
- Sentiment and topic modelling (§13.4, §13.5) on local Yelp / Google reviews of existing coffee shops. What do customers in this market value? What complaints recur?
- Embedding-based clustering (§14.3) of local review text to surface the dominant themes by market.
- GPT-as-measurement (§14.4) on the local review corpus, scoring constructs like "brand-as-third-place importance", "willingness to pay for premium", "loyalty to local independents", "convenience over experience."
- Multimodal social-media monitoring (§15.4) — image and text posts about coffee in each market. Are there visual conventions (the matcha latte aesthetic, the third-wave craft narrative) the firm should accommodate?
- A RAG (§15.1) assistant built on the firm's own past market-entry post-mortems, so the team can ask "what did we learn about markets that looked like Greenville?" and get cited answers from the corporate knowledge base.
What this produces: a per-market texture profile — what coffee culture is local, what the team should expect to compete with, what brand voice is likely to land.
What this doesn't produce: a green light. The voice analysis adds texture but does not replace the predictive ranking; it modifies confidence intervals around it.
Part VI — Operating the Rollout
The methods of Parts I–V give the firm a recommendation. Part VI gives it an operating model to roll that recommendation out.
The Part VI work:
- A rollout plan that opens in two waves: 8 stores in 6 months across the highest-confidence segment; 12 stores in months 7–18 across the next-confidence segment; the remaining 30 deferred pending learning from waves 1–2.
- An AI workflow card (§16.4) for the per-market voice-monitoring system that runs during the rollout — sentiment, topic, construct, and emerging-issue tracking in each newly entered market.
- A portfolio monitoring view (§17.3) that tracks every new store's revenue, churn, customer voice, and operational metrics against the per-market predictions from Part IV.
- A named on-call rotation for the early stores, with rollback criteria — what would cause us to pause the rollout, what would cause us to close a store, what would cause us to accelerate.
- A decision retrospective cadence: review wave-1 outcomes at month 6, formally update the Part IV model and the Part V voice constructs based on what we learned, before wave-2 markets commit.
What this produces: a rollout that learns as it executes, not one that commits all 50 markets up front based on a static prediction.
The Final Memo
The eleven sections of the §17.2 template, pulled from every Part above. A real memo would fit on one page; what follows is the structure.
The decision memo template — one page, eleven sections
| Decision | The recommendation in one sentence. Name the action, the unit, the horizon. |
|---|---|
| Context | Why this decision now. The pressure or opportunity that prompted it. |
| Evidence — descriptive | What the data already shows (Parts I–II). One chart, one number. |
| Evidence — causal | What we know about cause and effect (Part III). The identification claim. |
| Evidence — predictive | What we expect to happen (Part IV). The model and its uncertainty. |
| Evidence — AI / unstructured | What customer or document text adds (Part V). Constructs and grounded answers. |
| Counterfactual | What happens if we do nothing. Always named. |
| Uncertainty | Honest assessment of what could be wrong. Sensitivity to assumptions. |
| Recommendation | The action. The threshold. The named owner. |
| Next test | What we will measure to know whether the action worked. |
| Open questions | What this memo did not answer; what the next memo should. |
Lead with the recommendation. Show one chart, not three. Name the counterfactual, the threshold, and the owner.
A walk-through of how each section gets filled in for the expansion:
- Decision. "Approve a phased entry into 8 wave-1 markets (drawn from the dense-coastal-college-town segment, top of the Part IV ranking, with passing Part V voice profiles) over the next 6 months; commit only $X of the $5X total expansion budget; defer commitments on the remaining 42 markets pending wave-1 learning."
- Context. Why now. Saturation of existing footprint; competitor activity; available capital; corporate growth targets.
- Descriptive evidence. Candidate-market dataset and Visual Decision Brief from Parts I–II.
- Causal evidence. Past-expansion DiD and synthetic-control results from Part III, with the identification memo attached.
- Predictive evidence. Part IV ranking, classification confidence intervals, segmentation, and model card.
- AI / unstructured evidence. Per-market voice profiles, construct scores, RAG-pulled lessons from past post-mortems.
- Counterfactual. What happens if we delay all 50 markets a year. What happens if we commit all 50 markets now without phasing. Both alternatives modelled.
- Uncertainty. Sensitivity to the historical training-data thinness; sensitivity to per-market competitor behaviour; sensitivity to capital cost assumptions.
- Recommendation. Action, threshold, owner. Phase 1 launches Q1; reassessment gate at month 6.
- Next test. Wave-1 markets' first-year revenue vs. predicted; voice-monitoring KPIs in entered markets; competitor response signals.
- Open questions. Are there market types in the candidate pool that our model is systematically blind to? Should we run an experimental third wave in a high-uncertainty market type to learn its dynamics?
The memo would fit on a single page in production. The team would attach the supporting Parts I–V artefacts as appendices for any reviewer who wants to drill down.
Where the Framework Is Hardest
A few honest places where the book's framework strains under a real decision like this one:
- Past data is thin in the relevant range. Bean & Basket has done expansion before, but never at this scale and never into all the segments of the candidate pool. The Part IV model's confidence intervals are largest precisely where the decisions are biggest.
- Counterfactuals are fuzzy at the strategic level. "What if we delay a year" is hard to model with anything like the rigour of a campaign holdout. The team has to be explicit about the gap.
- AI workflows are an addition, not a substitute. The Part V voice analysis adds texture; it doesn't override Part IV's ranking when the two disagree. Knowing how to weight the two is judgment, not algorithm.
- The operating system is mostly unbuilt. The §17.3 portfolio monitoring described above assumes a level of infrastructure most firms haven't built. The book describes the target; the firm has to fund the path.
These limits aren't failures of the framework. They are honest descriptions of where structured analysis ends and managerial judgment begins. The book's strongest claim is that the frame — the artefact family, the decision ladder, the evidence stack — keeps the judgment honest.
The Artefacts That Survive the Decision
A final inventory. By the time the wave-1 decision is signed, the team has shipped:
The artefact family — five one-page documents that survive the work
Each artefact extends the discipline of the one above. The card you write at §9.1 grows into the memo you sign at §24.1.
Plus, from the Part VI work:
- A complete artefact catalog entry for the expansion decision portfolio.
- A portfolio monitoring view that runs throughout the rollout.
- A decision retrospective scheduled for month 6.
- A model card and workflow card for every system in production.
- The supporting Studios (Visual Brief, Pricing & Promotion, Customer Intelligence, Customer Voice) refreshed against the new markets as they go live.
This is what a complete pass through the evidence stack looks like when the question is hard enough to need it.
Coda — The Discipline Outlives the Methods
A closing observation worth holding onto. Every specific method in the book will look slightly different in five years. The model architectures will improve. The LLM landscape will move. The visualization tools will get better. The vendors will rotate.
The frame won't. Name the decision. Name the counterfactual. Name the threshold. Show the evidence in the language that fits the question. Ship the artefact. Monitor the portfolio. Retrospect on the cadence. Learn.
That is what the book has been training. The methods are the language of the moment. The discipline is what the manager keeps.
The case packs that accompany this article — the Bean & Basket expansion datasets, the worked partial solutions, the teaching notes — give the reader the chance to run this case themselves. The framework is now the reader's to use.