§0.0

Foreword: How to Read This Book

Modern businesses do not suffer from a lack of data. Data is generated almost every time work happens: a customer searches, buys, returns, complains, subscribes, cancels, scans a badge, opens an app, talks to support, receives a recommendation, or asks an AI assistant to summarize a document. The challenge is not recognizing that this data exists; it is understanding what kind of business trace it leaves, where it resides, what decisions it can support, and what makes those decisions defensible.

This book evolved from teaching notes for the Data-Driven Decision Making (D3M) course at NYU Stern. It stems from a simple observation: although managers are taught more analytics every decade, business decisions do not necessarily improve. Methods get more sophisticated, dashboards multiply, and AI announcements pile up. Yet firms still run campaigns that only appear to work, ship prediction models divorced from action, mistake polished visualizations for evidence, and automate workflows without mechanisms to detect failure.

The core wager of this book is that the missing skill isn't another technical method. It is a practical operating discipline — one that connects the business question to the data trace, the data trace to the evidence, and the evidence to a decision someone can actually own.

What This Book Is

This is a textbook for MBA-level managers and analysts who want to be fluent in modern data work. The goal is not necessarily to build every model from scratch, but to commission, evaluate, deploy, and govern data work intelligently.

By the end of this book, you should be able to:

Identify where business data is generated in ordinary human and organizational activity.
Distinguish between the systems that store data for transactions, analytics, search, AI, and governance.
Route a business question to the appropriate evidence language: dashboard, diagnostic view, causal design, prediction model, recommendation system, or AI workflow.
Interpret the standard visuals and artifacts that make each method actionable.
Anticipate and recognize the standard failures of each workflow before they result in executive mistakes.
Translate analysis into durable artifacts: metric cards, task contracts, model cards, AI workflow cards, dashboards, and decision memos.

While we cover the latest algorithms and AI models, this is not a textbook for aspiring data scientists seeking to derive equations from scratch. Mathematics appears only when it clarifies business meaning. Code and empirical exercises live in the accompanying case packs; the chapter prose remains strictly focused on managerial intuition and practical application.

The Practical Stance

How to Read It

There are three ways to approach this material.

Cover to cover. The arc is deliberate. Part I teaches data language. Part II teaches visual evidence. Part III teaches causal decisions. Part IV teaches prediction and segmentation. Part V teaches unstructured data, embeddings, and AI workflows. Part VI turns the pieces into a cohesive operating system.

Question-first. If you arrive with a specific business question, route it first. What happened? starts in Parts I and II. Did we cause it? starts in Part III. What will happen next? starts in Part IV. What are customers saying? starts in Part V. How do we run this repeatedly? starts in Part VI.

Method reference. Each chapter can be read on its own. Use the table of contents as an index when you need to look up a specific method, but always keep asking what decision that method is supposed to improve.

A Note on Cases

This book leans on one continuous fictional company, Bean & Basket Coffee, to illustrate core concepts, alongside a large set of standalone data cases. The standalone cases provide a testing ground for methods where real-world datasets teach the lesson better than our house brand.

Part II utilizes a market-concentration and ad-spend case, as well as the Progresso soup-scanner panel, which subsequently anchors the pricing and causal chapters. Part III's causal work builds on Progresso alongside the Milk and Zillow Colorado quasi-experiments. Part IV brings in the RentHop listings and New York Lottery ZIP psychographics cases. Finally, Part V analyzes real text corpora — Trump-tweet authorship, Goose Island acquisition sentiment, and Amazon review topics — along with the BAV brand-survey data that powers the Fast-Food Perceptual Map studio.

A Note on the AI Chapters

The AI chapters are built around structural patterns that will outlast any single model release: embeddings, retrieval, structured outputs, evaluations, human review, tool use, monitoring, and governance. The technical details of model selection will inevitably change. The managerial discipline should not.

The right question is not "which model is best?" in the abstract. The right question is: what workflow is this model part of, what evidence does it produce, how will we evaluate it, and who is responsible when it fails?

What You Will Have at the End

By the time you finish Part VI, you should hold five things you did not have at the start:

A map of the modern business data system, tracing from source activity to storage, to evidence, to action.
A vocabulary for naming the required evidence language for any business question.
A toolkit of standard analytical methods and the visuals that interpret them.
A portfolio of artifacts that turn loose analysis into processes the firm can audit, reproduce, refresh, and govern.
A stance: that methods serve decisions, evidence precedes equations, and monitoring closes the loop.

Code & Data Repository

All data used in the case studies, along with the corresponding code in R, Python, and SQL, is available in the book's GitHub repository.

Vishal Singh
Stern School of Business, NYU