§12.1

From Segments to Targeting: Ad Platforms and Lookalikes

A clustered segmentation lives in an analyst's notebook. A targetable audience lives on an ad platform. The bridge between the two is the topic of this article — and it is where most of the algorithmic ideas of the previous chapters become operational. Targeting is segmentation that has been wired into an action: a campaign, a budget, a creative, an auction.

We work through the bridge in four parts. First, the taxonomy of targeting that every major ad platform exposes — location, demographic, interest, behavior, custom, lookalike. Second, the under-the-hood explanation of what a lookalike audience really is (nearest-neighbour scoring at platform scale). Third, the reach-vs-similarity dial that every campaign has to set. Fourth, the funnel that retargeting is designed to nudge.

The chapter draws on the digital-marketing case material in case/part4/Facebook_AD_Platform.Rmd. We deliberately keep the discussion platform-agnostic. The names and surface details rotate; the underlying ideas don't.

The Executive Question

Now that we know who looks like our best customers, how do we put that knowledge in front of media buying — and how broadly should we cast the net?

The second clause is the hard one. A perfectly targeted audience that contains 800 people will not move revenue. A broad audience that contains everyone is just untargeted advertising. The dial between the two is the central decision.

The Targeting Taxonomy

Every major ad platform exposes a similar set of targeting families. The vocabulary varies; the structure does not.

Ad-platform targeting — six families layered to find a customer

Location

· country
· city / zip
· radius
· residents vs. visitors

Demographic

· age
· gender
· education
· job title
· life events

Interest

· hobbies
· entertainment
· shopping
· sports
· cuisine

Behavioral

· past purchases
· device usage
· travel
· site visits

Custom Audiences

· site visitors
· email list
· app users
· CRM upload

Lookalike

· seed audience
· similarity threshold
· reach setting

Layering families is the standard targeting strategy. Layer too many and the audience disappears; too few and the audience is everyone.

Figure 1. Six families of targeting that combine to define an audience. The first four are inferences platforms make about users; the last two are bridges from the firm's own data (custom audiences) and from the algorithm's similarity model (lookalike audiences).

Two practical notes on how the families combine:

Layering tightens. Each additional filter (women, in Brooklyn, who like espresso, who shop online) shrinks the audience multiplicatively. Two or three filters are usually enough; five is usually too many.
The families have different freshness profiles. Demographic data updates rarely; behavioural data updates by the hour. A campaign that depends heavily on behavioural targeting needs the platform to keep its signals fresh — which is a quiet dependency on the platform itself.

The two families that map most directly to ideas from Part IV are custom audiences and lookalike audiences.

Custom Audiences: Bring Your Own Segment

A custom audience is what you get when the firm hands the platform a list of its own customers (or website visitors, app users, CRM records) and asks to show ads only to those people. It is the operational form of the segmentations from §11.1.

The most common variants:

Customer list upload. Hashed email addresses or phone numbers from the firm's CRM, matched on the platform's side.
Website visitors. Built from pixel-based tracking on the firm's site, often segmented by which pages they visited.
App users. Built from SDK-based tracking inside the firm's app.
Engagement audiences. People who have interacted with the firm's content on the platform itself.

Custom audiences are the right tool for retention and retargeting. They are not a growth tool, because they cap out at the size of the firm's existing relationships.

Lookalike Audiences: Nearest Neighbours at Platform Scale

A lookalike audience asks the platform to find users who resemble a seed audience the firm uploaded. The seed is usually a custom audience of high-value customers — converters, repeat buyers, premium tier, whatever the firm wants more of.

Stripped of platform-specific language, the recipe is exactly what Chapter 11 prepared us to expect:

The platform represents every user as a vector of attributes — demographics, interests, behaviours.
It computes an aggregate representation of the seed audience (e.g., the centroid, or a learned classifier).
It scores every user on the platform for similarity to that representation.
It returns the top-X% of users by similarity as the lookalike audience.

In other words: lookalikes are k-nearest-neighbour (or learned classifier) inference applied at platform scale, against features the firm cannot see. That last point is the crucial one. The advertiser has no access to the features used to compute similarity, no ability to audit them, and limited recourse if the model is unfair, drifty, or simply wrong.

The Reach–Similarity Dial

Lookalike audiences come with a percentage knob: how broadly should the platform cast its net?

Reach vs. similarity — the lookalike dial

Figure 2. The reach–similarity curve for a lookalike audience. A 1% audience is closely matched to the seed but small; a 10% audience reaches an order of magnitude more people, at much lower average similarity. The right setting depends on what you want the campaign to do.

A useful way to think about the trade-off:

A 1% audience is the right default for conversion campaigns — when each impression has a high cost and the firm wants the best chance of a paid action. Closely matched users convert better; the small audience size means the campaign has to be paired with sufficient creative variation or run for long enough to learn.
A 5–10% audience is the right default for awareness or top-of-funnel campaigns — when the goal is to expand reach into adjacent segments, and a lower conversion rate is acceptable in exchange for a much larger audience.
A blended approach — running the 1% as a conversion campaign and the 5–10% as an awareness campaign in parallel — is the standard way to use the dial in production.

The shape of the curve is a property of the platform and the seed. A high-quality, large, distinctive seed pushes the curve up — there is more "lookalike signal" to spend. A small or generic seed flattens the curve quickly.

Retargeting and the Funnel

Custom audiences become especially powerful when paired with the funnel customers move through. Retargeting is the practice of showing different ads to users at different stages.

A simplified ad funnel — retargeting layers nudge each stage

Retargeting custom audiences re-engages users who reached a particular stage but didn’t convert.

Figure 3. A simplified ad funnel. Retargeting layers each have a job: re-engage users who clicked but didn't visit; bring back visitors who didn't add to cart; rescue carts that didn't convert. The right creative differs at each step.

The natural retargeting strategy is to define a custom audience per stage and serve a campaign whose creative is tuned to the next step. A cart-abandonment campaign can show the specific product left behind, with a time-bound incentive. A site-visit retargeting campaign can show a creative about the value proposition, not about closing.

A common failure mode is treating retargeting as a single audience with a single creative. It is more useful to think of it as a small portfolio of mini-campaigns, each addressed to a different stage of the funnel.

AI-Powered Optimization

Modern ad platforms have introduced AI-assisted layers — variously called Advantage+, Performance Max, Smart Bidding, depending on the platform — that automatically:

Generate creative variations and serve the best performers.
Choose placements (feed, story, search, display) on the fly.
Expand the targeting beyond the advertiser's explicit selections when it predicts performance will improve.
Bid in real time against an auction.

Two managerial implications follow:

The targeting decision becomes a constraint, not a recipe. The advertiser sets the seed (audiences), the budget (constraint), and the objective (conversion, sale, click). The platform fills in the operational details.
The evaluation language changes. With auto-creative and auto-placement, the experiment is no longer "does ad A beat ad B?" It is "does my budget plus my seed plus the platform's automation beat what I had before?" The unit of evaluation is the campaign-quarter, not the creative pair.

This is the place where digital advertising and the machine-learning lifecycle from §9.1 meet. The platform is running a continuous-learning loop over the advertiser's behalf. The advertiser's job is to define the loop's objective and to monitor whether the loop is doing what was asked.