§3.3

Case Study: Market Concentration Metrics

Imagine the analyst is sitting inside the FDA's Office of Prescription Drug Promotion. The office's mission is not to run merger review. It is to make sure prescription drug promotion is truthful, balanced, and accurately communicated. But before reviewing a message environment, the analyst has to ask a prior quantitative question: whose voices dominate the advertising market? If a small number of owners account for most public-facing spend in a category, surveillance, sampling, and compliance priorities should look different than they would in a fragmented field.

The executive question: how concentrated is market voice across industries?

This case uses the same parquet file behind the Industry Ad Spend Explorer, but the metric question is different. The studio asks how advertising budgets moved through the Covid shock. Here the question is cross-sectional: which source-defined markets have concentrated advertising voice, and how sensitive is that answer to the definition of "market" and "firm"?

The economic intuition comes from industrial organization. Market power is not observed directly in a descriptive ad-spend table. Instead, the analyst chooses a market boundary, chooses a firm boundary, converts dollars into shares, and summarizes the share distribution. That is why concentration metrics are excellent teaching examples for a visualization chapter: the chart is only as honest as the metric definition underneath it.

The case borrows reference thresholds from the DOJ/FTC 2023 Merger Guidelines, which describe HHI above 1,000 as concentrated and HHI above 1,800 as highly concentrated. Those lines are used here as visual benchmarks, not as legal conclusions. The same guidelines also emphasize that market shares and concentration are most probative when calculated in a properly identified market; a broad ad-spend industry is a useful screen, not a final relevant market.

The metric contract

The source table has two hierarchies that matter. On the firm side it includes PARENT, SUBSIDIARY, ADVERTISER, BRAND, and PRODUCT. On the market side it includes Industry_Group, INDUSTRY, MAJOR, CATEGORY, SUBCATEGORY, and MICROCATEGORY. The headline firm unit in this case is an owner proxy: use PARENT when it is known, and use ADVERTISER when PARENT is the pooled PARENT UNKNOWN bucket. That rule avoids treating many unrelated unknown-parent advertisers as one giant firm.

The default market proxy is INDUSTRY, not the broader Industry_Group. That gives 61 source-defined industries, versus 25 broad groups, 292 major buckets, 928 categories, and 1,695 subcategories. The share variable is total positive dollar_spent from January 2018 through December 2022. The data contains 2.98 million positive spend rows over 60 months and about $369B in measured spend. The metrics describe share of advertising voice, not share of sales, prescriptions, patients, or units.

Study window

61 INDUSTRY markets

25 broad groups; $369.0B in positive 2018-2022 ad spend

Owner proxy entities

237,936

PARENT when known, otherwise ADVERTISER

HHI bands

14 high / 17 moderate

DOJ/FTC-style thresholds used as visual reference lines

Top large industry

4,992

Household Soaps, Cleansers & Polishes; 89.2% CR4; $4.58B spend

Unknown-parent spend

1.2%

Split to advertiser instead of treated as one firm

Median entity spend

$4.0k

99th percentile: $8.42M

Figure 1. The measurement contract matters before the chart does: source-defined industry is the default market proxy, owner proxy is the firm proxy, and ad dollars are the share basis.
Table 1. Concentration metrics are compact summaries of the same share vector. CR1 and CR4 focus on the leaders; HHI uses the whole distribution.
MetricFormulaReads asWatch out for
CR1largest entity shareHow much of the market proxy is controlled by the single largest voice.Can miss the difference between one dominant firm plus a fringe and four roughly equal leaders.
CR4sum of the four largest sharesHow much of the market proxy is controlled by the leading set of firms.Ignores how unequal the top four are and ignores the shape of the remaining tail.
HHI10,000 x sum of squared sharesA full-distribution concentration index that gives more weight to large firms.Looks precise even when the market definition or ownership hierarchy is only a proxy.
Effective entities10,000 / HHIThe number of equal-sized firms that would produce the same HHI.Useful intuition, but not an actual count of competitors.

Concentration ratio

CRk=i=1ksiCR_k = \sum_{i=1}^{k} s_i

Herfindahl-Hirschman Index

HHI=10,000×i=1Nsi2HHI = 10{,}000 \times \sum_{i=1}^{N} s_i^2

The HHI formula squares shares before summing them. That is the reason a market with one 70% firm looks very different from a market with four roughly equal leaders, even if both have visible top-four concentration. The effective-entities transformation, 10,000 / HHI, gives a useful intuition: household soaps, cleansers, and polishes has an HHI of about 4,992, equivalent to roughly 2.0 equal-sized firms.

Market definition changes the answer

The broadest market definition is too coarse for the headline. Industry_Group pools many distinct product and service markets, so it makes the economy look mostly fragmented: only 2 of 25 broad groups are highly concentrated, and only 11.3% of spend lies in markets with HHI above 1,000. At INDUSTRY, the median HHI rises to about 1,018 and 36.0% of spend lies in concentrated markets. At MAJOR, CATEGORY, and SUBCATEGORY, concentration becomes the normal pattern because the buckets approach product categories rather than industries.

Market definition changes the empirical result

The same owner-proxy spend shares are recomputed at progressively narrower source hierarchy levels. Bars show the share of markets in each HHI band; dots mark the median HHI.

Highly concentratedModerately concentratedUnconcentrated
Figure 2. Concentration increases as the source market definition narrows; this is a market-definition result, not a charting detail.

This is why the rest of the case uses INDUSTRY as the default. It is narrow enough to avoid the worst pooling of the 25 broad groups, but still broad enough to support a cross-industry comparison without making thousands of tiny categories carry the main story. The narrower levels become sensitivity checks and targeted drilldowns, especially for regulated categories such as prescription drugs.

Concentration is rare, but top-four voice is common

Figure 3 ranks substantial INDUSTRY markets by owner-proxy HHI. A spend floor is used only for display so tiny source buckets do not dominate the rank chart. The most concentrated large markets include household soaps, cleansers, and polishes; discount department and variety stores; cigarettes and tobacco; department stores; and men's toiletries and skin care. Several are not monopoly-like by CR1, but they have extremely high CR4. Discount department and variety stores is the clearest example: Walmart's leader share is 35.3%, while the top four owners account for 99.2% of measured voice.

HHI by substantial source-defined industry

Owner-proxy ad spend shares, 2018-2022 pooled; shown for INDUSTRY markets with at least $500.0M spend. Each row is a lollipop: stem length and dot color encode HHI.

Highly concentratedModerately concentratedUnconcentrated
Figure 3. HHI separates the concentrated top of the ad-spend distribution from the long tail of source-defined industries.
Table 2. The highest-HHI substantial INDUSTRY markets show different structures: one dominant owner in household soaps, a nearly complete top-four structure in discount stores, and high top-four shares in tobacco and department stores.
IndustryLeaderCR1CR4HHIEffective entitiesBand
Household Soaps, Cleansers & PolishesProcter & Gamble Co69.5%89.2%4,9922.0Highly concentrated
Discount Department & Variety StoresWalmart Inc35.3%99.2%3,0903.2Highly concentrated
Cigarettes, Tobacco & AccessoriesBritish American Tobacco Plc38.5%90.8%2,8023.6Highly concentrated
Department StoresMacys Inc34.7%92.2%2,7453.6Highly concentrated
Toiletries, Hygienic Gds & Skin Care-MenProcter & Gamble Co45.7%82.6%2,6553.8Highly concentrated
Misc MerchandiseNOT ITEMIZED-MISC MERCHANDISE47.8%58.7%2,3594.2Highly concentrated
Beer & WineConstellation Brands Inc32.5%84.8%2,2744.4Highly concentrated
Gasoline, Lubricants (Trans) & FuelsExxon Mobil Corp40.8%72.3%2,0984.8Highly concentrated

Figure 4 shows why CR1, CR4, and HHI should be read together. Household soaps is a dominant-leader case: CR1 is 69.5% and CR4 is 89.2%. Discount department and variety stores is a top-four case: the leader share is lower, but CR4 is 99.2%. Medicines and proprietary remedies is the opposite: it has very large spend, but broad ownership and a low HHI at the INDUSTRY level.

CR1 and CR4 ask different concentration questions

Substantial INDUSTRY markets only; bubble size is total ad spend. CR1 is leader share and CR4 is combined top-four share.

Highly concentratedModerately concentratedUnconcentrated
Figure 4. CR1 and CR4 reveal leader dominance and top-four dominance, while HHI captures whether the rest of the market is also concentrated.

The FDA-style surveillance implication is practical rather than legal. If a regulated claims team wants broad coverage of a concentrated prescription category, a small set of owners may cover a large share of public-facing spend. If the same team wants broad coverage of a fragmented category, owner-level sampling is a long-tail problem. For prescription drugs, even the INDUSTRY level can still be too coarse; the same logic should be repeated by condition, product class, channel, and claim type.

The FDA-relevant drilldown is narrower than industry

The pharmaceuticals example shows why market definition cannot be delegated to a field name. Holding Industry_Group = Pharmaceuticals, the INDUSTRY level is not concentrated: medicines and proprietary remedies has HHI around 664, and pharmaceutical houses has HHI around 717. But the prescription subcategories are a different story. Diabetes/endocrine medications, arthritis medications, respiratory disorder medications, hematology/immunology medications, and heart disorder medications all cross the high-concentration HHI benchmark in this ad-spend proxy.

Pharma concentration appears after narrowing the market

Industry_Group is fixed to Pharmaceuticals; rows compare the source INDUSTRY level with high-spend prescription subcategories.

Highly concentratedModerately concentratedUnconcentrated
Figure 5. Broad pharma is low-HHI, while several high-spend prescription subcategories are highly concentrated.
Table 3. High-spend prescription subcategories illustrate the surveillance value of a narrower market definition.
Prescription subcategoryLeaderSpendCR4HHIBand
Prescription Dermatological MedicationsAbbVie Inc$4.01B75.0%1,860Highly concentrated
Prescription Diabetes/Endocrine MedicationsNovo Nordisk AS$3.13B91.4%2,933Highly concentrated
Prescription Cancer Therapy MedicationsMerck & Co Inc$2.44B80.2%1,893Highly concentrated
Prescription Medications, Multi ConditionAmgen Inc$2.32B59.4%1,221Moderately concentrated
Prescription Arthritis MedicationsAbbVie Inc$2.05B95.6%4,103Highly concentrated
Pharmaceutical Houses (Cat)Glaxosmithkline Plc$1.96B45.1%717Unconcentrated
Prescription Pain/Central Nervous System MedicationsAbbVie Inc$1.89B69.6%1,550Moderately concentrated
Prescription Respiratory Disorder MedicationsGlaxosmithkline Plc$1.70B99.0%3,226Highly concentrated

The firm boundary is not a detail

The largest methodological risk is entity definition. The same underlying rows can be summarized by parent owner, advertiser, or brand. Those definitions are not interchangeable. Parent-level aggregation is closest to a firm-level IO question because it groups multiple brands and advertiser labels under common ownership. Advertiser and brand definitions are useful for marketing execution, but they split corporate families and usually make concentration look lower.

The unit of analysis can change the apparent market structure

Same rows, same INDUSTRY denominator, three entity definitions. Owner proxy aggregates corporate families; advertiser and brand split them.

Owner proxyAdvertiserBrand
Figure 6. Ownership aggregation can materially change the HHI. Household soaps looks highly concentrated at owner level, but much less concentrated at advertiser or brand level.

The biggest example is household soaps, cleansers, and polishes: owner-proxy HHI is about 4,992, while advertiser-level HHI is about 1,008 and brand-level HHI is about 353. That is not a contradiction. It is the measurement hierarchy doing exactly what it is supposed to do. If the question is corporate market power or coordinated control of advertising strategy, split brands are too granular. If the question is creative execution or consumer-facing brand clutter, brand-level concentration may be the right object.

Thresholds should be shown as sensitivity, not hidden as cleaning

The user-facing problem with this dataset is the micro-entity tail. The median owner-by-INDUSTRY entity spends only about $4k over the full five-year window, while the 99th percentile spends about $8.4M. It is reasonable to ask whether tiny positive spend entities should be removed before computing concentration. But the choice is not free: dropping the tail and renormalizing retained entities mechanically raises HHI.

Entity spend is extremely right-skewed

Owner-by-INDUSTRY total 2018-2022 spend across 265,487 entity rows. Note the log axis: the median entity spends about $4.0k, while the 99th percentile spends about $8.42M.

The long upper tail is why a minimum-spend threshold is tempting and why it must be reported as a sensitivity check: dropping the bottom percentiles and renormalizing the survivors mechanically raises HHI.

Figure 6b. Entity spend is extremely right-skewed: a long upper tail sits above a dense floor of micro-entities, which is exactly the shape that makes a minimum-spend cutoff tempting.

The main metric therefore keeps all positive entity spend. Figure 7 treats thresholds as a sensitivity check. The $1M cutoff barely changes household soaps or medicines and proprietary remedies because each still retains 99.6% of spend. It changes miscellaneous services and amusements more because the cutoff retains only 87.0% of spend. In that setting, the higher HHI is partly a definition artifact.

Minimum-spend thresholds are a sensitivity check, not the main definition

HHI is recomputed after retaining only entities above each total-spend cutoff. Switch the entity definition to see how the same cutoff lands differently on owners, advertisers, and brands.

Entity level

Entity level: Owner proxy - leader Procter & Gamble Co

Min spendHHIBandCR1CR4Eff. entitiesRetained spend
All positive4,992Highly69.5%89.2%2.0100.0%
$10k+4,993Highly69.5%89.2%2.0100.0%
$100k+4,998Highly69.6%89.3%2.099.9%
$1M+5,030Highly69.8%89.6%2.099.6%

Parenthetical labels at the $1M cutoff show retained spend. A threshold that keeps little spend is a stress test, not a replacement denominator. Switching the entity level recomputes HHI, CR1, CR4, and effective entities from the same source rows.

Figure 7. Minimum-spend thresholds are useful only when the retained-spend share is visible beside the recomputed concentration metric.
Table 4. At the $1M owner-spend cutoff, some industries still retain nearly all spend, while long-tail industries lose enough denominator that the renormalized HHI becomes a stress test.
IndustryRetained spendRetained entitiesHHIBand
Business & Technology NEC93.3%1051,762Moderately concentrated
Government, Politics & Organizations95.0%3901,625Moderately concentrated
Household Soaps, Cleansers & Polishes99.6%335,030Highly concentrated
Medicines & Proprietary Remedies99.6%304398Unconcentrated
Misc Services & Amusements87.0%979104Unconcentrated

The rule for practice is simple: use all positive spend as the default denominator, then report threshold sensitivity with retained spend. If the HHI ranking is stable and retained spend is high, the conclusion is robust to de minimis noise. If the cutoff drops a large share of spend, treat the recomputed metric as an upper-bound stress test, not a replacement for the main estimate.

Concentration moved through the shock, but not in one direction

The 2018 to 2022 comparison is descriptive, not causal. Still, it shows a useful pattern: concentration can rise even as category spend falls. Household soaps became much more concentrated from 2018 to 2022 while measured spend fell. Department stores and miscellaneous merchandise also became more concentrated. Gasoline, lubricants, and fuels moved the other way: HHI fell sharply as spend fell, so decline alone does not imply consolidation.

Concentration changed unevenly from 2018 to 2022

Largest increases and decreases in owner-proxy HHI among INDUSTRY markets with at least $500.0M in either endpoint year.

Figure 8. HHI change from 2018 to 2022 varied by industry; contraction and concentration did not move mechanically together.

For surveillance, this kind of chart is a triage tool. A category with rising concentration may deserve a different sampling plan, especially if the leader changes, if spend shifts toward television or digital, or if the industry is already in a regulated claims environment. But the chart does not say why concentration changed. It could reflect entry, exit, media reallocation, product launches, missing ownership detail, or demand shocks.

Household Soaps, Cleansers & Polishes

HHI 4,992; CR4 89.2%

Procter & Gamble Co

69.5%

Reckitt Benckiser Plc

7.8%

Church & Dwight Co Inc

7.4%

Henkel Kgaa

4.5%

Discount Department & Variety Stores

HHI 3,090; CR4 99.2%

Walmart Inc

35.3%

Amazon.com Inc

34.8%

Target Corp

24.7%

TJX Cos Inc

4.3%

Medicines & Proprietary Remedies

HHI 395; CR4 27.7%

AbbVie Inc

12.0%

Pfizer Inc

6.9%

Amgen Inc

4.5%

Eli Lilly & Co

4.3%

Misc Services & Amusements

HHI 79; CR4 11.3%

St Jude Childrens Research Hospit...

3.9%

Ebay Inc

2.8%

NortonLifeLock Inc

2.6%

National Football League Prop Inc

2.0%
Figure 9. The top four owners make the ad-spend structure concrete; the same CR4 can come from a dominant leader or a more balanced set of large firms.

What the FDA perspective adds

FDA does not need this analysis to decide whether an ad is misleading. That requires claim-level evidence, labeling, context, and applicable rules. The Office of Prescription Drug Promotion focuses on whether prescription drug information is communicated truthfully and with balance. FDA's direct-to-consumer TV and radio rule guidance also emphasizes that major risk statements must be presented in a clear, conspicuous, and neutral manner.

Concentration metrics add a different layer: they help an analyst decide where the public message environment is structurally dominated. In a concentrated voice market, a few sponsors can shape a large share of consumer exposure. In a fragmented market, risk comes less from one owner's dominance and more from coverage, sampling, and heterogeneous claims.

The pharmaceutical finding is therefore two-sided. At the broad group and INDUSTRY levels, pharmaceutical ad spend is not highly concentrated by HHI. At prescription subcategory level, several high-spend markets are highly concentrated. That does not settle a regulatory question, but it tells the analyst where a claim-level review plan should become more targeted: therapeutic area, condition, prescription status, DTC versus professional promotion, media group, and possibly drug class. The visualization lesson is that the first chart should expose that next question rather than bury it.

Method notes and robustness

The calculations use the following choices:

  1. Keep all rows with positive dollar_spent through December 2022.
  2. Define the headline firm unit as PARENT unless the parent is PARENT UNKNOWN, in which case use ADVERTISER.
  3. Aggregate to owner-by-INDUSTRY total spend over 2018-2022.
  4. Convert each owner's spend into an industry share.
  5. Compute CR1, CR4, CR8, HHI, and effective entities from that share vector.
  6. Recompute market-definition sensitivity across Industry_Group, INDUSTRY, MAJOR, CATEGORY, and SUBCATEGORY.
  7. Recompute minimum-spend sensitivity at owner-spend cutoffs of $10k, $100k, and $1M, always reporting retained spend.

The robustness checks support the main descriptive pattern. The highest-concentration substantial industries remain high under minimum-spend thresholds because the retained spend share is near complete. Long-tail industries remain low-HHI even after aggressive cutoffs, but those cutoffs drop more denominator and should be read as stress tests. Market-definition sensitivity is larger than threshold sensitivity: moving from broad groups to narrower source categories changes the substantive answer more than dropping de minimis spend does.