Part V · Chapter 14

Applied Text, Embeddings, and Measured Constructs

From counting words, to placing meaning in coordinates, to measuring the constructs a manager actually cares about.

This chapter moves from counting words to measuring meaning. Two real corpora open it: @realdonaldtrump tweets, where a transparent Naive Bayes model fingerprints Android-versus-iPhone source from tone, hashtags, mentions, and timing; and Goose Island acquisition chatter, where a lexicon shows that an event spike is mostly news links and anti-corporate vocabulary rather than collapsing sentiment. From there embeddings turn documents into vectors in a learned coordinate system, powering semantic search, clustering, brand maps, and drift detection. The payoff is GPT-as-measurement, where a language model scores named constructs a manager actually cares about — intent to return, evasiveness, a sense of betrayal — directly rather than through a sentiment proxy.

Start reading

Topics covered

Naive Bayes source classificationauthorship fingerprinting from metadatatransparent sentiment lexiconspre / event / post event-study framingembeddings as a coordinate system for meaningcosine similarity and nearest-neighbour retrievalsemantic search and vector databasessurvey-vs-text brand-map triangulationGPT construct measurement and debiasing

Topics covered

In this chapter