Most people picture data brokers as a single shady company hoarding a giant spreadsheet. The reality is messier and more interesting. The modern data broker is a stitching machine, taking thousands of small, individually-harmless data points and joining them into a single profile that’s worth real money to advertisers, lenders, recruiters, and (increasingly) attackers.
Understanding how the stitch works is the first step to making it harder. Here’s the actual pipeline.
The raw inputs
Brokers pull from four sources. Public records, voter rolls, property assessor databases, court filings, corporate registrations, marriage and divorce records. Commercial transactions, loyalty programmes, credit-card spend categorisations sold by issuers, retail return histories. Behavioural signals, ad-tech clickstream data, mobile app SDKs that quietly broadcast location and device IDs. Breach datasets, emails, passwords, addresses, and phone numbers from the hundreds of breaches that hit the open market every year.
Any one of these is forgettable on its own. The value is in the join.
The identity graph
The broker’s core asset is what’s called an identity graph. Each row is a person. Each column is an identifier they could be reached through, full name, date of birth, primary email, secondary email, mobile phone, current address, three previous addresses, MAID (mobile advertising ID), cookie IDs, IP histories, employer, vehicle, household members.
The graph is built by probabilistic matching. If a record from a fitness-app SDK shares a device ID with a credit-card transaction record, and the timestamps suggest the same user, the broker assigns a confidence score and joins the rows. Hundreds of these joins per profile, refreshed daily, produce a graph that knows you better than your phone does.
Why breach data is the multiplier
Breach data is the join key that makes everything else easier. An email address from a 2019 breach lets the broker tie a 2024 ad-tech cookie to a 2026 mortgage record because they all share the same email at some point in their history. The broker doesn’t have to guess, the breach gives them a stable identifier that ties a decade of activity together.
This is also why the same data brokers that quietly buy ad-tech feeds are the ones who quietly buy breach datasets. They’re not buying for the passwords. They’re buying for the join keys.
Four steps that meaningfully break the stitch
Use a different email per service. Not just a “spam” Gmail and a “real” Gmail, a unique address per signup, ideally via a forwarding service like SimpleLogin, Fastmail aliases, or Apple’s Hide My Email. The email is the broker’s primary join key. If your Spotify, Pinterest, and Doordash signups all use different addresses, the graph can’t tie them.
Reset your mobile advertising ID monthly. Both iOS (Settings → Privacy & Security → Tracking → Reset Advertising Identifier) and Android (Settings → Privacy → Ads → Delete advertising ID) let you nuke the ID. The broker’s behavioural data has to start over each time. Doing this on the first of every month takes thirty seconds and meaningfully degrades the cross-app behavioural fingerprint.
Opt out of the brokers that publish opt-out portals. Spokeo, BeenVerified, Whitepages, Intelius, and the rest publish removal forms because state laws now require them to. Services like DeleteMe, Privacy Bee, and Optery automate the submissions. Once a year, run a fresh sweep, brokers re-add records constantly.
Stop installing apps that ship SDKs you don’t trust. The bottom 30% of the app store by reputation, flashlight apps, weather apps with twenty permissions, free game launchers, are largely vehicles for selling SDK data. The audit takes ten minutes per phone.
The threat-actor angle
The same identity graphs feed phishing operators, romance-scam outfits, and SIM-swap crews. A graph that knows your bank, your physical address, and the name of your spouse is a phishing template that almost writes itself. Breaking the stitch isn’t a privacy purist’s exercise, it directly raises the cost of targeted attacks against you.
