If you’ve been following the details of the California Consumer Privacy Act (CCPA), you probably know that the California State Legislature amended the definition of “personal data” to add the word “reasonably” in front of the word “capable” in the phrase “information that identifies, relates to, describes, [and] is capable of being associated with…a particular consumer or household.” (see AdExchanger coverage.)
One small word, one giant sigh of relief for the ad tech ecosystem.
Although they’ve been thwarted elsewhere in their lobbying efforts, this looks like a win for the industry as it leaves the door open a crack for anonymous tracking, while some on the privacy side surely see a dangerous concession on clarity. More to the point, the Legislature’s recent bill (AB-1355) now specifically exempts deidentified or aggregate consumer data from the definition of personal information. This amounts to much the same thing, as one might define “deidentified” to mean information that is deliberately processed to make it not reasonably capable of being associated with a particular individual or household. How does CCPA define it? You probably guessed it doesn’t. Nor does it define “reasonable” in this context. I guess that will be up to the attorneys and judges.
Deidentified data has been at the heart of privacy debates for decades. At the risk of oversimplifying a complex topic, the obvious question is what reasonable means in terms of the implied difficulty of reidentification that would distinguish deidentified from merely disguised. It’s clear that the richer a data set, the more easily it can be reidentified using statistical methods. But more significant in marketing is the common condition where an advertiser or publisher is suddenly able to connect a deidentified record with personally identifiable information provided by a customer, because they completed an order form or signed up for a newsletter or requested an electronic receipt. In these cases, most marketing platforms are reasonably capable of associating anonymous data with a particular individual with ease.
What does this mean? For one thing, it supports the position of many privacy advocates that deidentification is basically irrelevant to the problem of pervasive surveillance on the web. Consider Timothy Libert’s recent op ed in the N.Y. Times in which he excoriates news organizations (including the Times) for spying on users—in particular for collecting sensitive information related to things like health interests and political views. Nowhere in the article does the question of deidentification even come up; it’s presented as a given that cookies deployed on news sites to track activities and interests are naturally associated with the individuals behind those activities, and so all of their collected data should be considered personal under CCPA and poses a clear threat to privacy. Is that assumption reasonable?
If I’m logged into nytimes.com and the company knows my email address because it’s in my profile, then the Times can easily associate me with my behavior: personal data. But there are many ways the Times can use that data to enhance its advertising revenue without disclosing any personal data to any external party—as long as the behavioral data and the identifiable data remain separate. In other words, it can be deidentified. The Times can target me with political ads that match my bias—as long as there’s no way an advertiser or other third party can associate me personally with that aggregate category data. It can also encrypt my email address and match it in a secure environment with similarly encrypted (i.e., deidentified) data from other entities that know my email—as long as it doesn’t disclose anything about my behavior, including the fact that I sometimes visit the Times web site.
With this nuance in mind, industry advocates decamp to Sacramento and Brussels and anywhere else privacy laws are being drafted or revised to explain deidentification and its economic importance to publishers (in particular independent news organizations, which lack much attraction for contextual targeting) and brands. The challenge is convincing lawmakers and a skeptical public to trust a system that’s clearly full of holes. Perhaps the Times will play by the rules, but how can such rules be enforced when it’s obviously easy to break them and hard to catch someone doing so?
That’s where we need some more innovation. Here are a few ideas floating around:
- Accelerate convergence between Customer Data Platforms (CDPs) and Consent and Preference Management Platforms (CPMPs). (Some resources for Gartner subscribers: Market Guide for Customer Data Platforms for Marketing, Market Guide for Consent and Preference Management for Marketers.) CDPs (and DMPs for that matter) need to support specific, transparent rules for privacy compliance that explicitly prevent the association of data that could result in personal data leakage that violates privacy policies or laws.
- The industry needs open standards to codify specific privacy rules and metadata categories and make them interoperable across marketing platforms. Adobe’s Data Usage Labelling & Enforcement (DULE) framework is one step in this direction, allowing data to be categorized according to usage policies. Salesforce’s Individual Object offers similar governance capabilities for Salesforce platforms. And organizations such as the IAB Tech Lab and the ISO are developing related privacy standards, but industry-wide deidentification-related protocols need more attention if they’re to be at the frontier of compliance for marketers.
- Finally, it would be nice if the ISPs would wake up and enter the discussion. They’ve all made large investments in content and advertising technology, but they’ve been conspicuously silent about a problem they’re in a unique position to solve by supplying users with network-based services that control the exposure of their identities without requiring them to deal with obscure consent flow dialogs on every web site or device. Instead of lobbying for more freedom, they might reasonably take the opportunity to mediate privacy with their own network deidentification solutions and standards bodies.