If I were to give you the number 100001… Is that relevant for privacy?
Well, as every analyst’s answer seems to start nowadays, ‘it depends’. It might as well be binary code and mean nothing. Unless that relates to an indication of my age, though that was certainly almost a decade ago. But if I give you that same set of digits in the CONTEXT of us having a conversation about how many Euros I make each month, working for Gartner…
Obviously that is wishful thinking, don’t worry. However, if you’re looking at a million records in one database and you pick out this particular one with the unique attribute, that is often pseudonymous information at best. You may not know for sure whom that info is about, but you could likely with relative certainty deduce that it’s about a single individual amidst the other records.
A 2019 study demonstrated today’s ease of reidentification, essentially based on metadata. Metadata matters, more than we seem to acknowledge in generic privacy programs. This has a LOT of implications, but let me point out at least the obvious ‘step 1’: When you deploy (personal) data discovery tooling that only looks through regular expression comparison or fixed combination recognition tests, you’ll find names, SSNs, addresses etc. Sure. Basically what has been understood under the PII umbrella term as a set of fixed identifiers. But you won’t find all that matters, and may actually overlook incredibly large datasets with ‘personal data’; inherently carrying privacy risk that subsequently remains untreated. Which adds to your business risk. So instead, consider going one extra mile. See if there’s reason to invest in more detailed discovery options, increasingly AI-based, where context and semantic relations are understood so ‘personal data’ can get treated correct, instead of mere ‘PII’.