Blog post

A Brief, Incredible History of Marketing Data Matching

By Martin Kihn | May 16, 2016 | 1 Comment


Now if that’s not an enticing blog title calculated to maximize customer engagement … well, then it isn’t. But it’s an important topic, friends. As you’ll remember, not long ago we were talking about Acxiom LiveRamp and its (now demystified) data matching processes called, respectively, Onboarding and Customer Link.

As we pointed out, Customer Link is a version of Acxiom’s ten year-old AbiliTec, which itself is a natural evolution of the marketing service provider’s list and data services, which themselves were a level-up on its direct mail targeting data … and so on. And if you think digital marketers — much less ad tech-sperts — invented customer identity matching and household-level personalized messaging, you just don’t know your history.

Direct mailers did things with household information that are still impossible in the digital world.

Customer Data Integration

Acxiom started as a mailing list processor, and it added list data enhancement in the ’80s and data warehousing once the warehouse appeared. Its Personicx clusters business, which still markets groups of people attached to rather colorful life stage labels (I’m in the “Fun & Games” cluster, which will be news to people who know me), was introduced in this millennium.

Around the time you and I were partying like it was 1999 — because it was, in fact, 1999 — both Acxiom and Experian introduced their own versions of a master customer database for all consumers and businesses. Acxiom called its version AbiliTec; Experian, TruVue. They were called “matching systems,” at the time. So what did they do?

Recall that in the year 1999, there was indeed software to manage lists (primarily mailing lists). This software did merging and purging, removing duplicates and trying to match different versions of the same information. As any direct marketer will tell you, customer data is so inconsistent it’s almost funny. (Or it would be, if it weren’t so tragic.) For example, marketers have long needed some way to figure out that that “Marty Kihn” is the same person as “Martin Kihn” and “Mr. M. T. Kihn” and even “MT. Kinn” — and that’s the easy part. I’ve had four addresses, three phone numbers and two job titles in the past 12 months (don’t ask). I’m an Acxiomatic nightmare; and that’s just me.

Now creating all the fuzzy logic around matching versions of names and nicknames and so on is a very ad hoc way of getting at what the marketer really wants: to match this particular piece of information (name, address, email) to a person. And you have to match all the data sets you have each time to find information about a particular person — and, you don’t even know if it’s an actual person!

The goal of AbiliTec and TruVue was to create a single source of “truth” — a master database that contained only actual, identified individuals, households (and businesses for the B2B set). Each person got a code. Each person got a constantly updated set of attributes from public records and other reliable sources, like marriages and address changes and car registrations. And each person got mapped to their real self all the common typos and nicknames and sneaky emails that show up in their wanderings.

There really isn’t a whole incredible variety of data that will work against an individual, if you think about it. MSPs tend to call them “match parameters” or “match fields” and they include as much as you know of name, address, email, zip code or IP address. In fact, the most common combinations used to identify individuals are:

  • Name + Zip
  • Name + IP address
  • Email
  • Email + Zip

Of course, none of these is foolproof. In particular, IP addresses can be reassigned and are usually not unique to individuals anyway (they work better for households, whose members often share the same IP). But recall that the MSPs themselves have a master database of people who are known to be real and often can meet the marketer more than half-way in narrowing down the field of suspects.

This single identifier is a service in itself. Establishing the uniqueness and existence of an individual from his or her various digital identifiers is a valuable enterprise. This is the core of AbiliTec and TruVue: identity resolution.

There is also the separate but equal service of layering on actual information about the identifier. That is, attaching attributes to a person or business. In this way, the marketing service providers (MSPs) like Acxiom and Experian can build out a picture of all of us, updated as best they can, that itself can be segmented — “Fun & Games,” anyone? — and supplemented. This record isn’t totally secret; you can see some of it right here.

Most people will have basic demographic data assigned to them: things like age, education, marital status, income range. Credit and bank information is often on hand. There is information about preferred airlines and hotels, and data about your cars and lease status is solid. (Some of the fields are flagged as “likely” or “very likely” to indicate it is a statistical guess, rather than known.) And there is a list of activities that could be of interest to outside parties, such as charitable giving, political impulses, and zest for gambling.

Now, there are some of you in the back row very patiently asking yourselves what all this has to do with digital marketing, ad tech, or anything, really?

And I’ll tell you. The process I’ve described above is exactly what is being done by marketing data onboarding and customer matching companies — just with some additional fields. For examples, mobile device ID is a unique identifier for as long a person owns the device. If the MSP or the marketer is able to link the MSP’s master ID with the device ID, then the device ID becomes a key to all the data in the big database.

And outside the world of this big database, if a marketer wants to join together point of sale data and loyalty data and web analytics and email system and social ads, they really need to tie each separate data source to a single person … which is precisely the problem the original 1980’s “merge and purge” list solution vendors were trying to solve.

Connecting the Bots

There aren’t really bots involved here, but I liked that subtitle.

Moving from the somewhat analogue-plus-email scenario into recent years, let’s look at data matching circa 2016. You can still do what I outlined above, of course, and many marketers do. There are also a screaming million other ways to lose your customers in the digital sphere. Forget names and zips; we all have ID’s like you wouldn’t believe — thousands of them, each unique to us, emerging and colliding and expiring in real time. What’s a marketer to do?

It turns out the matching process involved in onboarding is not existentially different from the matching process described above. You have two files of data and you want to merge them. Say you have a web analytics “file” that shows what people do on your fancy Bernese mountain dog e-commerce site, and you have another retail “file” that shows what discerning Bernophiles buy in your Bernese mountain dog apparel retail store.

You want to see if there is any relationship between what people do on your site and what they buy in the store. How? It’s kind of obvious: you need a common field — an identifier — that will let you merge the files. If you’re lucky, you have a lot of site visitors who signed up for your “Berner Bucks” loyalty program and gave you their email. And of course, you ask for their email in the store. So there’s your field.

But what if you don’t have that magical matching email, or any magical (identical) field to merge the files? What then?

You can use an onboarding service. LiveRamp’s marketing director Justin Schuster wrote a groovy post on how the match process works. I paraphrase it here.

Remember that the goal of onboarding is to link otherwise separate pools of information by providing a bridge or match or key. This bridge takes many different forms and is harder than it looks. Of course, the specifics depend on what you’re matching where. Schuster outlines three basic types of matches:

  1. User Account Integration: The most straightforward. A personal identifier in your files — usually an email, name + email — is matched to the same identifier in the database of large media companies which are partners with the onboarder. These companies — think Yahoo or CNN, for example — have a lot of registered users. You can match your people to their people, tell them which of their people are your people, and serve ads to them on the media in question.
  2. Mobile ID Integration: Your files are matched to the links is the big database (see above), e.g., AbiliTec. Separately, many of these same links have been matched to a mobile device ID such as Apple’s IDFA or Google’s AdID. How? Of course, from various “partners” who provide mobile IDs attached to login data: our old friendly email or name + email or IP address. (These usually come of mobile apps that require registration or sign-in to use.)
  3. Cookie Integration: The most ephemeral of marketing identifiers, cookies come and go with blinding speed. Unlike mobile device ID’s, which are good as long as the person keeps her phone and are unique, the cookie is oiliest animal of all: it is good only for a single browser (if it accepts cookies) and can only be deciphered by the person who put it there (or her partners). How is a marketer to match a cookie to a cookie to a person?

The answer is simple, silly: partnerships. The onboarder puts a tag on your site. When a person visits, the tag fires. It gives you a way to tell the onboarder who that browser belongs to (if you know). So you have linked OnboarderTagZ to CustomerA. Now the onboarder turns to its partners. In the same way you provided a way to tie the OnboarderTagZ and OnboarderCookieX to your own browser cookie (and ultimately customerID), other partners have done the same. So the onboarder is likely to have a library of other cookies that have been linked back to its own cookie in a veritable bakery of cookie madness.

This process is called “cookie synching” and now that I think about it, it actually deserves a blog post all its own. And of course, I owe you the walk-through of an onboarder who is not LiveRamp. You have been warned.

Peace @martykihn

The Gartner Blog Network provides an opportunity for Gartner analysts to test ideas and move research forward. Because the content posted by Gartner analysts on this site does not undergo our standard editorial review, all comments or opinions expressed hereunder are those of the individual contributors and do not represent the views of Gartner, Inc. or its management.

Comments are closed

1 Comment

  • Glenn Humble says:

    Thanks for this piece Marty. I think this is a process that marketers know is table stakes in playing in the omni-channel world and very few actually understand how it works. Looking forward to your follow up from another provider.