Blog post

Top 10 Amazing Secrets of DMPs (Part 2)

By Martin Kihn | January 14, 2016 | 8 Comments


Welcome back. We are talking about the mighty data management platform – aka DMP. Last time, we discussed at some length the Top #1 – #5 Secrets of DMPs. Today, we enjoy the climactic second act.

First, I list the Secrets (for the post-literate #millennials) and then provide some real detail behind each one of them (for aspiring #ninjas). (Again, if you’d like to start somewhere, I’d recommend checking out Part 1 first.)

Top 10 Amazing Secrets of DMPs

  1. DMPs store data in two different ways
  2. They collect data like everyone else – with tags, APIs and uploads
  3. They are in the business of labeling (and relabeling) people
  4. They use outside partners to help map data to users
  5. Their “user profile” is not supposed to be a complete customer profile
  6. What they call an “audience” is just a segment
  7. They were designed to build targets for advertising
  8. They were also designed to personalize websites
  9. They make many decisions based on predefined rules
  10. [Revealed below]


6. What DMPs call an “audience” is just a segment

Talk to a DMPster for longer than a half-minute and you’re duty bound to hear this word: audience. It never goes out of style. I’ve said before that it betrays the secret yearning ad tech types have to be in a different business, namely, show business. (Look, we have an audience, baby, just like Rachel Zoe did!) At any rate, the audience is important to the DMP, but what is it?

I regret to say an audience is just a combination of attributes. These attributes are defined and users who have them are pulled into a group. This group – obviously, people who have the attributes you defined – is called an audience. That’s about it.

Now, an audience is a valuable piece of data in its own right. Think about all the trouble the DMP (and the client and its partners) have gone to in order to build up this lyrical userID-centered data store with so many beautiful attributes. And now we are cherry-picking actual individual userIDs to belong to a select tier of glory bestowed with the shibbolethic cognomen audience. It is bound to have meaning.

And it does. Most of what a DMP does – most of what a digital marketer does, frankly – does not happen at the user level. That is, it’s not only a set of single out-of-context or simple trigger-based rules. It first requires a DMP to place a userID into an audience, which itself – the audience – guides the decisions about what to do with that userID.

Say what? Try this:

Audiences are used for three main purposes. By no coincidence, these happen to be the three main purposes of a DMP itself:

  1. Targeting advertising
  2. Personalizing websites (and other client channels)
  3. Sharing with other systems (called “syndication”)

So what is syndication? Syndication is used in the programmatic world the same way newspapers use it: it means to create something in one place and send it out into the world for broader distribution. It’s a useful term. (So is audience, although I kidded it a bit upstairs.) So what is syndicated? Well, no surprise: audiences.

The way this works is the DMP has a prebuilt connection with another system that will be able to decode the audience. The DMP selects audiences for syndication, sends them to the partners, sends a few more pieces of information if needed, and there you have it.

The most common example involves DMPs and demand-side advertising platforms (DSPs), of course. When a DMP sends an audience to a DSP for targeting it is syndicating that audience. Often, the DMP will send the same audience to multiple DSPs and ad servers, publishers and other platforms, but the purpose is the same: it is telling the other platform who to target with advertising.

Like individual users, audiences have unique ID’s and are often given names and descriptions. They also have sizes (number of unique userIDs in the file) and costs associated with them, if they use third-party data that isn’t free.


7. They were designed to build targets for advertising

Now we get to the DMP’s money shot. When they first appeared, about ten years ago, they tried to sell themselves to ad agencies for use as a way to set up targets and measure the impact of online advertising campaigns. (It was such a good idea many agencies tried to grow their own.)

How does this work? Media campaigns are based on “audiences” (see above). These are sets of attributes. For media purposes, attributes can be first-party or third-party. First-party attributes are things that the client knows or can collect about people (like, “visited my site” or “loyal customer”). They are easily associated with the DMP’s userID, as we’ve seen. Third-party attributes are pieces of data that are purchased from data vendors. For example, you or I could buy a list of people – more accurately, a list of anonymous browser cookies – that have visited luxury auto sites recently from the Oracle Data Cloud.

Campaigns are set up by creating an audience and adding first-party and third-party attributes to it. Other parameters are set, including particular websites to exclude (blacklist), agency name, budget and goal, and how to define success. Goals for media usually refer to a specific budget tactic, usually CPM (spending per thousand impressions) or CPA (cost per acquisition, aka results).

But what is “success”? Success for a campaign is usually in measured in terms of how many targeted people actually did something the client wanted them to do. Examples here include: click on the ad (uncool these days, but still used), land on the site (better), engage with the site (even better), buy something (best). Goals are tracked the same way everything else is tracked in a DMP: pixel tags, APIs and offline data uploads. They are then tied back to the campaign via the userID.

Lookalikes and extensions: Of course, just about every advertiser has a simple goal: Find me people who look like people who love me. This is where the DMP’s popular ability to do “lookalike modeling” jumps up and down screaming. Lookalike works by creating a new audience and filling it with people who share attributes similar to those found in the audience they’re supposed to look like.

Say what? In practice, you’re looking to find people you don’t know – so, right away, you’re dealing with third-party data. Also in practice, you’re defining the model audience based on their doing something you want: e.g., you want to improve ad targeting, so you look at attributes shared by people who are responding (landing, buying, etc.) and find people like that. Obviously, you don’t know a lot about most people, but you may know enough to do lookalikes.

For example: you have a list of userIDs that you have linked to you CRM database. You have also acquired some information about these people from third parties. So once you’ve built up your model audience, the DMP will look to see what you know about these people – what their attributes are. These attributes are then bumped up against the “internet population” (often), which is simply a model of online adults. By comparing these two groups – model audience vs. internet population – you can see which attributes define your audience and which do not by seeing where they overindex (or underindex) vs. the broader population. Then you create another audience using the overindexing attributes, purchase targets from third-party vendors, and serve them ads.

This example is a simple one, of course. But you get the idea. Although the modeling can be quite complex and the scale fiendish, the number of attributes on which the lookalike swings is often pretty small – say, ten or fewer. Sometimes there’s just one or two. But they’re built on the DMP’s ability to cultivate a set of attributes linked to a userID from first- and third-party sources, and to use these attributes to build audiences.

A word about programmatic advertising: For advertising, DMPs are usually paired up with one or more demand-side platforms (DSPs), which may or may not be offered by the same vendor. For example, MediaMath and Rocket Fuel combine DMP and DSP functions, while Oracle does not. A DMP will need a DSP connected if it wants to play in the world of programmatic advertising – in particular, if the client wants to do real-time bidding (RTB) on the ad exchanges.

The role of a DSP is to make ultra-fast calculations about whether – and how much – to bid on a particular impression. It is a special skill we won’t get into now. Except to say that this is where the fast-thinking part of the DMP is crucial. The DMP’s role in RTB is to provide stat-stat-stat the information the DSP needs to know how to bid or pass. The basic mojo is: the ad seller provides information about the person to the exchange/DSP which can include descriptive information (e.g., age, gender, interests, location) which the buyer might be looking for … or it can include JSON objects naming what segments (from what data providers) the user might belong to (e.g., “in-market for BMW from Nielsen/eXelate”), or more detailed identifiers.

But that’s all on that, for now.


8. They were also designed to personalize websites

There are of course many ways to “personalize” a site and app and call center cue. Everyone and their dance partner does personalization, including landing page optimizers, testing platforms, content management systems, recommendations engines and so on. DMPs don’t do content management, but they have almost from the gate been used to make decisions about what to show (or say) to people.

How? In fact, the same way they help clients decide what ads to show:

  1. Pull a userID
  2. Look up the attributes
  3. Figure out what audience(s) they’re in (their labels)
  4. Apply some rules to decide what to show (or say) to that userID
  5. Tell somebody

Usually, the client will set up a rule that says: “If the userID belongs in Audience Undead X, then show them the Zombie Creative Z” and so on. Rules can be set up for many audiences, of course, and multiple creatives can be shown to a single audience. (This second point means the DMP can indeed be used for A/B testing and landing page optimization BTW.)

As we’ll see in a minute, the client sometimes wants the DMP itself to suggest what messages to show to what audiences. The way the DMP does this will come as no surprise to those of you intrigued enough by this topic to get all the way down here in this sequel blog post. Basically, the DMP has to have observed how a large set of userIDs reacted to a potential set of messages. It then calculates which audiences – fundamentally, which attributes – are correlated with the “good” reactions. These attributes are then assembled into a new audience and rules created to show the winning messages to responders and their lookalikes.

Since they’re just sending decisions anyway, DMPs can send them to places other than the client’s website. In fact, anywhere that wants (and can use) them. Mobile SMS messages are a possibliity. A CRM or email system can be configured to send an email based on a DMP’s alert. Can center agents or robo-dialers can be sent alerts too. These types of actions require the DMP to communicate with an outside system via API. Based on the connection, the DMP can send a lot of useful information to another system – in fact, anything that’s in the DMP. The outside system can then be set up to do what it will with this information.

No doubt you’re with me on this.


9. They make many decisions based on predefined rules

DMPs are big on rules, or what the enlightened call logic. In a way, an audience is just a set of rules that map a bunch of attributes to one another (“IF userID has ‘zombie shuffle’ AND ‘zombie glare’ BUT NOT ‘rosy complexion’ THEN put in Audience ZZ”).

Another simple set of rules frequently used by DMPers are triggers. Any piece of info that is passed back to a DMP can trigger an action. This means any tag fired on the client’s website, or passed through the URL string, for example, can be set up to cause the DMP to trigger something. Like what? A common example would be to “serve a particular offer” or “send a message to the email system to send an email.”

Triggering a specific action that the user notices is the most dramatic role for a trigger, but there are less obnoxious roles. For example, triggers can be used to update an attribute (e.g., “visited_zombie_page = Y”). They can also be used simply to keep up with the frequency of particular actions or experiences by counting. These counters are often used to make sure a particular user (or a group of users) does not see or experience too often. In other words, for frequency capping.

Of course, triggers do not have to be simple Pavlovian reactions: userID does this, DMP says “do that.” They can be combined in complex ways and used in media targeting, including RTB. But the important difference between triggers and other things we’ve been talking about here – like attributes and audiences – is this:

A trigger is information about an action, not a user

Of course, a user is the thing doing the action, but you get my point. A trigger says “UserX did B” or “User Y did C at 10pm on Sunday” – but not “User X is a zombie.” Another difference is that triggers and rules can’t really be “syndicated” outside the DMP’s own ecosystem.

Today’s Sermon: I’ll repeat here that I think the key to really groking the DMP is to keep in mind its bipolar nature: it has two speeds (fast and slow). This is not so odd for a Thinking Machine. In fact, our own human brains work the same way (assuming you are human and not undead). We have a fast brain for rapid decisioning based on partial data and preprocessed rules, and we have a slower brain that we use when we have a moment to ponder. (The best book on this topic is Daniel Kahneman Thinking Fast and Slow.)

Similarly, the DMP stores a lot of attributes against a userID in a profile for deep thoughts (slow) . . . and it has a quick-twitch set of lookups and “user states” against that same userID to guide zippy decisions (fast).


10. DMPs are better at counting than analytics

This statement always gets me into trouble, so I’ll cover my caboose by disclaiming that I don’t mean DMPs are naïve or simple-minded. All I’m claiming is that they are not data mining, B.I., or general purpose analytics tools. They do not compete with Tableau, QlikSense, or even Google Analytics. The analytics and reporting DMPs do is purpose-built to do those things a DMP user needs to do with her DMP.

Such as?

Well, the types of analytics DMPs do can be divided into three categories:

  1. Tools to help set up media targets
  2. Visualizations and reports
  3. Models that automate decisions

The first two are what we might call descriptive or basic analytics. #3 is more fancy and can be a real point of difference among providers.

Media Targets: As we’ve seen, DMPs can be used to show the attributes of userIDs that do things you’re interested in having people do (like buy). This is done in a way that media buyers have come to know and love: by indexing a set of targets (here, an audience) vs. an available pool of media or people. The DMP user will typically pull reports for – say – “campaign responders” (as an audience) and look at a set of attributes. Usually: age, gender, kids (household stuff), geography, financials, devices, and first-party data.

DMPs have charts and graphs that help the buyer see how these responders compare to other (targetable) populations, such as the US internet average or various third-party segments that can be purchased. In this way, clients can see which attributes describe converters and which second- (i.e., partner) or third-party data or publishers might reach them.

Visualizations and Reports: Visualizations are used to help targeting, as we saw. They are also used in reporting. A common report offered by media players like DMPs is called the “Funnel R/F” and it means “Funnel Reach/Frequency” – which is shorthand for a report that attempts to quantify the impact on the purchase funnel of a campaign in terms of its reach (how many people saw it) and its frequency (how often). The “funnel” here is defined the traditional way: awareness -> consideration -> conversion. Tactically, a client just labels particular tags or user actions as belonging to a particular stage in the funnel. In this way, the DMP can display the message, reach and frequency in terms of what they did to the funnel. For example: Campaign Z drove Audience Zomb1 into the conversion phase more successfully than Campaign ZZZ.

You can see how this feature allows us to see trends over time, which messages worked better, which frequencies are better, and how various channels work or don’t work together, like search and display.

Automated Models: This is where the magic of machine learning happens. We’ve already seen an example, in the case of identifying lookalikes, and when a DMP user leans on the DMP itself to suggest messages to serve to a particular audience. We could expand the list of examples, but the basic idea is that the DMP can – within certain very well-defined boundaries – do what machine learners do: figure out which attributes (features) predict good things (success) and determine how likely userZ is to belong to the group of people who do those good things.

There is so much more here it’s not funny, but I have overstayed my welcome in the blogosphere. We’ll get there some other time.

For now, let me leave you with this thought. It is related to machine learning. The goal of machine learning in many cases is to label something – e.g., “given set of attributes ZZZ – userX is likely to respond/belongs in audienceQ/is worthless.” And that’s a pretty good description of the DMP itself:

The purpose of the DMP is to label people

There you have it. Peace.

Favor: If you’ve read down to here and got something out of this 6,300-word DMP series, please tweet me @martykihn. Just tweet the simple hashtag #macaroni. (This is a pretty obscure reference to a Miranda July movie called Me and You and Everyone We Know.) Thx.

The Gartner Blog Network provides an opportunity for Gartner analysts to test ideas and move research forward. Because the content posted by Gartner analysts on this site does not undergo our standard editorial review, all comments or opinions expressed hereunder are those of the individual contributors and do not represent the views of Gartner, Inc. or its management.

Comments are closed


  • JK says:

    The links to the first part of this two article set are broken.

  • Christian says:

    Hi Martin,

    Great series of articles, really enjoyed reading them!
    In your opinion, how good of a job is (the best) DMPs doing in cross device mapping?
    What are the crucial factors in getting a high rate of cross-deviced mapped inventory?
    I know that i.e. Tapad’s device graph is class leading when it comes to combining probabilistic and deterministic methods.
    But how does the best DMPs do in comparison?

    Thank you 🙂


    • Martin Kihn says:

      Thanks Christian – appreciate the note. Cross-device is a complicated topic and in fact the topic of recent panel I moderated at AdExchanger Industry Preview 2016. Oracle/BlueKai’s Omar Tawakol was on the panel and it seems they’ve made some impressive strides in this area. In fact, many DMP’s use Tapad, Drawbridge and other specialty data co’s behind the scenes (among other sources), and the best source of cross-device data turns out to be the client’s own login data. So the quality of the map will vary.

  • igarelj says:

    Very thorough and educational. Thankyou!

  • Frand says:

    Amazeballs. Love your writing style, super informative.

  • CC says:

    Interesting series of articles and very informative. But I’m wondering who the big players are in this game? And if mid-to-large-sized organizations are outsourcing this service or looking to bring new technology in-house instead?Obviously there are sensitivities/suspicions around the use of ‘big data’ and how/who we share this data with – are companies managing DMPs themselves or leveraging pre-built solutions by providers?

    • Martin Kihn says:

      Sorry – just saw this 🙂 Good question – there are a number of established DMPs (some of which are part of “marketing clouds”) that our clients use with success. These are cloud solutions that are well maintained. We list some of the leaders in our “Magic Quadrant for Digital Marketing Hubs.” Independents include Krux and Lotame, combined DMP/DSPs include MediaMath and DataXu, and cloud versions incl. Adobe and Oracle.