Blog post

The Elephant in the Room – Master Data and Application Data

By Andrew White | January 14, 2014 | 8 Comments

SOA StrategySOAInformation StewardshipInformation ManagementInformation LeadershipInformation Innovation Yield CurveInformation GovernanceInformation ArchitectureEnterprise Information ManagementDefining Master DataData WarehouseData StewardshipData GovernanceChief Data OfficerApplication Architecture

Alternative title for this blog: The Difference between Master Data and Application Data is Huge – and Missing!

Many users, and even some analyst, confuse master data with the other data used by a business application to support the intended work, analysis or business process.  Both types of data are used and needed by business applications to instantiate the task.  However, this distinction, relatively new in the industry, is lacking in a great many organizations.  This distinction is increasingly becoming very important as business systems are getting more complex.  With new business models emerging, new mobile apps, new cloud offerings, stuff is just getting more complex!  Worse, some vendors don’t even want end users to “get this” distinction since those very same vendors are not (yet?) ready to help end users sufficiently. 

Business applications, including operational line-of-business applications (ERP is a good example) as well as downstream analytical applications, or the new class of operational line-of-business analytical applications, all use data. Some of that data persists in some form in other applications.  Sometimes that data persists in all the major applications.  Some data persists in just one application.  And some other data persists in several applications.  We only formalized “master data” 6 or seven years ago in order to express the problem that it is inconsistent across those common uses.

So the difference between master data and application specific data is this:

  • Application specific data is used ONLY by that application.  This data does NOT appear anywhere else in the application landscape.
  • Master Data is common across many applications.  It is not specific to any one business applications.

For the great majority of our own IT lives this distinction was not that important. 

“You want a new app, sir?  Just tell me what data you need – and I’ll get it or create it for you.  I will even supply you one application at a time.”

This has led to silo mania.  You know this to be true – most of you live this experience every day.

Some MDM vendors will help end users “get this” distinction and can help them build sustainable, operational information governance programs that takes this distinction into account.  If you DON”T take this distinction into account, the risk is that your new MDM hub will literally, over time, develop into yet another bloated ERP-like data model!  We will have failed and gone full circle.

Other vendors don’t like this distinction since they only sell tools to help manage data in an application – probably their application.  Better, they (the vendors) will even sell you this “application data stewardship” capability as if it was designed for MDM.  A solution designed to manage the data maintenance in a specific application will differ in functionality to a solution designed to govern and steward information across any and all applications.  The former is not an MDM-like design architecture; the latter is exactly that.

If we don’t make this distinction, the result is clear:

  • Data silos proliferate
  • Data integration tends to focus on copying data and moving it, even transforming it, but rarely helping in governing it for reuse
  • Application, integration and storage (IT centric) costs are higher than they should be
  • IT-based business agility is less than it could be

Bottom line: Investments in information fail to deliver the expected benefits and as a result, information driven business outcomes suffer.

Thisdistinction in the data is central to effective application, information and SOA strategy that it’s hard to convey.  If DNA did not have a common map from which to copy itself, errors would overcome the purity of the living form.  If printing presses did not offer a way to standardize the printed word, the core messages in each Bible would not persist from one copy to the next. 

But even today, 2014, we see too many inquiries from end users where this distinction is not understood.  If we don’t collectively get this point across, and embed it in our IM strategies and architectures, MDM really won’t succeed as we want, and need, it to.  EIM overall will grind to a halt….  More to follow in Research in 2014.

The Gartner Blog Network provides an opportunity for Gartner analysts to test ideas and move research forward. Because the content posted by Gartner analysts on this site does not undergo our standard editorial review, all comments or opinions expressed hereunder are those of the individual contributors and do not represent the views of Gartner, Inc. or its management.

Comments are closed


  • @Andrew, very interesting clarification – and I agree, much needed.

    When you say “Master Data is common across many applications”: is that always the case? Wouldn’t there be cases of Master Data that is derived from (or used by) only one application? Yet be critical enough for the organization that is warrants the type of governance and care that a MDM program can bring?

    As a side note, I have to say, in the days we live into, managing to talk about elephants without talking about big data is quite an accomplishment! 😉

  • Andrew White says:

    HI Yves, Happy New Year to you – and thanks for the post!

    You ask (I paraphrase, I hope correctly), “wouldn’t there be the case that master data is used by one application?” I will take this opportunity to suggest that this cannot be the case. But let me be clear. Master data, by definition, should be widely re-used. That implies many applications. Each application may thus use a copy of the master data, and add (i.e. author?) data within its own control. The combined set of data, master data and application specific data, is used by that single application to do what it does.

    So your question (if I understand it) suggests that an application may create its own data, and not have that data shared with any other, and so is this data not master data? I suggest that this is not master data: it is not widely reused. I would propose that we call this data ‘application specific data’ so that we know the difference between it and master data. I propose this since the governance and stewardship effort and work will (or should) differ for each.

    Do I understand the question correctly? If so, does my answer help explain my thinking? Does it make sense?

    I was driving my two oldest sons to school this morning. We are listening to the radio recording of Douglas Adam’s Hitchhikers Guide to the Galaxy. We just got to the part where Deep Thought provides the answer to Life, the Universe and Everything. As my oldest son got out of the car, he turns to me and says, “I will answer every question in every class today with, ‘42’, and see where that gets me!”

  • Hi Andrew – Happy New Year to you too! And thanks for the detailed response. Glad you did not just answer 42.

    Yes, you understood my question right – with one caveat. I wasn’t suggesting that the data *should not* be shared, but that there may be cases where no other system is “interested” by this data. So it’s not that the application does “data retention” but that there is no “buyer” out there.

    Let’s say that we are talking about data that would be master data should there be other applications “interested” in that data. How would you then treat the need for governance, stewardship, consistency, quality (and more) of application-specific data? Is this purely a data quality problem? So much of the same techniques, processes and challenges seem to apply…

    Maybe where I am getting at is that there are, within application data, different classes. Could there be such a thing as “application-specific master data” (to be opposed to such things as transactional data or logging data)?

  • Andrew White says:

    Yves, thanks for following up. Your follow up makes me realize I had (at least) missed the temporal impact. After having this dialog with a number of end users, a sub-set of those users have gone on to ask, “what happens over time, when requirements change?” So your question makes more sense to me when I think this:

    a) for a time, this data in question is only used by one application. Thus it is identified, tagged and classified as application specific data for the time being. It will attract the applicable governance and stewardship rules and policies and controls.
    b) over time, new requirements may emerge, and what was once used by one application is now used or wanted by many more applications.

    As such, its categorization changes and it is now master data. It warrants a different approach to stewardship and governance.

    This has been my standard response for a few years now. These two points actually lead to some other interesting ideas that vendors have, for the most part, not really addressed in the market:

    1) The response suggests 2 tiers – master data and other stuff. Clearly that is not the case. But how many tiers should there be? Is there an idea design style? Is there a model that works for groups of company, or industry, or application landscape? You do mention several different kinds of data also.
    2) The response also suggests that even given a), the centralized governance function needs to know about what is in b) since how else would we know that we had candidate data for re-classification to master data? This then expresses how important metadata and its management is to MDM and how MDM relates to the environment
    3) Wouldn’t it be nice if users could see in one place a living view of all this data, in both business sense (high level of granularity) as well as an IT sense (lower level of granularity)? And if I wanted to “flip a switch” to make this change – awesome!

    Anyway, I hope we have gotten to the bottom of your question Yves. Let me know if not. I feel a blog on metadata management coming…. 🙂

    Thanks again,

  • Fx Nicolas says:

    Excellent post Andrew. I agree with you on the fact that “real” master is be shared (and often scattered) across applications.


    We see more and more customers willing to create additional information attached to the real master data. This additional information is authored within the MDM Hub, making it a real “application data” provider.

    I even saw customers willing to create master (and even non-master 😉 ) data ex-nihilo directly in the hub, because they had a long time business requirement for managing this data, but no existing application to put it. So the MDM became a good alternative to a RAD Tool.

    Although it is not the core reason of an MDM solution to replace a RAD solution, it may be one its practical use, with all the risks that you mentioned.

    I’m curious to hear your (and Yves’) thoughts about this.

  • Andrew, thanks, this is now very clear. I trust where that leaves us is:

    (i) master data is, by its very essence, central and shared

    (ii) there are several classes of application-specific data, and some of these classes require governance and stewardship practices that are close in nature to the governance and stewardship of master data

    (iii) some communication, integration, metadata sharing is required between the teams that govern master data and the teams that govern application data, to ensure that change happens when it needs to

    Can’t wait to read further on this last topic – and I think (ii) would be interesting to explore.

  • Hi FX, I have also seen people who use MDM tools to develop applications to interact with data, but quite frankly this is marginal (most of the ones I encountered are actually doing it with our open source MDM). As you know there is a lot more to MDM than the technology, and data screens built with a MDM tool do not magically promote this data to master data.

  • Hmm says:

    Isn’t this what “4th Normal Form” was supposed to cure?