by Andrew White | May 23, 2013 | Submit a Comment
An end-user posed a question recently as follows: How does one manage the integrity of master data changes across different environments – e.g. development, testing and production? This user has a home-grown master data mapping (as opposed to full MDM) service, and as its footprint has driven their home-grown maintenance framework (based on database synchronization scripts etc.) and it is starting to creak a little. Some fairly fundamental questions occurred to the user in this context – e.g.:
- Should we really treat master data entities like we do software changes – i.e. push to development and then promote through to production?
- Should we define an entity life-cycle – e.g. ‘pending’, ‘active’, ‘terminated – and explicitly associate this with the entity instances themselves?
- How do we best manage the ‘merge’ challenge? Multiple changes need to ripple through to support testing – but how do we ensure when we’re finally ready to ‘promote’ an entity instance to ‘active’ status in production, our testing has been representative?
The first thing I have to say is that this is a very common set of questions, once an end-user has reached the point of seeking to understand what we call the information lifecycle building block of MDM – in fact that is the 6th building block (the others being Vision, Strategy, Metrics, and so on). An understanding of how master data flows across the organization leads to the realization (an ‘ah ha’ moment, if you will) that master data has state, and that state changes.
Such complexity however differs by data domain and industry. For example, distribution segments may manage complex product data in catalogs (of all kinds) to help with sourcing, selling, supply chain etc. Much data needs to be “held pending approval”. Eventually that data maybe “approved for use”, or maybe “approved for enrichment before use” and so on. Such state needs to be explicitly governed – not necessarily at the data level but through rule/policy setting, automati0n, and monitoring processes.
Equally, customer data has a lifecycle – as do student data. I presented such cycles (visually) at our recent MDM Summit in Texas: Suspect, Prospect, Customer, Reference, and Retired for customer – and also Suspect, Applicant, Student, Graduate, Post Graduate, Alumni and so on for student.
Finally the merge question is very often a common question in customer and other party type data domains. One reason is that organizations get bought and sold, and so an entity “reports” into another – and so its transaction history might be merged in a data warehouse for analytics; and the master data might be merged. Or it could be that candidate records for a common master data object need to be “merged” (i.e. de duplicated). In many cases, the history of the lineage (when, who merged, before and after images) are needed to be kept – sometimes for legal reasons.
So the answers to the questions are most definitely, yes, yes, and carefully.
Category: Information Governance Information Policy Master Data Master Data Lifecycle Master Data Management MDM Tags:
by Andrew White | May 22, 2013 | 2 Comments
There is a lot of hype related to “information as an asset”. One driver relates to how organizations are seeking to create a wholly new revenue stream through the monetization of their data. Though not really a new idea, it has new impetus due primarily to the growing hype related to big data. If there is more data, isn’t there more opportunity to make money on data and the insight that can be derived from it?
The print edition of the Wall Street Journal today (May 22 2013) had two articles with conflicting views. One story talks about the pending success from selling data. The other is about a less-than-successful story about selling data. Perhaps there are some valuable pointers here.
The Bright Side of Big Data
In “Phone Firms Sell Data on Customers”, the large phone companies were reportedly selling, or preparing to sell, aggregated information about how, where and when we access inforamtion from our phone/mobile devices. The point being that retailers and brand owners (and maybe others) are interested in understanding what web sites we might access when our physical location is related to their business. For example, if I am at a hardware or consumer electronic store, and 80% of visitors are known to visit a competitor site while on site (presumably to compare prices, features), that inforamtion could be used to target marketing dollars more effectively (even if the where and how is not totally clear, at this time).
There are clear privacy issues here – though the report emphasized that the data in question here is not meant to be granular (identifying you or me) but it is meant to be aggregated. However, the report does suggest that data being sold/shared might identify the source down to a few miles – so there are bound to be ways to triangulate on speific identities.
Certainly this opportunity is big, and hot. There is a growing amouint of this type of data, and ability to discover patterns and insight from it, is what big data is all about. But it is not always roses…
Big Data is Old News
A few pages after the “phone firms sell data” story was another: “Auto Dealers’ Suit Stalls Shopping of Carfax’s Owner”. The bottom line of this article was that R.L. Polk & Co., itself up for sale, has not built a highly profitable, growth business. Polk provides car companies and consumer (big) data on autos and their history, including accident and damage data, as well as other data you can see via Carfax. The point is that this is an early big data story – even before big data was the story. Yet the profit for Polk was perhaps insufficiant to build a sustainable business. It might be viable – but given that it is up fro sale, and not acquiing other companies, is an interesting point
Does this mean big data might be bad business? It could be that the opportunity for auto data has passed; the auto industry – though big – is not exactly the hive of growth. But I think the point is clear. The data itself may or may not assure success; having an idea for the kind of value that can be derived from it is probably far more important than the data itself. Perhaps consumer data related to the growth in mobile has more opportuinty today than consumer data related to vastly depreciating assets.
Category: Big Data Dark Data Infonomics Information as an Asset Tags:
by Andrew White | May 21, 2013 | Submit a Comment
In the print edition (May 11-17th, 2013) of the Economis, the Buttonwood article, “Age shall weary them” (page 78), queried a major question for many of us – where will prodctivity come from given that the West’s working population is getting old (and probably, all other things being equal, less prodictive). The article draws on findings from two references – one “Older Workers and the Adotion of New Technologies”, by Jenny Meyer, ZEW discussion paper, 2008. This short article reminded me of a paper I co-wrote in 2003.
I wrote this note with friend and collegage, Marc Halpern, and I published a note with the simple title, “Not All Enterprises Need Intelligent Item Numbers”. This note was effectice collaboration across information standards, governance, and business applications.
It so happens that I have worked in organizations (engineering firms, at that) where the part numbering scheme were meaningful. That is, the actual codes and squence of codes means something business relavent to the user. In other words, the first two digits describe the market this product sells into (defense, non defense), The next 3 codes describe the product category; the next 3 the type of product and so on – even down to size, and even color etc.
I was very young when I joined this firm – and I had to lrearn the way to “read” the numbersing scheme – in order to communicate with everyone else at the firm. It was in effect a localized language. This language even extended to sister companies and smaller suppliers from whom we procured materials. It was also an efficient short hand – so it was efficiently used.
This research asked the question – why is this even needed today? The today was 2003, and even at that point, new business applications where being desigened with fast look-up tables, glossaries and so on. In fact new technology was easily surpasing the needs of “intelligent item numbers” but we still received (and do today!) questions from users about the need, or lack thereof, for intelligent item numbers.
The bottom line finding, between the lines of the published research note, was this:
- If the average age of your employees (using the relavent business systems) was over 35, it is mostly likley you would suffer extreme resistenace to cahnge and so you would perpetuate whatever numbering scheme you had. And older firms had intelligent numbersing schemes.
- If the avertage age of your employees was under 35, or even nearer 25, the chance is you would more easily convert to random numbering-based systems (and hence leverages the new business application designs).
It Is not that intelligent item numbersing schemes are less productive – that is not the point. The point is that it is harder to predict what should work well for your organization; and therfore productivity is probebly best suited with the appropriate response.
It’s funny how old ideas go round and round. Something my good friend Bill Blitch told me.
Category: Information Governance Intelligent Item Numbers Intelligent Numbering Scheme Productivity Tags:
by Andrew White | May 16, 2013 | 3 Comments
Overall, Sapphire 2013 for me was a “coming out” party for SAP Hana. She is a debutante and now seeking dance partners. The event did not introduce any new step change in technology or SAP future. Overall the event seemed to be a call to action for the market to get into SAP Hana.
As an old Supply Chain Management business use, I can appreciate the value and promise of what SAP Hana could bring to the market. There were several examples of innovations coming to market. In a past life I tried to bring to market an innovative solution to a problem that required a calculation that had not been previously been adopted in the market. It failed; turns out some organizations do’nt actually want to know the true cost of some activities. And I remember the days of Fast MRP. Many factories cannot change their schedules minute by minute due to physical constraints. So while I was generally excited, the real world has some very real constraints – as well as political challenges – that will slow adoption of SAP Hana. For industries that have information as their product (insurance, banking, financial services) and those other industries that have elements of information-rich processes (parts of healthcare), SAP Hana has some great promise.
Here are some other observations:
- There was continued focus on expanding the ecosystem of partners and opportunities for SAP Hana
- Cloud and SAP Hana – this was, for me, a bit of non event. It was hyped quite a bit for it seemed the main benefits are related to lower TCO for IT. There was some talk of improved access to innovation for the business, but this seemed to me to be part of the SAP Hana message, not the cloud message. I may have missed something, but for me, SAP Hana is the main message here – not cloud
- Cloud, SAP Hana and business transformation. what SAP did NOT do was explore or talk about what SAP Hana and cloud, coupled with Ariba’s business network, could do. And I mean with innovation. Just connecting a network is not different; designing and developing new processes (multienterprise apps) that replace processes (apps) behind the firewall, now that is disruptive. But this was not the message. I guess that innovation will remain with smaller firms that SAP has yet to acquire – if they ever well. Why would you eat your own children?
- Zero latency between OLTP and OLAP – clearly a pending, near future for us. Very exciting. Could eradicate the need for any kind of data warehouse. Could help unify analytics with the business processes – thus killing of BI as we know it. And making “process is king” dominant, over the BI world. Of course I am ahead of myself. Big Data will drive demand for more data warehousing – even if the DW is no in SAP Hana….. so perhaps its the on-premise, on disk data warehouse that will lose its primacy…..
SAP Information Steward, SAP MDG, and information governance. Some information on a new release of SAP Information Steward (in ramp up) that attempts to put a financial face on the impact on business of poor/bad data. Could be very interesting, and innovative. We wait to see more.
User Experience is top priority. Well, this has happened before. I forget the catchy name for the last effort. However, SAP Fiori does look promising. The demo was not that useful -but hte idea of a unified platform for UI development on HTML 5, even using Chrome, sounds promising. We just don’t want too many enthusiastic (or citizen) developers going crazy. We need smart artists to get involved.
Finally, my college Nigel Rayner asked of Hasso and Vishal, as part of the Executive Q&A, “When and how will SAP support real time analytics across its various business applications (come built, some acquired) and analytic data warehouses?” The answer given was “today”. There was a little give and take, along the lines of “So does that mean there is a logical data model?” that attracted a, “yes”.
This topic is a kind of holy grail conversation. In fact only just the other week I was party to one of those massive email chains at work where analysts chime in to discuss how process, analytics and data are fighting for ownership and hegemony over each other. We should have explored the issue with Hasso and Vishal. The answer was not really targeted at the question really being asked. Of course, any vendor can build an integration for a range of given applications. We have been doing that for years. But:
a) How and when will SAP provide the tools and capability to support operational data governance and stewardship across heterogeneous applications and warehouses, even if they all exist in SAP Hana – and more importantly, when there is a hybrid model? This is required to assure the integrity of any real time analytics platform or solution.
b) How will current customers that have invested in current technology migrate (and pay for it, willingly)?
Both these questions are not easy to answer. In the first case, no vendor has yet solved this. There are numerous attempts going on in the industry. Master Data Management as a discipline is part of this dialog. As is semantic discover and modeling. As is business glossary. As is logical data warehouse and logical operational data store. As is data quality. The fact is even SAP would struggle to demonstrate this. SAP is, in this regard, like many other vendors. Well aware of the issue and complexity; but has built a successful business without having the need to solve this Holy Grail.
The second question adds the dimension of revenue to the same topic. I meet with SAP customers each week that tell me, “We are on a 3 (or 5) year program to consolidate x ERP systems (many are not SAP) to one (of a few) SAP ERP systems.” There is virtually no appetite to invest significantly in any game changing technologies. Investments with SAP Hana will therefore likely be opportunistic at best. The point is that the current investment has to yield some value. Now business executives will hope that their returns will not get eaten up by aggressive competitors that were late to the ECC on premise argument, and who jump early onto SAP Hana. SAP wins both times of course – so it’s all revenue to them J
All in all this event was well worth the time investment. Good exposure to customers; access to some executives; and even a couple of detailed demos. But what comes next? How can Oracle come up with the SAP Hana killer?
Category: SAP SAP Sapphire 2013 SAP Sapphire Now Tags:
by Andrew White | May 10, 2013 | Comments Off
Its Friday, its 5 o’clock, so its must be “Crackerjack”! Well, ignoring the very British TV series from the 70′s, I was packing my laptop away for the day when this email popped into my inbox, from Techtarget, with one phrase listed: “data stewardship”. Well, that is red rag to this bull.
I instantly stopped logging off, and clicked on the link. This took me to a “definition” of data stewardship. The definition started like this:
Data stewardship is the management and oversight of corporate data by designated personnel who typically don’t “own” the data but are responsible for tasks such as developing common data definitions and identifying data quality issues.
The fuller definition is here.
There were a few items I had issues with – and though its Friday, I just had to comment.
I liked the following:
- Data stewards document agreed-upon data definitions and formats and ensure that business users adhere to specified standards.
But I was less keen on this:
- Data stewards can come from either the IT department or business units. They often act as liaisons between IT and the business side, functioning both as “data coordinators” who track the movement of data inside an organization and “data correctors” who understand and enforce internal rules on how data can be used.
In our work with our clients, we stress how “data stewardship” has to come from the business. For us, data stewardship is about “policy enforcement” but this requires a business acumen, most likely NOT found in IT, nor is it the responsibility of IT for this work. If IT ends up doing this work, it is very hard for business to justify “staying involved” with “policy setting (what we call the role of governance). And too often history is littered with IT efforts to steward data.
Now, to be fair, in actual engagements with users, I tend to be more precise. I tend to offer up:
- Business data/information stewards
- IT data/information stewards.
The business data stewards are the real stewards, who for “13 minutes a week” do the work of monitoring and enforcing information governance policies. The collaboration the Techtarget rightly talks about with IT roles (architects and more) is the interface to the IT data steward. An IT role that supports the business version of the steward. But I use the term “steward” in relation to IT very carefully. We always stress how stewardship is from the business, by the business, for the business.
So this blog is not meant to be inflammatory or argumentative – it is just looking for clarity. Hopefully some comments can provide some color.
Category: Information Governance Information Lifecycle Management (ILM) Information Policy Master Data Management MDM Tags:
by Andrew White | May 10, 2013 | 1 Comment
I wrote, “The Perils and Pitfalls of Managing Master Data “inside” ERP Systems” April 24th. Unfortunately I did not get to see the comment, or get to replay, until after our self enforced time limit for comments expired. So I apologize for getting so long to back to this topic and comments.
Here is my response to Sowmindra.
Thanks for leaving a comment.
You ask a good question. We tend to call the set of questions related to “where the master data resides” as “implementation styles” type questions. That is, the degree to which the data is instantiated. We have seen a wide array of requirements that favor a centralized approach (push out), a registry approach (point to), as well as a consolidated approach (collect and harmonize). All three styles have, to varying degrees, worked for operational/transactional MDM situations. There are also some industry specific “favorites” such as registry for patient data in healthcare. So there does not seem to be one answer to your question. We distilled the more notable characteristics that end user organizations consider in this discussion in a note we updated in 2011: The Important Characteristics of the MDM Implementation Style – Update. My colleague, Lyn Robison, also tackled the same question and drilled down on the decision drivers in more detail with A Comparison of Master Data Management Implementation Styles in October 2012.
IBM’s book, Enterprise Master Data Management (2008), is a pretty good resources for some of this dialog, but even it has a few gaps and the models in the book do not align quite with ours (which makes using the rest of the research harder than it needs to be). Most other so-called MDM books really don’t cover all the basis at all; mostly because the authors only have limited exposure or experience with one type of master data (i.e. Customer), or one industry (i.e. banking). One needs a wide array of experiences to spot the real patterns in this topic.
Again, sorry for lateness.
Category: Implementation Style Master Data Management MDM Tags:
by Andrew White | May 10, 2013 | Comments Off
As you may know we are into the data gathering phase in support of our research into the market segments we call, MDM of Customer Data, and MDM of Product Data. We hope to publish around September this year.
Please note – vendors – the deadline for submission of all material for this phase is June 3rd. This includes the surveys and reference information.
Briefings are being scheduled – now – for June and July. You have all the details on how to get these set up. Please don’t wait for the week of your preferred time slot to set it up – that will be too late.
Our colleague, Dan Sommer, will be contacting you in the next few weeks to validate our software revenue estimates for 2012. Please watch out for his email and get back to him as soon as you can. We will then follow up as we look to relate the MDM software spend to specific domains (where applicable), as well as for specific products (when there are several, offered by vendors, that end users recognize).
If you have any questions, please send me an email. You know where I am – I cannot hide J
Category: Magic Quadrant Master Data Management MDM MDM of Customer Data MDM of Product Data Tags:
by Andrew White | May 2, 2013 | 1 Comment
Print edition, Wall Street Journal, May 2nd 2013: Poor Prognosis for Privacy. Fascinating article that can be summed up with one quote:
“ “The reality is, our ability to exchange electronic information is already well beyond our ability to control it,” says John Leipold, CEO of Valley Hope Technology in Norton, Kan., which makes electronic record systems for behavioral-health providers. “
The article explores this issue – there is a fast growing body of information concerning our medical history and situation. Much of the value from this information increases when it is linked to other aspects of out medical history – both at the personal level (what are the specific interactions and connections) as well as at the regional and population level (how are drugs performing in certain groups; how are diseases address across the country). In fact one key point of this “linked data” opportunity is that the potential benefit is endless. Some users have thought what can be done; some users are just “blue skying” new questions.
This idea of getting more value from linking data is somewhat analogous to the hype around big data. Data growth continues apace; the relati0nship between data increases equally (even more so) too. This makes the opportunity space very, very big. And the need to “spot” patterns becomes very powerful. Bingo – big data.
But the other side of the debate in the article concerns privacy. How can we protect our own medical data, when we want to preserve privacy? The challenge, with respect to the quote above, is that the technology – the ability to share, link, and analyze data is physically further along than our ability to agree, set, and execute policy to limit aspects of that opportunity. There are rules, polices and bodies focused on setting and upholding same (think HIPAA) but technology and the innovation it brings always speeds ahead of (government) policy. Isn’t always that way?
So my title of this blog – from practice, to policy, to pandemonium was playing on the idea that stuff happens – innovation is rife and we cannot hope to prevent it. Policy of course – public and private – will ever seek to influence where and how innovation works. And finally, the mess created is obvious.
A newer angle on this conversation of setting and enforcing policy on this growing mountain of data is being looked at by my colleague, Frank Buytendijk. He is looking at the topic of “ethics” in data. Is there a morality that can be understood in terms of data, and how it us used, or might be used? Is the data itself “responsible” or is the technology, or the user? Is this a fair question to ask? What is the implication on policy and design if this is a good question and we find there is a moral element that we have been missing? Fascinating stuff.
Category: EHR Governance Healthcare Information Governance Tags:
by Andrew White | April 30, 2013 | Comments Off
10 days I ago I took aim a vendor webinar that, I thought, poorly represented how MDM did not include an element of information governance; as if MDM was a technology only. See Nice Webinar on MDM and Data Governance by EDWWS…but some issues. This has never been our view – so I had to call it out.
Today I received a newsletter from InfoTrellis. The note is filled with lots of good anecdotes and links to sites with stories about information and its uses and abuses – good for big data stuff too.
But the main article, called, “What You May Be Missing By Not Monitoring Your MDM Hub” grabbed my attention. I saw the following:
After the completion and successful testing of the MDM implementation project, companies sit back and enjoy the benefits of their MDM hub – and more often than not don’t even think about looking under the hood. It never occurs to them that they could be trying to gain insights into what’s happening inside that MDM hub by asking questions like:
- How is the data quality changing?
- What are the primary activities (in processing time) inside the MDM hub?
- How are service levels changing?
However, organizations change, people change, requirements change – impacting what is happening inside the MDM Hub. Such changes can open up significant opportunities for an organization – but without doing any sort of investigation that opportunity is typically not recognized.
As with my previous blog, this does not add up. What on earth was the “MDM hub” trying to do if it was not supporting an ongoing, intrinsic, development of insight as to the health of the information driving (or inhibiting) business outcomes? MDM requires a set of analytics focused on business outcomes, business process improvements, workflows, as well as data quality. If the metrics are not established, and part of “how we do things (like MDM) around here”, then this is not an MDM program at all – it’s a traditional data integration hub.
I can forgive the vendor for positioning – we know they need to do that (I did that when I was a vendor) but I just had to call out this error in positioning. Its another observation that MDM is so pervasive (as a topic) that its understanding, its hype, is firmly entrenched within the “trough of disillusionment” – a predictable stage of the Gartner Hype Cycle. Users need to be very wary of vendors with different and disparate definitions and positions on what MDM is meant to be. Caveat Emptor…of course!
Category: Gartner Hype Cycle InfoTrellis MDM Technology Hype Cycle Tags:
by Andrew White | April 30, 2013 | Comments Off
I read with interest an article in today’s print edition of the Wall Street Journal: Virginia Records Rule Upheld by High Court. The Supreme Court unanimously upheld a Virginia law Monday that limits out-of-state residents’ access to public records. They dismissed arguments that the public and the press held broad rights of access to government information across state lines.
This is fascinating. What is all this hoopla about Open Data? Surely government agencies all over are opening up access, via the web (no state line there), to data about what they do, in the interests of freedom of information and good government? Surely Article IV of the Constitution, which provides that the “citizens of each state shall be entitled to all privileges and immunities of citizens in several states”?
Um, no, sorry, that would only mean “fundamental” privileges – and access to information is not one of those. So how does this square with the principle of open data and open government? I assume there must be a body somewhere that determines what kind of data is worthy of being free and open, and what kind of data is not. Logically some data in the “open” would be, or could be, bad – perhaps that relating to some aspects of child welfare, criminal justice, and so on. So I am sure there are arguments for masking personal identify and so on. But it does seem interesting to watch such a Leviathan as the US federal and state governments wrestle with the same topic (what information should be free) over and over – each time coming up with a different response. No wonder organizations in the private sector struggle with the same issues.
Category: Information Policy Open Data Politics Tags: