I took an inquiry yesterday that kind of represents a significant set of inquiries I take over a normal year. It goes like this:
“We have many data sources (business applications) and also several data warehouses, that each tries to aggregate data from the sources to support our BI strategy. We need help to govern the data in these warehouses since the data is always wrong, incomplete or erroneous. Please help.”
Firstly, I don’t actually over BI. We have a whole team of analysts that cover “BI” and analytics. But this call came to me – since the magical phrase of “information governance” was in the question. This magical phrase is both Pandora’s box as well as the keys to the kingdom. When I say information governance, I don’t refer to the policy determining physical storage strategy (that’s the poorly named Information Lifecycle Management topic); I refer to the business value and use of information. We call that Enterprise Information Management (that encompasses ILM, by the way).
So the idea I wanted to explore in this blog is this: I don’t think that there is real business led information governance on information in data warehouse, supporting BI. I don’t think it exists. My idea is born from personal experience, and then reinforced through years of work as a vendor and then as an analyst covering business applications and then information management.
Business users, for the most part, have little interest in a data warehouse is. It is so far removed from what they do, and care about, day to day. You might be lucky – a supply chain planner probably does – since they are often good IT users. But in general, most business users believe that a DW is what IT uses to build the foundation of the system that drives the reporting systems. As such, business users tend to assume:
- The data warehouse and the data in it is IT’s realm
- Any issues with the data in the warehouse is for IT to solve
- If IT comes knocking on my door to “fix” stuff, I’ll help them when I can but it’s a low priority compared to my normal job
And so the extent of information governance in a data warehouse tends to focus on an exception, kicked out by an ETL or script, and IT chasing after business users who are “too busy to call IT back” to help solve the problem. In a nutshell, the work of information governance and stewardship is rarely, if ever, “operationalized” in the business process; it remains an IT effort. It does not become “how we do things around here”.
This is where my experience with Master Data Management comes in. MDM really only came about due to the fact that no amount of additional investment in ERP, BI, data warehousing or even data quality tools, solved the real cause of the problem. And the focus thus changes – with MDM. Instead of being:
- Not related to what business does every day
- Not directly tied to a business outcome
- Not being work for which business users are measured
MDM is explicitly focused on:
- Specifically when, where and how business users create, use and consume business-relevant information
- Tied specifically to a business outcome
- Embedding in day to day work and metrics to mean it
This work of information governance, as original sought by IT, can actually be made real. Information governance thus cannot be made real in the data warehouse environment but can be in the operational business application environment.
So I am back to my idea: I don’t think that there is real business led information governance on information in data warehouse, supporting BI.
I think I over stepped the mark though. I think information governance CAN be sustained in/on a data warehouse, by business people, but in extreme and exceptional circumstances. For example, a corporate mandate that this will be done in order to meet regulatory requirements. But this was not the form of IG was thinking off when I formulated the idea. So my idea does need to be tweaked. Maybe I will settle on: For the most part, and absent corporate mandates, the likelihood that business users will willingly adjust behavior to adopt operational information governance responsibility for data in a warehouse, to support BI and analytics, is woefully low. On the contrary, if and when that same request is made of the operational data, in the hands of the business user every hour of every day, there is far greater chance that IG can be established.
What do you think? Agree? Disagree? Is there any better way of putting this idea together?