It so happens that this week I received an invite to a webinar hosted by Informatica. The title of the webinar (invite) is the title of this blog. I loved this title, so much so that had to blog about it!
First off, staleness is not actually the important point. The narrative for the webinar focused on the timeliness of data, which is certainly important. But is there not room for having “the right data, a little late” versus “the wrong data, sooner”? To be fair, there is a trade off here. There is much written about timeliness of information; and the impact on a successful outcome based on speed to action (based on bad data) versus slower response (and more/better data). My key point is that the timeliness factor is not the big idea here. For me, the newer, more interesting angles that came to mind when I read this webinar title were:
a) It’s not so much the information staleness, it is more about its trust.
b) It’s not so much about what happens with this data, but what happens afterward.
I read another article this week (Data Quality is Not Fitness for Use) on information’s fitness for purpose, and how “data quality” as a definition might not be enough. I tend to agree with the articles point of view. However, I have started to see a patter, driven from the huge growth in interest in things like “big data”, dark data, social data (some would call that “big data”), and so on. The problem is well known, even if the dimensions along which we define the complexity and scale are getting bigger.
All too often we talk of “data quality”, and this helps, but I don’t think it is enough. I agreed that some “fitness” factor might come in handy. But one relatively new twist on information is emerging – that being trust. The trust becomes important, perhaps more so that quality, in relation to the new, emerging demands coming from “big data” and the like. It goes like this:
You are a brand manager involved in the current product launch for a new line extension. The marketing is going gangbusters and you are paying 3rd party providers to bring you analysis for how the product is being received in the market place. Various social streams and pools are being mined; and all manner of “alerts” and feedback is being sent your way. Dutifully you have talked with IT and they are throwing their best and their brightest in terms of data quality at the incoming data, to cleans it, validate it, confirm that it is “good”. But you see the data and you simply don’t trust it. It would seem that feedback from all the data streams does not align. So you decide to mark some streams as “untrustworthy” in order to see if you can identify which information flow is dubious; and if the rest holds up. After some what-if analysis, you identify which information stream is in fact “untrustworthy” even though the data passes all data quality rules. Your reaction to the new view of the market, regarding the product launch, proceeds and is now based on a more trustworthy information base.
The problem is that most IT systems (and business processes, for that matter) do not allow for “trust” to be measured, used, or shared. It is a concept today, for the most part. I suspect that hot areas like data quality, Master Data Management, Information Governance, and the like, will increasingly model “trust” especially as it relates to data from outside the firewall. Some might argue that we need first to determine trust for the data we are supposed to control…
My mistake, not yours
My second point is akin to “once burned, twice shy”. If we are slowly building up our trust model, we will start to build up a knowledge base that rates sources, systems, even users, and how their data is used both positively and negatively in support of what we do. Think of the brand manager in the example above, who has not tagged an information source as “untrustworthy”. For many users this is not possible today, so mistakes happen and decisions are taken (on dodgy data) and performance and business outcomes are not as expected. So what now?
What if we could tag that data or source as “bad” and, even though it remains in the overall system, other users would be alerted to its somewhat negative impact in the recent past? What if the next user or use could adjust its risk and exposure practices based on previous degrees of trust (tagged on information, systems, sources and users)? If this were possible, the next business decision or action might be more “assured” since risk management would be adjusted, taking into account the experience and guidance from others?
This would, over the long term, lower the number of erroneous activities and overall increase the reputation of information, and even those that tag the information in the first place. Interesting idea, don’t you think?