I was reading ‘The signal and the noise’ (Nate Silver, 2012) over the holidays and didn’t really grasp the importance of a note I recently published called, ‘Big data governance – from truth to trust‘. This note should be re-titled, ‘Big data value – from truth to trust’. Nate Silver explores Big Data several times in his excellent book and in several places; he calls out the somewhat obvious point that with more data, so there will be more variability in that bigger pool of data. Specifically, with significant growth in data, new theories (and assumptions) of causation (versus correlation) will emerge. This growth will occur, perhaps, at such a prodigious rate that our testing won’t be able to keep up, and so our ability to improve our understanding (i.e. make better predictions) will fall. Great caution to keep in mind when we consider the high level of hype associated with big data.
In other words, with Big Data comes Big False Positives. Thus, as Silver states, “…[T]he number of meaningful relationships in the data – those that speak to causality rather than correlation and testify to how the world really works – is order of magnitudes smaller.” This is the exact causes (no pun intended) for the shift in emphasis our research note (and blog, From MDM to Big Data – from Truth to Trust) calls out, from a focus in absolute truths toward an understanding, and thus exploitation, of the degrees of trust in that meaning, or correlation. If truth is no longer possible, trust is at least plausible. Go Big Trust!