My colleagues at Gartner are always keen to remind me that structured data represents only about 5 to 10 per cent of an organizations information, the rest is made up of “unstructured” data. But a quick call with my colleague, Mark Beyer, will show that there is no such thing as unstructured data. For any such data to be meaningful, it has to be given structure. For that meaning to generate value much has to be made machine readable, even if that machine is a desktop in a single office. As such, most “unstructured” data ends up as “somewhat structured” so that can be read and interpreted by a computer or computer program. So what is the all the fuss about?
The May 2011 Health Data Management magazine ran an article that rang so true for me, far beyond healthcare. The article was titled, “To Scan or Not to Scan – Do meaningful use requirements for ‘structured data’ spell the end of document management systems?” There is a “yes” and a “no” answer to this. Both are right, but over different time scales.
Short term, perhaps the next 2 to 10 years all the data that is created today will have to be described (think: metadata) in a way that makes its discovery, use and re-use workable. This is the essence of the article. This suggests that, in the case of healthcare, to qualify for federal funds (essential use), healthcare organizations need to show how they are supporting the adoption of Electronic Health Records. As part of this, data objects (unstructured data) such as scans need to be made meaningful so that those same computer systems can use that data – instead of a separate “document management” system. Thinking about what Mr. Beyer says, this makes a lot of sense. Over time there must be increasing pressure to make “dumb data” meaningful by giving it structure. In some cases the pursuit of profit will drive this continuous evolution and in others, like healthcare, it will be regulation. As such there will be ever more structured data and much less unstructured data to worry about. For this reason MDM, that has come to represent one of the most strategic efforts for information governance for reuse, will become ever more important. That newly descried data needs to be governed for re-use and this is good news for MDM. Of course, this is also even better news for metadata management. For me though, metadata management has failed to attract the same aura that MDM and this is a shame. Metadata is so important, but its who value is derived from flexibility, and this has inhibited a focus that is common with MDM. Either way, metadata management and MDM are set to grow in popularity – period.
On the “no” side of this argument is another kind of innovation. For every terabyte of “unstructured-but-soon-to-be-structured data we create, yet more stuff will be created in a format or form that is not yet even known beyond the handful of people that created it. This means that ever more types of data will emerge. Think of the evolution of music, its electronic formats, and the battle over rights that dictate methods of access. So even though I do think that document management will, over the short to medium term, become less important, they won’t go away. They will adapt to other forms of data that will emerge. And of course, there is a long time gap between the “no” position of this argument, and the “yes” position. So don’t dump your content management systems, yet. Just make the more intelligent (like MDM) J
Bottom line – structure is required to make information meaningful. The more widely a piece of information can be used, the greater the need for governance. This does NOT mean “more” governance – it might be one person. MDM, the more recent attempt at starting an information governance program, will only get more important – to more organizations. Check out Master Content Management in our upcoming Hype Cycle for Master Data Management, 2011.