Gartner Blog Network

SearchDataManagement Asks: “Is it too soon for unstructured data governance?” I say, “It’s long overdue”.

by Andrew White  |  September 25, 2014  |  3 Comments

I had no choice – as soon as I read the headline I had to click to read the article.  I had received an email alert from SearchDataManagement that ran an article called, “Is it too soon for unstructured data governance?”  The focus was big data – since much of that tends to unstructured.  What is meant though by “unstructured” is actually that there are many different data, from different sources, and each may have no actual structure at all (unlikely) or its own structure (taxonomy, conformed dimensions, schema, metadata data model, semantic model etc).  And so the “lack of governance” as a result actually means “no single lens or compliance information model for which all sources have to comply for entry (perhaps into the data lake).  In other words, no information governance, no barrier to entry.

I actually commented on this problem recently in relation to the growing popularity with data lakes.  See Making Sense of the Information in your Data Lake – adding structure.

I feel the question is a bit of a ruse.  How can you hope to repeat or build on knowledge and insight gleaned from any analysis of your data if you don’t preserve some form of structure?  The answer should be self-evident.  What I think is different these days is the degree and form of information governance that should be overlaid our data lakes and stores.  The classic Enterprise Data Warehouse (EDW) had too much; every data had to comply 100% to gain entry into the warehouse.  At the other extreme a data lake itself has no barrier to entry – no conformed dimension requirements.  What we need these days is something in between those two extremes and I have heard that same point from end-users recently who have worked with data lakes.  So no, it’s not too soon for unstructured data governance.   It is long, long overdue.

Additional Resources

Category: dark-data  data-and-analytics-strategies  data-lake  information-governance  information-innovation  unstructured-data  

Andrew White
Research VP
8 years at Gartner
22 years IT industry

Andrew White is a Distinguished Analyst and VP. His roles include Chief of Research and Content Lead for Data and Analytics. His main research focus is data and analytics strategy, platforms, and governance. Read Full Bio

Thoughts on SearchDataManagement Asks: “Is it too soon for unstructured data governance?” I say, “It’s long overdue”.

  1. John Evans says:

    Agreed. Part of, if not all of, the governance requirement here is to add context to the data in your data lake (or whatever you want to call it!). Without that context, people will continue to struggle to analyze and take action against an increasing flow of incoming data. We blogged about this recently too and you can read our perspective at

  2. Banks, especially now, have very stringent credit
    history requirements; before they are going to
    issue a loan you need to pass these requirements.

Comments are closed

Comments or opinions expressed on this blog are those of the individual contributors only, and do not necessarily represent the views of Gartner, Inc. or its management. Readers may copy and redistribute blog postings on other blogs, or otherwise for private, non-commercial or journalistic purposes, with attribution to Gartner. This content may not be used for any other purposes in any other formats or media. The content on this blog is provided on an "as-is" basis. Gartner shall not be liable for any damages whatsoever arising out of the content or use of this blog.