Craig Roth

A member of the Gartner Blog Network

Craig Roth
Managing Vice President: Communication, Collaboration, and Content
4 years at Gartner
25 years IT industry

Craig Roth is a vice president and service director for Gartner Research, in Burton Group's Collaboration and Content Strategies service. Mr. Roth covers a wide range of knowledge and Web-related topics at the intersection of collaboration, content… Read Full Bio

Coverage Areas:

Big Content

by Craig Roth  |  October 17, 2012  |  1 Comment

The interest in Big Data at our recent Catalyst conference shows that enterprises have recognized the need for a new approach to exploiting massive and rapidly changing data streams.  When will that same interest coalesce for Big Content? 

Big Content is a term that helps highlight the subset of Big Data related to the less-structured side of it.  Big Content isn’t new or different than Big Data; rather it helps focus on uses of Big Data for unstructured information for the kind of folks that think the Library of Congress is filled with “content”, not “data.”  

After all, Big Data has much to offer to folks who are turned off by the word “data” and may pay more attention to its potential value if a subset of its techniques are thought of as Big Content.  Just as Big Data uses Apache Hadoop (with MapReduce) to go beyond traditional BI, Big Content combines technologies to go beyond traditional search.  These technologies are applied to text analytics, sentiment analysis, video analysis, semantic web technologies, and attention management. 

The Big Data story is now well known.  Whether you’re analyzing real-time point of sale information from grocery stores, traffic sensors for every corner in downtown, or tracking temperature and flow speed from myriad points of the ocean, numerical data is flowing in faster than online transaction processing systems (OLTP) can handle them.  This is where Big Data comes in and has revitalized the kinds of people that have utilized BI and OLAP.

But what the audience that cares about less structured information?  Unstructured content such as social media postings, audio, and video are growing at a fast clip.  And in practice, structured or unstructured is not a binary choice.  Numbers in a database are clearly structured and a freeform Word document is unstructured, but there are many shades of gray inbetween such as web logs, XML-based comment (with varying levels of specificity in their schema), web logs, text (or documents) in database fields, and structured Word documents.  Blends of structured data and unstructured content can yield interesting hybrid analytics use cases.

Out of this mass of content, enterprises increasingly want answers to questions such as:

  • What are my customers saying about my product in social media?  Are the reactions generally favorable or not?
  • How often have epidemiological studies shown a certain protein to be an inhibitor?
  • How many articles about the deficit mention healthcare entitlements?
  • How can I notice important trends in my field of expertise that are beyond a Google search?

Full-text search is not the answer.  There may be too much noise in the search results to make them useful.  The results may be desired as semantic linkages or sentiment ratings rather than a list of links.  The text to be searched may not be accessible by a public search engine like Google or all within a firewall for enterprise search engines.

For industries that care more about what people are saying rather than what meters are measuring, Big Content will become a big deal.

1 Comment »

Category: Content creation     Tags:

1 response so far ↓

  • 1 Ian Howells   October 22, 2012 at 2:44 pm

    Great post Craig.

    We spent the last 20 years in the content management industry at companies like Documetum and Alfresco and founded SambaCloud on the many of the beliefs in your article around “Big Content”.

    We believe the explosion of content is driving a shift in the way we consume and collaborate on content. Business users, to get their job done, need to intelligently connect information from more and more sources, that are both internal and external, static and dynamic. This is driving a fundamental shift where content finds you as opposed to you finding content as you discuss above with linkages.

    We have created a LinkedIn Group to discuss related topics on Big Content at http://linkd.in/QOiI6H

    We also have a blog dedicated to this subject at
    http://www.sambacloud.com/blog

    We agree Big content will become a big deal.

    Ian Howells