Gartner Blog Network

Data Not Included – The Era of the Data Collector

by John Rizzuto  |  March 27, 2013  |  1 Comment

It’s all because of connectivity, don’t ya know?  The “Internet of Things” is a simple concept – anything can be connected to the Internet.  Anything.  An embedded electronic gizmo, smaller than a fingernail, and boom, there it is – in your browser or app – that “thing”, transmitting all sorts of information.  Trivial things, such as a reminder to water  flowers; and critical things, such as a jet engine signaling it’s statistically likely to fail on its next flight.  All this information, all this new type of data and the sophisticated analysis to make sense of it – is real. But, it will take many years to realize the revolutionary impact big data and big analytics will have on us.  Importantly; however, we have passed the point of no return – the big data and big analytics craze, in all its hype, evangelical praise, and emphatic disdain, is secular and irreversible.  Welcome to the “Era of Transformation”.

Up to now, technology was primarily about efficiency.  Driving costs out of the system through automation, increased speed and replacing physical channels with digital ones.  The Era of Transformation is something else.  It’s about effectiveness.  The Era of Transformation is likely to go on for a decade or more.  It will transform our organizations – and lifestyles – in ways that cannot be imagined.  How is it and why is it that technology makes us more effective only now?   Today, software and systems have the ability to take millions upon millions of seemingly mutually exclusive data points (and, perhaps more importantly, the ability to gather them) and run a myriad of algorithms against them and discover relationships – cause and effect – and answers the questions what happened, and why, but, ultimately, what will happen next, and what to do about it.  It is an intractable problem, if not impossible, for the human mind. 

The prevailing distributed or client-server model was about delivering applications to users; the cloud model is a bit more about bringing data and applications together. Enterprise applications, primarily creators of data, will be accompanied by a tsunami of new enterprise applications that consume data. Inevitably it will break the current methods of distributing and leveraging information.  In the current enterprise application model, the RDBMSs and the teams that supported them are the “center” of the data universe.  Analyzing data?  Contact the database admin, work with her to create an interface, and she will provide a copy of the data you need – and off you go.  Each connection was point to point, one data source to one data consumer and it was either hand coded or engineered with third party ETL tools.  This is data integration. 

When reflecting on these data silos, my colleague, Ted Friedman, expressed it well during his keynote at the Gartner’s 2013 BI Summit, “First, we have to stop thinking about data as a byproduct”. And this is the real change that big data brings to the way we design, deploy and use applications and how we treat the data they create and how they get the data they require.  There are countless analytical applications emerging to capitalize on data – applications for digital marketing to studying diseases – among a slew of others.   All these applications have one thing in common – data not included.   The line of business will covet these applications; they will need to move fast and painlessly.  Application users will demand a simple way to get the data sets they need, analyze them, preserve their findings, get new data sets, analyze those, and so on.   However, much of the data these analytical applications will need comes from outside the organization.  For example, just in the U.S, there are nearly 100 Federal Agencies with Statistical Programs, each publishing data to and accessible via the Internet. I look at this problem and think déjà vu. Years ago, point-to-point connections from application to application and their inherent brittleness made the application integration model break.   

These days, the title “data analyst” is an oft used term that rivals “big data”. The data analyst is the glory gal.  She takes the realms of data at her disposal, uses her BI and analytics tools, and comes up with answers – and questions – that she would never have been able to find or know to ask.  She’s the resident hero.  I submit another role, the data aggregator, will rise in ascendancy and importance.   The data aggregator does the strenuous lifting to prepare the data so it’s ready for the data anlayzer. The data aggregator will gather data from a set of these practically infinite data sources, collect them, format them, assure their quality, and then take these data sources and make each one seamlessly available to many data consumers.  It has to be repeatable, scalable, and done rapidly and often.  It likely needs to be self-service for the business user.  The steward will be required to provide internal, transactional, long-lived, short-term and real-time data.  The tool he will need to realize this vision does not exist, but it will.   And when it does, it will transform the ETL market such that it will be unrecognizable.

I recommend that our Gartner Invest clients read the following documents: Top 10 Technology Trends Impacting Information Infrastructure, 2013; Hadoop Is Not a Data Integration Solution; Data Integration Enables Information Capabilities for the 21st Century;   Emerging Role of the Data Scientist and the Art of Data Science; and The Future of Data Management for Analytics Is the Logical Data Warehouse.  These are only a few titles from our library on data and analytics.  Be sure to get on the inquiry calendars of any member of our Information Management Team, including Gartner Invest regulars:  Merv Adrian (big data, DBMS), Mark Beyer (big data, data warehousing), Roxane Edjlali (DBMS, data management), Donald Feinberg (DBMS, data warehousing) and Ted Friedman (data integration and data quality).

You will need a Gartner login to access documents mentioned.

Click to connect:


Tags: analytics  big-data  data-integration  data-quality  data-scientist  data-warehousing  

John Rizzuto
Research VP
6 years at Gartner
10 years IT Industry

John Rizzuto enables investors and business strategists to take a holistic view of the software industry and its participants by leveraging the qualitative insights of the Gartner platform and linking them to quantitative measures of business performance. In previous roles, he evaluated software companies' strategy, market position, and financial and business models as a financial analyst. Read Full Bio

Thoughts on Data Not Included – The Era of the Data Collector

  1. Rob Karel says:

    John – Spot on, agree that the focus of our technology landscape is shifting and am very curious to see if and when the data stewardship role does get recognized by senior management as a strategic investment, not a tactical resource drain.

    Your data governance angle inspired me to start a follow-up discussion to dive deeper into that aspect (link below). Would love your feedback on what you’re hearing from your clients.

    Best, Rob

Comments are closed

Comments or opinions expressed on this blog are those of the individual contributors only, and do not necessarily represent the views of Gartner, Inc. or its management. Readers may copy and redistribute blog postings on other blogs, or otherwise for private, non-commercial or journalistic purposes, with attribution to Gartner. This content may not be used for any other purposes in any other formats or media. The content on this blog is provided on an "as-is" basis. Gartner shall not be liable for any damages whatsoever arising out of the content or use of this blog.