At last week’s Data & Analytics Summit, I had 37 discussions with a range of end user and vendor attendees. As the chart below shows, the overwhelming topic for end users was the data lake. Over the course of 16 data lake discussions, there was little overlap in how different attendees understood the concept. Many attendees simply expressed confusion – what’s this data lake thing and how does it work? Who uses it? What happens to the data?
One consistent theme across several discussions was the motivation for a data lake was to provide access to more data faster, but for uncertain uses. The risk is that business users are simply complaining to their IT counterparts that “we need more data” without a plan for what they’ll do with it. Worse, they may not understand the newly available data, its relative quality and how it should be used. My colleague Valeria Logan has done some excellent work on this front in her data literacy special report.
A second block of conversations around data strategy varied widely. Master data, data quality and cloud migration were top of mind for some, while others wanted to map out their investments for the next three years.
Questions around streaming were much more foundational around maturity and evolving business processes to take advantage of real-time insights and new ways of working. It is clear that when it comes streaming and real-time, companies are only focused on technology acquisition instead of what the downstream impact will be. Previous surveys indicated that only 15% of big data projects get to production. Those use cases were largely for data at rest. I believe the number of streaming projects getting to product will be far lower, owing to a lack of organizational maturity.
Next week we’re all off to London, where conversations will likely center on blockchain, GDPR, and likely more data lake discussions.