It’s still early days for Apache Spark, but you’d be forgiven for thinking that based on the corporate sponsorship at Spark Summit. For the second conference for a very early technology, the list of notable sponsors is impressive: IBM, SAP, Amazon Web Services, SanDisk and RedHat. SAP also announced Spark integration with HANA, its flagship DBMS appliance. Other companies, like MapR and DataStax, also announced (or reinforced) partnerships with Databricks, the Spark commercializer.
Given the relative immaturity of this open source project, why are these companies – particularly the large vendors – rushing to support Spark? I think there are a few things happening here.
First, after building out integration with MapReduce, integrating with Spark was easy. SAP’s integration with Spark uses Smart Data Access, the same method used for MapReduce integration. I imagine only it’s a matter of time before similar integration occurs with Teradata’s QueryGrid or IBM’s BigSQL, among others. After all, this looks a lot like external tables, something the DBMS vendors have been doing for at least a decade.
The ease of integration only explains part of the sudden interest in Spark. More important is the need to not be left out of the next iteration in data processing. While Hadoop is an important component of any data management discussion today, it had a long road to credibility. Many vendors simply took a “wait and see” approach to Hadoop and they waited too long. Don’t think the same mistake will happen with Spark. Customers are less resistant to open source options, and large vendors need to get behind every project with momentum to compete with startups.
It’s too early to pick winners and losers. The incumbent vendors are upping their game, while much of the messaging coming from the Hadoop distribution vendors is confusing. However this shakes out, it should make a great show for the rest of 2014.
Read Complimentary Relevant Research
Laying the Foundation for Artificial Intelligence and Machine Learning: A Gartner Trend Insight Report
Now more than ever, technical professionals must focus on developing the foundational components needed to support artificial intelligence...
View Relevant Webinars
State of Data Security
Warning: Your data is not all neatly defined, structured, organized and secured in your datacenter. Determining or defining the data...
Comments or opinions expressed on this blog are those of the individual contributors only, and do not necessarily represent the views of Gartner, Inc. or its management. Readers may copy and redistribute blog postings on other blogs, or otherwise for private, non-commercial or journalistic purposes, with attribution to Gartner. This content may not be used for any other purposes in any other formats or media. The content on this blog is provided on an "as-is" basis. Gartner shall not be liable for any damages whatsoever arising out of the content or use of this blog.