Gartner Blog Network


Who Asked for an Open Data Platform?

by Nick Heudecker  |  February 18, 2015  |  4 Comments

This is a joint blog post between Nick Heudecker and Merv Adrian.

It’s Strata week here in San Jose, and with that comes a flood of new announcements on products, partners and funding. Today’s big announcement came in the form of the Open Data Platform (ODP). A number of companies have signed on, but in short, it’s got some Hadoopers, some service providers and systems integrators, as well as some analytics apps vendors.

There are a variety of arguments both for and against the ODP. Statler and Waldorf tried to encapsulate them here. They are surprisingly knowledgeable.

StatlerandWaldorf
Source: Wikipedia

Statler The ODP’s primary objective is to create a standard set of components for Hadoop, thus ensuring enterprise customers aren’t locked into a specific vendor. Applications are portable, and end user organizations know what they’re getting if they pick the ODP standard. This is for the enterprise end users of Hadoop.

Waldorf  “Standard”? For Hadoop? Excuse me, don’t we have one already? It’s called Apache Hadoop. Distributors add their own pieces, but there is a core that everybody agrees to. And APIs that tie it all together. But nobody has to pay to get in. You qualify with useful code, voted on by your peers. This is clearly for vendors, by vendors. These guys have Platinum members, Gold members – what’s that money for?

Statler Don’t kid yourself – Apache has Platinum sponsors too. And the timing in this announcement isn’t coincidental. Being a charitable organization doesn’t mean being altruistic – it’s a tax status. That said, it’s worth talking about how ODP relates to the Apache Software Foundation (ASF). As I see it, ODP doesn’t compete with the ASF. The ASF provides a governance model around open source software development, while ODP hopes to provide a vendor-led packaging model. Currently, the Hadoop vendors are fighting proxy battles in the ASF using committers. That destroys the spirit and undermines the purpose of the ASF. There have been allusions to a “fauxpen” process dominated by a few players packing the committees. Shifting these discussions to the ODP is the right move.

Waldorf And yet one of those dominant players is a charter member here. And “minority players” seem to have been able to get things like Drill and Phoenix in – because their code was good enough to get voted through the Incubator.

This is not about discussion – it’s about innovation. Apache’s job is to drive more innovation – let a thousand flowers bloom. If someone wants a better/richer engine than MapReduce, somebody creates Tez. If HBase isn’t secure enough, somebody else creates Accumulo. This new group pushes a least common denominator; it’s a frozen snapshot for its members to support until they deem a new one is ready.

Statler And you think that’s a bad thing? Every vendor’s Hadoop distribution is constantly changing. One month Spark is wholly unsupported, while the next month some components are supported, others are beta, while still others are shipped but unsupported. Repeat this for every component in the ever-expanding Hadoop stack. ODP offers stability end users can invest in. That stability offers a catalyst for mainstream Hadoop adoption.

Waldorf Sorry, I’m still not buying that. Pivotal used to, but now they don’t have to invest in the pieces that aren’t “common” anymore – not that they were doing so. Teradata doesn’t have a distribution and didn’t contribute that much either. IBM has a distro, and they contributed some, but mostly their “special sauce.” Hortonworks picks up the lead and offers support for at least one of the members – suddenly their role is very different. Maybe that is what the fees are for…

Statler Maybe. To get back to the question posed in the title – who asked for ODP? When cast with Pivotal’s other announcements around open sourcing HAWQ, and Greenplum, and other pieces – and partnering with Hortonworks, yes, it looks like ODP positions Hortonworks as the Hadoop arms dealer for the other players. Basing an open data platform on a single vendor’s packaging casts some doubt on “open.”

Waldorf Exactly. And it’s not just who wants it – who needs it? Aren’t the vendors already free to add their own pieces now? In fact they have to, to differentiate themselves. So are they saying the previous compatibility wasn’t compatible enough? Or are they creating a club they get to be the leading members in? Maybe this is Pivotal’s way to reduce its investment in a failing effort to build a proprietary way to capture a slice of this trend. Declare victory and retreat. And a way for Hortonworks to get included in many of their sales, and pick up some revenue for themselves.

Statler In the long run, ODP’s effectiveness at defining a certified core set of Hadoop components is an open question. But the long run doesn’t mean much in Silicon Valley.

Waldorf Or, you might say, in Redmond. It reminds me of ODBC. Microsoft’s term used to be “embrace and extend.” We’ll use your innovations. You keep on doing that. We’ll make some special pieces around the edge and monetize them atop that base. And you guys spend your time and effort on what the rest of us share. I notice they’re not in this announcement.

This simply institutionalizes a dichotomy in favor of a few favored players. Who wants it? As Cloudera suggests, the paying members, and it’s not clear who else. It’s ironic that Hortonworks is one of the founders of an organization that wants to add an anchor slowing innovation in the open source free-for-all it has been the flag-bearer for.

Gartner note: our recent webinar asked how many attendees considered vendor lock-in a barrier to investment in Hadoop. It came in dead last. With around 1% selecting it. More on that in a future post..

Category: big-data  

Tags: big-data  hadoop  

Nick Heudecker
Research Director
4 years at Gartner
18 years IT Industry

Nick Heudecker is an Analyst in Gartner's Research and Advisory Data Management group. Read Full Bio


Thoughts on Who Asked for an Open Data Platform?


  1. Hari Sekhon says:

    Proprietary Hadoop vendors can’t compete with Hortonworks in the long run – it’s an unwinnable war if Hortonworks stays the course – simply because of 100% Apache open source standardization. HDP is already the standard open source Hadoop platform that customers have been asking for and adopting en masse, ODP is just a way for big vendors to save face and not admit they couldn’t compete with Hortonworks. I’ve actually advised a couple of these vendors to stop wasting everyone’s time and just partner with Hortonworks several months ago – I’m actually a bit surprised they listened already – either that or it’s a massive coincidence!

    Being on the customer side of the fence these days the sentiment seems to be that this announcement doesn’t actually change much for them, just run Hortonworks or Cloudera and get on with your life.

    Best Regards,

    Hari Sekhon
    Hadoop / Big Data Architect & Consultant (ex-Cloudera)
    http://www.linkedin.com/in/harisekhon

    ps. Btw there are a couple typos in this article guys (Hortonwrks => Hortonworks, Gereenplum => Greenplum)

  2. David White says:

    So there’s the rub: large corporations who don’t want to maintain their own, sizeable, Open Source support staff, with the deep technical skills needed to self-support their production services, will want a stable, supported stack that they can rely on for both:
    * growth as they build out new projects, and
    * support so that issues get fixed in a timely manner, preferably to some kind of contracted service levels.

    The “true” open source approach means you’re having to run constantly to keep up either with constant change and maintenance against an underlying chase for the next best thing, or to keep the interests of the best staff.

    That means there’s a huge gap, and monetise-able opportunity, for distribution companies with a strong Committer base and support capability to offer SLA-based, contracted support and the necessary design and build-out services.

    So I don’t believe there is ever a single winner in the race: just different market positions between leaders and laggards, and customers choose which approach they prefer, and which partner.

    So does that also mean that the slightly academic/religious argument about Open vs Proprietary will be a continual one, without necessarily an obvious answer?

    #IWork4Dell

    • Nick Heudecker says:

      Thanks for your comments David. Your point about continually playing catch-up if running “true” open source is well put, but that’s why there are vendors offering support. ODP doesn’t change that.



Leave a Reply

Your email address will not be published. Required fields are marked *

Comments or opinions expressed on this blog are those of the individual contributors only, and do not necessarily represent the views of Gartner, Inc. or its management. Readers may copy and redistribute blog postings on other blogs, or otherwise for private, non-commercial or journalistic purposes, with attribution to Gartner. This content may not be used for any other purposes in any other formats or media. The content on this blog is provided on an "as-is" basis. Gartner shall not be liable for any damages whatsoever arising out of the content or use of this blog.