Gartner Blog Network

Amazon Redshift Disrupts DW Economics – But Nothing Comes Without Costs

by Merv Adrian  |  December 8, 2012  |  7 Comments

At its first re:Invent conference in Late November, Amazon announced Redshift, a new managed service for data warehousing. Amazon also offered details and customer examples that made AWS’  steady inroads toward enterprise, mainstream application acceptance very visible.

Redshift is made available via MPP nodes of 2TB (XL) or 16TB (8XL), running the Paraccel PADB high-performance columnar, compressed DBMS, scaling to 100 8XL nodes, or 1.6PB of compressed data. XL nodes have 2 virtual cores, with 15GB of memory, while 8XL nodes have 16 virtual cores and 120 GB of memory and operate on 10Gigabit Ethernet.

Reserved pricing (the more likely scenario, involving a commitment of 1 year or 3 years) is set at “under $1000 per TB per year” for a 3 year commitment, combining upfront and hourly charges. Continuous, automated backup for up to 100% of the provisioned storage is free. Amazon does not charge for data transfer into or out of the data clusters. Network connections, of course, are not free  – see Doug Henschen’s Information Week story for details.

This is a dramatic thrust in pricing, but it does not come without giving up some things. For example, Amazon has not licensed Paraccel’s high-speed data import utilities; it is far more focused at this point at enabling movement between its own Elastic MapReduce, DynamoDB and S3 storage and Redshift. Thus the early focus, and likely early adoption, is Amazon’s customers’ data already in the cloud. Movement from existing data warehouses will come later. Today, that would require exporting data into S3 and then moving it into a (designed) Redshift data warehouse using Amazon’s data movement utilities, which were not shown in detail. Design doesn’t disappear, and it’s not free. As my colleague Mark Beyer said in an email discussion:

Data warehouse and analytics expertise is harder to come by than many believe. With Amazon Redshift providing services to initiate and operate the data warehouse in lieu of Paraccel’s management interface and tools, it is left up to the Redshift implementer to “provide the data warehouse chops.” While I’m sure that any good Cloud application jockey knows their stuff, any data warehouse veteran on the planet knows that letting the apps guys write analytics is like asking your doctor to be the striker on your football team (what we call Soccer here). It is entirely likely that an entire cottage industry of “expert implementers for analytics in the Cloud” will appear on the near horizon.

It’s also not clear how much database (not deployment and operating) control will be made available. Paraccel offers plenty of knobs and buttons. Tweaking performance by configuring memory, pinning tables there, looking at how data is packed inside the “slices” – it does not appear any of that will be exposed in the Redshift version. Nor is it obvious how to build ongoing update for a Redshift data warehouse yet.

Another missing “feature” is the support model one gets from a software firm like Paraccel – the level and nature of support in an Amazon environment today is quite different. Still, this is a work in progress. It was evident at re:Invent that Amazon is building up and enhancing its enterprise-facing team, and I had an interesting conversation with them about how the engagement model for an enterprise that has had several individuals “unofficially” contracting for projects on their own transitions to a corporate model. They have seen this play out a number of times now, and it’s becoming a better understood play for them.

One final comment about the vendors’ relationship: it is not as close as I suspect Paraccel would have hoped. After a million dollar multi- investment and over a year of joint work, it was surprising not to see Paraccel’s CEO on stage for the announcement, or even a synchronized press release. This reflects the relative arm’s-length nature of this arrangement. In my subsequent conversations with them, it became clear that Amazon expects their offering to diverge from Paraccel’s over time as they add their own pieces around the part they have licensed for use in Redshift.  And there was no publicized joint marketing or sales initiative.

It remains to be seen if the whole elasticity value proposition (scale up, scale down) proves as relevant to data marts and data warehouses as it does to the apps that Amazon is more accustomed to hosting, or how quickly enterprises will move their data to a public cloud. Warehouses don’t scale down. But analytic platforms iused for experimenting will, and this may create a great opportunity for Amazon. Gartner clients can see our position on other dimensions of this announcement in a First Take Mark Beyer and I just published.

Category: amazon  hadoop  mapreduce  big-data  data-warehouse  data-warehouse-appliance  dbms  

Tags: amazon  mapreduce  big-data-2  data-warehouse  dynamodb  paraccel  redshift  

Merv Adrian
Research VP
5 years with Gartner
38 years in IT industry

Merv Adrian is an analyst following database and adjacent technologies as extreme data transforms assumptions about what to persist as well as when, where and how. He also watches the way the software/hardware boundary… Read Full Bio

Thoughts on Amazon Redshift Disrupts DW Economics – But Nothing Comes Without Costs

  1. […] More… 37.696935 -121.867562 Share this:DiggTwitterEmailPrintFacebookRedditStumbleUponLike this:LikeBe the first to like this. […]

  2. […] Merv Adrian and Doug Henschen both reported more details about Amazon Redshift than I intend to; see also the comments on Doug’s article. I did talk with Rick Glick of ParAccel a bit about the project, and he noted: […]

  3. 図太い。不快感しかない。

  4. […] Merv Adrian and Doug Henschen both reported more details about Amazon Redshift than I intend to; see also the comments on Doug’s article. I did talk with Rick Glick of ParAccel a bit about the project, and he noted: […]

Comments are closed

Comments or opinions expressed on this blog are those of the individual contributors only, and do not necessarily represent the views of Gartner, Inc. or its management. Readers may copy and redistribute blog postings on other blogs, or otherwise for private, non-commercial or journalistic purposes, with attribution to Gartner. This content may not be used for any other purposes in any other formats or media. The content on this blog is provided on an "as-is" basis. Gartner shall not be liable for any damages whatsoever arising out of the content or use of this blog.