Entries Categorized as 'Big Data'
by Merv Adrian | March 9, 2013 | 22 Comments
I don’t often do a pure opinion piece but I feel compelled to weigh in on a queston I’ve been asked several times since EMC released its Pivotal HD recently. The question is whether it is somehow inappropriate, even “evil,” for EMC to enter the market without having “enough” committers to open source Apache projects. [...]
Category: Apache Big Data Cassandra EMC Hadoop Lucene MapR open source Tags: Apache, big data, Cassandra, EMC, Hadoop, MapR, OSS
by Merv Adrian | March 8, 2013 | 1 Comment
The first three posts in this series talked about performance, projects and platforms as key themes in what is beginning to feel like a watershed year for Hadoop. All three are reflected in the surprising emergence of a number of new players on the scene, as well as some new offerings from additional ones, which I’ll cover in [...]
Category: Amazon Apache Big Data Gartner Hadoop Hbase HDFS Lucene MapR MapReduce Tags: Apache, DBMS, DDN, EMC, Gartner, HANA, Hbase, HDFS, Hortonworks, HP, HPC, IBM, Intel, Lustre, MapR, MapReduce, MarkLogic, Pentaho, S3, SAP, SAS, Talend, VMware, WANdisco
by Merv Adrian | February 23, 2013 | 4 Comments
In the first two posts in this series, I talked about performance and projects as key themes in Hadoop’s watershed year. As it moves squarely into the mainstream, organizations making their first move to experiment will have to make a choice of platform. And – arguably for the first time in the early mainstreaming of an information [...]
Category: Amazon Apache Aster Big Data BigInsights Cisco Cloudera data warehouse appliance Elastic MapReduce EMC Gartner graph databases Hadoop HP IBM MapReduce NetApp Oracle Teradata Yarc Tags: Amazon, Apache, Aster, big data, Cisco, Cloudera, Elastic MapReduce, EMC, EMR, graph database, Hadoop, HP, IBM, MapReduce, NetApp, Oracle, Teradata, Yarc
by Merv Adrian | February 16, 2013 | 11 Comments
It’s no surprise that we’ve been treated to many year-end lists and predictions for Hadoop (and everything else IT) in 2013. I’ve never been that much of a fan of those exercises, but I’ve been asked so much lately that I’ve succumbed. Herewith, the first of a series of posts on what I see as [...]
Category: Big Data BigInsights Cloudera EMC Hadoop Hbase HDFS Hortonworks IBM MapReduce Sqoop Tags: Apache, BigInsights, Cloudera, EMC, Flume, Hadoop, Hbase, HDFS, Hive, Hortonworks, IBM, MapR, MapReduce, Pig, Sqoop, zookeeper
by Merv Adrian | February 10, 2013 | 15 Comments
“Hadoop people” and “RDBMS people” – including some DBAs who have contacted me recently – clearly have different ideas about what Data Integration is. And both may differ from what Ted Friedman (twitter: @ted_friedman) and I (@merv) were talking about in our Gartner research note Hadoop Is Not a Data Integration Solution, although I think [...]
Category: Big Data data integration Hadoop Hortonworks Magic Quadrant Talend Uncategorized Tags: Apache, data integration, Hadoop, Hortonworks, Magic Quadrant
by Merv Adrian | January 30, 2013 | 8 Comments
2013 promises to be a banner year for Apache Hadoop, platform providers, related technologies – and analysts who try to sort it out. I’ve been wrestling with ways to make sense of it for Gartner clients bewildered by a new set of choices, and for them and myself, I’ve built a stack diagram that describes [...]
Category: Apache Big Data Cloudera data integration Hadoop Hbase HDFS Hortonworks MapReduce open source OSS Sqoop Tags: Apache, Cassandra, Cloudera, Datastax, Flume, Hadapt, Hadoop, Hbase, HDFS, Hive, Hortonworks, Hstreaming, Karmasphere, MapR, MapReduce, Oozie, open source, OSS, Pig, Sqoop, zookeeper
by Merv Adrian | December 8, 2012 | 7 Comments
At its first re:Invent conference in Late November, Amazon announced Redshift, a new managed service for data warehousing. Amazon also offered details and customer examples that made AWS’ steady inroads toward enterprise, mainstream application acceptance very visible. Redshift is made available via MPP nodes of 2TB (XL) or 16TB (8XL), running the Paraccel PADB high-performance columnar, compressed [...]
Category: Amazon Big Data data warehouse data warehouse appliance DBMS Hadoop MapReduce Tags: Amazon, big data, data warehouse, DynamoDB, MapReduce, Paraccel, Redshift
by Merv Adrian | January 23, 2012 | Comments Off
In early January 2012, the world of big data was treated to an interesting series of product releases, press announcements, and blog posts about Hadoop versions. To begin with, we had the announcement of Apache version 1.0 at long last, in a press release. Although there were grumblings here and there in the twittersphere that [...]
Category: Apache Big Data Cloudera Hadoop Hbase HDFS Hortonworks IBM MapReduce NetApp open source Sqoop Tags: Apache Software Foundation, ASF, Aster, Avro, CDH, Cloudera, Datastax, EMC, Greenplum, Hadoop, Hbase, Hive, Hortonworks, IBM, Mahout, MapReduce, NetApp, open source, Pig, Sqoop, Teradata
by Merv Adrian | November 3, 2011 | 6 Comments
Another guest post, this time from my colleague and friend Mark Beyer. My name is Mark Beyer, and I am the “father of the logical data warehouse”. So, what does that mean? First, if like any father, you are not willing to address your ancestry with full candor you will lose your place in the universe and [...]
Category: Big Data data warehouse DBMS Tags: data warehouse
by Merv Adrian | July 19, 2011 | 4 Comments
The big players are moving in for a piece of the Big Data action. IBM, EMC, and NetApp have stepped up their messaging, in part to prevent startup upstarts like Cloudera from cornering the Apache Hadoop distribution market. They are all elbowing one another to get closest to “pure Apache” while still “adding value.” Numerous [...]
Category: Big Data Hadoop IBM MapReduce Microsoft OSS Yahoo! Tags: Apache, BigInsights, Brisk, Cassandra, Cloudera, Datarush, Datastax, Eigenbase, EMC, Facebook, Flume, Hadapt, Hadoop, Hbase, HDFS, Hive, Hortonworks, Hstreaming, IBM, InfoSphere, Isilon, Karmasphere, Linux, MapR, MapReduce, Microsoft, Mondrian, NetApp, NFS, Oozie, open source, Oracle, OSS, Pervasive, Pig, Platform Computing, SQLStream, Sqoop, Watson, Yahoo!, zookeeper