Merv Adrian

A member of the Gartner Blog Network

Entries Tagged as 'HDFS'


Strata Spark Tsunami – Hadoop World, Part One

by Merv Adrian  |  October 31, 2014  |  2 Comments

New York’s Javits Center is a cavernous triumph of form over function. Giant empty spaces were everywhere at this year’s empty-though-sold-out Strata/Hadoop World, but the strangely-numbered, hard to find, typically inadequately-sized rooms were packed. Some redesign will be needed next year, because the event was huge in impact and demand will only grow. A few of […]

2 Comments »

Category: Accumulo Amazon Apache Apache Yarn Aster Avro Big Data BigInsights Cascading Cassandra Cloudera Elastic MapReduce Gartner Hadoop HDFS Hive Hortonworks IBM MapR MapReduce Microsoft Spark Uncategorized YARN     Tags: , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,

Hadoop Is A Recursive Acronym

by Merv Adrian  |  October 13, 2014  |  3 Comments

Hopefully, that title got your attention. A recursive acronym – the term first appeared in the book Gödel, Escher, Bach: An Eternal Golden Braid and is likely more familiar to tech folks who know Gnu – is self-referential (as in “Gnu’s not Unix.”) So how did I conclude Hadoop, whose name origin we know, fits the definition? […]

3 Comments »

Category: Accumulo Apache Apache Yarn Big Data Cascading Flume Gartner Hadoop Hbase HDFS Hive Mahout MapReduce Oozie Pig Spark Sqoop Teradata YARN Zookeeper     Tags: , , , , , , , , , , , ,

Hadoop is in the Mind of the Beholder

by Merv Adrian  |  March 24, 2014  |  11 Comments

This post was jointly authored by Merv Adrian (@merv) and Nick Heudecker (@nheudecker) and appears on both blogs. In the early days of Hadoop (versions up through 1.x), the project consisted of two primary components: HDFS and MapReduce. One thing to store the data in an append-only file model, distributed across an arbitrarily large number […]

11 Comments »

Category: Accumulo Ambari Apache Apache Drill Apache Yarn Big Data BigInsights Cloudera Elastic MapReduce Gartner Giraph Hadoop Hbase HCatalog HDFS Hive Hortonworks IBM Intel Lucene MapR MapReduce Oozie open source OSS Pig Solr Sqoop Storm YARN Zookeeper     Tags: , , , , , , , , , , , , , , , , , , , , , , , ,

AAA is Not Enough Security in the Big Data Era

by Merv Adrian  |  January 13, 2014  |  11 Comments

Talk to security folks, especially network ones, and AAA will likely come up. It stands for authentication, authorization and accounting (sometimes audit). There are even protocols such as Radius (Remote Authentication Dial In User Service, much evolved from its first uses) and Diameter, its significantly expanded (and punnily named) newer cousin, implemented in commercial and […]

11 Comments »

Category: Big Data data integration data warehouse DBMS Gartner Hadoop HDFS Industry trends Security     Tags: , , , , , ,

BYOH – Hadoop’s a Platform. Get Used To It.

by Merv Adrian  |  October 9, 2013  |  9 Comments

When is a technology offering a platform? Arguably, when people build products assuming it will be there. Or extend their existing products to support it, or add versions designed to run on it. Hadoop is there. The age of Bring Your Own Hadoop (BYOH) is clearly upon us.  Specific support for components such as Pig […]

9 Comments »

Category: Big Data BigInsights Cloudera DBMS Hadoop HDFS Hive Hortonworks IBM Lucene MapR MapReduce Microsoft Oracle Pig Rainstor RDBMS Security Solr SQL Server SQLstream Talend Teradata YARN     Tags: , , , , , , , , , , , , , , , , , , , , , , ,

What, Exactly, Is “Proprietary Hadoop”? Proposed: “distribution-specific.”

by Merv Adrian  |  September 6, 2013  |  12 Comments

Many things have changed in the software industry in an era when the use of open source software has pervaded the mainstream IT shop. One of them is the significance – and descriptive adequacy – of the word “proprietary.” Merriam-Webster defines it as “something that is used, produced, or marketed under exclusive legal right of […]

12 Comments »

Category: Apache Apache Yarn Big Data BigInsights Cassandra Cloudera Hadoop Hbase IBM MapR MapReduce open source OSS Pig YARN     Tags: , , , , , , , , , , , , , ,

Hadoop Summit Recap Part Two – SELECT FROM hdfs WHERE bigdatavendor USING SQL

by Merv Adrian  |  July 15, 2013  |  10 Comments

Probably the most widespread, and commercially imminent, theme at the Summit was “SQL on Hadoop.” Since last year, many offerings have been touted, debated, and some have even shipped. In this post, I offer a brief look at where things stood at the Summit and how we got there. To net it out: offerings today […]

10 Comments »

Category: Apache Apache Drill Apache Yarn Aster Big Data Cloudera data warehouse DBMS Gartner Hadapt Hadoop HCatalog HDFS Hive Hortonworks IBM MapR MapReduce Microsoft Netezza Oozie Oracle Rainstor RDBMS Real-time SQL Server Sqoop Teradata YARN     Tags: , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,

That Exciting New Stuff? Yeah… Wait Till It Ships.

by Merv Adrian  |  July 13, 2013  |  3 Comments

A brief rant here: I am asked with great frequency how this RDBMS will hold off that big data play, how data warehouses will survive in a world where Hadoop exists, or whether Apple is done now that Android is doing well. There is a fundamental fallacy implicit in these questions. Comparing what someone new […]

3 Comments »

Category: Big Data data warehouse Gartner Hadoop HDFS MapR RDBMS SQL Server     Tags: , , , , , , ,

Hadoop Summit Recap Part One – A Ripping YARN

by Merv Adrian  |  July 10, 2013  |  7 Comments

I had the privilege of keynoting this year’s Hadoop Summit, so I may be a bit prejudiced when I say the event confirmed my assertion that we have arrived at a turning point in Hadoop’s maturation. The large number of attendees (2500, a solid increase – and more “suits”) and sponsors (70, also a significant uptick) made […]

7 Comments »

Category: Apache Apache Yarn Big Data Cloudera Gartner graph databases Hadoop HDFS Hortonworks IBM Intel MapR MapReduce Storm Yahoo! YARN     Tags: , , , , , , , , ,

Hadoop 2013 – Part Four: Players

by Merv Adrian  |  March 8, 2013  |  1 Comment

The first three posts in this series talked about performance,  projects and platforms as key themes in what is beginning to feel like a  watershed year for Hadoop. All three are reflected in the surprising emergence of a number of new players on the scene, as well as some new offerings from additional ones, which I’ll cover in […]

1 Comment »

Category: Amazon Apache Big Data Gartner Hadoop Hbase HDFS Lucene MapR MapReduce     Tags: , , , , , , , , , , , , , , , , , , , , , , ,