Gartner Blog Network

Posts from Date:   2014-2

Apache Tajo Enters the SQL-on-Hadoop Space

by Nick Heudecker  |  February 19, 2014

The number of SQL options for Hadoop expanded substantially over the last 18 months. Most get a large amount of attention when announced, but a few slip under the radar. One of these low-flying options is¬†Apache Tajo. I learned about Tajo in November of 2013 at a Hadoop User Group meeting. Billed as a big […]

Read more »

How Square Secured Your Data in Hadoop

by Nick Heudecker  |  February 14, 2014

If any company must face issues around data security, a credit card payment processor is a likely candidate. Square’s card readers and applications likely process millions of payments per day. One output of this processing is a lot of data.¬†According to a presentation delivered at Facebook, Square stores a substantial amount of this data in […]

Read more »

NoSQL Shouldn’t Mean NoDBA

by Nick Heudecker  |  February 6, 2014

Last September I conducted an informal survey of NoSQL adopters to improve our understanding of who is using NoSQL and why. The results were largely what I expected, except for the respondent profile. Database administrators (DBAs) appear to be significantly underrepresented in the NoSQL space, representing only 5.5% of respondents: The possibility of selection bias […]

Read more »

Spark and Tez Highlight MapReduce Problems

by Nick Heudecker  |  February 4, 2014

On February 3rd, Cloudera announced support for Apache Spark as part of Cloudera Enterprise. I’ve blogged about Spark before so I won’t go into substantial detail here, but the short version is Spark improves upon MapReduce by removing the need to write data to disk between steps. Spark also takes advantage of in-memory processing and […]

Read more »