Gartner Blog Network


There is No Such Thing as a Data Mashup

by Anthony J. Bradley  |  April 13, 2009  |  12 Comments

I feel compelled to step in voice my concern over the drifting of the term mashup. I’m not sure why I felt that mashups would not undergo the typical hype expansion other terms experience when the bandwagon begins to roll. I think much of my initial belief stemmed from the fact that mashups came from a clear point of origination starting with the term taken from music track mixing and sampling and then the clear examples of google maps based mashups that drove mashup notoriety. Unlike the terms SOA, and cloud computing, and Web 2.0 (as examples) mashups at least started with a clear differentiation. Unfortunately, those days seem to be fading.

Let me address “Data Mashups.” There is no such thing. A mashup is a composite application. Like any application, it is some purposeful utilization of data and not the data itself (feed or stored). Data mashups is a term created by data integration vendors looking to jump on the mashup hype. One vendor called themselves a “data mashup” vendor without making any change to their basic data integration capabilities and initially (though it has changed since) didn’t even offer their integrated data as a Web-technology based API or feed. We already have a term for accessing data from multiple sources (XML sources included) and combining it in various permutations….it is data integration. Relabeling data integration as data mashups does not serve mashups well as it hides the differentiated benefits of mashups within the quagmire of data integration.

Now, I’m certainly not saying that data integration isn’t important to mashups. What I am saying is that data integration is not a mashup or a “data mashup.” Almost two years ago Gartner delineated the difference between a mashup platform and a mashup enabler (see “Reference Architecture for Enterprise Mashups” – subscription or fee required). A mashup platform (such as JackBe, Serena, IBM’s Mashup Center) facilitates the building, management, assembly, and publishing of mashups while the mashup enablers (such as Denodo, Kapow, Connotate, Lixto) focus on accessing existing IT capabilities and exposing them as something mashable (see #3 below). Mashup enablement is very important for enterprise mashups but mashup enablement doesn’t create mashups or “data mashups”, it enables mashups.

It is important that we stick to a clear and differentiated definition of mashups or it is impossible to capitalize on what makes them unique and delivers the potential for new application value. Mashups are:

  1. A type of composite application where a new application is assembled from existing capabilities (data, logic, and visualization).
  2. The original sourced existing capabilities maintain their essenceĀ  (meaning that you as the creator and/or user know explicitly where the capabilities came from). This is crucial for governance and enabling socially-driven mashups.
  3. Mashups employ open Web-based technologes such as HTTP, XML, XHTML, RSS, and ATOM.

See, “Mashups and Their Relevance to the Enterprise” for a full definition – subscription or fee required.

It is the combination of these three characteristics that make mashups different from other forms of composite applications and systems integration. They are the root of the tremendous potential of mashup applications. We should not dilute that value by obscuring the definition or the IT world will miss what is different and will either ignore mashups as “the same old thing” or even worse, try and fail to achieve their promise.

Rant concluded, comments welcome.

Category: mashups  

Tags: data-mashups  

Anthony J. Bradley
GVP
10 years at Gartner
26 years in IT

Anthony J. Bradley is a group vice president in Gartner Research responsible for the research content that Gartner publishes through its three internet businesses (softwareadvice.com, capterra.com and getapp.com). These responsibilities include creating and leading the research organization and infrastructure needed for the strategy formulation, planning, research, creation, editing, production and distribution of the content. He has four global teams of highly talented people who are advancing towards the world's greatest destination for content on how small businesses succeed through information technology.


Thoughts on There is No Such Thing as a Data Mashup


  1. Byron Igoe says:

    I agree that marketing departments re-branding old data integration as new data mashup is a shame. However, in trying to define mashup as only application mashup, you are leaving behind the music industry, which coined the term.

    I defined mashup as “the creation of a new work from two sources that were not initially designed to be combined” in this article: http://bionbi.blogspot.com/2008/07/data-mashup-defined.html

    From there, Application Mashup is self-explanatory, and I define a Data Mashup as a dataset produced by users (not IT) performing data integration tasks interactively and collaboratively via the web.

    For example, my company (http://www.inetsoft.com) provides an Adobe Flash interface for drag-and-drop ad hoc querying that allows data from different sources (databases, web services, other data mashups, etc.) to be connected (join, union, etc.) with just a mouse. Calling this data integration doesn’t do it justice.

    -Byron

  2. Anthony Bradley says:

    Byron, On the contrary, I am including the music industry which provides music applications (songs/pieces) by assembling parts of other applications (tracks) in new ways. A “data mashup” analogy to music would be the production process of integrating music from several instruments into a music track. That isn’t called a “note mashup” it is called music.

    You are illustrating the challenge in that you are spinning the term “data mashup” to suit the needs of your product. I know many, many “data mashup” proponents who would strongly disagree with your “dataset produced by users” restriction.

    I would describe your product a visually oriented data integration tool. Admittedly, this isn’t as sexy as “data mashups” which is, of course, why you use it.

  3. Anthony Bradley says:

    Also, your definition of mashup is far too broad. Under your definition, most portals, many SOA orchestration apps, numerous BI reports, and a host of other systems integration initiatives would qualify as mashups.

  4. Mike Ogrinz says:

    I think you are missing an important point in your harkening back to the early days of mashups. Remember Paul Rademacher and HousingMaps.com (the “first” mashup”), which sprang from an ad hoc combination of craigslist and Google Maps?

    Mashups began as “innovation without permission” in both the computer + music space. Sure, this sentiment doesn’t fit well with a cut & dry technical definition, but I think somehow we need to capture the spirit that brought us mashups in the first place. (side note: mashups have spread to a third form as people mix Christian Bale rants w/ YouTube videos. But I digress..)

    I have some other concerns with your definition:
    >>1) (data, logic, and visualization)
    This would seem to contradict w/ some of the open standards listed in #3. RSS doesn’t have a visualization component, for example. You might counter that an RSS Reader is the visualization – but the Data Mashup vendors might fire back that the tool (e.g., widget, spreadsheet, etc) that ultimately displays the data they provide is theirs After all, isn’t some form of output inevitable?

    >>2) you as the creator and/or user know explicitly where the capabilities came from
    I’ve used lots of mashups to find cheap gas, restaurants, etc. And I don’t know where the data comes from. Maybe it’s 1 site – maybe hundreds. I don’t really care. I’m just happy to have the help I need. You are completely right that this presents huge governance, security, auditability, etc issues – especially within the enterprise. But if we immediately disqualify apps that don’t credit their sources as not being true mashups, I think we’d be throwing out some good stuff. This strikes me as more of a “best practice”.

    >>3) Mashups employ open Web-based technologes such as HTTP, XML, XHTML, RSS, and ATOM.
    Ideally, yes. But the first computer mashup (again, housingmaps.com) didn’t. Paul wrote a custom parser for cragislist and kludged his way into Google Maps before it was officially opened up. I love open protocols and standards. But I love a good hack, too. If providing a new solution means getting my hands dirty – so be it. In fact, some content owners have only chosen to provide an API *after* people started mashing up their data the hard way.

    Another side point: By your definition, Portals easily qualify as mashups if we add an open standard like JSR-168 to your #3. Most of the mashup enablers you list can spit out content in this form (and by extension that would make the portal vendor really a mashup platform??) To me portals aren’t mashups because they don’t give the user any/enough control. Even with personalization the user is limited to what the portlet creator opens up for customization.

    I completely agree that there are EII and EAI vendors who have jumped on the mashup bandwagon. At least they have the decency to call it a “Data Mashup” tool and not just a “Mashup” one. You know right away to look closer at these guys :-) I think the bigger danger is the BPM and BPEL guys who are suddenly saying they are now a mashup product.

  5. Anthony Bradley says:

    Mike, Thanks for your reply. BTW, I am reading your book and will review it in this blog shortly :-) Let me address a couple of your comments.
    1. JSR-168 is not an open Web standard and therefore doesn’t qualify. However, portals are moving quickly to consume Web standards and soon there may be no distinction. They will be getting far more flexible.
    2. Yes, very early on with mashups there was some hacking required but that was relatively embryonic and before the big mashup movement. I don’t think you will argue that easier is better and easy is a foundation of mashups (although maybe with your 2+ million patterns you may think they are more complex than I do :-)
    3. RSS is a feed not a mashup. A reader that mashes multiple feeds is the mashup application. In your book you have a set of “Harvest” patterns. I’m all for that. Call it Data Harvesting not Data Mashups. You have yet another definition of data mashups in your book where you call it an alternative to the term “Enterprise Mashups.” I’ll wait for my review to address that :-)

    4. You hit on the tough one which is the “maintaining their essence” criteria. I admit it sits on the line between criteria and best practice but it is so important that I pushed it over to criteria. I don’t believe enterprise mashups will be successful without this criteria. We use this one to differentiate mashups from the variety of other integration techniques where the capabilities are homogenized (e.g., BI/DW reporting). As an enterprise user you certainly should care and know where the capabilities are coming from. In fact you should be able to choose based on that knowledge.

    Finally I will say that you don’t really define mashups or enterprise mashups in your book, you dance around it. You do state that, “All composite applications are mashups but not all mashups are composite applications.” This is seriously flawed. Under that umbrella portals, SOA orchestration, sharepoint web parts, spreadsheet linking, and maybe even tiling of several open windows would qualify as mashups. I am trying to hold to a clear mashup definition upon which you can hang unique benefits. You have turned mashups into everything and therefore into nothing :-)

  6. Anthony Bradley says:

    BTW, I met Mike and he gave me a draft copy of his recent book, “Mashup Patterns: Designs and Examples for the Modern Enterprise.” I liked Mike right away and I appreciate his bravery in handing an analyst a copy of his book. :-) Notice all the smiley faces.

  7. Anthony Bradley says:

    We need to remember there is a fundamental disconnect in the industry right now with mashups. Vendors more and more are saying, “When you get down to it, everything is a mashup.” While clients are asking, “What is a mashup really, and why should I care?” I sit between the two and try to help our clients understand what is unique about mashups and why is that uniqueness valuable.

  8. Mike Ogrinz says:

    Thanks for the quick reply :-) (see, I can sprinkle those smiley faces in there as well). And don’t forget you got a rough-cut of the book, though I cant say I made any edits to the sections you mention. :-)

    I think we share very similar concerns. Last Dec, I started a thread over on the book’s companion site called, “Are mashups all hype” where I wrote:

    “At this nascent stage of their development, I will not argue that there are not some clouds of uncertainty surrounding mashups. And disparate vendor marketing is partially to blame. Let’s work together to bring clarity to this space”

    So I can hardly complain when you make a bold step to address this issue. :-) As for “data mashups” being “enterprise mashups”.. hmm – I’ll plead writer’s fatigue on that one. Sounds like a moment of stupidity (like reading Gartner blogs while I’m on vacation in St. John :-) )

    In my defense though, I think the set of “core abilities” (Data Extraction, Data Entry, Clipping, Support for Open Standards, Transformation, Visualization) on which all the mashup patterns in my book are based are very similar to the 3 criteria you put forth.This watermark eliminates tools like Excel, portals, etc. And hopefully budding mashup builders will use these criteria to bring clarity to any vendor discussions they wind up having.

  9. Anthony Bradley says:

    I like the data harvesting terminology and patterns you use in the book and much prefer that terminology to data mashups. Data harvesting is more of an umbrella term where mashup enabling of existing systems is one goal of data harvesting. I am using mashup enabling of existing systems building upon the well known terms of web enabling and service enabling.

    Anyway, I appreciate the dialog and look forward to finishing your book.

  10. Nick Gall says:

    Whether “data mashup” is a useful term embraced by users or merely a term foisted on the market by vendors I will leave for others to debate. I will claim that data designs can be more or less “mashable” and that this is the most important aspect of mashup architecture. Thus it is data designs (or whatever you want to call the concept, eg,schemas, data models, information architectures), such as RSS, ATOM, JSON, GData, that are most important to “mashability.” For a good discussion of this concept see http://derivadow.com/2007/12/28/web-design-20-its-all-about-the-resource-and-its-url/ .

  11. Anthony Bradley says:

    Agreed, and I would argue that data design and information architecture are central to any systems interoperability certainly not just enterprise mashups. This is why we talk about the importance of mashup enablement. However, mashup enablement is about exposing capabilities as mashable which certainly is improved by the use of RSS, ATOM, etc. but that won’t give you mashup interoperability. The challenges of syntactic and semantic interoperability will still persist. An underlying information model or other means of mapping the data from multiple mashable sources is still required regardless of the individual source’s data exposure formats and protocols.



Leave a Reply

Your email address will not be published. Required fields are marked *

Comments or opinions expressed on this blog are those of the individual contributors only, and do not necessarily represent the views of Gartner, Inc. or its management. Readers may copy and redistribute blog postings on other blogs, or otherwise for private, non-commercial or journalistic purposes, with attribution to Gartner. This content may not be used for any other purposes in any other formats or media. The content on this blog is provided on an "as-is" basis. Gartner shall not be liable for any damages whatsoever arising out of the content or use of this blog.