Nick Gall

A member of the Gartner Blog Network

Nicholas Gall
VP Distinguished Analyst
14 years at Gartner
35 years IT industry

Nick Gall is a vice president in Gartner Research. As a founding member of Gartner’s Enterprise Planning and Architecture Strategies, Mr. Gall advises clients on enterprise strategies for interoperability, innovation and execution. Mr. Gall is a leading authority on middleware… Read Full Bio

Coverage Areas:

Epiphany: Replace HATEOAS With "Hypermedia Describes Protocols"

by Nick Gall  |  June 2, 2009  |  5 Comments

As a few of my friends know, I live for epiphanies. I love to connect concepts. So I’m really happy to be having one now (it’s been a while as regular readers of my blog — if any remain — can tell).

For a LONG time, I’ve been talking about how all interfaces can be defined in terms of IFaPs (Identifiers, Formats, and Protocols). My canonical example of an interface composed of IFaPs is of course the Web: URL (I), HTML (F), and HTTP (P). All three intersect in a particular instance of HTML, say my blog’s home page. The HTML for my blog’s home page is filled with URLs, HTML tags, and even HTTP "verbs" (though these are quite rare, mostly in an HTML form or embedded JavaScript).

Then along came REST and with it the concept of HATEOAS: Hypermedia As The Engine of Application State. And everyone, myself included, spent a lot of time trying to grok it and explain it to others. We’re still trying. One way I try to explain it is by highlighting that HATEOAS requires that each server response must contain not only the requested data — but also control information (in the forms of specially tagged URLs) describing the next set of permitted interactions with the server. It is this additional control information (at a bare minimum just some links to more data) that turns mere media into hypermedia.

Now along comes Jim Webber with a much better (dare I say brilliant) way of explaining HATEOAS and hypermedia: "Hypermedia Describes Protocols!" (See slide 26.) At first this might seem counterintuitive, since I said that HTTP is the Protocol and HTML is the Format in the WWW. But URLs, HTML, and HTTP are just generic description languages for describing domain-specific identifiers, formats, and protocols. Thus, think of a web of specific HTML pages as a domain-specific protocol. Jim Webber uses the example of ordering a Starbuck’s coffee. (What’s important is that each hypermedia DSL is composed using the generic languages of URL, HTML, and HTTP.)

This notion of bringing together identifiers, formats and verbs to describe a protocol is not new. One of the best descriptions of this was in the WS-BPEL 1.1 spec:

In thinking about the data handling aspects of business protocols it is instructive to consider the analogy with network communication protocols. Network protocols define the shape and content of the protocol envelopes that flow on the wire, and the protocol behavior they describe is driven solely by the data in these envelopes. In other words, there is a clear physical separation between protocol-relevant data and "payload" data. The separation is far less clear cut in business protocols because the protocol-relevant data tends to be embedded in other application data.

So if WS-BPEL was already thinking about mixing protocol data with "payload" data, what’s so new about HATEOAS? The fundamental difference is that WS-BPEL is based on the concept of providing an entire static protocol description up front once and for all — and providing it out of band. But HATEOAS is based on the notion of progressive description (don’t bother Googling the term, I coined it; and not to be confused with progressive disclosure). More and more of the description of the protocol is provided to the client (in band in the protocol itself) as the client executes its part of the protocol. I guess another good term might be JIT Protocol Description (Just In Time). Another good term might be "self-describing protocol". So now when explaining HATEOAS, instead of saying "each server response must contain control information" (huh?), I can say "each server response progressively self-describes the current protocol."

Now there are pros and cons to static/complete vs dynamic/progressive protocol descriptions. How can I program a client to execute its part of a protocol if I don’t have a full description of it up front? But if I encode the complete static description of the protocol into my client up front, how can I change the protocol dynamically?

Love to hear others’ thoughts. I’m going to think about this some more. That’s why I love epiphanies — they make you think about things in new ways.

5 Comments »

Category: google WOA     Tags:

5 responses so far ↓

  • 1 Jerald Murphy   June 2, 2009 at 9:16 am

    My main concern about dynamic vs. static in this context has to do with flexibility vs. latency. While dynamic progressive self-description allows rapid evolution of capabilities in context, the progressive nature of this self-description will potentiall kill transaction latency over a wide-area network. So, in large volume, low latency environments, this self-description will kill you. In application environments where change is the norm, this will be the preferred method of application evolution.

  • 2 Anthony Bradley   June 2, 2009 at 11:35 am

    When talking (especially to non-geeks) about HATEOAS, the RESTfulness of applications, and the payload delivery of control information, I use Wikipedia as an example. I point out that the content payload for the requested URL is the direct “data” response but that same data also serves as the contextual metadata for the potential URL transitions in the response. The number of links (application state transition possibilities) and the effectiveness of data as metadata in your payloads gives you a measure of the RESTfulness of the application.

    To put this more plainly, pages with great content that point to many relevant and high value links in the pursuit of an overall goal are RESTful and employ HATEOUS. Those that don’t are either a dead end (i.e., no links) or aimless (i.e., no relevant connection through metadata).

  • 3 Nick Gall   June 3, 2009 at 6:50 am

    Thanks for the comments Jerry and Anthony (wow I feel like I’m back in a META Group Thursday Research Meeting!).

    Jerry, I think you meant “large volume, HIGH latency environment”. If so, then agreed. Hypermedia is bulkier and higher latency than binary data optimized just for machine processing. Despite that, the web (hypermedia) works “good enough” for most apps for most parts of the world. We all know if doesn’t work as well in many parts of Africa and Southeast Asia. Why? Because the networks are low bandwidth and high latency (and high dropped packets).

    Anthony, I agree that Wikipedia and many other “pure html” sites (eg Craigslist) are great examples of HATEOAS. I use them myself. But what I and others are trying to convey is a easy to understand NON-UI example of HATEOAS or “HYpermedia DEscribes PRotocols” (HYDPR — only 55 Google Hits!). I’m also trying to explain the difference in philosophy between BPEL/WSDL descriptions and HYDEPR descriptions. Since there’s no BPEL/WSDL for UI web sites, there’s nothing to compare/contrast with the Wikipedia UI.

  • 4 Ian Robinson   June 5, 2009 at 8:29 am

    “What’s important is that each hypermedia DSL is composed using the generic languages of URL, HTML, and HTTP.”

    The trick is in composing DSLs without stamping all over those generic languages: that is, our hypermedia DSLs ought not abuse the generic elements upon which they depend. A technique that can help prevent mangling these “generic description languages” is to consider the protocol from the point of view of an intermediary. If the intermediary need know something about our application-specific protocol in order to behave correctly, we’re doing something wrong.

    Consider, for example a (non-RESTful) situation in which we tunnel RPC-like commands over POST. If some of those commands behave like queries that return eminently cacheable results, and we do indeed want to cache those results, we’d likely have to bake some application-specific knowledge into our intermediaries (“this application-specific XML payload returns cacheable results”).

    By asking ourselves what intermediaries would have to do to support our hypermedia DSL, we steer ourselves towards a solution better aligned with the architecture of the Web (in this case: don’t tunnel cacheable requests over POST).

    The fact that intermediaries remain agnostic to good application-specific protocols (aka. hypermedia DSLs) leads me sometimes to suggest that in a RESTful application, business meaningful behaviours (application-specific behaviours) emerge almost as a side effect of the transfer of representations according to standardised media types.

  • 5 Galen Tosham   September 30, 2009 at 10:11 am

    I was just telling RD that I’m also quite the … epiphany slut!