Nick Gall

A member of the Gartner Blog Network

Nicholas Gall
VP Distinguished Analyst
14 years at Gartner
35 years IT industry

Nick Gall is a vice president in Gartner Research. As a founding member of Gartner’s Enterprise Planning and Architecture Strategies, Mr. Gall advises clients on enterprise strategies for interoperability, innovation and execution. Mr. Gall is a leading authority on middleware… Read Full Bio

Coverage Areas:

Tim Berners-Lee Doesn’t Seem to Think “Linked Data” Requires RDF

by Nick Gall  |  July 21, 2010  |  21 Comments

I just plunged back into the linked data scene after having been more focused on other topics, like design thinking/hybrid thinking. I was surprised to find that the controversy about whether linked data requires RDF is still raging: When is Linked Data not Linked Data? – A summary of the debate. I assumed this would have been settled by Tim Berners-Lee’s two TED talks on linked data: one last year (Tim Berners-Lee on the next Web) and an update this year (Tim Berners-Lee: The year open data went worldwide).

Tim does not mention RDF at all in either of them. Here is how he defines linked data in his 2009 TED talk:

So I want us now to think about not just two pieces of data being connected, or six like he did, but I want to think about a world where everybody has put data on the web and so virtually everything you can imagine is on the web. and then calling that linked data. The technology is linked data, and it’s extremely simple. If you want to put something on the web there are three rules: first thing is that those HTTP names — those things that start with "http:" — we’re using them not just for documents now, we’re using them for things that the documents are about. We’re using them for people, we’re using them for places, we’re using them for your products, we’re using them for events. All kinds of conceptual things, they have names now that start with HTTP.

Second rule, if I take one of these HTTP names and I look it up and I do the web thing with it and I fetch the data using the HTTP protocol from the web, I will get back some data in a standard format which is kind of useful data that somebody might like to know about that thing, about that event. Who’s at the event? Whatever it is about that person, where they were born, things like that. So the second rule is I get important information back.

Third rule is that when I get back that information it’s not just got somebody’s height and weight and when they were born, it’s got relationships. Data is relationships. Interestingly, data is relationships. This person was born in Berlin, Berlin is Germany. And when it has relationships, whenever it expresses a relationship then the other thing that it’s related to is given one of those names that starts HTTP. So, I can go ahead and look that thing up. So I look up a person — I can look up then the city where they were born I can look up the region it’s in, and the town it’s in, and the population of it, and so on. So I can browse this stuff.

Now one might argue that Tim was simply avoiding using geek-speak to a general interest audience. But what’s telling is his choice of examples of linked data successes. In particular, he highlights Open Street Maps (OSM) in both talks. AFAICT, the OSM data format is linked, and it is XML, but it’s not RDF.

So if Tim is going to use OSM as a prime example of linked data (actually an example of linked open data), then he’s going to have to open the linked data tent to formats other than RDF. BTW, in the update video this year he also cites examples from both the UK and US open data efforts, many of which I’m sure are not in RDF.

And for those looking for a name data that does require RDF, SPARQL, et al? How about Semantic Web Data? Don’t get me wrong. I’m not against the full blown Semantic Web standards per se. I just feel that they fail critical test of the simplest thing that could possibly work or as Tim describes it, the Principle of Least Power.

Remember, linked data is all about the links and the relationships—not the format.

So for me at least, linked data refers to any machine-readable data with URLs pointing to it and URLs pointing out of it. It doesn’t get any simpler than that. I think Tim would agree.

Enhanced by Zemanta

21 Comments »

Category: semantics WOA     Tags:

21 responses so far ↓

  • 1 Tweets that mention Tim Berners-Lee Doesn’t Seem to Think “Linked Data” Requires RDF -- Topsy.com   July 21, 2010 at 1:52 pm

    [...] This post was mentioned on Twitter by August Jackson, Julie Hunt. Julie Hunt said: RT @8of12 rss: Tim Berners-Lee Doesn’t Seem to Think “Linked Data” Requires RDF http://bit.ly/bFjjFh frm Nick Gall #Gartner blogs [...]

  • 2 Andreas Blumauer   July 22, 2010 at 8:21 am

    I think that any source which is worth being part of the L(O)D Cloud will be triplified, e.g.: OSM –> http://linkedgeodata.org/OnlineAccess

    In our discussion we should make a difference between technical details and the principles of Linked Data. I don´t think that TED talks should deal a lot with SPARQL endpoints, OWL, Inferencing etc.

  • 3 Juan Sequeda   July 22, 2010 at 8:21 am

    Just a reminder. It’s not fair to compare XML with RDF. That is apple-oranges comparison. XML is a serialization format. RDF is a data model. RDF can be represented in XML, RDFa, Turtle, N3, JSON, etc.

  • 4 Tom Heath   July 22, 2010 at 8:47 am

    Hi Nick,

    I think you’re conflating Linked Data and Open Data. The 2009 TED talk discusses Linked Data and Open Data (relatively distinctly), the 2010 is about Open Data (check the title).

    By my counting Linked Data gets one mention (on screen) in the 2010 talk, specifically in the context of some UK Government data that *is* published as RDF – the rest of the talk is about the increased opening of data in the UK, US and elsewhere.

    The excerpt you’ve transcribed doesn’t mention RDF, but it talks about relationships, and these are quite hard to make explicit in formats such as CSV or (non-RDF) XML. Yes, “linked data is all about the links and the relationships”, but most “formats” don’t make these links explicit. This is what RDF is designed to do. I wrote a blog post that explains this in more detail:
    http://tomheath.com/blog/2010/06/why-carry-the-cost-of-linked-data/

    Contrary to your claim, I don’t think TimBL uses OSM as an example of Linked Data success. Open Data success perhaps, but not Linked Data. We need to be absolutely clear that not all Open Data will be linked and not all Linked Data will be open. This is a point which is frequently lost in the tidal wave of commentary.

    Cheers, Tom.

  • 5 Christine Connors   July 22, 2010 at 9:50 am

    Nice pick up Nick. It is true that linked data doesn’t require RDF. It could use SKOS or OWL or XML; Microformats or OpenGraph or RDFa …. However, just as Henry Ford’s assembly lines revolutionized mass production, common frameworks such as these will mean less ‘negotiating’ for accurate linking of like items. The market will decide what is easiest, the producers of content and software will make business decisions – akin to choosing Internet Explorer’s version of rendering websites vs. Firefox’s or Safari’s. Each user will choose, some with more thought than others, which interaction mechanism they prefer – and inherit, with or without explicit knowledge, all of the business decisions that went into building that mechanism. The true battle, I believe, lies in the user interface design and user interaction design – not the format of linked data.

  • 6 Nick Gall   July 22, 2010 at 10:45 am

    Juan,

    Thanks for your comment. I didn’t think I was “comparing” XML with RDF. All I said was “it is XML, but it’s not RDF.” But if I had wanted to be more precise I could have said something like “it’s a data model described by XSD and serialized as XML, i.e., it doesn’t use the RDF data model.”

  • 7 Edward Thomas   July 22, 2010 at 10:49 am

    It seems that you have taken two speeches made by TimBL to lay audiences, and used this to characterise his position on RDF and linked data. In particular, you took a section of the talk out of context and used this to prove your hypothesis.

    I don’t think that RDF is the only possible format for linked data, much as HTML is not the only possible format for the Web (Flash and PDF amongst others have nich uses). The great thing about open data and open, machine readable, data formats is that the format doesn’t matter. This has been shown by projects such as LinkedGeoData which have taken Open Street Map’s data, turned it into RDF, linked it with DBPedia, and provided a RESTful API for accessing the data.

    The Open Government Data efforts have been characterised by the refrain “raw data now”. Of course data does not all magically appear in perfectly structured RDF. The data.gov.* portals have data in a lot of different formats, and many people are involved in the effort to make them more accessible.

  • 8 Nick Gall   July 22, 2010 at 11:33 am

    Tom,

    Thanks for the feedback.

    I think it’s Tim who’s doing the conflating, which is of course my point! :-) But if you want to get into a close reading debate about the talks, I’m game.

    Regarding the 2009 talk, I don’t know how you can claim that the 2009 talk discusses Linked Data and Open Data “relatively distinctly”! Tim never mentions the word “open” in the talk, other than in the name of the openstreetmap.org! (BTW, I can say this with confidence because the TED video page provides a full transcript–see the “open interactive transcript” hotspot in the upper right side of the page.)

    He talks extensively about linked data. In fact, he mentions it as part of the openstreetmap.org example: “It’s about people doing their bit to produce a little bit, and it all connecting. That’s how linked data works.” Given this statement directly in the context of the openstreetmap example, I can’t understand how you can possibly claim that you “don’t think TimBL uses OSM as an example of Linked Data success.”

    As for the 2010 talk, I agree that there is less direct mention of “linked data” (only one as you point out). But Tim clearly positions the talk as a continuation of the 2009 talk, and again he makes no attempt to distinguish between concepts like “linked data”, “raw data”, “community-generated data”, “open data”, etc. In fact, one could interpret the 2010 talk as deprecating the importance of “linked data” entirely since Tim seems to emphasize that the only important thing is to share as much raw data as possible!

    Finally, I don’t understand how you can say that relationships are “quite hard to make explicit in formats such as CSV or (non-RDF) XML. The use of rel-tag in HTML/XHTML and APP makes this simple, e.g., microformats. As for CSV, one can include relationships (even ones denoted by URLs) in the header line of the file.

  • 9 Nick Gall   July 22, 2010 at 11:47 am

    Christine,

    I wholeheartedly agree with everything you say…except the last sentence: “The true battle, I believe, lies in the user interface design and user interaction design – not the format of linked data.” I’ve been around long enough and seem enough examples of simple vs complex IFaPs (Identifiers, Formats, and Protocols) to know that the battle is won or lost by the IFaPs themselves. No amount of “hiding” of the IFaPs, whether it be via packaging, tooling, UI, etc., can save a very complex standard from a simpler one. The most successful IFaPs inevitably start out as ad hoc, unelegant, toy, simple, lightweight, almost trivial (e.g. Ethernet, TCP/IP, URL/HTML/HTTP, RSS, JSON) and then evolve into complex, robust, heavyweight standards. See Gall’s Law. See also Sowa’s corollary.

  • 10 Nick Gall   July 22, 2010 at 11:57 am

    Edward,

    I’m glad we agree on the big picture: “[We] don’t think that RDF is the only possible format for linked data.”

    As for whether or not my blog post took Tim’s comments out of context or not, we’ll have to agree to disagree. As I demonstrated in my comment to Tom, I think my use of Tim’s statements are completely IN context.

  • 11 Nick Gall   July 22, 2010 at 12:11 pm

    Andreas,

    Interesting prediction. We’ll see.

    I can’t think of a more principled and less technically detailed description of linked data than

    1. Linked data is all about the links and the relationships—not the format.
    2. Linked data refers to any machine-readable data with URLs pointing to it and URLs pointing out of it

  • 12 Kingsley Idehen   July 22, 2010 at 1:33 pm

    Nick,

    Simple glossary:

    1. RDF based Linked Data — this is Linked Data meme executed specifically using Entity-Attribute-Model variant of RDF and its associated Data Representation formats

    2. HTTP based Linked Data — this is Linked Data meme executed using a standard Entity-Attribute-Data model where the associated representation formats don’t need to be those associated with RDF e.g. OData, GData etc..

    I wrote the Data 3.0 manifesto to separate RDF and Linked Data because there’s absolutely no need for such conflation.

    I use RDFa, RDF, and SPARQL extensively, but they are implementation choices I’ve made re. Linked Data meme execution.

    Links:

    1. http://bit.ly/bmdv5N — Data 3.0 Manifesto Blog Post .

  • 13 Nick Gall   July 22, 2010 at 2:05 pm

    Kingsley,

    I came across your glossary the other day and I think it’s fine. But I prefer a “levels” approach akin to the levels of RESTfulness proposed by Leonard Richardson. It has become quite popular. TimBL proposes something like this in his linked data star system, which he discusses in his recent Gov 2.0 talk (via @janzemanek):
    1 data is made available
    2 machine readable
    3 format is an open standard, not proprietary
    4 linked data format: URL for all things and properties
    5 Declaring equivalence of properties

    It would need to be tightened up of course. But it’s a good start.

  • 14 Kingsley Idehen   July 22, 2010 at 2:39 pm

    Nick,

    Yes, a good start re. Link Data Fidelity :-)

    Some comments though:

    URL for all things is as confusing as URIs for all things. It really important we split the Roles inherent within the URI abstraction when talking about Linked Data.

    1. Use HTTP URIs for Unambiguous Names (i.e. an HTTP URL can be used as a Naming mechanism) that resolve to Structured Descriptor Documents

    2. Use Unambiguous Names as the Subject of aforementioned Structured Descriptor Documents

    3. Also use HTTP URI Names for Attributes and optionally as Reference values (how you point to related Things) within aforementioned Structured Documents

    4. Expose Structured Descriptor Documents published to the Web or other HTTP networks (e.g. Intranets and Extranets) via URLs (Address Role of HTTP URI abstraction).

    Kingsley

  • 15 Nick Gall   July 22, 2010 at 2:52 pm

    Kingsley,

    Couldn’t agree more. At the risk of ambiguity (:-) )I would shorthand 1-3: Use HTTP URLs to name entities, their attributes (aka their relationships), and the values of such attributes.

  • 16 Paul Miller   July 22, 2010 at 3:09 pm

    Nick,

    hear, hear! :-)

    As you may know, this is something I’ve been concerned about for a while, and detailed in two posts last year at http://cloudofdata.com/2009/07/does-linked-data-need-rdf/ (cited by Lorna in the post you reference here) and a follow-up at http://cloudofdata.com/2009/07/more-linked-data-and-rdf/.

    It’s an issue, it seems, that still has legs… :-(

    Paul

  • 17 Paul Miller   July 22, 2010 at 4:11 pm

    It’s perhaps also worth considering, with the greatest of respect for Tim and all that he has done and continues to do for the web, that his opinion as to RDF’s role within Linked Data is simply that; the OPINION of an influential and important individual within a global group of adopters and implementers. His Design Principles note isn’t a W3C Recommendation, or anything of the sort; it’s a personal perspective by an indisputably important individual.

    We are all just expressing opinions here, as part of seeking clarity in an evolving area. I BELIEVE my opinions, but that doesn’t necessarily make them right now, or over the longer term.

  • 18 Tom Heath   July 23, 2010 at 5:40 am

    Nick,

    Irrespective of whether Tim is giving the impression of conflating Linked Data and Open Data (and clearly you and I disagree on this), let’s you and I and others concentrate on raising the level of understanding in the community by reinforcing the complementarity yet distinctness of these two concepts, rather than picking over transcripts that leave significant room for interpretation.

    On the subject of ‘formats’ for Linked Data, I’ll extend to you the same challenge as I extended to Paul Miller: show me a concrete implementation of a ‘format’ for Linked Data, capable of scaling to the size and diversity of the Web, that isn’t RDF, and perhaps then I’ll take the “Linked Data doesn’t need RDF” claims seriously. As yet Paul hasn’t taken me up on my challenge, but if you’re keen let me know and I’ll make explicit my success criteria :)

    Cheers, Tom.

  • 19 Nick Gall   July 23, 2010 at 9:29 am

    Paul,

    Couldn’t agree more. And in my opinion, by far and away the most important aspect of an interface (which can be completely described in terms of IFaPs-Identifiers, Formats, and Protocols) to unify are the identifiers. (BTW, that’s Tim’s opinion too. :-)) The essence of “semantics” is agreement on the meaning of names/identifiers (please lets not quibble the distinctions between them). Unifying how you format sets of names and unifying how you exchange such sets via a unified protocol matters FAR less than simply unifying identifiers. So rather than scare those interested in the concept of making data “more like the web” with a set of seemingly daunting standards like RDF and SPARQL, why not just say, “You can begin linking data by simply using URLs for all your reference data (aka standard attribute names and values)? I find people get the value of that immediately.

    Once you’ve got them hooked on the power of unified identifiers using URLs, its easier to evolve them to convince them of the power of unified formats and protocols.

  • 20 Nick Gall   July 23, 2010 at 9:56 am

    Tom,

    I’d rather focus on getting people enthusiastic about doing BOTH linked data and open data. And quibbling about the one true format is a surefire buzzkill for enthusiasm. Let’s both concentrate on getting people to embrace URLs for all reference data (attributes and values). (See my comments to Paul Miller.)

    I’ll decline your challenge because successful standards rarely start off being scalable. They start off being easy and therefore popular! As Chris Dixon so aptly puts it: “The next big thing always starts out being dismissed as a ‘toy.’

    Remember John Gall’s Law and Sowa’s corollary (linked to in another comment)! The most successful standards are the ones that are most widely used. Once everyone is using them, then there’s an incentive to scale them. Cf. the World Wide Web.

  • 21 Paul Miller   July 23, 2010 at 10:02 am

    Nick,

    Yup.