I just plunged back into the linked data scene after having been more focused on other topics, like design thinking/hybrid thinking. I was surprised to find that the controversy about whether linked data requires RDF is still raging: When is Linked Data not Linked Data? – A summary of the debate. I assumed this would have been settled by Tim Berners-Lee’s two TED talks on linked data: one last year (Tim Berners-Lee on the next Web) and an update this year (Tim Berners-Lee: The year open data went worldwide).
Tim does not mention RDF at all in either of them. Here is how he defines linked data in his 2009 TED talk:
So I want us now to think about not just two pieces of data being connected, or six like he did, but I want to think about a world where everybody has put data on the web and so virtually everything you can imagine is on the web. and then calling that linked data. The technology is linked data, and it’s extremely simple. If you want to put something on the web there are three rules: first thing is that those HTTP names — those things that start with "http:" — we’re using them not just for documents now, we’re using them for things that the documents are about. We’re using them for people, we’re using them for places, we’re using them for your products, we’re using them for events. All kinds of conceptual things, they have names now that start with HTTP.
Second rule, if I take one of these HTTP names and I look it up and I do the web thing with it and I fetch the data using the HTTP protocol from the web, I will get back some data in a standard format which is kind of useful data that somebody might like to know about that thing, about that event. Who’s at the event? Whatever it is about that person, where they were born, things like that. So the second rule is I get important information back.
Third rule is that when I get back that information it’s not just got somebody’s height and weight and when they were born, it’s got relationships. Data is relationships. Interestingly, data is relationships. This person was born in Berlin, Berlin is Germany. And when it has relationships, whenever it expresses a relationship then the other thing that it’s related to is given one of those names that starts HTTP. So, I can go ahead and look that thing up. So I look up a person — I can look up then the city where they were born I can look up the region it’s in, and the town it’s in, and the population of it, and so on. So I can browse this stuff.
Now one might argue that Tim was simply avoiding using geek-speak to a general interest audience. But what’s telling is his choice of examples of linked data successes. In particular, he highlights Open Street Maps (OSM) in both talks. AFAICT, the OSM data format is linked, and it is XML, but it’s not RDF.
So if Tim is going to use OSM as a prime example of linked data (actually an example of linked open data), then he’s going to have to open the linked data tent to formats other than RDF. BTW, in the update video this year he also cites examples from both the UK and US open data efforts, many of which I’m sure are not in RDF.
And for those looking for a name data that does require RDF, SPARQL, et al? How about Semantic Web Data? Don’t get me wrong. I’m not against the full blown Semantic Web standards per se. I just feel that they fail critical test of the simplest thing that could possibly work or as Tim describes it, the Principle of Least Power.
Remember, linked data is all about the links and the relationships—not the format.
So for me at least, linked data refers to any machine-readable data with URLs pointing to it and URLs pointing out of it. It doesn’t get any simpler than that. I think Tim would agree.