Google is about to get a whole lot more useful. Yesterday, the search titan announced the “Knowledge Graph” a functional enhancement that attempts to provide actual information about the subject of your query rather than just a list of links. This might be helpful, but the really interesting bit is the part about the graph. As Google SVP Amit Singhal put it in his blog post:
“The Knowledge Graph also helps us understand the relationships between things. Marie Curie is a person in the Knowledge Graph, and she had two children, one of whom also won a Nobel Prize, as well as a husband, Pierre Curie, who claimed a third Nobel Prize for the family. All of these are linked in our graph. It’s not just a catalog of objects; it also models all these inter-relationships. It’s the intelligence between these different entities that’s the key.”
That’s what a graph is, a structured set of meaningful relationships. The great challenge of the web is to bring some sort of useful order to the chaos of available online resources. Search is pretty good at finding stuff, but does little to show how things relate to each other. I am likely to miss huge swaths of useful information just because I don’t know enough to ask the right questions. I need a guide, something like a knowledgeable clerk in a bookstore or a good librarian who can point me to important titles and authors I would have otherwise missed. This is what Google is attempting to provide with the Knowledge Graph. Not just the answer to what you asked, but also the answers to the questions you probably should have asked. They are linking information together in a meaningful way and presenting the integrated results to the user. Pretty neat trick. Of course, the dirty little secret of the Knowledge Graph is that you don’t need to be Google to create one. You just need to know a little about how the Semantic Web works.
A couple of years ago, Google purchased a company called Metaweb. As part of the deal Google took ownership of Freebase a massive public database of Linked Open Data, data that is structured in a semantically meaningful way and linked to other useful information. In other words, Freebase was a huge graph of knowledge available to the public, one of many. With a few tools, some semantic know-how and a bit of elbow grease, you could create your own knowledge graph that integrated these public sources with your own internal, proprietary data. The biotech and intelligence industries have been doing it for years.
Google mentions Freebase in passing, but otherwise doesn’t say much about the semantic sources they are leveraging. I think this is the result of a couple of trends in the semantic realm. Last year I wrote a document for Gartner entitled “Finding Meaning in the Enterprise: A Semantic Web and Linked Data Primer.” In a section on the future of the Semantic Web, I said:
“Semantic technology vendors … are beginning to learn that their customers don’t want to hear about ontologies, inference rules, and other nuances of the semantic technologies underlying their products. … As a result of this dynamic, semantic technologies are being absorbed into the platform and hidden from users. This trend will continue as more and more platforms add semantic capabilities and adopt semantic standards.”
When published, this document was received with the deafening sound of … crickets. I shouldn’t have been surprised. Unless you are an information science geek, it can be hard to relate to this stuff. One vendor recently reported that, during a meeting with a potential customer, “the client put a hat in the middle of the table and said that anyone who used the word ‘ontology’ would have to put a dollar into it.” Google understands this and is using it to its advantage, and potentially to our disadvantage.
The Knowledge Graph is not on a par with PageRank and the rest of the Google secret sauce. While they have certainly invested a lot of resources and brain sweat in Knowledge Graph, Google didn’t invent Linked Data and certainly didn’t create that vast majority of the information they are exposing. Linked Open Data is a public resource created by countless hours of effort from anonymous stewards. Acknowledging that contribution would not only be respectful, it would incentivize the creation of even more Linked Data, which would in turn make the Knowledge Graph even more powerful and valuable. The potential for a virtuous cycle is being missed here. Google has done a tremendous service in exposing some Linked Data to the end user. They could do a much greater service if they exposed it as a SPAQRL endpoint. Somehow I don’t expect it to show up in the Google API anytime soon.
I’ve expressed concern over the privatization of the semantic web before. I don’t think this is quite the same thing. Maybe this is more of a “don’t show us how the sausage is made” dynamic. It’s hard to blame Google for letting people assume the Knowledge Graph is more of their magic. But if IT leaders and practitioners continue to think they can’t do this stuff because they aren’t Google, opportunities are going to be missed. In fact, they already are. I find it ironic that one of the objections raised to the Semantic Web is that it all sounds too much like science fiction. In his blog post Singhal hails the Knowledge Graph as Google’s first baby step towards the Star Trek computer. If we don’t start to step up, when that computer eventually materializes it will be ad-driven. We need to get more comfortable with semantic technologies and bringing them into the enterprise. The more Linked Open Data available, the more powerful the graph becomes for all of us. It’s time to get more involved or as Jean-Luc Picard might say, “Engage!”