Apologies for the provocative headline, but I just couldn’t resist. Actually, there is no single secret sauce, but rather multiple key ingredients of varying degrees of obscurity.
Google Wave is a fascinating melange of audacity and hope. Google has blended multiple, powerful, sauces into this melange, some secret and other less so. One in particular that was quite obscure to me is operational transformation (OT) theory, which is the algorithmic framework that enables multiple people to edit a single document in real-time across a wide-area network with unpredictable latency. More about this in a moment.
As I stated in last week’s blog entry, written minutes after the keynote ended at the Google I/O conference, Wave has surprisingly broad scope, cutting across formerly separate application categories like email, blogging, wikis and instant messaging. My initial reaction was colored by instinctive reflex of cynicism, and basically amounted to: Yes, it’s very cool and innovative, but what has Google done for the enterprise lately?
After a healthy debate with Gartner colleagues, spanning a range of views pro and con, I reviewed the Wave video and the documentation, and felt greater excitement than I did during the keynote (where I was one of the few sitting down during the standing ovation). I won’t use this post to make one of those forecasts, such as “Google Wave will kill X”, where X can be any number of well-known vendors or products. That kind of statement is overly glib, because we are just a few days into a scenario that will take 5 years or more to play out, with many twists and turns along the way.
Instead, I want to first draw your attention to some of the enabling technologies that represent the key ingredients behind Wave, and then talk about OT.
Key enabling technologies include:
- GWT Ajax library, which not only powers complex interactions on a browser, but also provides a reasonably tuned rendering for small form-factor mobile devices, such as Android and iPhone
- XMPP protocol that provides a foundation for the Wave Federation Protocol
- the real-time keystroke-by-keystroke communication pioneered several years ago in Google Suggest, and now in production in high-scalability deployment
- the AppEngine cloud computing platform on which are hosted the Robot extensions to Wave
- the Big Table data persistence mechanism that powers Google’s implementation of a Wave server (something which is not required by the spec, but which facilitates developer productivity and production scalability)
Note that these are only enabling technologies, and Wave is a large collection of software built using these enabling technologies by a hundred-person team over the past two years. That system is currently in closed beta, so the best we can do is circle around it and measure where we can.
All of the above items have been discussed over the past months or years in the industry circles, so none of these qualifies as a “secret sauce”. Imho, if there is a secret sauce, it is Google’s extensions to operational transformation theory (OT).
OT is an area of computer science that spans decades, but is nevertheless obscure. The name is derived from the on-the-fly transformation of operations (in the case of a word processing document, the operations would be things like inserting or deleting text) to enable multiple independent reads and writes without locking. One phrase comes up again and again in the literature, is that this is an “optimistic” approach — not in the everyday sense of audacity and hope, but in a technical sense where each parallel repository of content takes its own path to an eventually consistent state.
So this Saturday evening, instead of going out to the movies, I decided to entertain my curiosity by looking further into OT. Here are some tidbits I found while scanning an area that is new to me. It is possible I have missed major landmarks and garbled some of the concepts in this rich and fascinating area. Like a tourist in an unfamiliar realm, I am enjoying the sights even though their significance might not be fully apparent.
Here is a simple illustrative sketch of OP, from well-known researcher Dr Chengzhen Sun, that hopefully will become more clear by the end of this blog post:
Seminal work that preceded OT was done by Leslie Lamport, one of the giants in computer science, in 1978 in a classic paper called “Time, Clocks and the Ordering of Events in a Distributed System”. This paper presented a fundamental algorithm for distributed computing based on distributed state machines, time stamps, and vector clocks. Lamport later said that his algorithm was based on his understanding of Special Relativity, which “teaches us that there is no invariant total ordering of events in space-time; different observers can disagree about which of two events happened first. There is only a partial order in which an event e1 precedes an event e2 iff e1 can causally affect e2.”
In an interesting side note, Lamport works for Microsoft Research these days.
Here is a simplified version of Lamport’s algorithm for distributed mutual exclusion, from a paper by RR Hoogerwoord in 2002.
Lamport’s work is generally applicable across many aspects of computer science, but is referenced by researchers in the field known as group editing. A pioneering work in collaborative authoring systems was by C.A. Ellis and S.E. Gibbs in 1989 on “Concurrency Control in Groupware Systems”. . There is a whole community of researchers that have been working in this field, but Dr. Clarence Ellis is one of pioneers. In an interesting side note, Ellis was the first African-American to get a PhD in Computer Science in 1969 (see his photo from that era).
Later on, in 1998, there was an important paper by Chengzheng Sun and Clarence Ellis on “Operational transformation in real-time group editors” This paper was followed by further work by Chengzheng Sun at Griffith University in Australia. (Is it a coincidence that the Google Wave team is based there?).
One of Dr Sun’s projects is CoWord, an attempt to retrofit Microsoft Word with real-time collaborative editing features (like those demonstrated by Google): CoWord seeks
“to apply the state-of-the-art collaborative editing technologies to widely used single-user commercial word processors in a transparent way, i.e., without modifying the source code of single-user applications. MS Word has been chosen as the first target of this application. A collaborative Word (named as CoWord) is being built in our lab, which not only retains existing MS Word features, but also includes new features that enable multiple users to perform and undo MS Word editing operations on the same MS Word document concurrently and consistently.”
Interestingly, Sun presented his work in a lecture to Microsoft Research in 2003. (Also worth noting is that Sun recently moved to Nanyang Technological University at Singapore.)
The above are only a couple of representative pieces in a swath of research in this field. My understanding is that Google has built on this work and come up with their own innovations, which I know about only as a distant observer. One of those innovations, I understand, has to do with maintaining a representation of a Wave as a tree structure in the browser, and keeping this structure in synch with the server version and that on other machines. If that is the case, I ran across a patent issued in 2005 that seems to be directly relevant: “Browser to browser, DOM-based, peer-to-peer communication with delta synchronization”:
“The present invention solves the complexity of integration of Web browsing, Web authoring and Instant Messaging, and offers the uniqueness of browser-based rich media manipulation and synchronization of Web contents among participating peers. The advantages are exemplified in the combination of the three…”
I’ll stop here to let the patent lawyers fire up their engines.
One last, possibly unrelated landmark, in my Saturday night tour of OT, is a variant approach which seeks the same goal as OT, to maintain consistency of shared data as it is manipulated by distributed authors in real time. The authors have come up with an approach that does not have to use vector clocks and is supposedly simpler and more efficient than classic OT. Because this approach is “With Out Operational Transform“, the authors chose the charming name of WOOT.
Updated (6/5/09): Corrected a typo (pointed out by Shantanu in comments).
Category: Uncategorized Tags: