On June 9, Google Labs announced Google Fusion Tables , a new system for managing data in the Google cloud from Google Labs. I want to be clear about one point – this is an experiment from Google Research not exactly ready for production systems (Google is clear about this also). The issue I have is how the press exaggerates the announcement by warning the Database Management System (DBMS) vendors to watch out as they are being blindsided by Google. You must be kidding!
First, what is Fusion Tables? It is a system for managing data in the cloud for collaboration with data from disparate sources in a simple way, including the ability to “drill-down” to the sources of the data. It allows the user to “join” (in a loose definition) data without the constraints of the data model, normally found in a relational DBMS. What it is not is a DBMS to manage data for an On-Line Transaction Processing (OLTP) system or a Data Warehouse. Fusion Tables is based on Data Spaces, defined in Wikipedia as “a container for domain specific data” and further “A Data Space system is a multi-model data management system that manages data sourced from a variety of local or external sources”. Data Spaces were originally defined in the early 1990’s during the Object Oriented DBMS (OODBMS) era.
As with many new ideas, there are elements of the technology that may have value. When this happens, we find that the original relational model is evolved to incorporate this new technology or model. We saw this occur with OODBMS – the modern DBMS does use inheritance and user defined classes. We saw this happen with XML – now the modern DBMS has full native XML as a data type as robust as the original pure-play XML DBMSs. Today we are seeing this happen with MapReduce as several DBMS vendors have incorporated it into its DBMS engine. We will see this happen also with the column-store construct, which we believe will be incorporated into many modern DBMS engines as an indexing technique for optimization. As to the validity of Fusion Tables and the ability to mix disparate data source and types, there is little question as to the usefulness of this. Oracle has already put a capability in its current release (11g) as SecureFiles and Microsoft in SQL Server 2008 has a feature called FILESTREAM. These are not experimental or beta test features but implemented in full production.
Is Fusion Tables worth watching? Of Course! The concept of easily combining disparate sources of data for analysis and collaboration is important and has been around since the inception of IT. Mashups and other Web 2.0 constructs have made some of this available today (see The Rise of Collaborative Decision Making). Google has a good start on this with the ability to use data from Google Apps and other spreadsheet style data with the initial version of Fusion Tables. Organizations must take care or these types of applications will cause additional turmoil in the governance and security space (see Developing a Strategy for Dealing With Desktop Database Management System Proliferation ). Will this technology replace your DBMS for OLTP and DW systems – not soon or in the future. Many have tried (e.g., OODBMS). There are other new techniques and systems being researched today that have promise (e.g., Akiba), however, the relational model continues to demonstrate flexibility and resiliency (over 30 years) and you can expect that to continue. Products like DB2, Informix, Ingres, MySQL, Oracle, PostgreSQL, SQL Server and Sybase ASE will be used in new IT systems for many years to come.