This post was authored by Rick Greenwald, Merv Adrian and Donald Feinberg
Last week, Google launched its internal Cloud Spanner DBMS into a public beta. Claiming to be both strongly consistent (like a relational DBMSs) and horizontally scalable (like NoSQL DBMSs), Cloud Spanner’s internal use has given Google time to exploit unique physical characteristics of its cloud.
The great weakness of distributed DBMS offerings has been their inability to offer both consistency and performance across broad scale-out architectures. The CAP Theorem demonstrated that any system subject to network partitioning must prefer either availability or consistency and sacrifice the other. Spanner promises to address this problem. One feature relies on Google’s control of the network, which has been optimized to reduce the number and duration of partition events. Google claims to deliver five 9s (99.999%) of availability in its internal network. Like its competitors, such claims will need to be backed by contracted SLAs.
A truly differentiating feature is TrueTime – the use of a Google technology that uses atomic clock-based timestamps at every location to provide synchronous clock time across the world. TrueTime gives Cloud Spanner a crucial piece of information for write activity – the system can absolutely determine the serial order of writes, even across distributed nodes. With this information, consistency issues across nodes can be determined and prevented, with a version of two-phase commit. Cloud Spanner uses a central write leader to enforce write consistency.
It also exploits a frequently used form of multi-version concurrency control (MVCC), where different versions of any piece of data are kept. MVCC allows a DBMS to produce consistent reads without read locks. In the case of Cloud Spanner, MVCC also helps to provide consistency across distributed replicas. Even if all updates have not been received by a replica, the node is aware of the TrueTime timestamp for the last completed update and can perform a fully consistent read as of that time. In its initial release, Cloud Spanner will only support distributed nodes within a region, but there are plans to support cross region nodes in future releases.
Will systems using Cloud Spanner automatically be consistent? No more than with any other DBMS – when a scenario arises that could result in inconsistent data, developers will have to build in logic to handle the inconsistency. But it does promise to at least offer a globally distributed system where consistency can be guaranteed – which is no small feat. Our recent Gartner research note Data Consistency Flaws Can Destroy the Value of Your Data shows why you cannot be assured of ever getting consistency back once it is lost. Although this full consistency is not required for all systems, when a system needs consistency, there are no shortcuts – only wishful illusions.
Some important capabilities are lacking. Migration tools, like those that Amazon has leveraged in its successful assault on the DBMS market, are absent here, suggesting that most usage is likely to be new applications. Data integration is not yet available, limiting any hybrid scenarios involving on-premises data. Finally, Cloud Spanner is not using standard ANSI SQL, which will impact database migrations and slow both integration and use by existing tools. The messaging needs a little work as well; the website competitive table suggests that RDBMSs don’t have high availability or failover, and non-relational DBMSs don’t have schemas. Such claims don’t strengthen credibility.
As Cloud Spanner approaches GA, it will need proof points from long term production use in the wild to be considered for mainstream production use. But this cloud DBMS service offers some new possibilities and bears watching.