by Joerg Fritsch | January 27, 2014 | 1 Comment
An other movie that I like very much is the film “Minority Report” from the year 2002. –Made approximately half a decade before AWS EC2 had opened its portals. In the film government keeps three brittle people (so called “Pre-cogs”) in sort of a vegetative state and uses them to forecast crimes before they actually happen. Citizens are arrested based on these forecasts. Now, the forecasts have it that they are only “eventually consistent” (see where I am going?) and that from time to time you have a minority report where one of the pre-cogs claims that the reality is not quite what her two friends are telling. Needless to say, the cops discard these minority reports and go on arresting as usual.
If you have never heard about eventual consistent data stores, the similarity between the plot of the film and the real world is that (distributed) Key/Value stores and NoSQL data stores are prone to run into disagreements about what the right version of a data set is. Frequently this is solved by an API that either exposes vector clocks or a means for the end user to define their own rules how data should eventually converge. If you have conflicting data sets on 50 servers, for example after a power outage, you can decide whether the version that is held by the majority of servers is “true” or whether the server that replies fastest is considered to be temporal “true”. Temporal because after sending you the minority reply his siblings may just go ahead and overwrite his data set to make it look more consistent.
At first I was wondering how this could ever work acceptably and what the implications on data security may be. Did not a decade ago certified heartbeat cables for database file systems that would give us the desired ACID conditions cost as much as a Volkswagen? Do we not spend considerable effort in C or MPI to prevent data (small data that is of course) race conditions?
One possible answer is that the systems give the right answer most of the time, but achieved with means much simpler and cheaper. We slowly turn away from an IT where we used to insure us against all inconsistencies, failures and Single Point of Failures to an IT that has the notion to take a loss and recover. “The ability to avoid loss never makes up for the ability to absorb loss.” (Dan Geer, 1998 in “Risk Management is Where the Money Is,” http://catless.ncl.ac.uk/Risks/20.06.html
). Or, as Mario Andretti (most successful US driver) told us: “If everything seems under control you are just not going fast enough”.