I feel the need to checkpoint this. Life is getting confusing.
The most import ant contribution that Morris makes, in my mind is that there are four models of scale out RDBMS. (Shared Disk, Shared Nothing, Synchronous Commit and their own Durable Distributed Cache invented, (or maybe substantiated), by Jim Starkey.)
Unsurprisingly, Morris’ third article extolling the superiority of what he has to sell does not, as far as I can see describe how the consistency property is met. I need to re-read the MVCC part of the article. MVCC is based on a file/item append model. MVCC obviates locks (How?) and thus removes a massive part of the seriality of a DBMS which is good because not only do we have Brewers Theory to deal with, but also Amdahl’s Law. The un-answered question to me is how does the relevant cache partition ensure that the page copy it gets from a remote node is the most recent and not required to be locked for update? He states the relationships are asynchronous between nodes, so we are back to eventually consistent, it would seem.
From Morris’ article we learn that NuoDB (like MarkLogic?) and in fact like MySQL where Starkey worked for a while consists of a Transaction Engine and a Storage Manager entity.
Morris mentions Google F1, which is used to support their ad keywords database.It is based on Google’s Spanner which seems pretty much their answer to the CAP theorum, we’ll have to see what the latency cost is like, but being Google it may not be publicly open source.
Morris’ article does not reference Brewer’s CAP theory. I have collected the following links tagged Brewer,
At some stage I found the proof that the CAP theorem was a theory, I think the Barnes article above references it.
Can we break Brewer’s theory?
I need a, personally, accessible definition of Consistent, Available and Partition Aware. (The first two are easy). Although the wikipedia entry, CAP Theorum has a pretty good set of definitions
In theoretical computer science, the CAP theorem, also known as Brewer’s theorem, states that it is impossible for a distributed computer system to simultaneously provide all three of the following guarantees:
It’s likely I suppose that we might engineer to ensure that the failing condition is so trivial it can be ignored.
The commonest compromise is between availability and consistency although eventual consistency is a relatively modern construction.
Shared disk clusters engineered for HA on a fail fast and recover algorithm are a solution that fails the Availability requirement of the CAP theorum although they have a zero RPO and can have relatively short RTOs.
The Jim Starkey wikipedia article references a 2012 patent that patents “A multi-user, elastic, on-demand, distributed relational database management system.” We’ll see? Probably the patents that protect the Nuodb products.
The NHS have decided to replace Oracle with RIAK for the “spine”. This claims partition tolerance and availability.
http://www.aerospike.com/ is another hi-performance, scale-out database.