WordPress Database Schema

While working on Nulis and while attempting to solve some other problems I began to need to document the contents of wordpress databases. The first problem I sought to solve was to build a visual map of the site. This needs the schema to be understood and the table contents to be extracted. This needs the login parameters and a connection route. The second problem involves creating a new template and will need it’s SQL designed and encapsulated in PHP. Continue reading “WordPress Database Schema”

RDBMS theory

I feel the need to checkpoint this. Life is getting confusing.

Barry Morris, CEO of Nuodb, has written a series of artilces about the “Holy Grail”, which he published at the Cloud Computing Journal, and somewhere within the NuoDB site.

The most import ant contribution that Morris makes, in my mind is that there are four models of scale out RDBMS. (Shared Disk, Shared Nothing, Synchronous Commit and their own Durable Distributed Cache invented, (or maybe substantiated), by Jim Starkey.)

Unsurprisingly, Morris’ third article extolling the superiority of what he has to sell  does not, as far as I can see describe how the consistency property is met. I need to re-read the MVCC part of the article. MVCC is based on a file/item append model. MVCC obviates locks (How?) and thus removes a massive part of the seriality of a DBMS which is good because not only do we have Brewers Theory to deal with, but also Amdahl’s Law. The un-answered question to me is how does the relevant cache partition ensure that the page copy it gets from a remote node is the most recent and not required to be locked for update? He states the relationships are asynchronous between nodes, so we are back to eventually consistent, it would seem.

From Morris’ article we learn that NuoDB (like MarkLogic?) and in fact like MySQL where Starkey worked for a while consists of a Transaction Engine and a Storage Manager entity.

Morris mentions Google F1, which is used to support their ad keywords database.It is based on Google’s Spanner which seems pretty much their answer to the CAP theorum, we’ll have to see what the latency cost is like, but being Google it may not be publicly open source.

Morris’ article does not reference Brewer’s CAP theory. I have collected the following links tagged Brewer,

At some stage  I found the proof that the CAP theorem was a theory, I think the Barnes article above references it.

Can we break Brewer’s theory?

I need a, personally, accessible definition of Consistent, Available and Partition Aware. (The first two are easy). Although the wikipedia entry, CAP Theorum has a pretty good set of definitions

In theoretical computer science, the CAP theorem, also known as Brewer’s theorem, states that it is impossible for a distributed computer system to simultaneously provide all three of the following guarantees:[1][2]

  • Consistency (all nodes see the same data at the same time)
  • Availability (a guarantee that every request receives a response about whether it was successful or failed)
  • Partition tolerance (the system continues to operate despite arbitrary message loss or failure of part of the system)

It’s likely I suppose that we might engineer to ensure that the failing condition is so trivial it can be ignored.

The commonest compromise is between availability and consistency although eventual consistency is a relatively modern construction.

Shared disk clusters engineered for HA on a fail fast and recover algorithm are a solution that fails the Availability requirement of the CAP theorum although they have a zero RPO and can have relatively short RTOs.

Here’s the sponsored Bloor paper on NuoDB.

The Jim Starkey wikipedia article references a 2012 patent that patents “A multi-user, elastic, on-demand, distributed relational database management system.” We’ll see? Probably the patents that protect the Nuodb products.

ooOOOoo

The NHS have decided to replace Oracle with RIAK for the “spine”. This claims partition tolerance and availability.

http://www.aerospike.com/ is another hi-performance, scale-out database.

When considering XML/RDF optimised databases, I have been pointed at Virtuoso, which has a wikipedia page here. and a white papers page here.

Snipsnap Problems

This was copied across from the snipsnap bliki on 26th July. It’s all a bit redundant now, but it might be useful for others. This was a pagfe that documented my work in building the configuration and in some places represented a work in progress. In some cases, the resolution is not documented and in others I failed to resolve it. I eventually gave up on snipsnap. Hope this page helps someone! Continue reading “Snipsnap Problems”