We at Cockroach Labs absolutely love Aphyr’s work. We are avid readers of the Jepsen series– which some know as a high quality review of the correctness and consistency claims of modern database systems, but which we really know as “Aphyr’s hunting tales about the highest profile bugs in our industry.” Most of us read each new blog entry with a mix of thrill, excitement, and curiosity about which new system will be eviscerated and which exotic error will be discovered next.

Aphyr’s Jepsen posts have changed the dialogue about distributed data stores, placing correctness on equal footing with scalability and performance. (Peter Mattis, Co-Founder of Cockroach Labs)
We are grateful for that.

That the Jepsen tool box is open source should be an inspiration to the entire industry. Additional gratitude is owed to Aphyr for his accompanying series on how to use Clojure,without which we would not have been able to appreciate how Clojure so elegantly simplifies writing the Jepsen event generators. Seriously, wow.

We’re looking forward to seeing CockroachDB put under Aphyr’s microscope one day, but for the time being we decided to apply the Jepsen testing tools ourselves.

We are completely aware of the shortcomings of self-testing the correctness of a system we built ourselves. We aren’t fooling ourselves thinking we could do this and announce afterwards to the world “Look! Aphyr’s magic wand hasn’t found any bugs!” That’s not really how things go in our industry, now, is it? Independent validation and all that, yet as we’ll see in this post we found value in running Jepsen tests.

And we learned so much in the process! It gave us both a boost in confidence and the opportunity to fix important issues in time for our Beta release.

Contents:

– First steps
– What’s in a transaction? The ACID model
– A simple consistency test: unique appends
– Snapshot isolation: banking transactions
– Client-side retries
– Intermezzo: Clojure’s threading gave us a cold sweat
– From snapshot isolation to serializable isolation
– From serializability to linearizability
– Snapshot isolation over multiple ranges?
– Linearizability on single ranges
– Woes with CockroachDB timestamps
– Linearizability vs. serializability
– Linearizability vs. network partitions
– Linearizability vs. clock skews
– More nemeses!
– Wrapping up

FIRST STEPS – START HERE…

Written By: Raphael ‘kena’ Poss

Related Post

Leave us a reply

You must be logged in to post a comment.