Verified Vector Clocks: An Experience Report, Part 1
04 Nov 2013
This post outlines a bunch of experimental work I’ve completed to model
data structures in Coq, leveraging Tim Carstens’ verlang
project to extract the data structures into executable Erlang.
Updated March 8th, 2014: A full talk about this work was presented at Erlang
Factory, San Francisco 2014. Both the
slides
and video are available.
Modeling vector clocks
Let us start by modeling vector clocks in Coq. We will start by
focusing on the API exposed by riak_core’s vector clock
implementation.
For this first implementation, we’re going to implement the API without
the timestamps that vclock uses for pruning for a couple reasons: they
are not core to an implementation of vector clocks or version vectors,
and there does not appear to be a good way to model UNIX timestamps in
Coq.
First, some boilerplate and type definitions. We’ll be modeling the
vclock as a list of pairs, each pair representing a particular actor
and the count of events observed. In the actual Riak implementation,
this is modeled as the product of atoms and integers.
Keep in mind these are not the most efficient implementations of these
methods, we are mainly focusing on naive implementations where we can
easily debug the translation to Core Erlang incase it fails.
Now, let’s add our constructor functions:
Now, our functions to increment an actor.
Some functions to compare equality amongst vector clocks:
Some functions to compare vector clocks and ensure a proper partial
ordering:
And, finally some functions to perform the merge betwen two vector
clocks.
Qed, one might say.
Examining the extracted code
Let us now examine the generated code.
One of the first problems that we encounter is that the function used to
construct the vector clock is not extracted. Let’s manually create this
function in the extracted Core Erlang.
We also modify the module exports at the top to reflect this as well.
Next, we’ll notice that there are a bunch of function calls which are
incorrectly qualified as calls to vvclock.VVClock, which we’ll
manually modify to contain the correct call. These are calls to
functions within the vvclock module.
Another problem we run into are function calls which have not been fully
qualified with their arity. For example, the call to descends@ missing
its arity, descends@/2.
Finally, we run into a problem with currying, where function calls like
the following…
…get exported to something like the following:
…which causes code like the following, to fail:
In cases like this, we have solved the problem by manually inlining the
call to the find function.
Now, given that we have modeled these clocks as products of Peano
numbers, we can take the existing riak_core test suite, and execute
it with only a slight modification and watch the tests pass.
And, for those of you interested in the output:
Cool, that’s not a bad start. We can’t directly apply this to Riak yet,
as the true and false atoms are represented as constructors for the
Boolean type in Coq, and we’re stuck using Peano numbers. We also don’t
have any proofs yet of these functions.
We will be addressing both of these issues, as well as how we adapted
this library for use in the Riak data store in subsequent posts.
This code is available on GitHub as an OTP application and
video and slides are available of a lightning talk I
did on this work at RICON|West 2013.