This post outlines a bunch of experimental work I’ve completed to model data structures in Coq, leveraging Tim Carstens’ verlang project to extract the data structures into executable Erlang.
Here’s a link to the first and second posts in this series.
Updated March 8th, 2014: A full talk about this work was presented at Erlang Factory, San Francisco 2014. Both the slides and video are available.
Let us now look at writing the wrapper module that we will use to
integrate our extracted code with the existing riak_core
OTP application.
We first need to make some adjustments to the generated Core Erlang
code, beacuse it’s using the literal atoms init_timestamp
and
init_count
instead of calling to those functions, which is generating
data structures we can not operate over in Erlang.
We will make the following adjustments to the Core Erlang, as well as add these new functions to the module exports.
We also modify the default case to call these functions, instead of just use the atoms as values.
You can see these changes here.
Now, let’s add back the original test from riak_core
.
I have added this back in the following commit.
Now, let’s take a look at the wrapper module.
Let’s start by providing the same API as the existing vclock
module.
First, a function to return a timestamp as a natural number.
When generating a fresh vector clock, we can just call directly to our exported code.
When incrementing, since actors can be any Erlang term, we first need to convert that term into a Peano number, and then call the generated increment function.
We’re going to skip the increment/3
call which takes a timestamp.
When using the accessor functions, we need to convert the actor to a Peano number, and then convert the count back to an Erlang term.
Same thing applies for the timestamps.
When handle descends or equality, we need to match against the extracted constructors for the True and False types.
Again, when dealing with the actors, which are stored as Peano numbers, we convert back and forth between Erlang terms.
When pruning, we need to extract items out of the application environment, and pass them directly into the Core Erlang code that’s been extracted so it can operate over them.
With the merge function, we simply adapt the single arity list call to recurse through the list and perform the merges using the extracted merge.
Cool.
Finally, we need to modify our Core Erlang code to call back into this wrapper module to generate proper Erlang timestamps needed when incrementing vector clocks. We support that by exporting a function only used by the Core Erlang, which returns a current timestamp, in Peano format.
Then, we modify our callers in the Core Erlang to use that function in our support wrapper.
Seems straightforward.
In supporting data type conversion, it gets a bit more interesting.
For example, writing the functions to perform the natural to Peano conversion is trivial, but extremely slow in the way that data constructors are modeled in the extracted Erlang code as nested tupes.
Here’s the function to convert to a Peano.
…and here’s the function to convert from a Peano.
Granted, these could be converted to tail calls for efficiency, but most the problems I ran into was the sheer amount of execution time required to convert a large integer to a Peano number.
Now, we look at how we can handle conversion of arbitrary terms to integers.
In the version used to evaluate the test suite, I’ve stubbed these functions out the following way to verfiy that the rest of the code executed correctly.
For example, converting a natural to a term…
…and converting a term to a natural.
I’m not sure what the best way is to convert a term in Erlang to an integer representation without hashing the binary representation if it. However, this yields extremely large integers, which when converted to Peano numbers take extremely long to complete.
For example, in one run it took multiple minutes just to perform the Peano number conversion for an extremely small integer.
I believe the next step in moving forward is to rework the extraction code to use some sort of String to model the actors, to eliminate the need for Peano numbers there. However, we still run into a similar problem when modeling timestamps as Peano numbers, so an optimized representation or conversion mechanism is probably desired.
The wrapper has been added in the following commit.
At this point, we have gotten a bit further, and now have a version of
the library that can work directly as a replacement for riak_core
’s
vclock
module. However, the performance is abysmal due to the data
structure modeling, which will be the topic of the next post in this
series.
Thanks!