Updated January 9th, 2015: Derflow has since been renamed to Lasp, which is open source on GitHub. For more information surrounding the name change, see this post.
As we discussed in our first post, Derflow is the name of our distributed deterministic programming model that is the basis of our research into providing a more expressive way of working with CRDTs and eventual consistency.
Today, we look at how we can go about building the first version of a Erlang QuickCheck model for testing our distributed variable store works as expected.
So, how do we start building a model for our language to QuickCheck with?
First, if you recall, our language uses a distributed single assignment
store in which distribution is supported by riak_core
. Unfortunately,
testing applications with dependencies on riak_core
is extremely
difficult, due to the amount of setup required. We’ve attempted it
before at Basho and it requires a lot of setup.
To make this easier, we first separate out the logic in our virtual node which controls the distributed store with the backend that actually writes values to disk. For this purpose, we’ve created a derflow_ets backend, which provides an API to the distributed store, which we can run our tests against.
Building the model is straightforward: how can you model a variable
store that is backed by ets
? We can use a dict
!
So, what are the invariants of our langauge? Let’s enumerate them.
If you recall, our language provides operations over both single-assignment variables and lattices. When operating with a lattice, we want to ensure that we allow variables to rebind if the given value is an inflation of that lattice.
So, we want to test the following operations:
What are the properties of our single-assignment store?
What are the properties of our lattice store?
For testing, we use the eqc_statem
behavior, which is a state-machine
based approach for testing functions with side-effects. For more
information on testing using this facility, check out this
talk from Laura M. Castro.
Let’s take a look at our code that generates commands:
When generating commands, we need to model our three basic types of
operations: read
, bind
, and declare
. We use the oneof
generator
to select one of the following commands – declare
operations will be
generated without constraints composed of one of the lattice types, or
the undefined
type to specify that we want a single-assignment
variable.
For read
and bind
operations: we need to ensure we only generate
these for variables which have been declared; this is done by the
length(Variables) > 0
predicate on the list comprehension.
Additionally, these commands also need to be called with the correct
types of values. For example, a read threshold over a lattice needs to
be called with a valid lattice value; similarly a bind needs to be
triggered with the correct type as well. In our example here, we use
the ?LET
macro to materialize the symbolic into an actual value, which
we can inspect the type of and generate the appropriate threshold and
value.
Here’s our threshold and value generators, which currently, are the same:
In this case, the type of one of the callers will be a CRDT provided by
the riak_dt library. Each of the data types provided, such
as riak_dt_gset
, for the grow-only set, exports a function gen_op()
,
which generates a random operation, allowing us to generate random types
of CRDTs.
Here’s an example of the operations generator for the riak_dt_gset
:
Now, let’s examine the preconditions on command generation.
What preconditions do we want to check when running our model?
Well, given that derflow read
operations are blocking – blocking on
unbound single-assignment variables and threshold read operations where
the threshold is not yet met, we want to prevent us from executing the
tests with those values.
We do that with the following precondition on read:
Now, let’s examine the postconditions on command generation.
There are two important postconditions of our model we must verify:
bind
operations which fail should only fail for a valid reason;
read
operations which succeed must succeed with the correct value.
When performing a read
, we assert that we only ever read the value
that we observed being bound to that variable when executing the model.
If it is a threshold read, we assert that the value returned is the
threshold – if the value wasn’t at least the threshold, we would have
failed the precondition shown above.
When asserting that bind
fails only when it is supposed to, we verify
that the only time it is allowed to fail is if we are attempting to
re-assign a already bound variable to a new value.
True.
This was just a very brief overview of how we began testing derflow using Erlang QuickCheck. For a deeper view, please checkout our test on GitHub.
I’d love to hear about how I could clean this model up and make it easier to understand. Please let me know, or send me a pull request.
Thanks!