As we discussed in our first post, Lasp is the name of
our distributed deterministic programming model that is the basis of our
research into providing a more expressive way of working with CRDTs and
This post is a continuation of our work on building an eventually
consistent advertisement counter using Lasp. To get the most benefit
from this article, you should read the first post in this
In this post, we look at an alternative method for tracking the list of
active advertisements: a replicated data structure which supports an
arbitrary number of concurrent addition and removal operations to
elements in a set. This data structure is called the Observed-Remove
set (OR-set), and was originally formalized by Shapiro et al. in “A
Comprehensive Study of Convergent and Commutative Replicated Data
Using the observed-remove set for tracking the active advertisements is
beneficial for several reasons:
Removals can be handled correctly without coordination, given the
propagation delay is acceptable.
Clients no longer need to maintain a list of advertisements
themselves; clients can track just a reference to the set of active
advertisements which is maintained by the variable store.
Clients can periodically cache the active advertisements list based on
a divergence control strategy of their chosing.
When clients tracked their own list of active advertisements, they
would have to wait for messages from the server about when to disable
an advertisement. Inevitably, these “disable” messages can (and
demonstrated during evaluation of the previous example) be delayed by
requests to view advertisements. This leads to advertisements being
displayed significantly more than they should.
The Advertisement Counter Example
As before, we’ve broken our example application into four components:
Create a counter for each advertisement
Initialize a process for each client who will be viewing ads
Initialize a process for each server who will be tracking ads
Simulate a bunch of advertisements being viewed
Creating advertisement counters
We do not alter our original approach of modeling each advertisement
counter as a grow-only counter (G-Counter). However, instead of
tracking the advertisement counters in a normal Erlang list, we use a
observed-remove set, as shown below.
We begin by declaring a new variable, of type riak_dt_orset, and then
for each advertisement we want to count impressions of, update the
observed-remove to include it.
Initialize client processes
Again, we spawn “client” processes, which respond to requests to view
Each of these clients only needs to track the identifier of the active
advertisement set, instead of the list of advertisements themselves.
Now, clients are no longer responsible for removing advertisements from
their list to display when requested by the “server” processes. When a
request to “view” an advertisement arrives, each client process either
uses a locally cached copy of advertisements that are displayable, or
request from the variable store the current list of active
The read operation used here is what we are referring to as a monotonic
read. A monotonic read operation takes an previously observed value
in the provided data type’s lattice and blocks until the variable’s
current value is an inflation of the previous.
For simplicity, think the greater-than-or-equal-to relationship over
natural numbers; we want to ensure we never view the value 1 if we have
already observed the value 2.
This behavior is extremely important: if our variable store is
replicated using an optimisic replication strategy, during failure
conditions we may read from a replica which contains an earlier value,
which would render our program incorrect.
In the case of our observed-remove set, the monotonic read operation
allows us ensure we always read values in causal order; we will never
read the empty set after reading a set with a value, unless that value
had been specifically removed (compared to the alternative case in
coordination-free cases, where you would observe and earlier value where
the value had not been added yet.)
Initialize server processes
Just as with our previous example, we initialize one “server” process
However, we take a slightly different approach. First, we iterate the
current list of advertisements spawning a process for each one. When
spawning the process we provide the identifier to the list of
advertisements, and not the actual list of advertisements.
Like before, we do a blocking monotonic read, which will not unblock
until the counter for the given advertisement reaches at least five.
Once the read unblocks, instead of sending a message to each client
notifying them to remove the ad, we modify the set directly by issuing
Simulating the requests.
Finally, some code to run the advertisement counter simulation.
In this example, we launch 100 requests to view a random sequence of
advertisements to exercise our code and verify the behavior is correct.
Let’s compare divergence with both approaches:
With our original ad counter, because the advertisement removal messages
are interleaved with requests to view advertisements, we suffer a high
amount of divergence: most counters stop around 20, when they should
stop at 5. This only gets worse when higher levels of concurrency are
Obviously, reading directly from the variable store each time cuts down
on divergence. However, when dealing with offline applications, this
approach is not viable.
What’s valuable in this approach compared to the original, is that
clients can cache the advertisements locally and choose when to
synchronize. This alters the model to shift divergence control to the
client – clients can update as connectivity is available, and diverge
during offline periods, instead of relying on the delivery of messages
from the server.
In this post, we introduced a few new concepts:
The monotonic read operation, which ensures that reads are always
advancing in a lattice defined by the data type of the object being
read. This ensures that we never read an earlier value and our
programs evolve monotonically.
Using the observed-remove set in Lasp to build richer applications.
Additionally, we alleviated the following problems in our previous
Clients no longer need to track all possible state they need to act
Clients now control their own divergence: they choose how long to
cache the active list of advertisements and when to refresh it,
instead of relying on messages from the server. We’ve shown in the
evaluation that this method is much better in terms of divergence.
Thanks for reading.
If you’re interested in this research and would like to discuss further
or assist, feel free to contact me using the details in the footer.
For more information on SyncFree and the use cases we have
focused our research on, I recommend this talk given by Annette
Bieniusa and myself at RICON 2014 in Las Vegas.