A CAPable Distributed Programming Model

15 Nov 2018

This is one post in a series about programming models and languages for distributed computing that I’m writing as part of my history of distributed programming techniques.

Relevant Reading

  • Diciplined Inconsistency with Consistency Types, Holt et al., SoCC 2016 Holt et al. (2016)

  • On the Design of Distributed Programming Models, Meiklejohn PMLDC 2017 Meiklejohn (2017).

  • A CAPable Distributed Programming Model, Myter, Scholliers, De Meuter, Onward! 2018 Myter, Scholliers, and De Meuter (2018)

Commentary

Building on previous work from Holt and Meiklejohn, this paper further explores the idea around allowing programmers to specify individual components of their system as either strongly consistent or weakly consistent. Why do you want some components to be weakly consistent? Well, performance and availability, of course!

By now, it should be clear that a system cannot simultaneously achieve both AP and CP as described by the CAP conjecture and later proof of the CAP theorem. For practitioners, this means they have to make a choice in their application: either coordinate for each operation in their application to ensure a total order over all events in the system, thereby keeping a concurrent application correct under distribution but face issues of latency and availability under failure; or, avoid coordination and risk sacrificing application correctness under network partitions and failure. Holt et al. work on the previously discussed IPA system admits this, and allows developers to optimize their application using a type system that allows programmers to exploit weak consistency, with the protection of a type system to ensure consistent values (CP) are not overridden or influenced by inconsistent values (AP), allowing the models to be mixed in the same application. In their system, type safety implies consistency safety. They realize this through a novel programming model, type system and middleware layer designed to run on top of data stores that support configurable consistency models.

The authors propose a new programming model that also attempts to bridge the AP-CP programming gap in programming models today. CAPtain introduces new main programming abstractions: availables, or data types that can always be operated on through a local copy, and consistents, data types that always coordinate for every operation. As it should be obvious, availables map directly to the AP aspect of the CAP theorem; whereas consistents map directly to the CP aspect of the CAP theorem.

As with Holt’s work, CAPtain ensures that values from availables cannot be used with objects or methods that are consistents without explicit promotion from the application developer. Holt’s work uses an explicit endorse operation to move types up the lattice from Inconsistent to Consistent; CAPtain provides two new primitives, thaw and freeze. Thaw can be used to create an available from a consistent: a local copy is created at each replica from the current value; freeze performs the opposite, by creating a new consistent object using consensus from an available.

In CAPtain, nodes use the Global Sequence Protocol from Burckhardt et al. for sequencing of updates to consistents. Each object in the system, whether an available or consistent, has a master node that is used to sequence updates. Updates to consistents are placed in a tentative queue, and eventually sequenced by the master; updates to availables are performed locally and lazily sequenced. Consistents block until this sequencing is completed and all nodes agree; availables can proceed while synchornization occurs in the background. CAPtain’s mechanism for operating on remote references to consistents located at a different node is based on previous work on far-references from the AmbientTalk system.

CAPTain is available on GitHub at http://github.com/myter/CAPtain.

Holt, Brandon, James Bornholt, Irene Zhang, Dan Ports, Mark Oskin, and Luis Ceze. 2016. “Disciplined Inconsistency with Consistency Types.” In Proceedings of the Seventh ACM Symposium on Cloud Computing, 279–93. ACM.

Meiklejohn, Christopher S. 2017. “On the Design of Distributed Programming Models.” In Proceedings of the Programming Models and Languages for Distributed Computing, 1. ACM.

Myter, Florian, Christophe Scholliers, and Wolfgang De Meuter. 2018. “A CAPable Distributed Programming Model.” In Proceedings of the 2018 ACM SIGPLAN International Symposium on New Ideas, New Paradigms, and Reflections on Programming and Software, 88–98. ACM.