Remote Procedure Call

12 Apr 2016

This is one post in a series about programming models and languages for distributed computing that I’m writing as part of my history of distributed programming techniques.

Relevant Reading

A Critique of the Remote Procedure Call Paradigm, Tanenbaum and van Renesse, 1987 Tanenbaum and Renesse (1987).
A Note On Distributed Computing, Kendall, Waldo, Wollrath, Wyant, 1994 Kendall et al. (1994).
It’s Just A Mapping Problem, Vinoski, 2003 Vinoski (2003).
Convenience Over Correctness, Vinoski, 2008 Vinoski (2008).

Commentary

“Does developer convenience really trump correctness, scalability, performance, separation of concerns, extensibility, and accidental complexity?” Vinoski (2008)

Timeline

1974: RFC 674,
“Procedure Call Protocol Documents, Version 2”
RFC 674 attempts to define a general way to share resources across all 70 nodes of the Internet. This work is performed at Bolt, Beranek and Newman (BBN Technologies).¹
1975: RFC 684,
“A Commentary on Procedure Calling as a Network Protocol”
First outlines the problems of RPC and how they related to fundamental problems in distributed systems.
1976: RFC 707,
“A High-Level Framework for Network-Based Resource Sharing”
An attempt to “mechanize” the TELNET and FTP protocols through a generalization to functions.
1984: “Implementing Remote Procedure Calls” Birrell and Nelson (1984)
In this work, the Cedar RPC mechanism is designed at Xerox PARC.
1987: Distribution and Abstract Types in Emerald Black et al. (1987)
One of the first major implementations of distributed objects containing distribution-specific calling conventions, such as call-by-move.
1987: A Critique of the Remote Procedure Call Paradigm Tanenbaum and Renesse (1987)
Tanenbaum and van Renesse provide a criticism, similar to the one in RFC 684, on why the RPC model is the wrong model for distributed computing.
1988: Distributed Programing in Argus Liskov (1988)
In Argus, atomic units of computation referred to as “guardians” coordinate application of effects and rollback.
1988: RFC 1057,
“Remote Procedure Call Protocol Specification, Version 2”
Defines sunrpc as a standard along with its supporting infrastructure, such as portmapper, that’s used in the creation of NFS.
1991: CORBA 1.0
CORBA 1.0 introduces distributed objects.
1994: A Note On Distributed Computing Kendall et al. (1994)
Kendall et al. also talk in great length about why the RPC model, extended to objects, is problematic.
1996: A Distributed Object Model for the Java System Wollrath, Riggs, and Waldo (1996)
Introduces the Java RMI system.
1997: CORBA 2.0
CORBA 2.0 represents the major release of CORBA that most people are familiar with.
1999-: EJB, XML-RPC, SOAP, REST, Thrift, Finagle, gRPC, etc.
Modern day RPC mechanisms.

Overview

Remote Procedure Call (RPC) is a general term for executing a subroutine in a different address space without writing the actual code used to perform the remote execution. To provide an example, we can imagine a user wishing to invoke the random number generator function on another machine, but, the only difference between the local and remote invocation is supplying an additional node identifier where it should occur. While not the first implementation, because it was preceded by Apollo Computer’s Network Computing System (NCS), the first major implementation to be widely known and adopted was the SunRPC mechanism, from Sun Microsystems, used to back their Network File System (NFS).

Remote Procedure Call mechanisms you may be more familiar with are Java’s Remote Method Invocation (RMI), it’s predecessor Modula-3’s Network Objects, XML-RPC, SOAP, CORBA, Avro, Facebook’s, (now Apache) Thrift, Google’s Protocol Buffers with Stubby, Twitter’s Finagle, and Google’s gRPC.

RFC 684

“Rather, we take exception to PCP’s underlying premise: that the procedure calling discipline is the starting point for building multi-computer systems.”

RFC 684 is commentary on RFC 674 that introduced the Procedure Call Paradigm, Version 2 (PCP). This commentary highlights, what boils down to, three major points from a critical analysis of the Procedure Call Paradigm.

Procedure calling is usually a primitive operation; by primitive, it should be an extremely fast context switch operation performed by the underlying abstraction.
Local and remote calls each have different cost profiles; remote calls can be delayed, and in the event of failure, may never return.
Asynchronous message passing, or sending a message and waiting for a response when the response is needed, is a much better model because it makes the passing of messages explicit.

Following from these three points, we see a series of concerns develop about this programming paradigm, all of which become a common theme across the 40+ years in RPC’s history. These are:

Difficulty in recovery after malfunction or error. For instance, do we rollback or throw exceptions? How do we handle these errors? Can we just try again?²
Difficulty in sequencing operations. If all calls are synchronous and some of these calls can fail, it can require a significant amount of code to ensure correct re-execution to preserve order moving forward.
Remote Procedure Call forces synchronous programming: a method is invoked and the invoking process waits for a response.
Backpressure, or blocking on previous actions completing, load-shedding, or dropping messages on the floor when the system is overloaded, and priority servicing become more difficult with the call-and-response model of Remote Procedure Call.

RFC 707

“Because of this cost differential, the applications programmer must exercise discretion in his use of remote resources, even though the mechanics of their use will have been greatly simplified by the RTE. Like virtual memory, the procedure call model offers great convenience, and therefore power, in exchange for reasonable alertness to the possibilities of abuse.”

RFC 707 generalizes the ideas from RFC 684 and discusses the problem of resources sharing for services such as TELNET and FTP: each of these services presents a different interface for interacting with it, which requires the operator to know the specific protocol for interacting with that service. In realizing that both services like TELNET and FTP follow the call-and-response model, the authors propose an alternative idea: rather than needing to know all of the available commands and protocols on the remote machine, can we define a generic interface for executing a remote procedure that takes an argument list and follows the call-and-response model?

While we can, the problems of control flow and priority servicing outlined in RFC 684 remain; however, not enough that it prevents this model from being adopted by many systems in the future.

“A Critique of the Remote Procedure Call Paradigm”

“We propose the following test for a general-purpose RPC system. Imagine that two programmers are working on a project. Programmer 1 is writing the main program. Programmer 2 is writing a collection of procedures to be called by the main program. The subject of RPC has never been mentioned and both programmers assume that all their code will be compiled and linked together into a single executable binary program and run on a free-standing computer, not connected to any networks.”

“At the very last minute, after all the code has been thoroughly tested, debugged, and documented and both programmers have quit their jobs and left the country, the project management is forced by unexpected, external circumstances to run the program on a distributed system. The main program must run on one computer, and each procedure must run on a different computer. We also assume that all the stub procedures are produced mechanically by a stub generating program.”

“It is our contention that a large number of things may now go wrong due to the fact that RPC tries to make remote procedure calls look exactly like local ones, but is unable to do it perfectly. Many of the problems can be solved by modifying the code is various ways, but then the transparency is lost. Once we admit that true transparency is impossible, and that programmers must know which calls are remote and which ones are local, we are faced with the question of whether a partially transparent mechanism is really better than one that was designed specifically for remote access and makes no attempt to make remote computations look local at all.”

Tanenbaum and van Renesse directly attack the Remote Procedure Call paradigm stating that it is fundamentally wrong to treat local and remote calls the same. They state that the transparency that RPC tries to achieve is impossible and posit that a protocol that is designed specifically for remote access is better.

The authors go on to describe an alternative paradigm: a “virtual circuit”. This alternative comes off sounding very similar to Distributed Erlang running with TCP: non-blocking send and receive operations across a sliding-window network protocol.

“An alternative model that does not attempt any transparency in the first place is the virtual circuit model (e.g., the ISO OSI reference model [Zimmermann, 1980]). In this model, a full-duplex virtual circuit using a sliding window protocol is set up between the client and server. If nonblocking SEND and RECEIVE primitives are used, incoming messages can be signalled by interrupts to allow the maximum amount of parallelism between communication and computation.”

Tanenbaum and van Renesse outline many of the same criticisms as RFC 684: latency, lack of parallelism, lack of streaming, exception handling and failure detection. In addition, they raise a few additional criticisms that we will highlight below.

Unexpected Messages

With a synchronous call-and-response protocol between client and server, how is one supposed to send a message to the client that is unexpected if that client is not waiting for a message?

Single Threaded Servers

How does one handle the situation where the server does not have a response ready for a client immediately – for instance, if it needs to wait on input from another server? Not only does this block the server, but it also blocks the client from proceeding further with local computation.³

“There is, in fact, no protocol that guarantees that both sides definitely and unambiguously know that the RPC is over in the face of a lossy network.” Tanenbaum and Renesse (1987)

The Two Army Problem

How do we handle requesting irreplaceable data? (or, more generally have two servers come to agreement that some RPC was successfully executed and the response received.) Well, we can acknowledge the receipt of that information, but then we would also have to acknowledge the acknowledgements to be sure the acknowledgement was delivered, right? This topic, central to the agreement problem, has been discussed extensively in distributed systems literature Halpern and Moses (1990).

Parameters

Tanenbaum and van Renesse discuss the problem of parameter passing and parameter marshalling. This issue becomes exacerbated with references in object systems such as CORBA, where specific distributed references have to be used to ensure they remain accessible and valid over time.

“However, if there are reference parameters or pointers, things are more complicated. While it is obviously possible to copy pointers into the message, when the server tries to use them, it will not work correctly because the object pointed to will not be present.”

“Two possible solutions suggest themselves, each with major drawbacks. The first solution is to have the client stub not only put the pointer itself in the message, but also the thing pointed to. However, if the thing pointed to is the middle of a complex list structure containing pointers in both directions, sublists, etc., copying the entire structure into the message will be expensive. Furthermore, when it arrives, the structure will have to be reassembled at the same memory addresses that it had on the client side, because the server code will just perform indirection operations on the pointers as though it were working on local variables.”

...

“The other solution is just to pass the pointer itself. Every time the pointer is used, a message is sent back to the client to read or write the relevant word. The problem here is that we violate one of the basic rules: the compiler should not have to know that it is dealing with RPC. Normally the code produced for reading from a pointer is just to indirect from it. If remote pointers work differently from local pointers, the transparency of the RPC is lost.”

Idempotence

Finally, the authors highlight the problem of providing exactly-once semantics across the network and the power of idempotence. Tanenbaum and van Renesse say it very concisely.

“Suppose a client does a nonidempotent RPC and the server crashes one machine instruction after finishing the operation, but before the server stub has had a chance to reply. The client stub times out and sends the request again. If the server has rebooted by then, there is a chance that the operation will be performed two or more times and thus fail.”

CORBA

The Common Object Request Broker Architecture (CORBA) is an abstraction for object-oriented languages, popularized by C++, that allows you to communicate between different languages and different address spaces running on different machines. CORBA relied on the use of an Interface Definition Language (IDL) for specifying the interfaces of remote classes of objects; this IDL was used to generate stubs of what the remote systems object interfaces appeared as on the local machine. These IDL’s would be used to generate mappings between the abstract interfaces provided by the IDL’s and the actual implementations in languages such as C++ and Java.

CORBA attempted to provide several benefits to the application developer: language independence, OS-independence, architecture-independence, static typing through a mapping of abstract types in the IDL to machine and language specific implementations of those types, and object transfer, where objects can be migrated over the wire between different machines. CORBA’s promise was that through the use of mapping that remote calls could appear as local calls, and that distributed systems related exceptions could be mapped into local exceptions and handled by local exception handling mechanisms.

However, as Vinoski points out in 2003, the evaluation of programming languages and abstractions based on transparency alone is flawed:

“The goal is to merge middleware abstractions directly into the realm of the programming language, minimizing the impedance mismatch between the programming language world and the middleware world. For example, mappings make request invocations on distributed objects and services appear as normal programming-language function calls, and they map distributed system exceptions into native programming language exception-handling mechanisms.” Vinoski (2003)

“A Note On Distributed Computing”

“It is the thesis of this note that this unified view of objects is mistaken.” Kendall et al. (1994)

In this pinnacle Waldo paper, they argue that “it is perilous to ignore the differences” between local and distributed computing and that the unified view of objects is flawed. Kendall et al. (1994) They cite two independent groups of work, the systems of Emerald and Argus, and the modern equivalents of those systems, Microsoft’s DCOM and OMG’s CORBA: all systems that extended the RPC mechanism to objects and method invocation.

We can summarize the “promise” of the unified view of objects, as Waldo does in the paper.

Applications are designed using interfaces on the local machine.
Objects are relocated, because of the transparency of location, to gain the desired application performance.
The application is then tested with “real bullets.”

This strategy for the design of a distributed application has two fundamental flaws. First, that the design of an application can be done with interfaces alone, and that this design will be discovered during the development of the application. Second, that application correctness does not depend on object location, but only the interfaces to each object.

Waldo swats down this design with the “three false principles” Kendall et al. (1994):

“there is a single natural object-oriented design for a given application, regardless of the context in which that application will be deployed”

“failure and performance issues are tied to the implementation of the components of an application, and consideration of these issues should be left out of an initial design”

“‘the interface of an object is independent of the context in which that object is used’’

“Every 10 years...”

“The hard problems in distributed computing are not the problems of getting things on and off the wire.” Kendall et al. (1994)

Waldo argues that every ten years we approach the problem of attempting to unify the view of local and remote computing and run into the same problems, again and again: local and remote computing are fundamentally different.

Latency

Waldo argues that the most obvious difference should be the issues of latency: if you ignore latency, you will end up directly impacting software performance. He states that is it wrong to “rely on steadily increasing speed of the underlying hardware” and that it is not always possible to test with “real bullets”. Performance analysis and relocation is non-trivial and a design that is optimal at one point will not necessarily stay optimal.

Memory Access

His criticisms of memory access are very specific to CORBA and its predecessors in the object space: objects can retain pointers to objects in the same address space, but once moved these pointers will no longer be valid. He states that one approach to solving the problem is distributed shared memory, but more practically techniques such as marshalling or replacement by CORBA references, which are marshalled for distributed access, are used.

Partial Failure

Finally, the most fundamental problem: partial failure. In local computing, he argues, failures are detectable, total, and result in a return of control. This is not true with distributed computing: independent components may fail, failures are partial and a failure of a link is indistinguishable from a failure of a remote processor.

As always, Waldo says it best:

“The question is not ‘can you make remote method invocation look like local method invocation?’ but rather ‘what is the price of making remote method invocation identical to local method invocation?’” Kendall et al. (1994)

Waldo argues that there is only two paths forward if we want to achieve the goal of the unified object model.

Treat all objects as local.
Treat all objects as remote.

However, he states that if the real goal is to “make distributed computing as simple as local computing”, that the only real path forward is the first. This approach, he believes, is flawed, and that distribution is fundamentally different, and must be treated so.

“This approach would also defeat the overall purpose of unifying the object models. The real reason for attempting such a unification is to make distributed computing more like local computing and thus make distributed computing easier. This second approach to unifying the models makes local computing as complex as distributed computing.” Kendall et al. (1994)

The paper provides two examples of where this paradigm is problematic, but we will highlight one case, that builds upon RPC.

Network File System

Sun Microsystem’s Network File System (NFS), built upon RPC, is one of the first distributed file systems to gain popularity. Network File System adhered to the existing filesystem API, but introduced an entire new class of failures resulting from network partitions, partial failure, and high latency. Network File System is a stateless protocol implemented in UDP; the decision to implement it this way is motivated by crash recovery avoidance and protocol simplification Sandberg et al. (1988).

Network File System operated in two modes: soft mounting and hard mounting. Soft mounting introduced a series of new error codes related to the additional ways file operations could fail: these error codes were not known by existing UNIX applications and led to smaller adoption of this approach. Hard mounting introduced the opposite behavior for failures related to the network: operations would block until they could be completed successfully.

It’s just a mapping problem, right? Vinoski (2003)

“Convenience Over Correctness”

“We have a general-purpose imperative programming-language hammer, so we treat distributed computing as just another nail to bend to fit the programming models.” Vinoski (2008)

Vinoski highlights three very important points in “Convenience Over Correctness” criticism of RPC many years later.

Interface Definition Languages (IDL) “impedance mismatch”: base types may be easy to map, but more complex types may be less so.
Scalability: the RPC paradigm does not have any first class support for caching, or mechanisms for mitigating high latency, and is remains a rather primitive operation to build distributed applications with.
Representational State Transfer (REST): REST is good: it specifically addresses the problem of managing distributed resources; but most frameworks built on top of REST alter the abstraction and present something that repeats the problem ⁴.

Impact and Implementations

Remote Procedure Call (RPC) has been around for a very long time and while many opponents of RPC have been extremely critical of it, it still remains one of the most widely used way of writing distributed applications. There’s an unprecedented amount of use of frameworks for RPC like Google’s Protocol Buffers, and Apache’s Thrift deployed and used in production applications every day.

Frameworks such as Google’s gRPC for HTTP/2.0 and Twitter’s Finagle continue to reduce the amount of complexity in building applications with them, attempting to bring RPC to an even wider audience. For instance, Twitter’s Finagle is protocol-independent and attempts to deal with the problems of distribution directly. Finagle does this through the use of futures, which allow composition and explicit sequencing; Google’s gRPC does similar. These frameworks claim that since they do not attempt to hide the fact that the calls are remote, that it provides a better abstraction. However, now we have returned to the aforementioned problem of soft mounting in NFS: explicitly handling the flow control from the array of possible network exceptions that could occur, though mitigated through the use of promises/futures ⁵.

But, the question we have to ask ourselves is whether the abstraction of an individual method invocation or function call is the correct paradigm for building distributed applications. Is the idea of treating all remote objects as local, and making distribution as transparent as possible, the correct decision moving forward? Does it mask failure modes that will allow developers to build applications that will not operate correctly under partial failure?

When we talk about distributed programming languages today, many developers equate this to programming languages that can be, and have been used, to build distributed systems. For example, any language with concurrency primitives and the ability to open a network socket would suffice to build these systems; this does not imply that these languages are distributed programming languages.

But, a distributed programming language is where the distribution is first class. Languages like Go are more closely related to concurrent languages, where concurrency is first class; and, while concurrency is a requirement for distribution, these are different topics. CORBA is an example of trying to make distribution first class in languages such as C++.

Erlang Claessen and Svensson (2005; Svensson and Fredlund 2007a; Svensson and Fredlund 2007b) is one language where distribution is first class. Erlang has a RPC mechanism, but prefers the use of asynchronous message passing between processes. In fact, the RPC mechanism in Erlang is implemented using Erlang’s native asynchronous message passing. While you can peek under the covers and see where processes are running, Erlang tries to make the programmer assume that each process could be executing on a different node. Motivated by the expressiveness of the design, both Distributed Process, from the Cloud Haskell group, and Akka, in Scala, are examples that attempt to bring Erlang-style semantics to Haskell and Scala, respectively.

One approach taken in the Scala community for distributed programming is serializable closures Miller, Haller, and Odersky (2014), or what’s known as the function shipping paradigm. In this model, entire functions are moved across the network, where the type system is used to ensure that all of values in scope can be properly serialized or marshalled as these closures move across the network. While this solves some of the problematic points in systems like CORBA and DCOM, it does not have a solution for the problem of how to ensure exactly-once execution of functions, or to handle partial failure where you can not distinguish the failure of the remote node from the network.

Languages like Bloom, and Bloom_L, and Lasp Alvaro et al. (2011; Conway et al. 2012; Meiklejohn and Van Roy 2015) take an alternative approach: can we build abstractions that rely on asynchronous programming, very weak ordering and structuring our applications in such a way where they are tolerant to network anomalies such as message duplication and message re-ordering. While this approach is more expensive in terms of state transmission, and more restrictive in what types of computations can be expressed, this style of programming supports the creation of correct-by-construction distributed applications. These applications are highly tolerant to anomalies resulting from network failures by assuming all actors in the system are distributed. The restrictions, however, might prohibit wide adoption of these techniques.

So, we ask again:

“Does developer convenience really trump correctness, scalability, performance, separation of concerns, extensibility, and accidental complexity?” Vinoski (2008)

Alvaro, Peter, Neil Conway, Joseph M Hellerstein, and William R Marczak. 2011. “Consistency Analysis in Bloom: A CALM and Collected Approach.” In CIDR, 249–60. Citeseer.

Birrell, Andrew D, and Bruce Jay Nelson. 1984. “Implementing Remote Procedure Calls.” ACM Transactions on Computer Systems (TOCS) 2 (1). ACM: 39–59.

Black, A., N. Hutchinson, E. Jul, H. Levy, and L. Carter. 1987. “Distribution and Abstract Types in Emerald.” IEEE Transactions on Software Engineering SE-13 (1): 65–76. doi:10.1109/TSE.1987.232836.

Claessen, Koen, and Hans Svensson. 2005. “A Semantics for Distributed Erlang.” In Proceedings of the 2005 ACM SIGPLAN Workshop on Erlang, 78–87. ACM.

Conway, Neil, William R Marczak, Peter Alvaro, Joseph M Hellerstein, and David Maier. 2012. “Logic and Lattices for Distributed Programming.” In Proceedings of the Third ACM Symposium on Cloud Computing, 1. ACM.

Halpern, Joseph Y, and Yoram Moses. 1990. “Knowledge and Common Knowledge in a Distributed Environment.” Journal of the ACM (JACM) 37 (3). ACM: 549–87.

Kendall, Samuel C, Jim Waldo, Ann Wollrath, and Geoff Wyant. 1994. “A Note on Distributed Computing.” Sun Microsystems, Inc.

Lee, Collin, Seo Jin Park, Ankita Kejriwal, Satoshi Matsushita, and John Ousterhout. 2015. “Implementing Linearizability at Large Scale and Low Latency.” In Proceedings of the 25th Symposium on Operating Systems Principles, 71–86. ACM.

Liskov, Barbara. 1988. “Distributed Programming in Argus.” Communications of the ACM 31 (3). ACM: 300–312.

Liskov, Barbara, and Liuba Shrira. 1988. Promises: Linguistic Support for Efficient Asynchronous Procedure Calls in Distributed Systems. Vol. 23. 7. ACM.

Meiklejohn, Christopher, and Peter Van Roy. 2015. “Lasp: A Language for Distributed, Eventually Consistent Computations with CRDTs.” In Proceedings of the First Workshop on Principles and Practice of Consistency for Distributed Data, 7. ACM.

Miller, Heather, Philipp Haller, and Martin Odersky. 2014. “Spores: A Type-Based Foundation for Closures in the Age of Concurrency and Distribution.” In ECOOP 2014–Object-Oriented Programming, 308–33. Springer.

Sandberg, R., D. Golgberg, S. Kleiman, D. Walsh, and B. Lyon. 1988. “Innovations in Internetworking.” In, edited by C. Partridge, 379–90. Norwood, MA, USA: Artech House, Inc. http://dl.acm.org/citation.cfm?id=59309.59338.

Svensson, Hans, and Lars-Ake Fredlund. 2007a. “A More Accurate Semantics for Distributed Erlang.” In Erlang Workshop, 43–54. Citeseer.

Svensson, Hans, and Lars-Åke Fredlund. 2007b. “Programming Distributed Erlang Applications: Pitfalls and Recipes.” In Proceedings of the 2007 SIGPLAN Workshop on ERLANG Workshop, 37–42. ACM.

Tanenbaum, Andrew Stuart, and Robbert van Renesse. 1987. A Critique of the Remote Procedure Call Paradigm. Vrije Universiteit, Subfaculteit Wiskunde en Informatica.

Vinoski, Steve. 2003. “It’s Just a Mapping Problem [Computer Application Adaptation].” Internet Computing, IEEE 7 (3). IEEE: 88–90.

———. 2008. “Convenience over Correctness.” Internet Computing, IEEE 12 (4). IEEE: 89–92.

Wollrath, Ann, Roger Riggs, and Jim Waldo. 1996. “A Distributed Object Model for the Java TM System.” In Proceedings of the 2nd Conference on USENIX Conference on Object-Oriented Technologies (COOTS)-Volume 2, 17–17. USENIX Association.

Fun fact: BBN’s Internet division would later become Genuity and then Level3 out of bankruptcy and has autonomous system number 1 (ASN 1).↩
Some systems that attempt to address this include Liskov’s promises Liskov and Shrira (1988) and Lee’s Lee et al. (2015) work on a reusable framework for linearizability.↩
Our section on promises talks about this problem in depth Liskov and Shrira (1988).↩
For instance, if one was to build an object model on top of REST.↩
Our related post on promises and futures challenges whether that abstraction is right either.↩