Chapter 3 Message Driven Computing and Its Relationship to ACTORS

Copyright © 1994. Thomas W. Christopher
Illinois Institute of Technology
Chicago IL 60616


  We define Message Driven Computation as a style of computing with two characteristics: 1) Messages, not sequential processes, convey both control and data. Indeed, there are no sequential processes unless they are programmed explicitly in terms of messages. 2) Computation is invoked by the presence of a collection of messages at the same location.
We have embodied MDC in two languages, extensions of Icon and of C. MDC is designed to be coherent with the global structure of multicomputers and hence forms a basis for general purpose multicomputing. MDC may also be viewed as a generalization of Actors [AGHA86].
Our development of Message Driven Computation has three main goals: 1) to be coherent with the structure of multicomputers, 2) to be efficient, 3) to be useful for general purpose computation. Here we introduce the features of Message Driven Computation and relate them to the features of multicomputers, to Actors, to other parallel programming systems, and to our goals.
It should be understood from the outset that our current version of MDC is nearly at the assembly language level. It is not much easier to use than any other low level language.
We base the design of MDC on the structure of multicomputers. Consider the differences between von Neumann computers and multicomputers at their global level.
3.1 Von Neumann vs. Multicomputer
  In a von Neumann computer, processors and memories are separate and connected through a switch, which forms a bottleneck. In a multicomputer, processors and memories are combined at nodes, which are interconnected, in most designs, without bottlenecks.
In a von Neumann computer, memory locations are small. In a multicomputer, the node memories are large and can store large, structured objects.
In a von Neumann computer, addressing logic is simple and memory locations are addressed by integer. In a multicomputer, node memories are not directly addressable from outside, and accesses to non-local data must use the processor of the node on which the data resides. Since a full processor will be used anyway, creation of some form of an associative memory is trivial.
In a von Neumann computer, a process contains data in its local memory and executes instructions to control the transformation of the data. In a multicomputer, messages carry data between the nodes and the arrival of messages trigger the execution of instructions and the transformation of data. We hypothesize that messages in a multicomputer are the equivalent of processes on a von Neumann computer.
Von Neumann Multicomputer
P-S-M (Processors connected to memories through a switch C-C(Full computers at nodes)
small memory locations large memories on nodes
integer memory addresses associative memories as easily as linear address spaces
sequential processes encapsulate control and data hypothesis: Messages encapsulate control and data

Considering the global structure of multicomputers, we have designed MDC with the following features:
Loci : Since on a multicomputer processing and storage are performed at the same locations, we provide loci (singular locus), locations where data are stored, and computations are performed.
Portals : Each locus is a collection of labeled message queues, called portals. Each message at a locus is in one queue, at one portal. Messages are not sent to loci directly, but to portals: the destination locus and the portal label must both be specified.
Message Driven Computations : On multicomputers, messages encapsulate data and
control and thus stand in the place of processes. In MDC, messages contain all data and
messages cause all computation. There is no storage except at loci and no storage at a
locus except in the messages waiting there. Messages may contain an arbitrary number
of fields and the fields may be tuples.
Behaviors : Computation is specified by behavior declarations * which consist of a pattern and a procedure body (the behavior body). The pattern specifies a pattern of messages. When a collection of messages at a locus match the pattern, they may be removed and the behavior body executed ("triggered") with the fields of the messages as its parameters. The body may send messages to portals at other loci or leave messages at the current locus.
Pattern matching : There are currently two forms of pattern that may trigger behaviors. They are sufficient to provide control flow, data flow, and demand driven models of computation [TREL82a][TREL82b]. One form of behavior pattern may specify a certain number of messages are to be removed from each of several portals. When each of the portals has the required number of messages, the behavior may be triggered. This kind of pattern permits data-flow and control-flow computation models. Another form of pattern specifies that if the first message to arrive at a locus comes to a specified portal, the behavior is to be triggered. We use this for demand-driven models and for certain dynamic programming algorithms, where the first demand for a value starts its computation and the subsequent demands simply wait.
Data types: Locus and portal are data types, which permits them to be passed in messages and manipulated easily. Two examples, when a message corresponding to a procedure call is sent, it contains a portal-type field giving the continuation to which the result is to be sent. In a MDC interpreter for a data-flow language, a message corresponding to a data-flow instruction contains portals indicating where the result token messages are to be sent.
Associative memory : We use the processing power available in a multicomputer to create an associative memory of loci. In MDC, the names of loci are tuples of arbitrary length. The elements of the tuple may be any primitive data type: integer, character, string, symbol *, real. It is required that the first element of the locus name be a symbol to make it easier to write a garbage collector - if a symbol can no longer be accessed by any computation, then all loci that begin with it may be removed.
A locus will occupy one node of the multicomputer. The number of the node it occupies is computed from its name by hashing or by some other user-specified function. The storage it occupies on that node is found by looking its name up in a hash table. The name of a locus may be constructed independently on many computers and its single exact location computed. Notice that MDC's global data base differs from that of Linda [AHUJ88]. Linda maintains a global tuple space and permits a pattern matching fetch of tuples. In MDC the data base contains loci which are named by tuples. A locus contains potentially any number of messages. A message is sent to a single locus by specifying its exact name so the exact node on which it is located can be computed. The Linda and the MDC associative memories have different functions and uses and are not comparable.
Adjustable granularity : MDC permits fields of messages to be tuples of information, small arrays. Many parallel processing language implementations are fine grain, passing a single data value in a message. MDC permits the granularity to be adjusted by passing tuples of values in messages. The programming problem is to lower the message overhead per datum adequately while not depriving the system of too many opportunities for parallelism.

*. The name behavior we take from Actors. We will compare our work to Actors below.
*. internally generated unique values
3.2 Relationship to Actors
  We were inspired in our design of MDC by Actors.
An actor is, in effect, a message queue with a procedure and an actual parameter list attached (the parameterized procedure is called a "behavior").
When one or more messages (called "tasks") are in the queue, the procedure is called with the parameter list and with the first message, which is removed. The procedure can create new actors, send messages, and can replace the procedure and parameter list attached to its message queue.
Actors are easily translated into MDC as follows. The Actors construct given on the left may be implemented by the corresponding MDC construct on the right.
Actors MDC
mailbox locus
behavior message sent to a "behavior" portal
behavior name portal label
acquaintance list field in a message sent to a "behavior" portal
task message sent to a "task" portal
behavior definitions behavior defenitions

To implement an actor system in MDC, associate the loci with the mailboxes. Partition the portals into those associated with behaviors and those associated with tasks. Require that there be precisely one message at a "behavior" portal of a locus, while there may be many at a task portal. Restrict the MDC behavior definitions to specify precisely two messages in their patterns: one at a behavior portal and one at the task portal.
We classify Actors as another message driven computing system. The features of Actors we have incorporated into MDC are

Features where we differ are:

3.3 Relationship to other parallel processing languages:
  What particularly makes us believe in the generality of our approach is the ease with which we can encode other programming languages in MDC. Both code-copying and tagged-token implementations of dataflow [VEEN86][ACKE82][ARVI82] are trivially encoded in MDC (a one-for-one translation). We are also running an FP [BACK78][BURT87] interpreter in MDC and have a demand-driven interpreter for an applicative system providing static scoping and lazy evaluation via demand-driven graph copying and graph reduction [TREL82a]. Conventional sequential process languages, e.g. Pascal, are a bit messy to translate into MDC, but no more difficult than compiling them at all. (Lower-level sequential languages, such as C, which assume a linear address space and encourage pointer arithmetic, would be very difficult to translate efficiently.)
3.4 Concluding Remarks
  Message Driven Computation, derived in part from the parallel object oriented language Actors, provides a more general computational style than strict object orientation, but does include object oriented programming as a special case. Message Driven Computation can also support control-flow, data-flow, and demand-driven computational models. We consider Message Driven Computation to be an excellent foundation on which to build parallel programming software.
3.5 References
 

Copyright ©1994.Thomas W.Christopher