Chapter 2 Experience With Message Driven Computing and the Language LLMDC/C

Copyright ©1994. Thomas W. Christopher


2.1 Abstract
  A model for parallel computing, Message Driven Computing, is presented, which is based on distributed memory and pattern-driven control. One implementation of that model, LLMDC/C, is presented, and it is shown that a wide variety of parallel programming techniques are made possible. These techniques include Actors, communicating processes, active messages, dataflow, demand-driven programming, futures, I-structures, streams, broadcast and accumulation.
2.2 Introduction
  Parallel computing models may be categorized by whether they use centralized or distributed data storage and by whether they use control-driven, data-driven, demand- driven, or pattern-driven control [TREL82] [TREL84]. Message Driven Computing (MDC) is based on distributed data and pattern-driven control. The term Message Driven Computing is also used by Athas [ATHA87] for a similar underlying model, although his elaboration of that model differs greatly from ours.
In MDC (hereafter, as we use the term), computation takes place as computational events at abstract locations which have structured names.
The only communication between computational events is by means of message passing - both between locations and at the same location. A collection of messages at a location which match a pattern can trigger a computational event that will consume some or all of those messages. There is mutual exclusion among the computational events at a location.

Message Driven Computing is an unconventional computational model, and in order to justify interest in MDC, it must be shown possible to write programs in it and to run those programs efficiently. Most of this paper will be devoted to describing a particular implementation of MDC and describing MDC programming techniques. These programming techniques are derived from experience with four Message Driven Computing languages. Students have implemented programs that include an LALR(1) parser generator, implementations of FP [AMEI89][BACK78] (three in fact, with strict, lenient, and lazy evaluation), of Lucid [WADG85], of Joyce [BRIN87a][BRIN87b], and of Dynamo II [PUGH73]. We have numerous implementations of parallel algorithms, and have efficient simulations of neural networks [WOHL90a] [WOHL90b].
The techniques we use include Actors [AGHA86] [ATHA86], communicating processes [HOAR78] [BRIN87a] [BRIN87b], dataflow [ACKE82] [ARVI78] [DAVI82] [DENN80] [TREL82] [VEEN86], futures [HALS85], I-structures [ARVI86], streams [ARVI78] [WADG85], lazy evaluation, and active messages [WALL82].
Experiments with the efficiency and speed-up of algorithms written in MDC are reported in other papers [CHRI90b] [CHRI90c] [WOHL90a] [WOHL90b], but we will summarize a few of our results under Location and Grain size..

2.3 LLMDC/C
  Here we describe some of the features of the language LLMDC/C (low-level message driven computing built atop C). The intent of many of the features will not become apparent until the section on programming techniques, where it will be shown how to implement various control and data structures in LLMDC/C.

Symbols.In LLMDC, the analog to a pointer is a symbol. Symbols are unique values, generated by calls to the function Symbol, that are used to construct names of locations. The symbol corresponds to the name of a potentially distributed object. LLMDC/C has a data type SYMBOL that contains symbol values.

Locations.A location is a named place to which messages are sent and at which computation occurs. A location has a name composed of two fields
[S,X]
where S is a symbol and X (the "index") is an unsigned long integer. The data type LOCUS carries location names.

Classes.Symbol are divided into classes. The function to generate a symbol is provided a class name and creates a symbol of the specified class.
Symbol(class_id)
A location has the same class as the symbol component of its name.

Locate.On a multicomputer, locations are assigned to nodes so that a location with a particular name will only occupy one particular node of the computer. Similarly, in our multiprocessing implementations, a location will occupy a particular process. In the absence of any other specifications, a location name will be hashed to determine the number of the node it occupies. For some algorithms, better layout can be achieved by specifying explicitly the formula to calculate a location's assignment. The locate declaration
locate class_id @ expr ;
will use the formula expr to determine where a location of class class_id will be assigned. Within the expression, the two fields of the location name are referred to as S and X, and the location name as a whole by locus.

Messages.Messages are represented as C structures. The message declaration
message id { members } ;
is translated into (inter al.)
typedef struct id_S { members } *id;
That is, id is made a pointer type through which messages bodies may be referenced.
Further, a macro "new_id" is declared which will allocate and initialize a message of
type id.
The members are allowed to be scalars and arrays, but pointer and user-declared struct and union types are not permitted.
There are two other forms of message declarations:
message id { } ;
message id1 = id2 ;

The first creates a message with no fields, commonly referred to as a signal. The second creates a message id1 which is equivalent to message id2.

Behavior declarations.A collection of messages at a location which matches a pattern can trigger a computational event, the execution of a behavior. A behavior is declared
behavior pattern =>
{ procedure body }

There are three types of pattern elements.

  1. msg_type id1, id2, ..., idn;
    which declares parameters id1, id2, ..., idn to be messages of type msg_type. This pattern element will match only if there are at least n messages of type msg_type present at the location.
  2. msg_type relop num ;
    which will match only if the number of messages of type msg_type present at the location bears the relationship relop to the number num. Relop may be any of ==, !=, >, <,>=, or <=.
  3. precedence num
    which gives the precedence or priority of the behavior. It is used to arbitrate when several patterns match. A matching behavior with a higher precedence will be executed rather than a lower precedence behavior. The higher the value of num, the higher is the precedence of the behavior.

Within a behavior, the parameters are the identifiers declared by type (1) patterns and a special parameter here, the name of the location at which the behavior is executing.
There is mutual exclusion among the behaviors at a location; at most one may be executing at any one time.

Class declarations. In addition to its use in locate declarations, the class of a location also restricts the set of behaviors that may execute there. The declaration
class class_id1, class_id2, ...,class_idn :
declares class_id1, class_id2, ..., class_idn to be class names and declares the behaviors following the declaration up to the next class declaration or the end of the program may execute at locations of any of those classes.

Send.Within a behavior, a message may be sent to a destination by a statement of the form
send message_expr to dest ;

LeaveA message may be sent back to the same location at which the behavior is executing by the leave statement
leave message_expr ;
The leave statement may be more efficient than a send to oneself. (Although it is not, in the current implementation.) The major difference between send and leave is that leave guarantees that the message will be present at the location immediately after the behavior finishes executing and thus is guaranteed to be able to influence subsequent pattern matches.

Casts.The message expression in the send and the leave statements may specify a cast which will convert the message to a structurally equivalent type. E.g.
leave (msg_type) x ;
will convert the message pointed to by x to type msg_type and leave it at the current location.
A run-time error is reported if the new type of the message is not structurally equivalent to the previous type. Two types are structurally equivalent if they have the same number of members, and the members have the same types and occur in the same order.

PORTAL.The data type PORTAL contains a location name, L, and a message type, T. When a PORTAL is used as a destination of a send statement, the message is converted to message type T and sent to location L. A PORTAL expression has one of the forms
msg_type @ e
msg_type @ [ e1 , e2 ]

Forms of send.There are different forms of the send statement depending on whether
the message is being sent to a LOCUS or to a PORTAL. If the destination is a LOCUS,
the forms are
send message_expr to [ e1 , e2 ] ;
send message_expr to locus e ;

If the destination is a PORTAL, the forms are
send message_expr to e ;
send message_expr to msg_type @ e ;
send message_expr to
msg_type @ [ e1 , e2 ] ;

Delay and undelayBecause message passing is asynchronous, messages may arrive out of order. A behavior may delay processing a message by executing a delay statement
delay message_expr ;
which behaves like a leave, except that the message no longer participates in pattern matches. Execution of an undelay statement
undelay ;
in a subsequent behavior execution will result in all the delayed messages again participating in pattern matches.

Initial behavior.An initial behavior starts the execution of an MDC program by sending messages. It's syntax is
initial behavior => { body }

2.4 Programming Techniques
  Actors.Perhaps the best way to describe Actors is to describe an implementation. An Actor is a mailbox with a message queue, a procedure pointer, and an argument list attached. When the queue is non-empty, the scheduler can call the procedure passing it the argument list and the first message from the queue. The procedure can do three things in addition to the usual calculations. 1) It can create a new actor by supplying a procedure and an initial argument list, getting a pointer to its mail box. 2) It can send messages to those actors to whose mailboxes it has pointers. 3) It can change itself into a new actor by specifying a new procedure or argument list. (The argument list is called an "acquaintance list," and the messages are called "tasks.")
Actors may be approximated in LLMDC/C by the following translations:
A mailbox is a location, [s,0], where the symbol s may be considered the pointer to the mailbox. The index field is unused but must be consistently some value, here chosen to be zero.
The procedure pointer and argument list are combined into a message; the message type corresponds to the procedure pointer and the fields of the message to the arguments. The body of the procedure is broken up into several LLMDC/C behaviors, one for each type of message that may be received. These behaviors have the form:
behavior actorType self;
messageType msg; =>
{ code }

The messages (tasks) in Actors also translate into MDC messages. To avoid errors, the MDC message types should be partitioned into actor messages and task messages.
Creating a new actor involves creating a new symbol, s, for its location name, allocating a message to represent the initial procedure, filling in the fields of that message with the initial parameter list, and sending the message to the location [s,0].
Sending a message in Actors corresponds to sending a message in MDC. To become a new actor, the behavior leaves a new actor message in place of the previous one. To remain the same actor, the behavior leaves the same input message (self). To terminate, the behavior does not leave any actor message.
This translation does not completely imitate Actors. In Actors, the incoming messages must be processed in the same order they arrive. In our translation, they might not be. In Actors, therefore, the procedure must be able at any time to handle any task message.
This is often quite difficult. For example, the actor A calls a subroutine by creating an actor B and sending it a task. Actor A cannot respond to other messages until B returns (i.e. B's response message arrives). But A cannot read the response until it handles the messages waiting on its queue ahead of the response. The solution to this problem in Actors is to create a queue (composed of more actors) to which the incoming tasks are sent. The solution in MDC is to leave an actor message that will only trigger a behavior when the response message comes. That behavior will then leave an actor message that will handle the other tasks.
There are ways to guarantee identical behavior to Actors, but given a choice, would a programmer restrict him- or herself to Actors?
Distributed arrays.When using Actors, arrays must be built out of linked structures,
which can require a great deal of ingenuity just to set up. (See for example, the Cantor User Report[ATHA86].) In MDC, one would simply create arrays of locations using the index field of the location names. In MDC, the symbol component of the location name corresponds to the name of the distributed object; the index field selects the specific component.
Communicating processes.The fundamental technique for simulating processes is representing activation records as messages. The fields of the message correspond to local variables, the return address, and other information. One way to represent the program counter is by the message type. One rewrites the code for the control flow program with explicit goto's and labels on each basic block, and creates a message type for each label which is structurally equivalent to the activation record message.
message act_rec { variables etc. };
message labeli = act_rec ;

Create one behavior for each labeled section of code which will trigger the execution of the code without requiring any other messages be present.
behavior labeli state ; =>
{ code_blocki }
Translate each "goto labelj" into a leave statement.
leave (labelj) state ;
The translation of input and output commands can be quite complex for many of the CSP derivative languages. Here we only outline a translation of asynchronous message passing with direct naming of the destination and guards only on input commands, not on output.
An output command is translated directly into a send statement. Consider a guarded input command of the form
[
B1 & s1 ? m1(vars1) => C1 |
B2 & s2 ? m2(vars2) => C2 |
...
Bn & sn ? mn(varsn) => Cn
]
where Bi is a Boolean guard, si is a source, mi is a message type, and Ci is a command. If label L is associated with the entry to this command, LL with the exit, and each message type, mi, occurs in only one guard, generate behavior definitions
behavior L state; mi msg; => {
if (Bi &&
msg->source is si ) {
Ci
undelay;
leave (LL) state;
} else {
delay msg;
leave state;
}
}
Messages must contain a field, source, that names the sender. The behavior triggered by a message first checks to see if the corresponding Boolean guard is satisfied and if the
sender is correct. The message is delayed if either of those checks fail. If they succeed,
the corresponding code is executed and all messages which may have been delayed are
undelayed for further processing.
If more than one clause has the same message type, they are inserted into the same
behavior with a series of else-if's.

  Procedure calls. To call a procedure, the caller creates a location name at which the procedure is to execute and sends off a call message to trigger the starting behavior of the procedure. The call message contains the parameter list and return address. The caller leaves its continuation.
The best form for a return address is a PORTAL containing the location to which the results are to be sent and the message type expected. The called procedure packs up the results in a message and sends them to the portal. The system converts the message to the type the caller expects which must be structurally equivalent. Since the type of the result message will be pattern matched to trigger a continuation in the caller, it should be up to the caller to specify the message type, but the called procedure should not have to have separatecode to handle each resulting type.
This implementation of procedure calls permits the implementation of remote procedure calls. A remote procedure call is a procedure call that executes in another process having access to shared variables. The call message is sent to the location of another process where the shared variables are stored in one or more messages. The pattern on the behavior for the remote procedure matches the call message and the shared variable messages.
Dataflow graphs.We have used the following code for dynamic dataflow graphs:
message add { PORTAL dst;};
message lopnd {long v;};
message ropnd = lopnd;
message result {long rslt;};

behavior add a; lopnd left; ropnd right; =>
{result r = new_result;
r->rslt = left->v + right->v;
send r to a->dst;
}

The add message corresponds to an add dataflow instruction. The field dst represents the destination to receive the result. The behavior is triggered when an add instruction is present at the same location as both a left and a right operand (lopnd and ropnd). The behavior allocates a new result message, fills it with the sum of the operands, and sends it to the destination contained in the add message. When the result message is sent, it is converted into the message type contained in the PORTAL a->dst, which will very likely be lopnd or ropnd.

Broadcast.It is often necessary to broadcast a message to each of an array of locations, e.g. to start an algorithm. The following are four techniques we have used:

  1. A behavior loops allocating messages and sending them to each location. This does not make use of parallel processing and can flood the communication system on a multicomputer.
  2. If the locations are numbered 1, 2, ..., n, a broadcast is sent to location 1 which sends to 2 and 3, 2 sends to 4 and 5, 3 sends to 6 and 7, and in general, i sends to 2i and 2i+1, if they exist.
  3. If the locations are numbered m, m+k, m+2k, ..., n, a broadcast to the sequence is recursively broken down into broadcasts to the two sequences, i) m, m+2k, m+4k, ..., and ii) m+k, m+3k, .... If m=0, k=2J;, and a proper locate declaration has been provided, this technique uses neighbor connections on a hypercube computer.
  4. If the locations are numbered 0, 1, ..., n, a broadcast can use the edges of a multiway tree with location 0 as the root node and the parent of location j>0 being j-lowbit(j). On a twos complement machine, lowbit(j) = j & -j, that is, the number formed by clearing all but the rightmost bit of j. When broadcasting to the children, location j sends messages to locations j+2k such that j+2k <= n, and 2k < lowbit(j). This broadcast can start from anywhere in the tree and can use the neighbor connections on a hypercube computer given proper locate declarations.

Accumulation.It is often necessary to accumulate information from an array of locations, indeed most often simply the information that they have finished some processing. The two methods we have most often used to accumulate information are

  1. Building a dataflow graph to accumulate the information. The graph may be built during a broadcast congruent to the broadcast tree (but with the edges pointing "up").
  2. Passing the information among the locations rootwards along the multiway tree described in broadcasting method (d).

Futures, I-structures and Streams. A future [HALS85] may be considered a single- assignment variable connecting a producer and a consumer. An I-structure [ARVI80] [ARVI86] is an aggregate, e.g. an array, of futures. The term "I-structure" is derived from "incremental structure," a structure that becomes defined incrementally as its components are assigned. The consumer may proceed with other work and be delayed upon attempting to access the variable later if the variable has not yet been assigned. We frequently use futures and I-structures in our code. A future is a location to which a value will be sent. A message attempting to access the value goes to the location where it waits for the value to arrive.
During execution in our FP implementation, we create dataflow graphs linked by futures. In the lenient version, the sequences (FP's equivalent of lists) are I-structures. The most common form of I-structure we use is the stream [ARVI78]. A stream is an I-structure in the form of a one dimensional array. The elements are assigned approximately in sequential order and consumed approximately in the same order. A special value (we use the empty message eos) marks the end of the stream. Streams provide an unbounded buffer between activities and are used for pipelining.
In lazy streams, a suspension is placed in the stream. An attempt to access the suspension will trigger further generation of stream elements. This permits some bounding of the buffer the stream represents.
We use streams in out output package and in various algorithms such as merge sort.

Active messages. We have found the most convenient way to process streams is with active messages, or traveling activations, wherein an activation record message (the representation of a process) does not remain at a single location fetching data to itself, but rather moves from location to location.

Demand-driven.In demand-driven programming, the demand for, or attempt to use, a value causes its computation to be initiated. Lazy streams are one example of demand-driven programming. Similarly, our lazy FP implementation leaves suspensions at locations which represent futures in the lenient and strict implementations.
Interestingly enough, it is not always necessary to have a suspension message waiting at a location. Suppose that there is a way to uniquely assign the computation of a function applied to particular arguments to a single location. When one wishes to compute the function, one sends the call message to the assigned location. If the value has already been computed, it will be sitting there in a message, and the call will simply return it.
If the value is already being computed, but has not arrived yet, there will be a please_wait message, and the call message waits for the value to be computed. If there is neither a value nor a please_wait message, the call leaves a please_wait and starts evaluation of the function.
The code fragments are:
message value{...};
message please_wait{};
message demand{PORTAL dst;};
behavior demand d;
value v; => {
leave v;
send CopyMsg(v)
to d->dst;
}
behavior demand d;
value==0;
please_wait==0; => {
leave new_please_wait;
...start computing function...
}
behavior value v;
please_wait w; => {
leave v;
}

Grain size. We have learned that the grain size of computations is highly significant. Grain size refers to the amount of computation performed by each computational event.
If the grain size is too small, the system overhead is too large a proportion of the execution time. If the grain size is too large, there are likely to be opportunities for parallelism being missed.
Grain size is adjusted by adjusting the amount of data processed in a computational event, which entails keeping arrays of data in messages - the larger the array, the more data to be processed.
We ran a fine grained version of bitonic sort on 8192 elements and a coarse grained version with blocks of sizes two through 1024 in powers of two [CHRI90b]. We observed a difference of over 9000% in the run time of a very fine grained (scalar field) version of the algorithm over a version with 256 element blocks. We have observed similarly large differences in other algorithms [CHRI90c].

Layout. The locate declaration allows some control of the layout of locations over the nodes of a multicomputer. We have not yet run experiments on a genuine multicomputer to find out how significant location is, but we have run experiments on shared-memory machines. The messages are allocated in shared memory, but the locations are in the private heaps of processes. Messages are sent between processes through shared linked queues.
We ran a fine-grained version of bitonic sort with 1024 locations each handling one element of the array. We observed the worst possible compared to best possible gave about a 16% difference. Hashed placement compared to the best possible placement gave only about a 5% difference[CHRI90b].

Partial ordering of time. One great difference from conventional programming is that time is only partially ordered. Computational event x precedes computational event y only if there is a chain of message transmissions and behavior executions from x to y.
One of the difficulties in learning to program in a message driven style is learning to pay attention to this. It leads to a number of other problems. Termination detection is a problem; there is no way to instantaneously examine the state of the system and observe there is no more work to do. In fact, messages that have been handed over to the communications system on a multicomputer are not visible until they arrive. Termination of algorithms can be detected by, for example, arranging for acknowledgements on all operations and building acknowledgement trees to determine when an aggregate operation is complete, but it does require explicit programming.
Since we cannot capture global snapshots, debugging can be a problem. We use trace debugging. Typically we first check the input/output behavior of behaviors, then we look for orphaned messages (produced but never consumed) and unexpected location names, and finally we resort to following chains of messages and behaviors.

2.5 Discussion
  Message Driven Computing offers a number of advantages in programming multicomputers. Although it bears a family resemblance to Actors languages, it is considerably more pleasant to program in. The computable location names allow distributed arrays to be allocated without the difficulties of encoding them in linked structures. Pattern matching in behavior headers and through the delay statement allows messages to be handled in the order they are needed, rather than the order in which they arrive.
The conversion of message types among equivalence classes is both convenient and efficient. It is quite important for building dataflow graphs, passing procedure continuations, and handling state changes in processes.
Array fields within messages allow us to adjust the grain size of computations. We observe that fine-grained computations are horrendously expensive even on shared memory machines which have low message-passing overhead.
The locate declaration gives a degree of control over the layout of distributed data and process structures.

The automatic creation of locations upon message arrival is useful for building I-structures and for demand-driven programming.
MDC also allows the mixture of multiple parallel programming paradigms within the same program. Indeed, we often have locations metamorphose from Actors to processes and back to Actors.
This multiparadigm programming comes from the low-level nature of MDC. We are not separated that far from the important features of multicomputers: the outer message-passing level of algorithms versus the inner control-flow level, the partial ordering of time, the influence of layout and grain size. The cost, as with any low-level language, is the amount of detail required to be expressed.

2.6 Acknowledgements
  We are porting OOMDC/C to several shared-memory parallel processors and running experiments at the Advanced Computing Research Facility, Mathematics and Computer Science Division, Argonne National Laboratory, whose provision of computing facilities we gratefully acknowledge. NSF has provided us a 16-processor Ncube/7 through grant CDA-8820762 upon which we are planning to run a number of experiments.
2.7 References
 
  • ACKE82 Ackerman, W. B., "Data Flow Languages," Computer, Feb. 1982, pp. 15-25.
  • AGHA86 Agha, G. A.; Actors: A Model of Concurrent Computation in Distributed Systems, MIT Press, Cambridge, MA., 1986.
  • AMEI89 Ameiss, David, FP In MDC/C, Technical Report, Computer Science Department, Illinois Institute of Technology, Chicago, IL 60616, 1989.
  • ARVI78 Arvind, Gostelow, K. P., Plouffe, W. An Asynchronous Programming Language and Computing Machine, Department of Information and Computer Science, University of California, Irvine, Irvine, CA 92717, Dec. 8, 1978.
  • ARVI80 Arvind, Thomas, R. E., I-Structures: An Efficient Data Type for Functional Languages, MIT Laboratory for Computer Science Technical Manual TM-178, MIT, Cambridge, MA, Sept. 1980.
  • ARVI86 Arvind, R. S. Nikhil, K. K. Pingali, "I-Structures: Data Structures for Parallel Computing," in Fasel, Joseph H. and Robert M. Keller (ed), Graph Reduction, Proceedings of a workshop in Sante Fe, New Mexico, Sep./Oct. 1986, Lecture Notes in Computer Science 279, Springer-Verlag, 1987.
  • ATHA86 Athas, W. C., and Seitz, C. L., Cantor User Report, Version 2.0, 5232:TR:86, Computer Science Department, California Institute of Technology, Pasadena, CA, 91125, January, 1987.
  • ATHA87 Athas, William C., Fine Grain Concurrent Computations, Ph. D. Dissertation, 5242:TR:87, California Institute of Technology, Pasadena, California, May, 1987.
  • ATHA88 Athas, William C. and Charles L. Seitz, "Multicomputers: Message-Passing Concurrent Computers," IEEE Computer, August, 1988.
  • BACK78 Backus, J.; "Can Programming be Liberated from the von Neumann Style," Commun. ACM, 21,8, August 1978, pp. 613-641.
  • BRIN87a Brinch Hansen, P., "Joyce - A Programming Language for Distributed Systems," Software - Practice and Experience 17,1, Jan., 1987.
  • BRIN87b Brinch Hansen, P., "A Joyce Implementation," Software - Practice and Experience 17,4, Apr., 1987.
  • CHRI89 Christopher, T. W., OOMDC/C Language Definition, Technical Report, Computer Science Department, Illinois Institute of Technology, Chicago, IL 60616, 1989.
  • CHRI90a Christopher, T. W., and Freitag, Gregory A. LLMDC/C Language Description, Version 1.0, Technical Report, Computer Science Department, Illinois Institute of Technology, Chicago, IL 60616, 1990.
  • CHRI90b Christopher, T. W., Experiments in Multicomputing With Bitonic Sort, Computer Science Department, Illinois Institute of Technology, Chicago, IL 60616, 1990. Unpublished.
  • CHRI90c Christopher, T. W. "Exploration of the Limits That Grain Size Imposes on the Speed-Up and Efficiency of Two Transitive Closure Algorithms," Fourth Annual Parallel Processing Symposium, Orange County IEEE, April 4-6, 1990.
  • DAVI82 Davis, A. L., Keller, R. M., "Data Flow Program Graphs," Computer, Feb. 82, pp. 26-41.
  • DENN80 Dennis, J. B., "Data Flow Architectures," Computer, Nov. 1980, pp. 48- 56.
  • HALS85 Halstead, Robert H., Jr., "Multilisp: A Language for Concurrent Symbolic
    Computation," ACM Transactions on Programming Languages and Systems, 7:4, October, 1985.
  • HOAR78 Hoare, C. A. R.; "Communicating Sequential Processes," Commun. ACM, 21,8 Aug. 1978, pp. 666-677.
  • PUGH73 Pugh, A. L. III, Dynamo II User's Manual, The MIT Press, Cambridge MA, 1973.
  • TREL82 Treleaven, P. C., Brownbridge, D. R., and Hopkins, R. P. "Data-Driven and Demand-Driven Computer Architecture," ACM Computing Surveys, 14,1, March, 1982.
  • TREL84 Treleaven, P. C., Lima, I. G. "Future Computers: Logic, Data Flow, ..., Control Flow?" Computer, March, 1984.
  • VEEN86 Veen, A. H.; "Data Flow Machine Architecture," ACM Computing Surveys, 18,4, Dec. 1986, pp. 365-396.
  • WADG85 Wadge, William W., Ashcroft, Edward A. Lucid, the Dataflow Programming Language, APIC Studies in Data Processing No. 22, Academic Press, 1985.
  • WALL82 Wall, D. W. "Messages As Active Agents," 9th Annual ACM Symposium on Principles of Programming Languages, Jan., 1982.
  • WOHL90a Wohl, Peter, and T. W. Christopher, "A Fine Grained Neural Net Simulator Encoded in Coarse Grained OOMDC," Fourth Annual Parallel Processing Symposium, Orange County IEEE, April 4-6, 1990.
  • WOHL90b Wohl, Peter, and T. W. Christopher, "SIMD Neural Net Mapping on MIMD Architectures," International Conference on Parallel Processing, August, 1990.

Copyright ©1994. Thomas W. Christopher