| 2.1 |
Abstract |
|
A model for parallel computing, Message Driven Computing, is presented, which is
based on distributed memory and pattern-driven control. One implementation of that
model, LLMDC/C, is presented, and it is shown that a wide variety of parallel programming
techniques are made possible. These techniques include Actors, communicating
processes, active messages, dataflow, demand-driven programming, futures, I-structures,
streams, broadcast and accumulation.
|
| 2.2 |
Introduction |
|
Parallel computing models may be categorized by whether they use centralized or distributed
data storage and by whether they use control-driven, data-driven, demand-
driven, or pattern-driven control [TREL82] [TREL84]. Message Driven Computing
(MDC) is based on distributed data and pattern-driven control. The term Message
Driven Computing is also used by Athas [ATHA87] for a similar underlying model,
although his elaboration of that model differs greatly from ours.
In MDC (hereafter, as we use the term), computation takes place as computational
events at abstract locations which have structured names.
The only communication between computational events is by means of message passing
- both between locations and at the same location. A collection of messages at a location
which match a pattern can trigger a computational event that will consume some or
all of those messages. There is mutual exclusion among the computational events at a
location.
Message Driven Computing is an unconventional computational model, and in order to
justify interest in MDC, it must be shown possible to write programs in it and to run
those programs efficiently. Most of this paper will be devoted to describing a particular
implementation of MDC and describing MDC programming techniques. These programming
techniques are derived from experience with four Message Driven Computing
languages. Students have implemented programs that include an LALR(1) parser
generator, implementations of FP [AMEI89][BACK78] (three in fact, with strict,
lenient, and lazy evaluation), of Lucid [WADG85], of Joyce [BRIN87a][BRIN87b], and
of Dynamo II [PUGH73]. We have numerous implementations of parallel algorithms,
and have efficient simulations of neural networks [WOHL90a] [WOHL90b].
The techniques we use include Actors [AGHA86] [ATHA86], communicating processes
[HOAR78] [BRIN87a] [BRIN87b], dataflow [ACKE82] [ARVI78] [DAVI82] [DENN80] [TREL82] [VEEN86],
futures [HALS85], I-structures [ARVI86], streams [ARVI78] [WADG85], lazy evaluation, and
active messages [WALL82].
Experiments with the efficiency and speed-up of algorithms written in MDC are
reported in other papers [CHRI90b] [CHRI90c] [WOHL90a] [WOHL90b], but we will
summarize a few of our results under Location and Grain size..
|
| 2.3 |
LLMDC/C |
|
Here we describe some of the features of the language LLMDC/C (low-level message
driven computing built atop C). The intent of many of the features will not become
apparent until the section on programming techniques, where it will be shown how to
implement various control and data structures in LLMDC/C.
Symbols.In LLMDC, the analog to a pointer is a symbol. Symbols are unique values,
generated by calls to the function Symbol, that are used to construct names of locations.
The symbol corresponds to the name of a potentially distributed object. LLMDC/C has
a data type SYMBOL that contains symbol values.
Locations.A location is a named place to which messages are sent and at which
computation occurs. A location has a name composed of two fields
[S,X]
where S is a symbol and X (the "index") is an unsigned long integer. The data type
LOCUS carries location names.
Classes.Symbol are divided into classes. The function to generate a symbol is provided
a class name and creates a symbol of the specified class.
Symbol(class_id)
A location has the same class as the symbol component of its name.
Locate.On a multicomputer, locations are assigned to nodes so that a location with a
particular name will only occupy one particular node of the computer. Similarly, in our
multiprocessing implementations, a location will occupy a particular process. In the
absence of any other specifications, a location name will be hashed to determine the
number of the node it occupies. For some algorithms, better layout can be achieved by
specifying explicitly the formula to calculate a location's assignment. The locate declaration
locate class_id @ expr ;
will use the formula expr to determine where a location of class class_id will be
assigned. Within the expression, the two fields of the location name are referred to as S
and X, and the location name as a whole by locus.
Messages.Messages are represented as C structures. The message declaration
message id { members } ;
is translated into (inter al.)
typedef struct id_S { members } *id;
That is, id is made a pointer type through which messages bodies may be referenced.
Further, a macro "new_id" is declared which will allocate and initialize a message of
type id.
The members are allowed to be scalars and arrays, but pointer and user-declared struct
and union types are not permitted.
There are two other forms of message declarations:
message id { } ;
message id1 = id2 ;
The first creates a message with no fields, commonly referred to as a signal. The second
creates a message id1 which is equivalent to message id2.
Behavior declarations.A collection of messages at a location which matches a pattern
can trigger a computational event, the execution of a behavior. A behavior is declared
behavior pattern =>
{ procedure body }
There are three types of pattern elements.
- msg_type id1, id2, ..., idn;
which declares parameters id1, id2, ..., idn to be messages of type msg_type. This
pattern element will match only if there are at least n messages of type msg_type
present at the location.
- msg_type relop num ;
which will match only if the number of messages of type msg_type present at the
location bears the relationship relop to the number num. Relop may be any of ==, !=,
>, <, >=, or <=.
- precedence num
which gives the precedence or priority of the behavior. It is used to arbitrate when
several patterns match. A matching behavior with a higher precedence will be executed
rather than a lower precedence behavior. The higher the value of num, the
higher is the precedence of the behavior.
Within a behavior, the parameters are the identifiers declared by type (1) patterns and a
special parameter here, the name of the location at which the behavior is executing.
There is mutual exclusion among the behaviors at a location; at most one may be executing
at any one time.
Class declarations. In addition to its use in locate declarations, the class of a location
also restricts the set of behaviors that may execute there. The declaration
class class_id1, class_id2, ...,class_idn :
declares class_id1, class_id2, ..., class_idn to be class names and declares the behaviors
following the declaration up to the next class declaration or the end of the program may
execute at locations of any of those classes.
Send.Within a behavior, a message may be sent to a destination by a statement of the form
send message_expr to dest ;
LeaveA message may be sent back to the same location at which the behavior is executing
by the leave statement
leave message_expr ;
The leave statement may be more efficient than a send to oneself. (Although it is not, in
the current implementation.) The major difference between send and leave is that leave
guarantees that the message will be present at the location immediately after the behavior
finishes executing and thus is guaranteed to be able to influence subsequent pattern
matches.
Casts.The message expression in the send and the leave statements may specify a cast
which will convert the message to a structurally equivalent type. E.g.
leave (msg_type) x ;
will convert the message pointed to by x to type msg_type and leave it at the current
location.
A run-time error is reported if the new type of the message is not structurally equivalent
to the previous type. Two types are structurally equivalent if they have the same number
of members, and the members have the same types and occur in the same order.
PORTAL.The data type PORTAL contains a location name, L, and a message type, T.
When a PORTAL is used as a destination of a send statement, the message is converted
to message type T and sent to location L. A PORTAL expression has one of the forms
msg_type @ e
msg_type @ [ e1 , e2 ]
Forms of send.There are different forms of the send statement depending on whether
the message is being sent to a LOCUS or to a PORTAL. If the destination is a LOCUS,
the forms are
send message_expr to [ e1 , e2 ] ;
send message_expr to locus e ;
If the destination is a PORTAL, the forms are
send message_expr to e ;
send message_expr to msg_type @ e ;
send message_expr to
msg_type @ [ e1 , e2 ] ;
Delay and undelayBecause message passing is asynchronous, messages may arrive
out of order. A behavior may delay processing a message by executing a delay statement
delay message_expr ;
which behaves like a leave, except that the message no longer participates in pattern
matches. Execution of an undelay statement
undelay ;
in a subsequent behavior execution will result in all the delayed messages again participating
in pattern matches.
Initial behavior.An initial behavior starts the execution of an MDC program by sending
messages. It's syntax is
initial behavior => { body }
|
| 2.4 |
Programming Techniques |
|
Actors.Perhaps the best way to describe Actors is to describe an implementation. An
Actor is a mailbox with a message queue, a procedure pointer, and an argument list
attached. When the queue is non-empty, the scheduler can call the procedure passing it
the argument list and the first message from the queue. The procedure can do three
things in addition to the usual calculations. 1) It can create a new actor by supplying a
procedure and an initial argument list, getting a pointer to its mail box. 2) It can send
messages to those actors to whose mailboxes it has pointers. 3) It can change itself into
a new actor by specifying a new procedure or argument list. (The argument list is called
an "acquaintance list," and the messages are called "tasks.")
Actors may be approximated in LLMDC/C by the following translations:
A mailbox is a location, [s,0], where the symbol s may be considered the pointer to the
mailbox. The index field is unused but must be consistently some value, here chosen to
be zero.
The procedure pointer and argument list are combined into a message; the message type
corresponds to the procedure pointer and the fields of the message to the arguments. The
body of the procedure is broken up into several LLMDC/C behaviors, one for each type
of message that may be received. These behaviors have the form:
behavior actorType self;
messageType msg; =>
{ code }
The messages (tasks) in Actors also translate into MDC messages. To avoid errors, the
MDC message types should be partitioned into actor messages and task messages.
Creating a new actor involves creating a new symbol, s, for its location name, allocating
a message to represent the initial procedure, filling in the fields of that message with the
initial parameter list, and sending the message to the location [s,0].
Sending a message in Actors corresponds to sending a message in MDC. To become a
new actor, the behavior leaves a new actor message in place of the previous one. To
remain the same actor, the behavior leaves the same input message (self). To terminate,
the behavior does not leave any actor message.
This translation does not completely imitate Actors. In Actors, the incoming messages
must be processed in the same order they arrive. In our translation, they might not be. In
Actors, therefore, the procedure must be able at any time to handle any task message.
This is often quite difficult. For example, the actor A calls a subroutine by creating an
actor B and sending it a task. Actor A cannot respond to other messages until B returns
(i.e. B's response message arrives). But A cannot read the response until it handles the
messages waiting on its queue ahead of the response. The solution to this problem in
Actors is to create a queue (composed of more actors) to which the incoming tasks are
sent. The solution in MDC is to leave an actor message that will only trigger a behavior
when the response message comes. That behavior will then leave an actor message that
will handle the other tasks.
There are ways to guarantee identical behavior to Actors, but given a choice, would a
programmer restrict him- or herself to Actors?
Distributed arrays.When using Actors, arrays must be built out of linked structures,
which can require a great deal of ingenuity just to set up. (See for example, the Cantor
User Report[ATHA86].) In MDC, one would simply create arrays of locations using the
index field of the location names. In MDC, the symbol component of the location name
corresponds to the name of the distributed object; the index field selects the specific
component.
Communicating processes.The fundamental technique for simulating processes is
representing activation records as messages. The fields of the message correspond to local
variables, the return address, and other information. One way to represent the program
counter is by the message type. One rewrites the code for the control flow program with
explicit goto's and labels on each basic block, and creates a message type for each label
which is structurally equivalent to the activation record message.
message act_rec { variables etc. };
message labeli = act_rec ;
Create one behavior for each labeled section of code which will trigger the execution of
the code without requiring any other messages be present.
behavior labeli state ; =>
{ code_blocki }
Translate each "goto labelj" into a leave statement.
leave (labelj) state ;
The translation of input and output commands can be quite complex for many of the
CSP derivative languages. Here we only outline a translation of asynchronous message
passing with direct naming of the destination and guards only on input commands, not
on output.
An output command is translated directly into a send statement. Consider a guarded
input command of the form
[
B1 & s1 ? m1(vars1) => C1 |
B2 & s2 ? m2(vars2) => C2 |
...
Bn & sn ? mn(varsn) => Cn
]
where Bi is a Boolean guard, si is a source, mi is a message type, and Ci is a command.
If label L is associated with the entry to this command, LL with the exit, and each message
type, mi, occurs in only one guard, generate behavior definitions
behavior L state; mi msg; => {
if (Bi &&
msg->source is si ) {
Ci
undelay;
leave (LL) state;
} else {
delay msg;
leave state;
}
}
Messages must contain a field, source, that names the sender. The behavior triggered by
a message first checks to see if the corresponding Boolean guard is satisfied and if the
sender is correct. The message is delayed if either of those checks fail. If they succeed,
the corresponding code is executed and all messages which may have been delayed are
undelayed for further processing.
If more than one clause has the same message type, they are inserted into the same
behavior with a series of else-if's.
|
|
Procedure calls. To call a procedure, the caller creates a location name at which the
procedure is to execute and sends off a call message to trigger the starting behavior of
the procedure. The call message contains the parameter list and return address. The
caller leaves its continuation.
The best form for a return address is a PORTAL containing the location to which the
results are to be sent and the message type expected. The called procedure packs up the
results in a message and sends them to the portal. The system converts the message to
the type the caller expects which must be structurally equivalent. Since the type of the
result message will be pattern matched to trigger a continuation in the caller, it should
be up to the caller to specify the message type, but the called procedure should not have
to have separatecode to handle each resulting type.
This implementation of procedure calls permits the implementation of remote procedure
calls. A remote procedure call is a procedure call that executes in another process having
access to shared variables. The call message is sent to the location of another process
where the shared variables are stored in one or more messages. The pattern on the
behavior for the remote procedure matches the call message and the shared variable
messages.
Dataflow graphs.We have used the following code for dynamic dataflow graphs:
message add { PORTAL dst;};
message lopnd {long v;};
message ropnd = lopnd;
message result {long rslt;};
behavior add a; lopnd left; ropnd right; =>
{result r = new_result;
r->rslt = left->v + right->v;
send r to a->dst;
}
The add message corresponds to an add dataflow instruction. The field dst represents
the destination to receive the result. The behavior is triggered when an add instruction is
present at the same location as both a left and a right operand (lopnd and ropnd). The
behavior allocates a new result message, fills it with the sum of the operands, and sends
it to the destination contained in the add message. When the result message is sent, it is
converted into the message type contained in the PORTAL a->dst, which will very
likely be lopnd or ropnd.
Broadcast.It is often necessary to broadcast a message to each of an array
of locations, e.g. to start an algorithm. The following are four techniques we have used:
- A behavior loops allocating messages and sending them to each location. This does
not make use of parallel processing and can flood the communication system on a
multicomputer.
- If the locations are numbered 1, 2, ..., n, a broadcast is sent to location 1 which sends
to 2 and 3, 2 sends to 4 and 5, 3 sends to 6 and 7, and in general, i sends to 2i and
2i+1, if they exist.
- If the locations are numbered m, m+k, m+2k, ..., n, a broadcast to the sequence is
recursively broken down into broadcasts to the two sequences, i) m, m+2k, m+4k,
..., and ii) m+k, m+3k, .... If m=0, k=2J;,
and a proper locate declaration has been provided, this technique uses neighbor
connections on a hypercube computer.
- If the locations are numbered 0, 1, ..., n, a broadcast can use the edges of a multiway
tree with location 0 as the root node and the parent of location j>0 being j-lowbit(j).
On a twos complement machine, lowbit(j) = j & -j, that is, the number formed by
clearing all but the rightmost bit of j. When broadcasting to the children, location j sends
messages to locations j+2k such that j+2k <= n, and 2k < lowbit(j). This
broadcast can start from anywhere in the tree and can use the neighbor connections
on a hypercube computer given proper locate declarations.
Accumulation.It is often necessary to accumulate information from an array of locations,
indeed most often simply the information that they have finished some processing.
The two methods we have most often used to accumulate information are
- Building a dataflow graph to accumulate the information. The graph may be built
during a broadcast congruent to the broadcast tree (but with the edges pointing
"up").
- Passing the information among the locations rootwards along the multiway tree
described in broadcasting method (d).
Futures, I-structures and Streams. A future [HALS85] may be considered a single-
assignment variable connecting a producer and a consumer. An I-structure [ARVI80]
[ARVI86] is an aggregate, e.g. an array, of futures. The term "I-structure" is derived
from "incremental structure," a structure that becomes defined incrementally as its components are
assigned. The consumer may proceed with other work and be delayed upon
attempting to access the variable later if the variable has not yet been assigned.
We frequently use futures and I-structures in our code. A future is a location to which a
value will be sent. A message attempting to access the value goes to the location where
it waits for the value to arrive.
During execution in our FP implementation, we create dataflow graphs linked by
futures. In the lenient version, the sequences (FP's equivalent of lists) are I-structures.
The most common form of I-structure we use is the stream [ARVI78]. A stream is an I-structure
in the form of a one dimensional array. The elements are assigned approximately
in sequential order and consumed approximately in the same order. A special
value (we use the empty message eos) marks the end of the stream. Streams provide an
unbounded buffer between activities and are used for pipelining.
In lazy streams, a suspension is placed in the stream. An attempt to access the suspension
will trigger further generation of stream elements. This permits some bounding of
the buffer the stream represents.
We use streams in out output package and in various algorithms such as merge sort.
Active messages. We have found the most convenient way to process streams is with
active messages, or traveling activations, wherein an activation record message (the representation of a
process) does not remain at a single location fetching data to itself, but
rather moves from location to location.
Demand-driven.In demand-driven programming, the demand for, or attempt to use, a
value causes its computation to be initiated. Lazy streams are one example of demand-driven programming.
Similarly, our lazy FP implementation leaves suspensions at locations which represent futures in the
lenient and strict implementations.
Interestingly enough, it is not always necessary to have a suspension message waiting at
a location. Suppose that there is a way to uniquely assign the computation of a function
applied to particular arguments to a single location. When one wishes to compute the
function, one sends the call message to the assigned location. If the value has already
been computed, it will be sitting there in a message, and the call will simply return it.
If the value is already being computed, but has not arrived yet, there will be a
please_wait message, and the call message waits for the value to be computed. If there
is neither a value nor a please_wait message, the call leaves a please_wait and starts
evaluation of the function.
The code fragments are:
message value{...};
message please_wait{};
message demand{PORTAL dst;};
behavior demand d;
value v; => {
leave v;
send CopyMsg(v)
to d->dst;
}
behavior demand d;
value==0;
please_wait==0; => {
leave new_please_wait;
...start computing function...
}
behavior value v;
please_wait w; => {
leave v;
}
Grain size. We have learned that the grain size of computations is highly significant.
Grain size refers to the amount of computation performed by each computational event.
If the grain size is too small, the system overhead is too large a proportion of the execution time.
If the grain size is too large, there are likely to be opportunities for parallelism
being missed.
Grain size is adjusted by adjusting the amount of data processed in a computational
event, which entails keeping arrays of data in messages - the larger the array, the more
data to be processed.
We ran a fine grained version of bitonic sort on 8192 elements and a coarse grained version
with blocks of sizes two through 1024 in powers of two [CHRI90b]. We observed a
difference of over 9000% in the run time of a very fine grained (scalar field) version of
the algorithm over a version with 256 element blocks. We have observed similarly large
differences in other algorithms [CHRI90c].
Layout. The locate declaration allows some control of the layout of locations over the
nodes of a multicomputer. We have not yet run experiments on a genuine multicomputer
to find out how significant location is, but we have run experiments on shared-memory
machines. The messages are allocated in shared memory, but the locations are in the private heaps of
processes. Messages are sent between processes through shared linked
queues.
We ran a fine-grained version of bitonic sort with 1024 locations each handling one element
of the array. We observed the worst possible compared to best possible gave about
a 16% difference. Hashed placement compared to the best possible placement gave only
about a 5% difference[CHRI90b].
Partial ordering of time. One great difference from conventional programming is that
time is only partially ordered. Computational event x precedes computational event y
only if there is a chain of message transmissions and behavior executions from x to y.
One of the difficulties in learning to program in a message driven style is learning to pay
attention to this. It leads to a number of other problems. Termination detection is a problem;
there is no way to instantaneously examine the state of the system and observe
there is no more work to do. In fact, messages that have been handed over to the communications
system on a multicomputer are not visible until they arrive. Termination of
algorithms can be detected by, for example, arranging for acknowledgements on all
operations and building acknowledgement trees to determine when an aggregate operation
is complete, but it does require explicit programming.
Since we cannot capture global snapshots, debugging can be a problem. We use trace
debugging. Typically we first check the input/output behavior of behaviors, then we
look for orphaned messages (produced but never consumed) and unexpected location
names, and finally we resort to following chains of messages and behaviors.
|
| 2.5 |
Discussion |
|
Message Driven Computing offers a number of advantages in programming multicomputers.
Although it bears a family resemblance to Actors languages, it is considerably
more pleasant to program in. The computable location names allow distributed arrays to
be allocated without the difficulties of encoding them in linked structures. Pattern
matching in behavior headers and through the delay statement allows messages to be
handled in the order they are needed, rather than the order in which they arrive.
The conversion of message types among equivalence classes is both convenient and
efficient. It is quite important for building dataflow graphs, passing procedure continuations,
and handling state changes in processes.
Array fields within messages allow us to adjust the grain size of computations. We
observe that fine-grained computations are horrendously expensive even on shared
memory machines which have low message-passing overhead.
The locate declaration gives a degree of control over the layout of distributed data and
process structures.
The automatic creation of locations upon message arrival is useful for building I-structures
and for demand-driven programming.
MDC also allows the mixture of multiple parallel programming paradigms within the
same program. Indeed, we often have locations metamorphose from Actors to processes
and back to Actors.
This multiparadigm programming comes from the low-level nature of MDC. We are not
separated that far from the important features of multicomputers: the outer message-passing
level of algorithms versus the inner control-flow level, the partial ordering of
time, the influence of layout and grain size. The cost, as with any low-level language, is
the amount of detail required to be expressed.
|
| 2.6 |
Acknowledgements |
|
We are porting OOMDC/C to several shared-memory parallel processors and running
experiments at the Advanced Computing Research Facility, Mathematics and Computer
Science Division, Argonne National Laboratory, whose provision of computing facilities we
gratefully acknowledge. NSF has provided us a 16-processor Ncube/7 through
grant CDA-8820762 upon which we are planning to run a number of experiments.
|
| 2.7 |
References |
|
- ACKE82 Ackerman, W. B., "Data Flow Languages," Computer, Feb. 1982, pp. 15-25.
- AGHA86 Agha, G. A.; Actors: A Model of Concurrent Computation in Distributed
Systems, MIT Press, Cambridge, MA., 1986.
- AMEI89 Ameiss, David, FP In MDC/C, Technical Report, Computer Science
Department, Illinois Institute of Technology, Chicago, IL 60616, 1989.
- ARVI78 Arvind, Gostelow, K. P., Plouffe, W. An Asynchronous Programming Language
and Computing Machine, Department of Information and Computer Science,
University of California, Irvine, Irvine, CA 92717, Dec. 8, 1978.
- ARVI80 Arvind, Thomas, R. E., I-Structures: An Efficient Data Type for Functional
Languages, MIT Laboratory for Computer Science Technical Manual TM-178,
MIT, Cambridge, MA, Sept. 1980.
- ARVI86 Arvind, R. S. Nikhil, K. K. Pingali, "I-Structures: Data Structures for Parallel
Computing," in Fasel, Joseph H. and Robert M. Keller (ed), Graph Reduction,
Proceedings of a workshop in Sante Fe, New Mexico, Sep./Oct. 1986, Lecture Notes
in Computer Science 279, Springer-Verlag, 1987.
- ATHA86 Athas, W. C., and Seitz, C. L., Cantor User Report, Version 2.0,
5232:TR:86, Computer Science Department, California Institute of Technology,
Pasadena, CA, 91125, January, 1987.
- ATHA87 Athas, William C., Fine Grain Concurrent Computations, Ph. D. Dissertation,
5242:TR:87, California Institute of Technology, Pasadena, California, May,
1987.
- ATHA88 Athas, William C. and Charles L. Seitz, "Multicomputers: Message-Passing
Concurrent Computers," IEEE Computer, August, 1988.
- BACK78 Backus, J.; "Can Programming be Liberated from the von Neumann
Style," Commun. ACM, 21,8, August 1978, pp. 613-641.
- BRIN87a Brinch Hansen, P., "Joyce - A Programming Language for Distributed
Systems," Software - Practice and Experience 17,1, Jan., 1987.
- BRIN87b Brinch Hansen, P., "A Joyce Implementation," Software - Practice and
Experience 17,4, Apr., 1987.
- CHRI89 Christopher, T. W., OOMDC/C Language Definition, Technical Report,
Computer Science Department, Illinois Institute of Technology, Chicago, IL 60616,
1989.
- CHRI90a Christopher, T. W., and Freitag, Gregory A. LLMDC/C Language
Description, Version 1.0, Technical Report, Computer Science Department, Illinois
Institute of Technology, Chicago, IL 60616, 1990.
- CHRI90b Christopher, T. W., Experiments in Multicomputing With Bitonic Sort,
Computer Science Department, Illinois Institute of Technology, Chicago, IL 60616,
1990. Unpublished.
- CHRI90c Christopher, T. W. "Exploration of the Limits That Grain Size Imposes on
the Speed-Up and Efficiency of Two Transitive Closure Algorithms," Fourth Annual
Parallel Processing Symposium, Orange County IEEE, April 4-6, 1990.
- DAVI82 Davis, A. L., Keller, R. M., "Data Flow Program Graphs," Computer, Feb.
82, pp. 26-41.
- DENN80 Dennis, J. B., "Data Flow Architectures," Computer, Nov. 1980, pp. 48-
56.
- HALS85 Halstead, Robert H., Jr., "Multilisp: A Language for Concurrent Symbolic
Computation," ACM Transactions on Programming Languages and Systems, 7:4,
October, 1985.
- HOAR78 Hoare, C. A. R.; "Communicating Sequential Processes," Commun.
ACM, 21,8 Aug. 1978, pp. 666-677.
- PUGH73 Pugh, A. L. III, Dynamo II User's Manual, The MIT Press, Cambridge
MA, 1973.
- TREL82 Treleaven, P. C., Brownbridge, D. R., and Hopkins, R. P. "Data-Driven and
Demand-Driven Computer Architecture," ACM Computing Surveys, 14,1, March,
1982.
- TREL84 Treleaven, P. C., Lima, I. G. "Future Computers: Logic, Data Flow, ...,
Control Flow?" Computer, March, 1984.
- VEEN86 Veen, A. H.; "Data Flow Machine Architecture," ACM Computing Surveys,
18,4, Dec. 1986, pp. 365-396.
- WADG85 Wadge, William W., Ashcroft, Edward A. Lucid, the Dataflow Programming
Language, APIC Studies in Data Processing No. 22, Academic Press, 1985.
- WALL82 Wall, D. W. "Messages As Active Agents," 9th Annual ACM Symposium
on Principles of Programming Languages, Jan., 1982.
- WOHL90a Wohl, Peter, and T. W. Christopher, "A Fine Grained Neural Net Simulator
Encoded in Coarse Grained OOMDC," Fourth Annual Parallel Processing Symposium,
Orange County IEEE, April 4-6, 1990.
- WOHL90b Wohl, Peter, and T. W. Christopher, "SIMD Neural Net Mapping on
MIMD Architectures," International Conference on Parallel Processing, August,
1990.
|