|
Early Notes, September 24, 1996.Released to the public domain, 1996, by Thomas W. Christopher
Warning: This material is copied from earlier papers on other members of the MDC family of languages and from the Memo system. It must still be translated into AMDC. IntroductionThe great difference in AMDC from traditional programming systems is that in AMDC, operations can be performed by active messages at data locations rather than requiring the data be sent to processes to be worked on. It is the operation at locations that are the distinguishing feature and real power of AMDC.
Block-Matrix OperationsAn "object" consists of a data structure representing its state, an address or mailbox at which it receives messages, and a collection of methods (procedures) by which it handles the messages it receives. A "reactive object" only executes code in response to messages. It is trivial to implement reactive objects in MDC90. The state of the object is a message sitting at a location. The address or mailbox of the object is the location at which the state object sits. The methods are behaviors triggered by the state message and an incoming message. Consider a matrix block server. A matrix is represented by blocks containing Grain x Grain elements each. Let the total number of rows in the matrix be R, and the total number of columns C. The total number of blocks will be ceiling(R/Grain) x ceiling(C/Grain). Each block will be represented by a reactive object. Their location names will be [A,k,m] where A is the symbol naming the matrix, 0 <= k < floor(R/Grain) and 0 <= m < floor(C/Grain). Each block will also keep track of the total number of rows and columns in the matrix. The state message is
message mat_block{int R, C; float a[Grain][Grain];};
Note that with this representation, matrix element A[i,j] is in the block at location [A,i/Grain,j/Grain]. The number of rows in the block at [A,X,Y] is min(Grain,R-Grain*X) and the number of columns is min(Grain,C-Grain*Y). The operations defined on matrix blocks are storing elements, fetching elements, fetching entire blocks, and fetching a descriptor for the array. There is no explicit create operation: a mat_block messages is just sent to its location. (There needs to be a delete operation. We omit that for lack of space.) To store an element, one sends a store_element message to the matrix block containing the element to be stored.
message store_elem{int i,j; float v; PORTAL stored;};
message done{};
To store value v at position A[i,j], this message should be sent to location [A,i/Grain,j/Grain]. The i and j values in the message are the full indices of the matrix element. A done signal is sent to 'stored' when the store is complete. It is vitally important that a completion signal be provided; further operations will need to be delayed until the store is complete, and the only way to communicate its completion is by message passing.
behavior mat_block b; store_elem s; =>
{
b->a[s->i%Grain][s->j%Grain] = s->v;
leave b;
send new done to s->stored;
}
message fetch_elem{int i,j; PORTAL dest;};
message scalar{float v;};
Fetching an element is similar to storing an element. A fetch_elem message is sent to the block containing the element specifying a destination, dest, to which to send the value. The value is placed in a 'scalar' message and sent to dest. (Since the value is sent to a PORTAL, the actual return message can be any type that is structurally compatible with 'scalar'.)
behavior mat_block b; fetch_elem f; =>
{ scalar s = new scalar;
s->v = b->a[f->i%Grain][f->j%Grain];
send s to f->dest;
leave b;
}
message fetch_block{PORTAL dest;};
message left_mat_block{int X, Y, NR, NC;
float a[Grain][Grain];};
message right_mat_block=left_mat_block;
The command to fetch an entire block of the array A is fetch_block. If fetch_block is sent to location [A,X,Y], then the array, 'a', of the block is placed in a message structurally compatible with 'left_mat_block' and sent to 'dest'. In addition, the other members are set as follows: X and Y are copied from the block's location name. NR is the number of rows present in the block, and NC is the number of columns.
behavior mat_block b; fetch_block f; =>
{ left_mat_block r = new left_mat_block;
int dif;
r->X = here.X;
r->Y = here.Y;
dif = b->R - Grain*here.X;
r->NR = Grain<dif ? Grain : dif;
dif = b->C - Grain*here.Y;
r->NC = Grain<dif ? Grain : dif;
memcpy(&r->a, &b->a, sizeof(b->a));
send r to f->dest;
leave b;
}
message fetch_descr{PORTAL dest;};
message mat_descr{int R,C;};
To find out the number of rows and columns in matrix 'A', a fetch_descr message should be sent to [A,0,0] where there will be a mat_block message. A 'mat_descr' message will then be sent to 'dest'.
behavior mat_block b; fetch_descr f; =>
{
mat_descr d = new mat_descr;
d->R = b->R;
d->C = b->C;
send d to f->dest;
leave b;
}
ContinuationsProgramming using objects requires a certain amount of contortion of code. Consider the problem of writing out an array without using a process. One would like to write something like this pseudo-code:
print_array =>
<fetch the bounds descriptor>
for (i=0; i<number of rows; i++) {
for (j=0; j<number of columns; j++) {
<fetch a[i][j]>
print a[i][j]
}
print new line
}
As a first step in transforming this code, we must make the control structure more explicit and annotate continuation points - points where we will have to begin a new behavior to process a message returning with a value:
print_array =>
<fetch the bounds descriptor>
CONTIN1=>
i=0;
while (i<number of rows){
j=0;
while (j<number of columns) {
<fetch a[i][j]>
CONTIN2=>
print a[i][j]
j++;
}
print new line
i++;
}
The behaviors we need to encode are
The pseudo-code for (1) is simple: print_array => <fetch the bounds descriptor> leave self The "leave self" means "leave the state of this procedure in a message to trigger the continuation when the fetched value arrives." The pseudo-code for (2) is simplified by the assumption that there will be no empty arrays. It is especially simplified by the realization that a while-statement that unconditionally creates a continuation within it will never actually loop, and therefore can be replaced with an if-statement: CONTIN1 => i=0; j=0; <fetch a[i][j]> leave self The pseudo-code for (3) is particularly contorted because control must start in the middle of two nested loops. Armed with the knowledge that coming to the bottom of a while loop is the same as reentering it, the first step might be:
CONTIN2=>
print a[i][j]
j++;
while (j<number of columns) {
<fetch a[i][j]>
CONTIN2=>
print a[i][j]
j++;
}
print new line
i++;
while (i<number of rows){
j=0;
while (j<number of columns) {
<fetch a[i][j]>
CONTIN2=>
print a[i][j]
j++;
}
print new line
i++;
}
Next, using the transformations we used in constructing pseudo-code (2), we arrive at:
CONTIN2=>
print a[i][j]
j++;
if (j<number of columns) {
<fetch a[i][j]>
leave self
return
}
print new line
i++;
if (i<number of rows){
j=0;
<fetch a[i][j]>
leave self
return
}
exit print_array
With a bit more contortion, reversing the conditionals, we arrive at:
CONTIN2=>
print a[i][j]
j++;
if (j>=number of columns) {
print new line
i++;
if (i>=number of rows){
exit print_array
}
j=0;
}
<fetch a[i][j]>
leave self
return
}
This pseudo-code is implemented by the following message and behavior declarations:
message print_array{SYMBOL A; PORTAL done;};
message print_array_wt_descr=print_array;
message printing_array{SYMBOL A;
int i,j,R,C; PORTAL done;};
message elem=scalar;
behavior print_array p; =>
{
fetch_descr f = new fetch_descr;
let f->dest = mat_descr @ here;
send f to [p->A,0,0];
leave (print_array_wt_descr) p;
}
behavior print_array_wt_descr p; mat_descr d; =>
{
printing_array c = new printing_array;
fetch_elem f = new fetch_elem;
c->A = p->A;
c->i = c->j = 0;
c->R = d->R;
c->C = d->C;
c->done = p->done;
f->i = f->j = 0;
let f->dest = elem@here;
send f to [p->A,0,0];
leave c;
}
behavior printing_array p; elem e; =>
{
fetch_elem f = new fetch_elem;
printf("%4.0f ", e->v);
p->j++;
if (p->j >= p->C) {
printf("\n");
p->i++;
p->j=0;
if (p->i >= p->R) {
send new done to p->done;
return;
}
}
f = new fetch_elem;
f->i = p->i;
f->j = p->j;
let f->dest = elem @ here;
send f to [p->A,p->i/Grain,p->j/Grain];
leave p;
}
Partial ordering of timePartial ordering of time. One great difference from conventional programming is that time is only partially ordered. Computational event x precedes computational event y only if there is a chain of message transmissions and behavior executions from x to y. One of the difficulties in learning to program in a message driven style is learning to pay attention to this. It leads to a number of other problems. Termination detection is a problem; there is no way to instantaneously examine the state of the system and observe there is no more work to do. In fact, messages that have been handed over to the communications system on a multicomputer are not visible until they arrive. Termination of algorithms can be detected by, for example, arranging for acknowledgements on all operations and building acknowledgement trees to determine when an aggregate operation is complete, but it does require explicit programming. Since we cannot capture global snapshots, debugging can be a problem. We use trace debugging. Typically we first check the input/output behavior of behaviors, then we look for orphaned messages (produced but never consumed) and unexpected location names, and finally we resort to following chains of messages and behaviors. Asynchronous proceduresTo call a procedure, the caller creates a location name at which the procedure is to execute and sends off a call message to trigger the starting behavior of the procedure. The call message contains the parameter list and return address. The caller leaves its continuation message. The best form for a return address is a Destiny containing the location to which the results are to be sent, the message type expected, and the script the result message is to execute when it gets back. The called procedure packs up the results in a message and sends them to the destiny. This implementation of procedure calls permits the implementation of remote procedure calls. StreamsA "spatial stream" is a stream that is spread over memory space. A spatial stream can be represented as a one dimensional array of locations, the N elements in the stream are messages in locations [s,0] through [s,N-1]. Location [s,N] contains an end-of-stream message. Consider a program for a prime sieve. We wish to compute a stream of primes. We start by generating a stream of candidate integers.
message elem{int val;}; /*stream element*/
message eos{}; /*end of stream*/
message gen{int i,step,til;}; /*generator process*/
Spatial streams are processed by active messages [WALL82] (or perhaps they might be better called travelling processes). An example of an active message is the generator message, gen, which, when sent to a location [s,j], will leave at locations [s,j], [s,j+1], [s,j+2], ... the messages elem{i}, elem{i+step}, elem{i+2*step},..., eos{}. Gen does not send messages to locations, but goes to them to leave the messages.
behavior gen g; =>
{
if (g->i <= g->til) {
elem e=new elem;
e->val = g->i;
leave e;
g->i += g->step;
send g to [here.S, here.X+1];
}
else
leave new eos;
}
message cast_out{int m; LOCUS out;};
A cast_out message sent to a location will copy the elements of the stream that begins at that location which are not multiples of m. They will be copied to the stream beginning at "out". The cast_out message copies them by visiting each element of the input stream, consuming them in the process.
behavior cast_out c; eos e; =>
{send e to locus c->out;}
behavior cast_out c; elem e; =>
{
if (e->val % c->m != 0) {
send e to locus c->out;
c->out.X++;
}
send c to [here.S, here.X+1];
}
A filter message starts the sieving of a stream. Given a stream of values X = x1, x2, x3, ..., a filter message performs three functions: 1) It sends the first stream element, x1, to the output stream. 2) It sends a cast_out message to the remainder of the stream (x2, x3, ...) creating an intermediate stream by casting out multiples of the first element. (Y = x2, x3, ... with multiples of x1 removed.) 3) It sends a filter message to the intermediate stream, filtering that stream into the output. (Let X = Y and repeat from (1)). (This would perhaps be easier to understand as a recursive function, although it runs in a pipeline.)
message filter{LOCUS out;};
behavior filter f; eos e; =>
{send e to locus f->out;}
behavior filter f; elem e; =>
{SYMBOL s=Symbol(Primes);
cast_out c=new cast_out;
c->m = e->val;
let c->out = [s,0];
send c to [here.S, here.X+1];
send e to locus f->out;
f->out.X++;
send f to [s,0];
}
In a "temporal stream," the values in the stream arrive in messages at the same location -- spread out, so to speak, over time. The messages themselves, rather than their locations, must now carry the sequence number indicating their position in the stream. The prime sieve may be rephrased for temporal streams as follows:
message elem{int pos,val;};
message eos{int pos;};
message gen{int i,step,til,pos; LOCUS out;};
The element and the end-of-stream messages now have a "pos" field to indicate their position in the stream. The generator message, gen, now also contains a "pos" field to indicate the position in the output for the next value to be generated. In addition, the generator contains the name of the location, "out", to send the stream to. The behavior for the generator has been coded to generate one stream element per dispatch. It demonstrates an active object or process: The gen message is the activation record of the process, and the presence of the activation record is sufficient to trigger a burst of computation.
behavior gen g; =>
{
if (g->i <= g->til) {
elem e=new elem;
e->val = g->i;
e->pos = g->pos++;
g->i += g->step;
send e to locus g->out;
leave g;
} else {
eos e = new eos;
e->pos = g->pos;
send e to locus g->out;
}
}
message cast_out{int m; int inpos,outpos; LOCUS out;};
A cast_out message at a location corresponds to a process that will copy those elements of an incoming stream which are not multiples of m into an output stream, out. The positions in the input and output streams must be represented explicitly: inpos is the position of the incoming element that will be handled next, outpos is the position of the next element to write into the output. Since messages in MDC are not required to be delivered in the order sent, provision must be made for out-of-order messages. In the following code, a message which arrives out of order is delayed, i.e. left at the location but removed from further pattern matching. When the correct message is processed, the delayed messages are "undelayed" to allow them again to be processed.
behavior cast_out c; eos e; =>
{
if (c->inpos != e->pos) {
delay e;
leave c;
} else {
e->pos = c->outpos;
send e to locus c->out;
}
}
behavior cast_out c; elem e; =>
{
if (c->inpos != e->pos) {
delay e;
} else {
if (e->val % c->m != 0) {
e->pos = c->outpos;
send e to locus c->out;
c->outpos++;
c->inpos++;
undelay;
} else {
c->inpos++;
undelay;
}
}
leave c;
}
A filter message must perform the same functions for a temporal stream as for a spatial stream. There are three major differences. The filter message needs to know the position in the output stream for the prime it finds. The filter message must explicitly wait for stream element zero. And the cast_out message must be left at the same location as the filter: conceptually, the filter object becomes a cast_out object.
message filter{int outpos; LOCUS out;};
behavior filter f; eos e; =>
{
if (e->pos == 0) {
e->pos = f->outpos;
send e to locus f->out;
} else {
delay e;
leave f;
}
}
behavior filter f; elem e; =>
{
if (e->pos != 0) {
delay e;
leave f;
} else {
SYMBOL s=Symbol(Primes);
cast_out c=new cast_out;
undelay;
c->m = e->val;
let c->out = [s];
c->inpos = 1;
c->outpos = 0;
leave c;
e->pos = f->outpos;
send e to locus f->out;
f->outpos++;
send f to [s];
}
}
It is worth remarking that spatial streams are a lot cleaner than temporal streams to implement, and if the activation record messages are small (or the implementation is on a shared-memory machine), a traveling process is about as efficient as a process that executes at a single location. BroadcastIt is often necessary to broadcast a message to each of an array of locations, e.g. to start an algorithm. The following are four techniques we have used:
AccumulateIt is often necessary to accumulate information from an array of locations, indeed most often simply the information that they have finished some processing. The two methods we have most often used to accumulate information are
Itinerant threadsWe have found the most convenient way to process streams is with active messages, traveling activations, or itinerant threads, wherein an activation record message (the representation of a process) does not remain at a single location fetching data to itself, but rather moves from location to location. Dataflow and I-structuresFutures and I-structures. A future is an assign-once variable used to communicate between a producer (typically a subroutine) and a consumer (its caller). Both the producer and consumer may run in parallel, with the consumer only being delayed if it attempts to fetch from the variable before it has been assigned. An I-structure (an "incremental structure") is a collection (e.g. an array) of futures. I-structures were invented for dataflow hardware [ARVI80]. In Memo, any folder that will have only one memo ever placed in it may correspond to a future. The consumer executing a memo_get, memo_get_copy, or memo_alt fetching from that folder will be delayed until the value has been produced. The folder will vanish once the memo is removed. Since it is usually better not to block an entire process, the consumer can delay a memo for the job jar in the future's folder which will trigger the desired computation when the data becomes available: FOLDER_NAME future,job_jar; ... memo_write_delayed(future, job_jar,operation,sizeof(operation)); Dataflow. Dataflow programming triggers execution of code when its operands become available [ACKE82] [ARVI86] [DAVI82] [DENN80] [TREL82] [VEEN86]. The Memo system facilitates dataflow programming by providing the memo_write_delayed procedure. Assume the operands are futures. One simply arranges to have an operation dropped into a job jar when an operand memo arrives in a folder, thus: FOLDER_NAME operand,job_jar; ... memo_write_delayed(operand, job_jar,operation,sizeof(operation)); If the operation requires more than one operand, then the operation can poll for each of the operands (e.g. by memo_present) and delay itself in absent operands' folders until all are available. Dataflow graphs. We have used the following code for dynamic dataflow graphs:
message add { PORTAL dst;};
message lopnd {long v;};
message ropnd = lopnd;
message result {long rslt;};
behavior add a; lopnd left; ropnd right; =>
{ result r = new_result;
r->rslt = left->v + right->v;
send r to a->dst;
}
The add message corresponds to an add dataflow instruction. The field dst represents the destination to receive the result. The behavior is triggered when an add instruction is present at the same location as both a left and a right operand (lopnd and ropnd). The behavior allocates a new result message, fills it with the sum of the operands, and sends it to the destination contained in the add message. When the result message is sent, it is converted into the message type contained in the PORTAL a>dst, which will very likely be lopnd or ropnd. ActorsReactive objects. A reactive object is an object that executes only in response to messages it receives. Reactive objects are central to the Actors model of parallel computation [AGHA86] [ATHA86] [ATHA87] [ATHA88] [BODE88]. A reactive object can be implemented with a job jar folder plus one input folder per object. A memo containing the object is delayed on its input folder. When an input memo arrives, the object is placed in the job jar. Eventually some process fetches the object from the job jar and executes its code, which reads an input message and responds to it. (For efficiency, it is better to loop reading all available input messages with memo_get_skip, rather than delaying the object after processing each one.) Actors. Perhaps the best way to describe Actors is to describe an implementation. An Actor is a mailbox with a message queue, a procedure pointer, and an argument list attached. When the queue is non-empty, the scheduler can call the procedure passing it the argument list and the first message from the queue. The procedure can do three things in addition to the usual calculations. 1) It can create a new actor by supplying a procedure and an initial argument list, getting a pointer to its mail box. 2) It can send messages to those actors to whose mailboxes it has pointers. 3) It can change itself into a new actor by specifying a new procedure or argument list. (The argument list is called an "acquaintance list," and the messages are called "tasks.") Actors may be approximated in LLMDC/C by the following translations: A mailbox is a location, [s,0], where the symbol s may be considered the pointer to the mailbox. The index field is unused but must be consistently some value, here chosen to be zero. The procedure pointer and argument list are combined into a message; the message type corresponds to the procedure pointer and the fields of the message to the arguments. The body of the procedure is broken up into several LLMDC/C behaviors, one for each type of message that may be received. These behaviors have the form:
behavior actorType self;
messageType msg; =>
{ code }
The messages (tasks) in Actors also translate into MDC messages. To avoid errors, the MDC message types should be partitioned into actor messages and task messages. Creating a new actor involves creating a new symbol, s, for its location name, allocating a message to represent the initial procedure, filling in the fields of that message with the initial parameter list, and sending the message to the location [s,0]. Sending a message in Actors corresponds to sending a message in MDC. To become a new actor, the behavior leaves a new actor message in place of the previous one. To remain the same actor, the behavior leaves the same input message (self). To terminate, the behavior does not leave any actor message. This translation does not completely imitate Actors. In Actors, the incoming messages must be processed in the same order they arrive. In our translation, they might not be. In Actors, therefore, the procedure must be able at any time to handle any task message. This is often quite difficult. For example, the actor A calls a subroutine by creating an actor B and sending it a task. Actor A cannot respond to other messages until B returns (i.e. B's response message arrives). But A cannot read the response until it handles the messages waiting on its queue ahead of the response. The solution to this problem in Actors is to create a queue (composed of more actors) to which the incoming tasks are sent. The solution in MDC is to leave an actor message that will only trigger a behavior when the response message comes. That behavior will then leave an actor message that will handle the other tasks. There are ways to guarantee identical behavior to Actors, but given a choice, would a programmer restrict him- or herself to Actors? GrainsizeWe have learned that the grain size of computations is highly significant. Grain size refers to the amount of computation performed by each computational event. If the grain size is too small, the system overhead is too large a proportion of the execution time. If the grain size is too large, there are likely to be opportunities for parallelism being missed. Grain size is adjusted by adjusting the amount of data processed in a computational event, which entails keeping arrays of data in messages - the larger the array, the more data to be processed. We ran a fine grained version of bitonic sort on 8192 elements and a coarse grained version with blocks of sizes two through 1024 in powers of two [CHRI90b]. We observed a difference of over 9000% in the run time of a very fine grained (scalar field) version of the algorithm over a version with 256 element blocks. We have observed similarly large differences in other algorithms [CHRI90c]. LayoutThe locate declaration allows some control of the layout of locations over the nodes of a multicomputer. We have not yet run experiments on a genuine multicomputer to find out how significant location is, but we have run experiments on shared-memory machines. The messages are allocated in shared memory, but the locations are in the private heaps of processes. Messages are sent between processes through shared linked queues. We ran a fine-grained version of bitonic sort with 1024 locations each handling one element of the array. We observed the worst possible compared to best possible gave about a 16% difference. Hashed placement compared to the best possible placement gave only about a 5% difference[CHRI90b]. Tree rewritingVery much under construction... Demand-drivenIn demand-driven programming, the demand for, or attempt to use, a value causes its computation to be initiated. Lazy streams are one example of demand-driven programming. Similarly, our lazy FP implementation leaves suspensions at locations which represent futures in the lenient and strict implementations. Interestingly enough, it is not always necessary to have a suspension message waiting at a location. Suppose that there is a way to uniquely assign the computation of a function applied to particular arguments to a single location. When one wishes to compute the function, one sends the call message to the assigned location. If the value has already been computed, it will be sitting there in a message, and the call will simply return it. If the value is already being computed, but has not arrived yet, there will be a please_wait message, and the call message waits for the value to be computed. If there is neither a value nor a please_wait message, the call leaves a please_wait and starts evaluation of the function. The code fragments are:
message value{...};
message please_wait{};
message demand{PORTAL dst;};
behavior demand d;
value v; => {
leave v;
send CopyMsg(v)
to d>dst;
}
behavior demand d;
value==0;
please_wait==0; => {
leave new_please_wait;
...start computing function...
}
behavior value v;
please_wait w; => {
leave v;
}
ReferencesACKE82 Ackerman, W. B., "Data Flow Languages," Computer, Feb. 1982, pp. 15-25. AGHA86 Agha, G. A.; Actors: A Model of Concurrent Computation in Distributed Systems, MIT Press, Cambridge, MA., 1986. AMEI89 Ameiss, David, FP In MDC/C, Technical Report, Computer Science Department, Illinois Institute of Technology, Chicago, IL 60616, 1989. ARVI78 Arvind, Gostelow, K. P., Plouffe, W. An Asynchronous Programming Language and Computing Machine, Department of Information and Computer Science, University of California, Irvine, Irvine, CA 92717, Dec. 8, 1978. ARVI80 Arvind, Thomas, R. E., I-Structures: An Efficient Data Type for Functional Languages, MIT Laboratory for Computer Science Technical Manual TM-178, MIT, Cambridge, MA, Sept. 1980. ARVI86 Arvind, R. S. Nikhil, K. K. Pingali, "I-Structures: Data Structures for Parallel Computing," in Fasel, Joseph H. and Robert M. Keller (ed), Graph Reduction, Proceedings of a workshop in Sante Fe, New Mexico, Sep./Oct. 1986, Lecture Notes in Computer Science 279, Springer-Verlag, 1987. ATHA86 Athas, W. C., and Seitz, C. L., Cantor User Report, Version 2.0, 5232:TR:86, Computer Science Department, California Institute of Technology, Pasadena, CA, 91125, January, 1987. ATHA87 Athas, William C., Fine Grain Concurrent Computations, Ph. D. Dissertation, 5242:TR:87, California Institute of Technology, Pasadena, California, May, 1987. ATHA88 Athas, William C. and Charles L. Seitz, "Multicomputers: Message-Passing Concurrent Computers," IEEE Computer, August, 1988. BACK78 Backus, J.; "Can Programming be Liberated from the von Neumann Style," Commun. ACM, 21,8, August 1978, pp. 613-641. BRIN87a Brinch Hansen, P., "Joyce - A Programming Language for Distributed Systems," Software - Practice and Experience 17,1, Jan., 1987. BRIN87b Brinch Hansen, P., "A Joyce Implementation," Software - Practice and Experience 17,4, Apr., 1987. CHRI89 Christopher, T. W., OOMDC/C Language Definition, Technical Report, Computer Science Department, Illinois Institute of Technology, Chicago, IL 60616, 1989. CHRI90a Christopher, T. W., and Freitag, Gregory A. LLMDC/C Language Description, Version 1.0, Technical Report, Computer Science Department, Illinois Institute of Technology, Chicago, IL 60616, 1990. CHRI90b Christopher, T. W., Experiments in Multicomputing With Bitonic Sort, Computer Science Department, Illinois Institute of Technology, Chicago, IL 60616, 1990. Unpublished. CHRI90c Christopher, T. W. "Exploration of the Limits That Grain Size Imposes on the Speed-Up and Efficiency of Two Transitive Closure Algorithms," Fourth Annual Parallel Processing Symposium, Orange County IEEE, April 4-6, 1990. DAVI82 Davis, A. L., Keller, R. M., "Data Flow Program Graphs," Computer, Feb. 82, pp. 26-41. DENN80 Dennis, J. B., "Data Flow Architectures," Computer, Nov. 1980, pp. 48-56. HALS85 Halstead, Robert H., Jr., "Multilisp: A Language for Concurrent Symbolic Computation," ACM Transactions on Programming Languages and Systems, 7:4, October, 1985. HOAR78 Hoare, C. A. R.; "Communicating Sequential Processes," Commun. ACM, 21,8 Aug. 1978, pp. 666-677. PUGH73 Pugh, A. L. III, Dynamo II User's Manual, The MIT Press, Cambridge MA, 1973. TREL82 Treleaven, P. C., Brownbridge, D. R., and Hopkins, R. P. "Data-Driven and Demand-Driven Computer Architecture," ACM Computing Surveys, 14,1, March, 1982. TREL84 Treleaven, P. C., Lima, I. G. "Future Computers: Logic, Data Flow, ..., Control Flow?" Computer, March, 1984. VEEN86 Veen, A. H.; "Data Flow Machine Architecture," ACM Computing Surveys, 18,4, Dec. 1986, pp. 365-396. WADG85 Wadge, William W., Ashcroft, Edward A. Lucid, the Dataflow Programming Language, APIC Studies in Data Processing No. 22, Academic Press, 1985. WALL82 Wall, D. W. "Messages As Active Agents," 9th Annual ACM Symposium on Principles of Programming Languages, Jan., 1982. WOHL90a Wohl, Peter, and T. W. Christopher, "A Fine Grained Neural Net Simulator Encoded in Coarse Grained OOMDC," Fourth Annual Parallel Processing Symposium, Orange County IEEE, April 4-6, 1990. WOHL90b Wohl, Peter, and T. W. Christopher, "SIMD Neural Net Mapping on MIMD Architectures," International Conference on Parallel Processing, August, 1990. |
|