# Specifying concurrent objects as communicating processes

Specifying Concurrent Objects as Communicating Processes

Jayadev Misra Department of Computer Sciences The University of Texas at Austin Austin, Texas 78712 (512) 471-9547 misra@cs.utexas.edu

1 Introduction

An object and its associated operations may be speci ed in many ways. One way is to give an abstract representation of the object data structure (viz., representing a queue by a sequence) and the e ects of various operations on this abstract representation 3,4,5]. Another way 2] is to leave the representation aspects unspeci ed but to give a set of equations that relate the e ects of various operations (the equations de ne a congruence relation in a term algebra). The common goal of a speci cation, however, is to serve as a legal document that spells out a contract between a user and an implementer: The user may assume no more about an object than what is speci ed and the implementer must satisfy the speci cation through his implementation. We view a speci cation not merely as a legal contract but additionally as a means (1) to deduce properties of a speci ed object, (2) to deduce properties of other objects in which the speci ed object is a component (i.e., inheritance of properties) and (3) to implement the object by stepwise re nement of its speci cation. Therefore we require that a speci cation not merely be formal but also be in a form that admits of e ective manipulation. This requirement rules out speci cation schemes in which program fragments (in some high level language) appear as part of speci cations; typically such program fragments cannot be manipulated e ectively. In many speci cation schemes it is assumed that (1) each operation on an object is deterministic (i.e., applying the operation to a given state of the object results in a unique next state and/or unique values being returned), (2) an operation once started always terminates in every state of the object, and (3) operations are not applied concurrently. In many cases of interest arising in applications such as operating systems, process

This material is based in part on work supported by the Texas Advanced Research Program under Grant No. 003658{065 and by the ONR Contract N00014-90-J-1640.

1

control systems and concurrent databases, these assumptions are rarely met. For instance, a \queue object" acts as an intermediary between a producer and a consumer, temporarily storing the data items output by the producer and later delivering them to the consumer. The queue object is required to (1) deliver data in the same order to the consumer as they were received from the producer, (2) receive data from the producer provided its internal queue spaces are nonfull, and (3) upon demand, send data to the consumer provided its internal queue spaces are nonempty. Requests from the producer and the consumer may be processed concurrently: The producer is delayed until there is some space in the queue and the consumer is delayed until there is some data item in the queue. Observe that a request from the producer, to add a data item to the queue, may not terminate if the queue remains full forever, and similarly, a request from the consumer may not terminate if the queue remains empty forever. In this paper, we propose a speci cation scheme for concurrent objects that allows e ective manipulations of speci cations and admits nondeterministic, nonterminating and concurrent operations on objects. Our approach is to view a concurrent object as an asynchronous communicating process. Such a process can be speci ed by describing its initial state, how each communication|send or receive|alters the state, and the conditions under which a communication action is guaranteed to take place. On the other hand, since the internal state can be determined from the sequence of communications (provided the process is deterministic), the process can also be speci ed in terms of the sequence of communications in which it engages. The point to note is that the internal state and the sequence of communications can be viewed as auxiliary variables which may be altered as a result of communications. Our speci cation scheme allows us to de ne auxiliary variables and state properties using these variables. We advocate using any auxiliary variable that allows a simple and manipulable speci cation, leaving open the question whether it is preferable to use \observable" input-output sequences or \unobservable" internal states. The speci cation mechanism is based on \UNITY logic" 1]. Auxiliary variables can be de ned directly using the constructs in this logic. The inference rules of the logic can be applied to deduce various properties of an object from its speci cation. Also, we may use the logic to prove that one speci cation re nes (i.e., implements) another speci cation and that an implementation meets a speci cation.

1.1 Outline of the Paper

We give a brief introduction to UNITY in Section 2 including all the notations and logic to understand this paper. The main example treated in this paper is a bounded bag (or multiset). Section 3 contains an informal description and a formal speci cation of this object. We demonstrate the usefulness of the speci cation by showing that concatenation of two bags results in a bag of the appropriate size. The speci cation in Section 3 is based on an auxiliary variable that encodes the internal state. We provide an alternative speci cation in Section 4 that uses the sequences of input/output as auxiliary variables; we show that this speci cation is a 2

re nement of the earlier one. Section 5 contains speci cations for a \queue" and a \stack." We show that a queue implements a bag and a stack implements a bag, showing in each case that the speci cation of the former implies the speci cation of the latter. Section 6 contains a re nement of the bag speci cations of Section 4, and Section 7 gives an implementation of the speci cation of Section 6. The usefulness of the method is discussed in Section 8. All the proofs are given in complete detail to emphasize that such proofs need not be excessively long or tedious, as is often the case with formal proofs. A preliminary version of this paper appears in 7]. The speci cations and the proofs have been considerably simpli ed in the current version.

2 A Brief Introduction to UNITY

A UNITY program consists of (1) declarations of variables, (2) a description of their initial values and (3) a nite set of statements. We shall not describe the program syntax except brie y in Section 7 of this paper because it is unnecessary for understanding this paper. However an operational description of the execution of a program is helpful in understanding the logical operators introduced later in this section. An initial state of a program is a state in which all variables have their speci ed initial values (there can be several possible initial states if the initial values of some variables are not speci ed). In a step of program execution an arbitrary statement is selected and executed. Execution of every statement terminates in every program state. (This assumption is met in our model by restricting the statements to assignment statements only, and function calls, if any, must be guaranteed to terminate.) A program execution consists of an in nite number of steps in which each statement is executed in nitely often. This program model captures many notions useful for programming such as: synchrony, by allowing several variable values to be modi ed by a single (atomic) statement; asynchrony, by specifying little about the order in which the individual statements are executed; processes communicating through shared variables, by partitioning the statements of the program into subsets and identifying a process with a subset; processes communicating via messages, by restricting the manner in which shared variables are accessed and modi ed, etc. We shall not describe these aspects of the model; however, it will become apparent from the bag example in this paper that process networks can be described e ectively within this model.

2.1 UNITY logic

A fragment of UNITY logic, that is used in this paper, is described in this section. Three logical operators, unless, ensures, and leads-to, are at the core of UNITY logic. Each of these is a binary relation on predicates; unless is used to describe the safety properties, and the other two to describe 3

progress properties of a program. Notation: Throughout this section p; q; r denote predicates which may name program variables, bound variables and free variables (free variables are those that are neither program variables nor bound variables). 2

2.1.1 unless

For a given program

p unless q

denotes that once predicate p is true it remains true at least as long as q is not true. Formally (using Hoare's notation)

h for all statements s of the program :: fp ^ :qg s fp _ qgi

i.e., if p ^ :q holds prior to execution of any statement of the program then p _ q holds following the execution of the statement. It follows from this de nition that if p holds and q does not hold in a program state then in the next state either q holds, or p ^ :q holds; hence, by induction on the number of statement executions, p ^ :q continues to hold as long as q does not hold. Note that it is possible for p ^:q to hold forever. Also note that if :p _ q holds in a state then p unless q does not tell us anything about future states. Notation: We write h8u :: P:ui and h9u :: P:ui for universal quanti cation of u in P:u and existential quanti cation of u in P:u; respectively. The dummy u could denote a variable, a statement in a program, or even a program. Any property having a free variable (i.e., a variable that is neither bound nor a program variable) is assumed to be universally quanti ed over all possible values of the variable. Thus,

p unless q

u = k unless u > k

where k is free, is a shorthand for

h8 k :: u = k unless u > ki .

2

Examples

1. Integer variable u does not decrease. u = k unless u > k 4

or, u k unless false 2. A message is received (i.e., predicate rcvd holds) only if it had been sent earlier (i.e., predicate sent already holds). :rcvd unless sent

2

Example (De ning Auxiliary Variables)

Let u be an integer-valued variable and let v count the number of times u's value has been changed during the course of a program execution. The value of v is completely de ned by initially v = 0 u = m ^ v = n unless u 6= m ^ v = n + 1 The traditional way to de ne v, given a program in which u is a variable, is to augment the program text with the assignment,

v := v + 1

whenever u's value is changed; v is called an auxiliary variable. Our way of de ning v, without appealing to the program text, is preferable because it provides a direct relationship between u and v which may be exploited in speci cations and proofs. 2

2.1.2 Stable, Constant, Invariant

Some special cases of unless are of importance. The predicate p unless false, from de nition, denotes that p remains true forever once it becomes true; we write, \p stable" as a shorthand for \p unless false ." An expression e is constant means e = x is stable, for all possible values x of e; then e never changes value. If p holds in every initial state and p is stable then p holds in every state during any execution; we then say that \p is invariant."

2.1.3 ensures

For a given program,

p ensures q

implies that p unless q holds for the program and if p holds at any point in the execution of the program then q holds eventually. Formally,

p unless q ^ h9 statement s :: fp ^ :qg s fqgi p ensures q

5

It follows from this de nition that once p is true it remains true at least as long as q is not true (from p unless q). Furthermore, from the rules of program execution, statement s will be executed sometime after p becomes true. If q is still false prior to the execution of s then p ^ :q holds and the execution of s establishes q.

2.1.4 leads-to

For a given program, p leads-to q, abbreviated as p 7! q, denotes that once p is true, q is or becomes true. Unlike ensures, p may not remain true until q becomes true. The relation leads-to is de ned inductively by the following rules. The rst rule is the basis for the inductive de nition of leads-to. The second rule states that leads-to is a transitive relation. In the third rule, p:m, for di erent m, denote a set of predicates. This rule states that if every predicate in a set leads-to q then their disjunction also leads-to q. (basis)

p ensures q p 7! q

(transitivity)

p 7! q ; q 7! r p 7! r

(disjunction) In the following m ranges over any arbitrary set, and m is not free in q.

h8 m :: p:m 7! qi h9 m :: p:mi 7! q

Notes on Inference Rules

We have explained the meaning of each logical operator in terms of program execution. However, neither the de nitions nor our proofs make any mention of program execution. We use only the de nitions, and a few rules derived from these de nitions, in proofs; we believe that our proofs are succinct because we avoid operational arguments about program executions.

2.1.5 Derived Rules

The following rules for unless are used in this paper; for their proofs, see 1]. (re exivity, antire exivity)

p unless p ; p unless :p

6

(consequence weakening)

p unless q ; q ) r p unless r

(stable conjunction)

p unless q ; r stable p ^ r unless q ^ r

We need the following facts about constants; see 6] for proofs. (constant formation) Any expression built out of constants and free variables is a constant. (constant introduction) For any function f over program variables u,

u = k unless u 6= k ^ f (u) = f (k) f constant Corollary 1: For any predicate p over program variables u, u = k unless u 6= k ^ p p stable

We use the following results about leads-to. (Implication)

p ) q p 7! q

The following rule allows us to deduce a progress property from another progress property and a safety property. (PSP)

p 7! q ; r unless b p ^ r 7! (q ^ r) _ b

2.1.6 Substitution Axiom

Substitution axiom allows us to replace an invariant by \true" and vice versa, in any predicate. Thus, if I is an invariant and it is required to prove that

p 7! q ^ I

7

it su ces to prove

p 7! q .

It is important to note that when several programs are composed (see Section 2.2), the substitution axiom can be applied only with an invariant of the composite program.

2.2 Program Composition through

union

Given two programs F; G, their union, written F ] G, is obtained by appending their codes together: The initial conditions of both F; G are satis ed in F ] G (and hence, we assume that initial conditions of F; G are not contradictory) and the set of statements of F ] G is the union of the statements of F and G. Programs F; G may be thought of as executing asynchronously in F ] G. The union operator is useful for understanding process networks where each process may be viewed as a program and the entire network is their union; in this view, the bag program, the producer, and the consumer are composed through the union operator to form a system. The following theorem is fundamental for understanding union. It says that an unless property holds in F ] G i it holds in both F and G; an ensures property holds in F ] G i the corresponding unless property holds in both components and the ensures property holds in at least one component. (When there are multiple programs we write the program name with a property, such as p unless q in F .) Union Theorem:

p unless q in F ] G p unless q in F ^ p unless q in G p ensures q in F ] G (p unless q in F ^ p ensures q in G) _ (p ensures q in F ^ p unless q in G)

Corollary 1:

p unless q in F ; p stable in G p unless q in F ] G

Corollary 2:

e constant in F ; e constant in G e constant in F ] G

Corollary 3:

p stable in F ; p invariant in G p invariant in F ] G

Corollary 4: 8

logic.

p ensures q in F ; p stable in G p ensures q in F ] G If variable x cannot be accessed by a program F |i.e., x is local to some other program|then x is constant in F ; hence|using the constant formation rule|any expression, e(x), over x is constant in F . This fact is expressed in Corollary 5. We write \x is not accessed in F " in quotes because this is not a proposition in our

Corollary 5: \x is not accessed in F " e(x) constant in F We shall also use properties of the following form in the speci cation of F , where P; Q are arbitrary properties. In the following, G is quanti ed over programs:

h8 G :: (P in G) ) (Q in F ] G)i

This says that if P is a property of any program G then Q is a property of F ] G. This is a convenient way of specifying the properties of the environment with which F may be composed and the resulting properties of the composite program. This kind of formula will be crucial in speci cations of concurrent objects, because the objects typically assume certain properties of their environments.

3 Speci cation Methodology; Bag Speci cation

3.1 Speci cation Methodology

We treat a concurrent object as an asynchronous communicating process. A process and its environment communicate over channels that can typically hold several items of data. To simplify matters, we dispense with channels; we assume that communications are through certain shared variables each of which can hold at most one item of data. The access protocol to the shared variables is as follows: An \input shared variable" of the object can be written only by the environment and read by the object, an \output shared variable" of the object can be written only by the object and read by the environment. Typically, we will have a shared variable corresponding to each operation on the object (if an operation also delivers a result, we may require two shared variables|one to request the operation and the other to store the result). In Figure 1, B implements a bag that is shared by a group of producers and consumers. Producers add items to the bag by successively storing them in r; consumers remove successive items from w. Program F is the environment of B , representing the producers and the consumers. There might be multiple producers and 9

F

r

B

w

Figure 1: A Program B and its environment F consumers or even a single process that is both the producer and the consumer; the exact number is irrelevant for the speci cation. The values that can be written into r; w are, again, irrelevant for speci cation. However we do postulate a special value, , which is written into a variable to denote that it is \empty," i.e., it contains no useful data. The protocol for reading and writing is as follows. Program F writes into r only if r = ; program B reads a value from r only if r 6= and then it may set r to . Program B stores a value in w only if w is ; program F reads from w and it may set w to to indicate that it is ready to consume the next piece of data. To formalize the access protocol of the last paragraph, we introduce an ordering relation, , over the data values and as follows: is \smaller than" all non- values, i.e.,

X

Then it follows that

Y Y

X = ^ Y 6= X= _ X=Y

X

Property P1, below, states that B removes only non- data from r and it may set r to . Property P2 states that B writes only non- values into w provided w is . In the following, R; W are free variables (of type, data or ). P1. r P2. W

R stable in B w stable in B

The access protocol speci cation as given by P1 and P2 is identical for all concurrent objects. The initial conditions can usually be speci ed for output shared variables; for this example P0. initially w = in B Next, we specify the safety properties|the e ect of reading inputs or how the outputs are related to the current state|and progress properties|the conditions under which inputs are read and outputs are written. These properties are speci c to each object. They are explored next for bags. 10

3.2 Internal State as an Auxiliary Variable (An Informal Speci cation)

In the following speci cation we introduce an auxiliary variable, b, that represents the bag of data items internally stored by B . Variable b is local to B . The size of b does not exceed N , a given positive integer (Property P3). Property P4 states that any item read from r is added to b or stored in w (and r is then set to ) and any item removed from b is stored in w (w was previous to this step); hence the union of r; b; w remains constant. Property P5 states that independent of the environment if the bag is nonfull (jbj < N ) an item, if there is any, will be removed from r and analogously, if the bag is nonempty (jbj > 0) w is or will become non- . Notation: All bags in this paper are nite. We treat r; w to be bags of size at most 1. Union of bags u; v is written as u + v; the bag u ? v is the bag obtained by removing as many items of v as possible from u. The value is treated as an empty bag. Combining with P0, P1, P2 de ned previously, we obtain

3.3 Speci cation of a Concurrent Bag of Size

There exists a variable b, local to B , of type bag such that P0. P1. P2. P3. P4. P5. initially b + w = in B r R stable in B W w stable in B jbj N invariant in B r + b + w constant in B h8 F :: (P5.1) jbj < N 7! r = (P5.2) jbj > 0 7! w 6=

N; N >

0

i

in B ] F in B ] F

Note: A program that implements B may not have a variable named b. It is merely required that b be computable from the internal variables of B ; in this sense, b is an auxiliary variable of B . Note: Since b is a local variable of B , it is not accessed by the environment F . Using Corollary 5 of Section 2.2, jbj N stable in F . Since jbj N is invariant in B , using Corollary 3 of Section 2.2, P6. h8 F :: jbj N invariant in B ] F i Observe that P5 is a property of program B ; it says that if B is composed with any program F then certain properties hold in the composite program, B ] F . In particular, P5 does not require F to obey the appropriate 11

protocols in accessing r; w. A di erent bag speci cation is given in Section 4.2 where F is assumed to access r; w appropriately.

3.4 Deducing Properties from the Speci cation

We deduce one safety and one progress property from the speci cation given in Section 3.3. P7. w = unless jbj < N _ r = in B In the following proof all properties are in B . jr + b + wj N + 1 constant , constant formation rule applied to P4 jr + b + wj N + 1 stable , a constant predicate is stable w = unless w 6= , antire exivity (see Section 2.1.5) w = ^ jr + b + wj N + 1 unless w 6= ^ jr + b + wj N + 1 , stable conjunction (Section 2.1.5) applied to the above two w = ^ jr + bj N + 1 unless jr + bj N , rewriting the lhs and weakening the consequence (Section 2.1.5) w = unless jr + bj N , substitution axiom applied to the lhs with invariant jr + bj N + 1 (see P6) w = unless jbj < N _ r = , consequence weakening (Section 2.1.5) P8. If w = stable in F then w = 7! jbj < N _ r = in B ] F In the following proof all properties are in B ] F unless otherwise stated. w = stable in F , given w = unless jbj < N _ r = in B , from P7 w = unless jbj < N _ r = in B ] F , Corollary 1 of Section 2.2 jbj > 0 7! w 6= , from P5.2 jbj > 0 ^ w = 7! jbj < N _ r = , PSP on the above two jbj 0 ^ w = 7! jbj < N , implication rule and using N > 0. w = 7! jbj < N _ r = , disjunction on the above two

2

12

F

r

B1

v

B2

w

Size N Size M B = B 1 ] B 2 has Size M + N + 1 Figure 2: Concatenation of bags B 1; B 2

3.5 Bag Concatenation

Let B 1 implement a bag of size M; M > 0; with input, output varibles, r and v, respectively, and B 2 implement a bag of size N; N > 0; with input, output variables v and w, respectively. We show that B = B 1 ] B 2 implements a bag of size M + N + 1 with input, output variables r and w, respectively. The arrangement is shown pictorially in Fig. 2. Observe that r and w are di erent variables (i.e., B 1; B 2 are not connected cyclically) and hence, r cannot be accessed by B 2 nor can w be accessed by B 1. We rst rewrite the speci cation from Section 3.3 for B 1; B 2. In the following, R; V; W are free variables. There exists b1, local to B 1, such that P0. initially b1 + v = in B 1 P1. r R stable in B 1 P2. V v stable in B 1 P3. jb1j M invariant in B 1 P4. r + b1 + v constant in B 1 P5. h8 G :: (P5.1) jb1j < M 7! r = in B 1 ] G (P5.2) jb1j > 0 7! v 6= in B 1 ] G There exists b2, local to B 2, such that initially b2 + w = in B 2 v V stable in B 2 W w stable in B 2 jb2j N invariant in B 2 v + b2 + w constant in B 2 h8 H :: jb2j < N 7! v = in B 2 ] H jb2j > 0 7! w 6= in B 2 ] H

i

i

The properties of B , to be proven, are (again from Section 3.3) There exists b, local to B , such that P0. initially b + w = in B P1. r R stable in B P2. W w stable in B 13

P3. jbj M + N + 1 invariant in B P4. r + b + w constant in B P5. h8 F :: (P5.1) jbj < M + N + 1 7! r = (P5.2) jbj > 0 7! w 6=

i

in B ] F in B ] F

We start the proof by de ning b to be b1 + v + b2. To show that b is local to B observe that b1 + v + b2 is constant in the environment F since none of the variables|b1; v or b2|can be accessed by F . Proof of P0: All properties in this proof are in B . initially b1 + v = initially b2 + w = initially b1 + v + b2 + w = initially b + w = Proof of P1: r constant in B 2 r R stable in B 2 r R stable in B 1 r R stable in B 1 ] B 2 Proof of P2: Similar to the proof of P1. Proof of P3: jb1j M invariant in B 1 ] B 2 jvj 1 invariant in B 1 ] B 2 jb2j N invariant in B 1 ] B 2 jb1 + v + b2j M + N + 1 invariant in B 1 ] B 2 Proof of P4: r + b + w = r + b1 + v + b2 + w r + b1 + v constant in B 1 b2 + w constant in B 1 r + b1 + v + b2 + w constant in B 1 r + b1 + v + b2 + w constant in B 2 14 , from P0 of B 1 , from P0 of B 2 , from the above two , de nition of b

2

, r cannot be accessed by B 2 , constant formation; a constant predicate is stable , from P1 of B 1 , from Corollary 1 of Section 2.2 2

2

, from P6 applied to B 1 (with F = B 2) , de nition of v , from P6 applied to B 2 (with F = B 1) , from the above three

2

, from P4 of B 1 , b2 and w cannot be accessed by B 1 , constant formation rule , similarly

r + b1 + v + b2 + w constant in B 1 ] B 2

, Corollary 2 of Section 2.2

2

Proof of P5: Consider any arbitrary F . Setting G to B 2 ] F in P5 for B 1 we get 1. jb1j < M 7! r = in B 1 ] B 2 ] F 2. jb1j > 0 7! v 6= in B 1 ] B 2 ] F Similarly, setting H to B 1 ] F in P5 for B 2 we get 3. jb2j < N 7! v = in B 1 ] B 2 ] F 4. jb2j > 0 7! w 6= in B 1 ] B 2 ] F We will only show P5.1 for B , i.e. jbj < M + N + 1 7! r = in B ] F

We use the following fact that is easily seen from the implication and disjunction rules for leads-to.

q 7! q p _ q 7! p _ q In the following proof all properties are in B ] F , i.e., B 1 ] B 2 ] F .

0 0

jbj < M + N + 1 ) jb1j < M _ v = _ jb2j < N 7! jb1j < M _ v = 7! jb1j < M _ r = 7! r =

, since b = b1 + v + b2 , above fact about 7! and (3). , above fact about 7! and P8 for B 1 (v = stable in B 2 ] F ) , above fact about 7! and (1).

3.6 A Note on the Bag Speci cation

It is interesting to note that the speci cation of Section 3.3 does not apply if N = 0, i.e., the internal space for storing bag items is empty. In such a case, we would expect program B to move data from r to w whenever r 6= and w = . However, the progress condition P5 becomes,

h8 F :: jbj < 0 7! r = jbj > 0 7! w 6= i

in B ] F in B ] F

Since N = 0, it follows (from P6 and the de nition of b) that jbj = 0 is an invariant in B ] F . Hence, using the substitution axiom, the antecedent of each progress condition is false . In UNITY, false 7! p, for any p. 15

Therefore, for N = 0 the speci cation provides no guarantees on progress. In particular, the speci cation allows a state r 6= ^ w = to persist. It may seem that the following strengthening of the progress speci cation could generalize the speci cation to N 0.

h8 F :: jbj < N _ w = 7! r = jbj > 0 _ r 6= 7! w 6= i

in B ] F in B ] F

However since nothing is assumed about F , program F could conceivably change r 6= ^ w = to r 6= ^ w 6= (by storing data in w), thus making it impossible for B to implement the rst progress condition (while satisfying P4: r + b + w is constant in B ). Thus, such a speci cation cannot be implemented unless we assume something more about the way F behaves, in particular that it never removes data from r nor stores into w. A speci cation along this line is given in the next section.

4 Alternative Speci cation of a Bag

4.1 Communication Sequences as Auxiliary Variables

We propose another bag speci cation in this section. We take as auxiliary variables the bags r and w, where b b r is the bag of data items written into r (and similarly w). Such a bag is typically de ned by augmenting the b b program text appropriately: whenever r is changed r is altered by adding the new value of r to it. As shown in b an example in Section 2.1.1 our logic provides a direct way of de ning r, w without appealing to the program b b text. In the following, R; W are free variables (of the same type as data) and X; Y are free variables that are of type bag. A0. initially (r,w) = (r; w) in B ] F bb A1. r = X ^ r = R unless r = X + r ^ r 6= R in B ] F b b A2. w = Y ^ w = W unless w = Y + w ^ w 6= W in B ] F b b To understand the relationship between r and r and (similarly for w and w), note that r remains unchanged b b b as long as r is unchanged and r may be modi ed by adding the (new) value of r to it whenever r is changed. b Furthermore, in our case, since the values of r will alternate between and non- , successive values assumed by r are di erent. (Note that whenever r is set to , r remains unchanged.) b We deduce two simple facts. (Here \ " is the subbag relation.) A3. r A4. w

r invariant in B ] F b w invariant in B ] F b

16

We prove A3; proof of A4 is similar. Proof of A3: All properties in the following proof are in B ] F .

r = X ^ r = R unless r = X + r ^ r 6= R b b , from A1 (r; r) = (X; R) unless (r; r) 6= (X; R) ^ r r b b b , consequence weakening r r stable b , Corollary 1 of Section 2.1.5 Initially r r because r = r. Hence r r invariant. b b b

4.2 Bag Speci cation

The following speci cation is for a bag of size N; N 0. The properties R1,R2 are the same as P1,P2. In order to overcome the problems described in Section 3.6, we make some assumptions about the environment, F . Speci cally, we require F to obey a similar access protocol to w; r as the ones obeyed by B for access to r; w: Environment F may write into r only if r = and it may change w|to |only if w 6= . The property R3 states that under these conditions on F , the bag w is a subbag of r ? r (property R3.1); these two bags di er b b by no more than N (property R3.2); if r; w di er by no more than N then r is or will be set to (property b b R3.3); if jrj exceeds jwj then w is or will be set to non- (property R3.4). b b R0. initially w = in B R1. r R stable in B R2. W w stable in B R3. h8 F :: B:hypo in F ) B:conc in B ] F i where B:hypo :: R r stable , w W stable B:conc :: R3.1 w r ? r invariant b b R3.2 jr ? r ? wj N invariant b b R3.3 jr ? wj N 7! r = b b R3.4 jrj > jwj 7! w 6= b b We show, in Section 4.3, that under certain conditions the proposed speci cation is a re nement of the speci cation in Section 3.3. Now we deduce a few properties from the given speci cation. Assuming B:hypo in

F

R4. R5.

r ? w + w constant in B b b r ? w ? r constant in F . b b

17

The proof of R5 is similar to that of R4 by interchanging the roles of r; w. We prove R4. Proof of R4: All properties in the following proof are in B .

r = X ^ r = R unless r = X + r ^ r 6= R b b r R stable r = X ^ r = R unless r = X + r ^ r R b b (r; r) = (X; R) unless (r; r) 6= (X; R) ^ r = X b b b r constant b

, union theorem on A1 , from R1 , stable conjunction on the above two , consequence weakening , constant introduction

Similarly, we can show|starting from A2,R2 and applying stable conjunction|that w ? w is constant in B . b Using the constant formation rule, r ? (w ? w) is constant in B . Since w b b w (from A4) and w r (from b b b R3.1, which can be assumed since B:hypo holds in F ), r ? (w ? w) = r ? w + w, which is then constant in B . b b b b

4.3 Proof of Re nement

We show that for N > 0, the speci cation of Section 4.2 is a re nement of the speci cation of Section 3.3, provided that B:hypo holds in the environment F . In the absence of such a requirement on F , the speci cation of Section 4.2 does not prescribe B 's behavior (that is when B is composed with an arbitrary F ) whereas the speci cation of Section 3.3 prescribes B 's behavior even when it is composed with an arbitrary F . Hence, our proof obligation for the re nement is to show that, given

B:hypo in F , N > 0 and the properties R0{R3 of Section 4.2

there exists b, local to B , satisfying properties P0{P5 of Section 3.3. We let

b=r?w?r b b

Observe that b is local to B since b is constant in F (from R5). Proof of P0: b + w = r ? w ? r + w b b initially (r; w) = (r; w). Hence initially b + w = . b b Proofs of P1, P2: directly from R1, R2. Proof of P3: jr ? w ? rj N invariant b b , from R3.2 jbj N invariant , using the de nition of b Proof of P4: r + b + w = r ? w + w b b r ? w + w constant b b , from R4 18

2 2 2 2

Proof of P5.1:

, implication The result follows using disjunction on the above two. The proof of P5.2 is similar to that of P5.1.

jbj < N ^ r 6= ) jr ? w ? rj < N ^ r 6= b b ) jr ? wj N b b 7! r = Hence, jbj < N ^ r = 7! r = 6 Also, jbj < N ^ r = 7! r =

, using the de nition of b , r r (A3) b , from R3.3

2

5 Speci cations of Concurrent Queue and Concurrent Stack

We consider two other concurrent data objects|queue and stack|in this section. Each has variables r; w which are accessed similarly as in the case of the bag. Our interest in these speci cations is mainly to show that each of these speci cations re nes a bag speci cation, i.e., each implements a bag (of the appropriate size).

5.1 A Speci cation of a Concurrent Queue

The operation of a concurrent queue is analogous to that of a concurrent bag. The important di erence is that in the former case the items are written into w in the same order in which they were read from r. We propose a speci cation analogous to that of Section 3.3. In the following \uv" denotes concatenation of sequences u; v and denotes the null sequence. A queue of size N , N > 0, is de ned by a program Q where: There exists a variable q, local to Q, of type sequence such that Q0. Q1. Q2. Q3. Q4. Q5. initially q = ^ w = in Q r R stable in Q W w stable in Q jqj N invariant in Q rqw constant in Q h8 F :: (Q5.1) jqj < N 7! r = in Q ] F (Q5.2) jqj > 0 7! w 6= in Q ] F

i

19

5.1.1 Queue Implements Bag

We show that from the speci cation of Section 5.1 we can deduce the speci cation in Section 3.3 for an appropriate b. Denote the bag of items in q by q]. Let b = q]. Proofs of P0,P1,P2 are trivial. Property P3 follows from Q3 and P5 from Q5 by noting that

jbj = j q]j = jqj:

To prove P4|that r + b + w constant in B |we observe (below all properties are in B )

rqw constant rqw] constant r] + q] + w] constant r + b + w constant

, from Q4 , constant introduction , rqw] = r] + q] + w] , r] = r; q] = b; w] = w

5.2 A Speci cation of a Concurrent Stack

The operation of a concurrent stack is distinguished from that of a concurrent bag by the requirements that the items read from r be pushed onto a stack (read is permitted only if there is room in the stack) and the item written into w at any point be the top of the stack which is then removed from the stack. A stack is seldom accessed concurrently because speed di erences between the producer|process that writes into the stack|and the consumer|that reads from the stack|a ects the outcome of the reads. We propose a speci cation, analogous to that of Section 4.2, for a program S that implements a stack of size N , N > 0. Let r; w denote the sequences of data items written into r; w respectively. Analogous to the de nitions of r; w by (A0,A1,A2) we de ne r; w in S ] F (where F is any arbitrary environment of S ) by b b (A0 ; A1 ; A2 ) below. As in Section 5.1, concatenations of sequences are shown by juxtapositions and denotes the null sequence.

0 0 0

(A0 ) initially (r; w) = (r; w) in S ] F (A1 ) r = X ^ r = R unless r = Xr ^ r 6= R in S ] F (A2 ) w = Y ^ w = W unless w = Y w ^ w 6= W in S ] F

0 0 0

Now, we de ne X N Y |a binary relation between sequences X; Y expressing that Y is the complete output sequence of a stack of size N given X as the input sequence. For N 0, it is the strongest relation satisfying

N X N X ; Y (N + 1) Y ) aXY (N + 1) X aY

0 0 0 0

20

where a is any arbitrary data item. The second rule states that given an input string aXY to a stack of size N +1, the item a appears in the output at some point. Prior to output of a, item a is at the bottom of the stack and hence, the stack behaves as if its size is N in converting some portion of the input, X , to X . Following the output of a, the stack is empty and hence the remaining input sequence Y is converted to Y using a stack of size N + 1. Several interesting properties|using induction on the length of X |can be proven about this relation. In particular,

0 0

X N Y ) X] = Y ]

where X ] is the bag of items in X . Now we specify program S that implements a concurrent stack of size N , N > 0, with shared variables r; w as before. Note that S3.1 is the only property that di ers substantially from the corresponding property of the bag in Section 4.2. S0. initially w = in S S1. r R stable in S S2. W w stable in S S3. h8 F :: B:hypo in S ) B:conc in S ] F i where B:hypo :: R r stable , w W stable B:conc :: fIn S3.1, r ? r is the pre x of r excluding r (if r = , r ? r = r). Also, v is the pre x relation over sequencesg S3.1 h9 z :: (r ? r) N z ^ w v zi invariant S3.2 jrj jrj + jwj + N invariant S3.3 jrj jwj + N 7! r = S3.4 jrj > jwj 7! w 6=

5.2.1 Stack Implements a Bag

We show that from the speci cation of a concurrent stack we can deduce the bag speci cation in Section 4.2 for N > 0. Note that r = r], w = w]. It follows that r ? r = r ? r]. Proofs of properties R0{R3 are entirely b b b b straightforward, except for R3.1: w r ? r. b b

w v z (r ? r) N z ^ w v z

(r ? r) N z

) r ? r] = z ] ) w] z ] ) w ] r ? r]

21

, property of the binary relation stated earlier , fact about building a bag from a sequence , from the above two

h9 z :: (r ? r) N z ^ w v z i ) w b

r?r b

, from the above using w] = w and r ? r] = r ? r. b b

We leave it as an exercise for the reader to show that a queue of size 1 implements a stack of size 1, and vice versa.

6 Re nement of the Bag Speci cation

We re ne the speci cation of Sec. 4.2 as a step toward implementing a bag. It can be shown from that speci cation that concatenation of two bags of size M; N , where M; N are non-negative, results in a bag of size M + N + 1; the proof is similar in structure to the proof in Sec. 3.5; also, a similar proof for a queue appears in 7]. Hence a bag of size N , N > 0, can be implemented by concatenating (N + 1) bags of size 0 each. Therefore we restrict our attention to bags of size 0 and propose a re nement. If a bag has size 0, the only possible action is to move data items from r to w directly. The e ect of this action is to keep r + w constant (property T3), and such an action is guaranteed to take place provided r 6= ^ w = resulting in w 6= (property T4). T0. T1. T2. T3. T4. initially w = in B r R stable in B W w stable in B r + w constant in B r 6= ensures w 6=

in B

6.1 Proof of the Re nement

We show that the properties R0{R3 of Sec. 4.2, for N = 0, can be deduced from T0{T4. Proofs of R0,R1,R2 are immediate from T0,T1,T2. Property R3 is of the form,

h8 F :: B:hypo in F ) B:conc in B ] F i

We rst show that T5: h8 F :: B:hypo in F ^ (T0 ^ T3) ) r = w + r invariant in B ] F i b b Proof:

r ? w + w constant in B b b r + w constant in B r ? w + w ? (r + w) constant in B b b r ? w ? r constant in B b b r ? w ? r constant in F b b

22

, from R4 , from T3 , constant formation , simplifying the above expression , from R5 assuming B:hypo in F

r ? w ? r constant in B ] F b b , Corollary 2 of Sec. 2.2 Initially r ? w ? r = b b in B ] F , from r = r and w = w = b b Hence r ? w ? r = invariant in B ] F , i.e., r = w + r invariant in B ] F . b b b b

Proof of R3: Consider an arbitrary F in which B:hypo holds. R3.1. Proof is immediate given T5. R3.2 Proof is immediate given T5 and that N = 0 R3.3 We have to show that jrj jwj 7! r = b b in B ] F . In the following all properties are in B ] F . jrj jwj ) r = b b , from T5 jrj jwj 7! r = b b , using implication rule on the above R3.4 We have to show that jrj > jwj 7! w 6= b b in B ] F r = R unless r 6= R in F , antire exivity of unless r = R unless r 6= R ^ r 6= in F , stable conjunction with R r stable (from B:hypo) r 6= stable in F , Corollary 1 of Sec. 2.1.5 r 6= ensures w 6= in B , from T4 r 6= ensures w 6= in B ] F , using Corollary 4 of Sec. 2.2 on the above two r 6= 7! w 6= in B ] F , de nition of 7! jrj > jwj 7! w 6= in B ] F b b , from T5 jrj > jwj r 6= b b

2

Next we prove the four properties in B:conc, from B:hypo in F , T1, T2, and r = w + r invariant in B ] F . b b

2 2

2

7 An Implementation

The speci cation of Section 6 can be implemented by a program whose only statement moves data from r to w provided w = (if r = , the movement has no e ect):

r; w := ; r

if w =

The proof that this fragment has the properties T1{T4 is immediate from the de nition of unless and ensures. The initial condition of this program is w = , and hence (T0) is established. An implementation for a bag of size N , N > 0, uses the union of N + 1 such statements: One statement each for moving data from a location to an adjacent location (closer to w) provided the latter is . We show how this program may be expressed in the UNITY programming notation. Rename the variables r and w to be b 0] and b N + 1], respectively. The internal bag words are b 1] through b N ]. In the following program we write h ] i : 0 < i N +1 :: t(i)i as a shorthand for t(1) ] t(2) : : : ] t(N +1), 23

where t(1), for instance, is obtained by replacing every occurrence of i by 1 in t(i). The program speci es the initial values of b 1] through b N + 1] to be (in the part followed by initially). The statements of the program are given after assign; the generic statement shown moves b i ? 1] to b i] provided the latter is .

Program bag fof size N , N > 0g initially h ] i : 0 < i N + 1 :: b i] = i assign h ] i : 0 < i N + 1 :: b i ? 1]; b i] := ; b i ? 1] end fbagg

if b i] = i

8 Evaluation of the Proposed Methodology

Equational notation, as in 2], is generally preferred for speci cations because it enables a programmer to separate two concerns: deducing facts about an object independent of its implementation, and implementing the given equations to minimize time or storage. As we have argued, the objects that are concurrently accessed are, as yet, not amenable to equational speci cations. An important research problem is to determine the conditions under which an object can be speci ed equationally, and yet accessed concurrently. Our speci cation of the bag illustrates how some of the requirements described in the introduction are met. The speci cation is concise even though it makes concurrent access explicit. The notation admits of e cient manipulation|see, for instance, the lengths of the re nement proofs in Sections 4, 5 and 6. We believe that one of the most important applications of formal speci cations is reasoning about an object; a cumbersome notation|such as a programming notation|defeats the main purpose of speci cation. It may seem from our speci cation that we have arti cially simpli ed the problem by introducing shared variables|r; w in case of bag, queue, and stack|through which the object communicates with this environment. How about multiple processes accessing an object through remote procedure calls, for instance? First, remote procedure call by a process is equivalent to the process storing the procedure parameters in certain (shared) variables which the object can later access; in this sense, we have not introduced any arti cial simplicity. Second, accesses by multiple processes|for instance, multiple producers writing into shared variables r:j , 0 j < L| can be treated within our theory as follows. Imagine that a sequencer process copies some r:j into r whenever r = ; the di erent r:j 's are chosen fairly during the course of the program execution. The index j could be part of the data written into r, if, for instance, the object needs to respond to a request in r:j by writing into w:j . (Similarly, outputs for di erent consumers may be sequenced.) The entire concurrent object may be viewed as a union of an input sequencer, data manager (the program that actually implements the data structures) and the output sequencer, and each of these may be speci ed individually. We have shown how to specify the data manager; speci cations of input/output sequencers are entirely straightforward. A direct speci cation of a queue with multiple input/output variables appears in 8]. 24

9 Acknowledgment

It was a suggestion from Leslie Lamport which led me to study this problem and to understand how auxiliary variables can be de ned and e ectively used. I am indebted to Ambuj Singh for suggesting simpli cations for my original speci cation; his work clearly reveals the importance of formal reasoning as I was unable to justify his simpli cations using intuitive arguments alone. Comments from the participants of the 9th International Summer School in Marktoberdorf, W. Germany, were most helpful; particular thanks go to Jan L. A. van de Snepscheut. Thorough and critical readings of di erent versions of the manuscript by Mani Chandy, Edgar Knapp and J. R. Rao have improved the presentation.

10 References

1. K. Mani Chandy and Jayadev Misra, Parallel Program Design: A Foundation, Addison-Wesley, Reading, MA, 1988. 2. John Guttag, \Abstract Data Types and the Development of Data Structures," C.ACM, 20:6, pp. 396{404, June 1977. 3. B. Hailpern, \Verifying Concurrent Processes Using Temporal Logic," Springer-Verlag, Lecture Notes in Computer Science 129, 1982. 4. Speci cation Case Studies, Ian Hayes (ed.), Prentice-Hall International, Englewood Cli s, N.J., 1987. 5. L. Lamport, \A Simple Approach to Specifying Concurrent Systems," Digital Systems Research Center Report 15, December 1986, 130 Lytton Avenue, Palo Alto, CA 94301. 6. J. Misra, \Monotonicity, Stability and Constants," Notes on Unity 10-89, University of Texas at Austin, 1989. 7. J. Misra, \Speci cations of Concurrently Accessed Data," Proc. International Conference on Mathematics of Program Construction, Groningen University, The Netherlands, June 26{30, 1989, LNCS 375, SpringerVerlag. 8. Ambuj Singh, \Ranking in Distributed Systems," (see Sec. 3.4, in particular), Ph.D. thesis, The University of Texas at Austin, 1989.

25