Some Techniques for Proving Correctness of Progrms which Alter Dt Structures R. M. Burstll Deprtment of Mchine Intelligence University of Edinburgh 1. INTRODUCTION Consider the following sequence of instructions in list-processing lnguge With roughly ALGOL 60 syntx nd LISP semntics (hd (hed) is LISP CAR nd tl (til) is LISP CDR). x: = cons(1, nil); y:=cons(2, x); hd(x): =3; (compre LISP RPLACA) print(x); print(y); The intention is s follows. x becomes the list (1) y becomes the list (2, 1) The hed (first element) of the list x becomes 3. Since y ws mnufctured from x it 'shres' list cell with x, nd hence is side-effected by the ssignment to hd(x). When x is printed it is (3) nd y when printed is (2, 3) rther thn (2, 1) s it would hve been hd the lst ssignment left it undisturbed. How re we to prove ssertions bout such progrms? Figure 1 trces the course of events in the trditionl picture lnguge of boxes nd rrows. Our tsk will be to obtin more forml mens of mking inferences, which, unlike the picture lnguge, will del with generl propositions bout lists. We will extend Floyd's proof system for flow digrms to hndle commnds Which process lists. The principles which pply to lists would generlise in strightforwrd wy to multi-component dt structures with shring nd circulrities. Although this technique permits proofs, they re rther imperspicuous nd fiddling for lck of pproprite higher level concepts. Investigting the specil cse of liner lists in more depth we define 'the list from x to y' nd consider Systems of such lists (or perhps we should sy list frgments) which do not 23
PROGRAM PROOF AND MANIPULATION x: = cons (1, nil); 1 nil y: = cons (2, x) 2 1 nil hd (x): =3 2 3 nil Figure 1. shre with ech other or within themselves. (By liner list we men one which either termintes with nil, such s LISP (A, (B, C), D), or is circulr; by tree we men list structure which termintes with toms rther thn with nil, such s LISP ((A. B). (C. D))). We thus get rther nturl wy of describing the sttes of the mchine nd the trnsformtions on them nd hence obtin esy proofs for progrms. Some ides from the ppliction of ctegory theory to tree utomt help us to extend this tretment from lists to trees: frgments of lists or trees turn out to be morphisms in n pproprite ctegory. Acquintnce with ctegory-theoretic notions is not however needed to follow the rgument. Our im hs been to obtin proofs which correspond with the progrmmer's intuitive ides bout lists nd trees. Extension to other kinds of dt structures wits further investigtion. 2. PREVIOUS WORK Since McCrthy (1963) rised the problem number of techniques for proving properties of progrms hve been proposed. A convenient nd nturl method is due to Floyd (1967) nd it hs formed the bsis of pplictions to non-trivil progrms, for exmple by London (1970) nd Hore (1971). Floyd's technique s originlly proposed delt with ssignments to numericl vribles, for exmple, x: = x + 1, but did not cter for ssignments to rrys, for exmple, [i]:=[j]+ 1, or to lists, such s t/(x): =cons(hd(x), 11(11(x))). McCrthy nd Pinter (1967) del with rrys by introducing 'chnge' nd 'ccess' functions so s to write [i]:=[j]+1 s : = chnge (, i, ccess 24
BURSTALL (, j)+ 1), treting rrys s objects rther thn functions. King (1969) in mechnising Floyd's technique gives method for such ssignments which, however, introduces cse nlysis tht sometimes becomes unwieldy. Good (1970) suggests nother method which distinguishes by subscripts the vrious versions of the rry. We will explin below how Good's method cn be dpted to list processing. Although the proofs mentioned bove by London (.1970) nd Hore (1971) involve rrys they do not give rigorous justifiction of the inferences involving rry ssignments, which re rther strightforwrd. List processing progrms in the form of recursive functions hve received ttention from McCrthy (1963), Burstll (1969) nd others, but quite different problems rise when ssignments re mde to components of lists. This ws discussed in Burstll (1970) s n extension to the xiomtic semntics of ALGOL, but the emphsis there ws on semntic definition rther thn progrm proof. Hewitt (1970), Chpter 7, touches on proofs for list Processing progrms with ssignments. J. Morris of Berkeley hs lso done some unpublished work, so hve B.Wegbreit nd J.Poupon of Hrvrd (Ph.D. thesis, forthcoming). 3. FLOYD'S TECHNIQUE Let us recll briefly the technique of Floyd (1967) for proving correctness of progrms in flow digrm form. We ttch ssertions to the points in the flow digrm nd then verify tht the ssertion t ech point follows from those t ll the immeditely preceding points in the light of the intervening commnds. Floyd shows, by induction on the length of the execution pth, tht if this verifiction hs been crried out whenever the progrm is entered with stte stisfying the ssertion t the entry point it will exit, if t ll, only with stte stisfying the ssertion t the exit point. The rules for verifiction distinguish two cses: tests nd ssignments. (1) A triple consisting of n ssertion, test nd nother ssertion. thus Si YES S2 is sid to be verified if Sli PF AS2, tht is, ssertion S2 is deducible from Si nd test P using some xioms A, sy the xioms for integer rithmetic. If 52 IS ttched to the NO brnch the triple is verified if SI, IPI-AS2. (2) A triple consisting of n ssertion, n ssignment nd nother ssertion, thus 25
PROGRAM PROOF AND MANIPULATION Si I: =E S2 where / is some identifier nd E some expression, is sid to be verified if S1 FA [S2]f where [S2], mens the sttement S2 with E substituted for I throughout. (This is clled bckwrd verifiction by King (1969); it is vrint of Floyd's originl method, which introduces n existentilly quntified vrible.) We will here retin the inductive method of Floyd for deling with flow digrms contining loops, but give methods for coping with more complex kinds of ssignment commnd. 4. EXTENSION OF THE VERIFICATION PROCEDURE TO ARRAY ASSIGN MENTS Consider the following commnd nd ssertions 1,i< 9 nd for ll x, such tht 1 <x<100, (x)=x (((i)+1)x(i+1)):=0 (ixi+2xi+1)0(i) The second ssertion does hold if the first one does, but the verifiction rule given bove for ssignments to numeric vribles, such s j: =2 x j, is indequte for rry ssignments such s this. Thus ttempts to substitute 0 for (((i)+1)x(i+1)) in (ix i+2 x i+1)0(i) merely leve it unchnged, but the unchnged ssertion does not follow from the first ssertion. (Floyd's version of the rule, using n existentil quntifier is eqully inpplicble.) Following Good (1970), with slightly different nottion, we cn overcome the difficulty by distinguishing the new version of the rry from the old one by giving it distinct symbol, sy '. We lso mke explicit the fct tht other elements hve not chnged. We thus ttempt to show tht 1 <i<9, (Vx)(1<x<100(x)=x), '(((i)+1)x(i+1))=0, (Vy)(y0((i)+1)x(i+l)'(y)=(y))1-. rith '(ix 1+2 xi+1)0d(i). Once we note tht ((i)+1)x (i+1) is (1+1)2, s is (ix 1+2 x i+1) nd tht, since 1<i, (1+1)201, we hve '(ix i+ 2 x i+ 1)=0 nd '(0=i, so 26
IIURSTALL '(0> 0. The distinction between nd ' nd the condition tht ll elements which re not ssigned to re unchnged reduce the problem to elementry lgebr. 5. COMMANDS, CHANGE SETS AND TRANSFORMATION SENTENCES The following technique is vrint of the one given by Good (1970). He uses it for ssignments to rrys but not for list processing. With ech ssignment commnd we ssocite set of identifiers clled the chnge set nd set of sentences clled the trnsformtion sentences. (Rules for specifying these re given below.) The bsic rule of inference is: If set S of sentences re written before commnd C, nd C hs set T of trnsformtion sentences, then we my write sentence U fter C if S, TI-U', where U' is U with ech identifier i in the chnge set of C replced by i'. (1) Simple ssignment Commnd: I:=E e.g. i: =i+j Chnge set: {/} {i} Trnsformtion sentences: /'=E =i-fj Exmple: i>0 j>0 i:=1-fj 1>1 1>1 is legitimte t becuse i> 0, j> 0, i'=i+j1-i'> 1. Note. By AFB we men B is provble from A using xioms of rithmetic or other xioms bout the opertions of the lnguge. Note. Vribles (identifiers) in the progrm become constnts in the logic. (2) Arry ssignment Commnd: A[E1]:=E2 e.g. [[i]]:=[i]+[j] Chnge set: (A) () Trnsformtion sentences: A 1(E1)=E2 '((i))=(i)+(j) (ix)(x0e1 A'(x)=A(x)) (Vx)(x0(i)'(x)=(x)) (3) List processing () Commnd: hd(e1):=e2 e.g. hd(t1(hd(i))):=hd(i) Chnge set: {hd) {hd} Trnsformtion sentences: hd'(e1)=e2 hd'01(hd(i)))=hd(i) (Vx)(x Eihd'(x)=hd(x)) (Vx)(x tl(hd(i))hd'(x)=hd(x)) (b) Commnd: 1/(E1):=E2. s for hd 27
PROGRAM PROOF AND MANIPULATION (c) Commnd: I: = cons(eli E2) Chnge set: {I, used, new} Trnsformtion sentences: new'oused used' =usedk.) (new') hd(new' )= tl(new')=e2 I' =new' e.g. i: =cons(2, j) {1, used, new} new'oused used' =usedu {new') hd(new')=2 tl(new')=j i' =new' Note. We ssume E1 nd E2 do not contin cons. Complex cons expressions must be decomposed. It should be cler tht the choice of two-element list cells is quite rbitrry, nd exctly nlogous rules could be given for lnguge llowing 'records' or 'plexes', tht is vriety of multi-component cells. 6. CONCEPTS FOR LIST PROCESSING We could now proceed to give exmples of proofs for simple list processing progrms, but with our present limited concepts it would be difficult to express the theorems nd ssertions except in very d hoc wy. We wnt to tlk bout the list pointed to by n identifier i nd sy tht this is distinct from some other list. We wnt to be ble to define conveniently such concepts s reversing or conctenting (ppending) lists. To do this we now specilise our discussion to the cse where cons, hd nd t/ re used to represent liner lists. Such lists terminte in nil or else in cycle, nd we do not llow toms other thn nil in the til position. Thus we exclude binry trees nd more complicted multi-component record structures. First we define the possible sttes of list processing mchine. A mchine is chrcterised by: C, denumerble set of cells nil, specil element A, set of toms (C, {nil} nd A re disjoint), function from finite subsets of C to C, such tht (X) 0 X, function for producing new cell. It is convenient to write XO for Xu {nil} where Xg. C. A stte is chrcterised by: Uc C, the cells in use hd: U--0 Au Uo 11: U-+ UO(thus we re tlking bout lists rther thn binry trees) Let X* be the set of ll strings over set X, including the empty string which we write s 1. Also let T= {true, flse}. It is convenient to define unit strings over Au UO thus Unit: (Au Uo)* Unit (cr).:*.cr e (Au U0) We cn now ssocite set 2 of triples with stte. (cc, u, v) e 2 mens 28
BURSTALL tht is list from u to v (or perhps we should sy list frgment from u to v). Thus c (Au Uo)* x U0 x U We shll write u )v s mnemonic bbrevition for (, u, v), hd u for hd (u) nd tl u for 11(u). We define 2 inductively, thus: (i) 2 for ech u e U0 P (ii) if u--ve 29 nd v w 2 then u.-+w e hdu (iii)u-+ t/ue2forechu U. l23 453 1 67 For exmple, in figure 2, X-4y, y-9, nd z * nil re ll in 2. 3 4 5 Figure 2. Identity lists re defined thus Identity: Identity (u, v)<=ne = 1 A prtil opertion of composition (.) between lists is defined thus :2x2-+T p ft (14-40 (1)-+W)=14 W Thus list from u to v cn be composed with one from v' to w if nd only if Vry'. The reder fmilir with ctegory theory will notice tht 2 forms ctegory with U s objects nd the lists (triples in 2) s morphisms. (A definition of 'ctegory' is given in the Appendix.) Indeed it is the free ctegory hdu generted by the grph whose rrows re the lists u + 11 u for ech u. There is forgetful functor from 2' to (A u U )*, where the ltter is regrded s ctegory with one object. This functor gives the string represented by list frgment (cf. Wegbreit nd Poupon's notion of covering function in the unpublished work mentioned on p. 25). 29
PROGRAM PROOF AND MANIPULATION We define the number of occurrences of cell in list by induction (5: Ux..T-4N (i) 6(v-,v)=0 (ii) 6,4(x-o t/ x-02 w)=6,(t1 x:-,w)+1 if x=u =3 (t1 x-+w) if xou It follows by n obvious induction tht 6 (A p)=3 (A)+6 (p). To keep trck of the effects of ssignments we now define reltion of distinctness between lists, tht is, they hve no cells in common. Distinct: 2' x 21-4T Distinct(A, 1t)<=15 (A) =0 or 5p) =0 for ll u e U. We lso define property of non-repetition for lists Nonrep: 21-+T Nonrep(A)<*15 (A)< 1 for ll UE U Lemm 1 1 (i) Distinct(A,u-+u) for ll A nd u (ii) Nonrep(u-4u) for ll u (iii) Distinct(A, p) nd Nonrep(A) nd Nonrep(p) = Nonrep(A p), if A p is defined. Proof. Immedite. We re now equipped to stte the correctness criterion for simple list processing progrm nd to supply nd prove the ssertions. Consider the problem of reversing list by ltering the pointers without using ny new spce (figure 3). We first need to define n uxiliry function to reverse string rev: X*-4 X* rev (1) =1 rev (x)=rev()x for XE X, e X* Before After nil nil Figure 3. 30
BURSTALL j =RE VER S E(k) Assume rev: U*4 U* reverses strings. K is constnt string. ':=nil 1 K -k-nil e Nonrep(k3ni1). IC --(343)[k->ni1e 2.j3nile 2. Nonrep (k-> nil). Nonrep (j-> nil). Distinct(k- > nil,] -1-3>nil). rev ()fl =rev (K)]. YES rev(k) j --> nil e 2. := ilk (314)[k4t1 k e 11 k-; nil e 2. Anil e 2. Nonrep k->ni1). Nonrep(j->ni1). Distinct(k-> ilk, ii k-3ni1). P 1 Distinct (t1 k-> nil,)->ni1). Distinct (j->nil, k-> k). Unit (k411 k). rev (1413 =rev (K)]. 1 P - -- (3143)[k-> tike 2. i;nil e 2. j-> nil e 2. NonrepOni1). P 1 Nonrep (j--)ni1). Distinct (k-> ilk, i-)nil). P P 1 Distinct (13 nil, j->ni1). Distinct (j->nil, k-> tl k). i Unit (k->t1 k). rev ()113 =rev (K)]. ilk =j I 1 P 6- ---(314)[i->nil e 2. k->j e 2. p> nil e 2. Nonrep(i->n11). j:=k k:=i Nonrep (j-> nil). Distinct(k4j, i->nil). Is P : Distinct(i3nil,j-)ni1). Distinct (j-nil, k-)j). : Unit (k--)j). rev ()1,3 =rev (K)]. Figure 4. Reversing list. 31
PROGRAM PROOF AND MANIPULATION By well-known methods of induction on the length of the string (structurl induction) we cn prove simple lemms such s rev (i3)=rev(13) rev() for; fie X* Notice the distinction between the function rev which works on strings nd hs esily proved properties nd the function or procedure REVERSE to reverse lists using ssignment, which we re bout to define. The ltter is essentilly function from mchine sttes to mchine sttes, where mchine stte is chrcterised by the two functions hd nd The flow digrm for REVERSE with ssertions is given in figure 4. Notice tht the ssertions re long nd tedious. We cn verify the ssertions by using the techniques given bove, distinguishing between d nd ti' nd consequently between List, Distinct nd List', Distinct'. The verifiction proofs re quite long for such simple mtter nd very boring. We will not wery the reder with them; insted we will try to do better. 7. DISTINCT NON-REPEATING LIST SYSTEMS We will now gther together the concepts introduced so fr into single notion, tht of Distinct Non-repeting List System (DNRL System). Using suitble bbrevited nottion we cn render the ssertions brief nd perspicuous. To mke the verifiction proofs eqully ttrctive we show how the vrious kinds of commnds, nmely ssignment to hed or til nd cons commnds, correspond to simple trnsformtions of these systems. We cn prove this once nd for ll using our previous technique nd then use the results on vriety of progrms. We define Distinct Non-repeting List System s n n-tuple of triples 1i, 1=1,.. n, such tht (i) Ai e 2' for ech i= 1,.. n (ii) Nonrep(1i) for ech i = 1,...,n (iii) If j0 i then Distinct(2,, 2) for ech i, j = 1,.. n It is cler tht if S is DNRL System then so is ny permuttion of S. Thus the ordering of the sequence is immteril. We should not think of S merely s set of triples, however, since it is importnt whether S contins triple x-+y once only or more thn once (in the ltter cse it fils to be DNRL System unless =1). (S1 CC2 2k-1 s 2 Abbrevition. We will write ui-4u2-u3. -4 uk, for u1-4u2, u2-4143, k - 14_1 U. We lso write *S for 'S is DNRL System'. For exmple, n bbrevited stte description for the stte shown in figure 2 is ls s 4s s l s *(X u-*y --+ u, z--q1 w--41 z) or less explicit description p y (3fly(5)(*(x-,u-9,-)u, z-)nil) nd *(w--nil) nd Atom()) 32
BURSTALL j=reverse(k) Assume rev: (Au U)*->(Au U)* reverses strings. K is constnt string.. Assume, b, c.., Ay re existentilly quntified before ech ssertion. fi *(k3 nil, j-> nil). rev ()fl =rev (K). y p rev(y)fl=rev(k). Unit(). 7 P *(k-)i->nil, nil). rev(y)fl=rev(k). Unit(). fl 7 *(k3j->nil, rev(y)l3=rev(k). Figure 5. Reversing list. 33
PROGRAM PROOF AND MANIPULATION Figure 5 shows the REVERSE progrm gin with much pithier ssertions. We will now consider how to verify ssertions such s these pinlessly. The following proposition which enbles us to mnipulte DNRL Systems follows esily from the definitions bove. Proposition 1 (i) Permuttion *S*S' if S' is permuttion of S. (ii) Deletion *Pi, An) *(22, An) (iii) Identity *Pi, An) *(u--+u, Ai, An) (iv) Composition p p *(u--)v-w, 11, Ali. An) nd conversely s *(14-4W, Alp.0 An)(3v)*(u-ov-+w, An) (v) Distinctness p *(u-*v, u-+w)cc= 1 or ig =1 (vi) Inequlity *(uv) nd u v(3bi3)(unit(b) nd cc =0). Proof. (i) nd (ii) Immedite from the definition of *. (iii) By Lemm 1 (i) nd (ii). (iv) By Lemm 1 (iii). (v) lf 1 nd $1 then 6.(u-- v)= 1 nd (5 (u-4w)=1 so they re not distinct. (vi) By definition of 2'. We re now ble to give the trnsformtion sentences ssocited with the vrious commnds, in terms of * rther thn in terms of hd nd 11. They enble us to verify the ssertions in figure 5 very esily. The trnsformtion sentences ll involve replcing or inserting component of *-expression, leving the other components unchnged. They re displyed in figure 6. We will mke some comments on these trnsformtion sentences nd how to use them. Their correctness my be proved using the trnsformtion sentences given erlier for hd nd ti nd the following lemm. Lemm 2. Suppose for ll y x, hd'y=hd y nd tl'y. = Then for ll A e 2' such tht 3(A)=0 we hve A e 21' nd 5(A) =ö(a)for ll y. Corollry. Under the sme suppostion if A, /I e 2' nd 5n(2)=0 nd 3 04)=0 then Nonrep(A)Nonrep'(A) nd Distinct(A, p)distince(a, Proof. By n obvious induction on 2'. The rule for ssignment to simple identifier is s before. Trnsformtion sentences re given for the YES nd No brnches of test, even though these do not lter the stte (their chnge sets re empty). 34
BURSTALL Assume n, Ai,..., An, x, y, re universlly quntified. The trnsformtion sentences re shown to the right of the commnd. We ssume tht E, E1 nd E2 do not contin cons. II:=E I Chnge set = {I} 1' =E Chnge set = { } =E2 YES Chnge set ={ E1 E2. NO hdei:=e21 Chnge set ={17d,..., A ) nd Unit()=. E2 *'(El 4Y, Al f 'I An) Chnge set ={ ti, * *(E1-)y, A1, An) nd Unit() ii : = E2 *'(E14E2, A1, An) E:=cons(Eb E2) Chnge set ={*, I} An) *'(E ->E2, A1, An) Figure 6. Trnsformtion sentences for stte descriptions. 35
PROGRAM PROOF AND MANIPULATION Consider for exmple the NO brnch of the test 'lc =nil' in the REVERSE progrm of figure 5. Before the commnd we hve (3211)r rev()f3=rev(k)] (1) The trnsformtion sentence, putting n =1, is konil. (2) After the commnd, we hve 7 (3yf3)[*(k->nil,j-mil). rev(y)i3=rev(k). Unit()] (3) But (3) follows immeditely from (1) nd (2) using Proposition 1 (vi). In generl, Proposition 1 will be used to reorder or otherwise mnipulte the *-expressions nd if E1 or E2 contin references to hd or ti these will need to be removed by replcing them by E1 nd E2 using Proposition 2. Still, the verifiction is quite trivil. Consider nother exmple from the REVERSE progrm, the commnd 'ilk: =j'. Before the commnd, we hve 7 (3yfl)[*(k--4-)ni0->ni1). rev(y)fl=rev(k). Unit()] (1) The trnsformtion sentence, putting n =2, is *(k--)y, 21, 22) *'(k->j, 21,12), for ll, y, Ali 12 (2) Rewriting the sttement fter the commnd with *' for *, we hve p (9y13)[*'(k-+j--+nil, rev(y)(213 =rev(k)] (3) We must prove this from (1) nd (2). Combining (1) with (2) we get 7 (3y13)[*1(k-V, rev(y)13=rev(k). Unit()] ' But permuting the *' expression (Proposition 1 (i)) nd using obvious properties of rev we get (3). The sentences for hd nd cons re used in n nlogous wy. Becuse their mening chnges in reltively complex wy it is dvisble to debr hd nd ti from ppering in stte descriptions nd work in terms of * lone. We now consider how to reduce n expression involving hd or tl with respect to stte description so s to eliminte references to these. For exmple the expression hd(t1(t1(0)) with respect to the stte description c..) with Unit(), Unit(b), Unit(c) reduces to c. If such n expression occurs in trnsformtion sentence we reduce it to obtin n equivlent trnsformtion sentence not involving hd or ti. The reduction of n expression E with respect to stte description D, written E, is defined recursively by (i) If E is n identifier, constnt or vrible then E=E (ii) If E is hd El nd D contins *(EI x,.) nd Unit() then f= 36
BURSTALL (iii) If E is 11E1 nd D contins *(E\--* x,.) nd Unit() then E=x. Proposition 2. D t= E. The proof is strightforwrd by induction on the structure of E, using the definition of *. 8. EXAMPLES OF PROOFS FOR LIST PROCESSING The trnsformtion sentences cn best be understood by seeing how we use them in proving some simple list processing progrms which lter the list structures. Although these progrms re short their mode of opertion, nd hence their correctness, is not immeditely obvious. We hve lredy looked t the progrm REVERSE. Another exmple, involving cons, is conctention of two lists, copying the first nd terminting it with pointer to the second, see figure 7. Notice tht the input lists need not necessrily be distinct. The other, destructive, method of conctention is to overwrite the finl nil of the first list with Pointer to the second. In this cse our initil condition would hve to be ensuring distinctness. Our next exmple, figure 8, is reversing cyclic list, where insted of terminting with nil the list eventully reches the strting cell gin. The next exmple, figure 9, involves list with sub-lists. A list of lists re to be conctented together nd the REVERSE nd CONCAT routines lredy defined re used. The suspicious reder my notice tht we re mking free with.' in the ssertions, but we cn lwys replce (s1, by, sy, (si)7.,1 or sequence(s,l,n), so nothing mysterious is involved. Some of the ttrctions of such proofs seems to come from the fct tht the form of the ssertions is grph-like nd so (t lest in figures 6, 7 nd 8) strictly nlogous to the structures they describe. 9. TREES We will now pss from lists to trees. We will consider generl 'record' or plex' structures with vrious kinds of cell ech hving specified number of components, s for exmple in Wirth nd Hore (1966). A prticulr cse is tht of binry trees. Thus, insted of hd: U--+Auuo U UO we now llow hd: U--+AuU 11: U AuU (nil hs no specil plce in binry trees; it cn be considered n tom like ny other). More generlly we my replce trees built using cons with trees (terms or expressions) built using ny number of opertors with rbitrry numbers of rguments, corresponding to different kinds of records. 37
PROGRAM PROOF AND MANIPULATION m=concat(k,l) K nd L re constnt strings of toms, nd b units. j: =k YES in: = cons (hd j, nil) j: =tij i:=171 *(k-i.> nil). *(1-14ni1). *(k->nil). *(13ni1). K=1. m: =I L see EXIT below p L *(k4x4ni1). *(I->nil). pig = K. 1 P L *(k->j->nil). *(1->ni1). l3= K. 1 ft *(k4 j-.> nil, m->ni1). L *(I->nil, fl = K. e If *(k->j->nil, i ->x). L *(19nil, in? ix). fl = K. *(k->nil, i 4 *(I->nil, m->i->x). ow= K. i: =I!E XIT K K KL --*(k-> nil, m41). *(m4i-->nil). by *(k3j-)nil, in-> i ->x). L :0(0 nil, n13 i -)x). by= K. 11 = cons (hd j, nil) by b *(k->j->nil, m-> 4y). L b j: =11j *(I->nil, m31-3y). by = K. I:= tl Figure 7. Conctention. 38
BURSTALL j=cyclicreverse(1) (*(/->ni/) nd L=1) or (*(1-->I) nd L=). YES j:- rev(l) *(g1). L=. *(1:13k41). rev () =rev (L). p *(j-314)c,k41). rev ()fl =rev(l). /i *(j-)l >x). fici =rev (L). - # b 7 *(F> 14x, k-->y -)1). rev(by)fl =rev (L). # b *(j413x, k > i-->1). rev (by)fl =rev (L). ---*(k;$ax, 41). rev (by)fl =rev (L). tl I: =j *(j rev(l) Figure 8. Reversing cyclic list. 39
PROGRAM PROOF AND MANIPULATION 1=M ULTICONCAT(i) Assume N, C1,. CN, Aj, AN re constnts. =RE VE RS E(i) j:=i : =nil C CN At AN nil, Cir> nil). rev(cf. CN) At AN * nil, Cr> nil,. CN.->ni1). 1 rev(ci...cn) At AN " *(j3j > nil, Ci--> nil,. Cpr> CN.. Cm Cm-1..C1 Ary )J >nil,l )nil, At AN CN-->ni1). 11 I:=CONCAT(hd j,1) =tlj CN..Cm Cm- I 2..Cl Am... AN - > j > x --> ni1,1 > nil, CN.. Cm Cm-1 Cm- 2.. Ct At. AN C1--> nil,. C >ni1). Am I Am...AN x-4 nil, 1 > y > nil, At C1-)'nil,.. AN CN-)nil). CN...C1 At AN *(i nil,! ---> nil, C r-> nil). 1:=REVERSE(i) *(i C1...CN AI AN Af AN > ni1,1--)nil,cr> Figure 9. Multiple conctention. 40
BURSTALL Our previous discussion depended rther hevily on the conctention opertion for lists. Fortuntely Lwvere (1963) with his notion of 'Free Theory' hs shown how conctention cn be extended from strings to trees (see lso Eilenberg nd Wright 1967). His ctegory theory pproch is convenient in view of the remrk in n erlier section tht 2 forms ctegory. We will not present our ides in ctegory-theoretic lnguge, since this my not be fmilir to some reders, but will be content to point out where the concept of ctegory is pproprite. Our first tsk is to define system of trees or expressions with composition opertion, nlogous to the strings used bove. We will then pply them to tree-like dt structures under ssignment commnds. Let 12 = tfln), n=0, 1,.. be sequence of sets of opertors, (xi}, i = 1, 2,.. sequence of distinct vribles, nd let X, = We define sequence of sets of terms (or trees) T =[Tn(X,)},m =0, 1... TrI(X,,,) mens ll terms, in the usul sense of logic, in the opertors fl nd vribles X,. We cn regrd Tn(X, ) s subset of the strings (Uf2uX,)* nd define it inductively thus (i) x etn(x,) if x e X, (ii) coti t e Tn(X, ) if ) e 12 for some n nd th t e (Algebriclly speking Tn(X,) forms the word lgebr or generic lgebr over f2 with generting set X,.) If t e Tn(X,) nd (Si, s,) e Tn(X )m, m 0, by the composition t (s1,. s,) we men the term in T(X) obtined by substituting si for xi in t, 1,. To preserve the nlgy with our previous tretment of strings we would like to mke composition ssocitive. For this we consider n-tuples of trees. Thus, more generlly, if Oh. ti) e Tn(X,)I nd (Si,. s,) e we define the composition by (ti, ti) (si, s,)=(ti (si,., sm), ti (si,, Sin)). This composition is in T0(S,)1. For exmple if C22= {cons}, C20= {, b, c,..}, nd we llow ourselves to write prentheses in the terms for redbility (cons(xli ), cons(x2, xi), b) E Tn( Xz)3 (c, xi) e XI )2 nd their composition is (cons(c, ), cons(xi, c), b) e T(X1)3. The composition is now ssocitive nd (xi, x) T0(X) " is n identity for ech n. It is, however, prtil opertion (compre mtrix multipliction, which is only defined for mtrices of mtching sizes). We see now tht the disjoint union of the Tn(X ) forms ctegory, Tn, with s objects the integers n=0,1,.. nd with set of morphisms Tn(X )"' from in to n for ech m, n. Indeed, this is just wht Lwvere clls the Free Theory on f2. Eilenberg nd Wright (1967) use it to extend utomt theory to trees s well s strings, giving ctegory theory presenttion (see 41
PROGRAM PROOF AND MANIPULATION lso Arbib nd Giv'eon (1968)). The ctegory T0 replces the monoid (Au U)* used in our tretment of lists, the min difference being tht the composition opertion, unlike string conctention, is only prtil. The strings over n lphbet E cn be regrded s specil cse by tking f/1 to be E nd S20 to be {nil}. In the Appendix we summrise the bstrct definition of Free Theory s given by Eilenberg nd Wright. We will not use ny ctegory theory here, but it is perhps comforting to know wht kind of structure one is deling with. We will consider fixed n nd X for the time being nd write T, for Tri(X )m, the set of m-tuples of terms in n vribles. If co e n. we will tke the liberty of writing co for the term coxi x in T1. This bbrevition my pper more nturl if we think of n element r of T1 s corresponding to the function txi x t from T0(25)" to T(Ø). For exmple cons is short for 2x1, x2 cons (xi, x2), but not of course for 2x1, X2. cons(x2, We will write 1 for the identity (x1,.. x ) e T nd 0 for the 0-tuple o G To,,. We will sometimes identify the 1-tuple (x) with x. 10. STATE DESCRIPTIONS USING TREES We cn now use the m-tuples of terms, T,, just s we previously used strings to provide stte descriptions. Suppose now tht ech co e n =0, 1, corresponds to clss of cells U0 (records) with n-components nd tht these components cn be selected by functions Sr: e UQ), 1=1,. n We put U=U U0,:co e US1). For exmple if n2 = { cons}, Slo= A nd Q,=Ø for 0)00 or 2 then 6cions is hd: Ucons--. U H _ cons A 32"'s is th cons--* UconsU A (here we hve put U,, ={) for ech e A, nd Ucons is just our previous U, the set of list cells). A stte is defined by the U nd the br nd we ssocite with it set of triples, just s the set 2 ws previously ssocited with stte. If triple (r, u, v) is in.9" then for some m, n, u e Urn, v e U" nd r e T,. As before we write u-+v for (r, u, v). We define.t inductively thus: (i) If e T, is (xi xi,,,) nd if e {1,..., n} for j=1, m (tht,is, r involves only vribles nd not opertors) then (ui. ) is in T (ii) If u---ov e nd v 'w e 9" then their composition (u + v) (v--0 w) is T 41 defined s u-+w, nd it is in ". (t1) (to t, to (iii)if(u1) v,.. (10 0 v re in.9- then (ui,. u )--4 v is in (iv) If co e S2 nd br(u)= vi for i =1, n then (u)-- (vi, v ) is in 9. 42
BURSTALL Tking the cse where f2 consists of cons nd toms, (iv) mens tht cons (u) (v, w) E g. if hd(u)=v nd 11(u) =w, lso tht ()-() is in.9" for ech tom e A. An exmple will mke this clerer. In the stte pictured in figure 105" contins the following triples, mongst others, (i) cons(cons(cons(c,d),),cons(cons(c,d),b)) cons(cons(xl,),cons(xi,b)) (ii) (j) cons(c,d) (iii) (j)--- cons(cons(xl,),x 2) (iv) ( (j, k) cons(x 1,x i) (V) (1)---4(1) The first of these is the composition of the second nd third. Figure 10. It will be noticed tht 9" forms ctegory with objects u for ech u e U", n. 1,.... The triples (T, u, v) in.9" re the morphisms from to v. There is forgetful functor represents: g".-'. We cn now define Distinct Non-repeting Tree System nlogous to Distinct Non-repeting List System. We tke over the definitions of Distinct- ness nd Non-repetition lmost unchnged except tht they now pply to.r rther thn to 2. We cll (T, u, v) e elementry if involves vribles nd opertors in Oo but none in 1"2, n> 1 (for exmple, trees which shre only elementry constituents cn shre toms but not list cells). We now define, s for lists, the number of occurrences of cell in n n-tuple of trees, thus 6: Ux,-*N (i) 6 (v--% w) =0 if u is elementry CO T (ii) 6 ((x)- v -w)=3 (v->w) +1 if XU )>f =6 (v-uv) if xou 43 WE 0, n>
PROGRAM PROOF AND MANIPULATION (Tit (iii) WU', 9 un)--410 -=-Ou((u1)-+ 0+ +Ou((u1)- ) Notice tht for e Slo we hve (5 (() q)) =0, since we do not wish to notice shring of toms, which is innocuous. The definitions of Distinct nd Nonrep re s before replcing 2' by.9". Lemm 1 holds unchnged with the obvious ddition tht Distinctness nd Non-repetition re preserved by formtion of n-tuples in the sme wy s by composition. The definition of Distinct Non-repeting Tree System is, s before, k-tuple of elements of.9-, Ai, i =1,. k, such tht (i) Ai is in.9- for 1=1,. k (ii) Nonrep (A i) for ech i =1,. n (iii) IfjOi then Distinct (Ai, Ai) for ech i,j e 1,.. k. We employ the sme bbrevition s before, writing *S for 'S is Distinct Non-repeting Tree System'. We cn dpt Proposition 1 nd specify some useful properties of such systems. Proposition 3. (i) Permuttion *S*S' if S' is permuttion of S. (ii) Deletion *(21, 2k) *(22, 2k). (iii) Rerrngement of vribles Tn *(lf.9 itk) *((Uilf / (u/, un), At, P 11) if ije {1,..., n} for j=1,...,m. (iv) Composition T Alt.1 11134#'*(11 Wf 21, ef 2k) nd conversely T ef *(ii-4w, Alt, ilic)(31 ) *(11-)V-"V, Alf 2k) (v) Tupling (TI Tm) Ai, Vet T Ak) 4**(011,, 21,, 2k) (n> 0). (vi) Distinctness T *(11"-, 11-+W)14-4/7 is elementry or u-4 iv is elementry. (vii) Inequlity *((14)(VI, 10) nd uovi, i=1,. n nd u e (r =co o-). Proof. Obvious using Lemm 1 dpted for trees. 44
13URSTALL In the cse of cons nd toms we ssume predicte Atom, Atom(u)<*u e A, tht is, u E U, for some e A, nd (vii) yields v )) nd u vi, i =1,.. n nd Atom(u) (B). r=u nd Atom(u)(3)r =cons. In figure 11 we give the trnsformtion sentences for this cse. Their correctness is provble esily in just the sme mnner s those for lists. The extension to generl L2 should be obvious. The sentences for I: =E nd E1=E2 re unchnged nd for the test Atom(E) we just dd the sentence Atom(E) or its negtion. Proposition 2 nd the definition of E go through unchnged. 1 hde1 =E2 Chnge set = {hd,*} cons *((E1) (11, v), Ai, cons *VE1) --> (E2, 0, Al, A ) Chnge set = { ti, * } cons *((E1) --> (u, v), At, A ) cons *VE1) --> (u, E2), At,., =e011s(e1, E2) Chnge set = {*, 1) *Pt, cons *V11) -> (Et, E2), Ali Figure 11. Trnsformtion sentences for tree processing. We now give couple of exmples of tree processing (or perhps we Should sy 'expression processing'). Figure 12 gives progrm in the form of recursive subroutine for reversing tree; for exmple cons(, cons(b, c)) becomes cons(cons(c, b), ). We hve identified the one-tuple (t) with T to sve on prentheses. The proof is by induction on the structure oft. Strictly We re doing induction on the free theory T. We define tree reversing function recursively by rev ()= if is n tom rev (cons (c, c))= (cons(rev(t), rev ()). 45
PROGRAM PROOF AND MANIPULATION j=reverse(i) *( 4( )). *((i)-->( )). Atom(i). k:=hd i L EXIT 4,(u)r.VC». cons (ri.t2) *((i) > (Xs )1) )). cons (r1, T2) =T. COM T1 T2 Y), (k)3( ), (Y)4( )). cons (ri, 01.2)=.r. hd i =RE VER S E(t1 i) cons rev(r2) *((i) >(v, y), (v) > (k)4( )). cons (ri, T2) =t 11 1: = R E V ER S E(k) cons (rev(r2),rev(ti)) *((i)--3 (v, k) >( )). cons (ri, "(2) =T. :=I [EXIT Figure 12. Reversing tree. rev(t) *((s)---)( ))* 46
13URSTALL k=subst(i,, j). Substitute i for in j. is n tom. YES j).-;( ), (i)-e>( )). Atom(). '4(4( )). T =. k:=i [EXIT *((k);( )). subst (,, T)=o. -*Ki):)( (4( )). x. YES *((j)3( ), (i)4( )). Atom(r).. k: =j - -*((k)-4( )). subst(,, T) =T. cons r s2 *((j)-- ) (-1.2 y), (x)4( )2 (y)4( )). lid] =SUB ST(i,, lid j) cons subst(,,r2) r2 *((j) )(112 Y)2 (11)-----( (y)4( )). ii]: = S U B ST(i,, ii]) cons subst(,,r I) subst(,,r2) *((j)-3(u, v), (ii)-4( ), )). L EXIT subst(,,r) ->( )). Figure 13. Substitution in tree. 47
PROGRAM PROOF AND MANIPULATION Figure 13 gives recursive subroutine for substituting one tree into nother wherever given tom ppers. Agin the proof is by induction on the structure of T. We define substitution for morphisms of T by subst(o-,, )= subst (cr,, b)=b if b nd b is n tom subst(o-,, cons ('r1, T2)) = cons (subst (cr,, TO, subst (o-,, T2)). We should relly hve included some rbitrry extr morphisms Ai in the * expressions t entry nd exit so s to crry through the induction, since the recursive clls ct in the presence of other distinct morphisms; but this is clerly dmissible nd we hve omitted it. In our exmples we hve used the LISP cse of cons nd toms, but even for LISP progrms it might be useful to consider more generl S. List structures re often used to represent expressions such s rithmetic expressions, using cons (`PL U S', cons (x, cons (y, nil))) for 'x +y', similrly for x y'. We cn then llow T to hve binry opertors + nd x defining (u) )(v, w) if hd(u)=plus' nd hd(t1(u))=v nd hd(1/2(u))= w. This enbles us to write the ssertions intelligibly in terms of + nd x. In fct this representtion of expressions corresponds to n injective functor between the ctegory T+, x nd the ctegory Tms.A. Free Theories s bove seem pproprite where the trees hve internl shring only t known points. Dt structures other thn lists nd trees remin to be investigted. More generl ctegories of grphs with inputs nd outputs might be worth considering. In generl the choice of suitble ctegory nlogous to T would seem to depend on the subject mtter of the computtion, since we wish the vocbulry of the ssertions to be meningful to the progrmmer. Acknowledgements I would like to thnk John Drlington nd Gordon Plotkin for helpful discussions nd Michel Gordon for pointing out some errors in drft, lso Pt Hyes nd Erik Sndewll for helpful criticism during the Workshop discussion. I m grteful to Miss Elenor Kerse for ptient nd expert typing. The work ws crried out with the support of the Science Reserch Council. A considerble prt of it ws done with the id of Visiting Professorship t Syrcuse University, for which I would like to express my pprecition. REFERENCES Arbib, M.A. & Give'on, Y. (1968) Algebr utomt prllel progrmming s prolegomen to the ctegoricl pproch. Informtion nd Control, 12, 331-45. Algebr utomt the ctegoricl frmework for dynmic nlysis. Informtion nd Control, 12, 346-70. Acdemic Press. Burstll, R. M. (1969) Proving properties of progrms by structurl induction. Comput. N., 12, 41-8. Burstll, R. M. (1970) Forml description of progrm structure nd semntics in first order logic. Mchine Intelligence 5, pp. 79-98 (eds Meltzer, B. & Michie, D.). Edinburgh: Edinburgh University Press. 48
BURSTALL Eilenberg, S. & Wright, J.B. (1967) Automt in generl lgebrs. Informtion nd Control, 11, 4, 452-70. Floyd, R.W. (1967) Assigning menings to progrms. Mthemticl Aspects of Computer Science, 19-32. Providence, Rhode Islnd: Amer. Mth. Soc. Good, D.I. (1970) Towrd mn-mchine system for proving progrm correctness. Ph.D. thesis. University of Wisconsin. Hewitt, C. (1970) PLANNER: lnguge for mnipulting models nd proving theorems in robot. A.I. Memo. 168. Project MAC. Cmbridge, Mss.: MIT. Hore, C.A.R. (1971) Proof of progrm: FIND. Comm. Ass. comput. Mch., 14, 39-45. King, J.C. (1969) A progrm verifier. Ph.D. thesis. Deprtment of Computer Science, Crnegie-Mellon University, Pittsburgh, Pennsylvni. Lwvere, F.W. (1963) Functoril semntics of lgebric theories. Proc. Ntl. Acd. Sci. U.S.A., 50, 869-72. London, R.L. (1970) Certifiction of Algorithm 245 TREESORT 3. Comm. Ass. comput. Mch. 13, 371-3. McLne, S. (1971) Ctegories for the working mthemticin. Grdute Texts in Mthemtics 5. Springer-Verlg. McCrthy, J. (1963) A bsis for mthemticl theory of computtion. Computer Progrmming nd Forml Systems, pp. 33-70. (eds Brffort, P. & Hirschberg, D.). Amsterdm: North Hollnd Publishing Co. McCrthy, J. & Pinter, J. A. (1967) Correctness of compiler for rithmetic expressions. Mthemticl Aspects of Computer Science, pp. 33-41. Providence, Rhode Islnd: Amer. Mth. Soc. Wirth, N. & Hore, C.A.R. (1966) A contribution to the development of ALGOL. Comm. Ass. comput. Mch., 9, 413-32. APPENDIX: FREE THEORIES Ctegories By ctegory C we men set 0 of objects nd set M of morphisms, together with pir of functions, 31:M-00 (domin nd co-domin), prtil opertion of composition, :M-+M, nd n identity opertion 1:0-4 M such tht (i)f g is defined iff =Dog, (ii) Iff g nd g h re both defined then (f g) h = f (g h) (iii) o(i.)---01(l.) = (iv) If Oof = nd elf =b then l f =f=f lb We write f:.-b s n bbrevition for of= nd Dif=b nd write C(, b) for the set of ll f such tht f: -ob in C. (For further development see, for exmple, McLne 1971.) Functions over finite sets For ech integer we write [n] for the set 1,. n}. We write 0 for [0] nd /for [1] nd notice tht there re unique functions -*[n] nd [n]-'! for ny n. A set [n] nd n integer ie [n] determine unique function I-[n] which we will denote by 49
PROGRAM PROOF AND MANIPULATION The free theory on Cl Let SI = {n }, n=0, 1,... be sequence of sets of opertors. We define the free theory on Q inductively s the smllest ctegory T such tht (i) The objects of Tre the sets [ii], n =0, 1,. (ii) There is function d from the morphisms of T to the non-negtive integers. We cll d(f) the degree of the morphism f, nd write TA[m],[n]) for the set of ll morphisms of degree] from in to n. (iii) To([n],[m]) is the set of ll functions from the set [n] to the set [m]. Composition nd identity re defined s usul for functions. (iv) There is n opertion of `tupling' on morphisms from I (s well s composition nd identity). Thus for ech n-tuple of morphisms f 1=1,. n there is unique morphism f, written.,f > such tht i f=fi for ech i=1,. n. (Recll tht 1: 1--,[n], is the function tking 1 to i in [n].) From the uniqueness we see tht for ny f:[n]-4[k], f= <1 f,. n f>. The degree of (fi,..,f> is d(fi)+.. + d(f ). (v) SI g_ Ti(I,[n]). (vi) If co e Cl,,, nd f e T Jm], [n]) then co f e Ti+j(I,[n]), nd conversely if g e [n]) for j 0 then there is unique m, unique co e Om nd unique f such tht g = co f. Now T, in the body of the pper my be equted (to within n isomorphism) with yri([m], [n]) here. 50