Computer enumeration and generation of trees and rooted trees


Computer enumeration and generation of trees and rooted trees...

0 downloads 151 Views 994KB Size

J . Chem. Znf. Comput. Sci. 1981, 21, 91-99

91

Computer Enumeration and Generation of Trees and Rooted Trees J. V. KNOP, W. R. MULLER, 2. JERICEVIe, and N. TRINAJSTIe* Computer Centre, The University of Diisseldorf, 4000 Diisseldorf, Federal Republic of Germany

Received June 20, 1980 A computer-adopted method for enumerating and plotting trees (alkanes) and rooted trees

(substituted alkanes) with Nvertices (carbon atoms) is described. Results are compared to some previous work in this area. INTRODUCTION This paper presents a computer-oriented method for enumerating and plotting trees (alkanes) and rooted trees (substituted alkanes). A tree is a connected graph with no cycles.' A rooted tree is a tree in which one vertex has been distinguished from others.z This vertex is usually called the root. Alkanes C N H ~ Nare + ~ represented by trees in which the maximal vertex degree isfour. Alkane trees used here depict only carbon skeletons of alkane hydrocarbons. Substituted alkanes CNHzN+IX are represented by hydrogen-suppressed rooted trees in which the maximal vertex degree is also four.

primary r o o t

secondary root

tertiary root

1-hydroxy-2-methylbutane

2-hydroxy-3-methylbutane

(primary alcohol1

(secoddary alcohol)

2-hydroxy-2-methylbutane (tertiary alcohol)

0 1

I

d

b tree

rooted tree

r3

P

2-methyl-propane

2-methyl-propane i lree)

graph

CH 3

G 3

c<

CH

~

2-chlor-2-methyl-propane

A 2-chlor-2-methyl-propane (rooted t r e e )

graph

In the case of rooted trees, representing substituted alkanes, we differ the primary root (a rooted vertex with degree one), the secondary root (a rooted vertex with degree two), the tertiary root (a rooted vertex with degree three), and the quaternary root (a rooted vertex with degree four). A substituent is, of course, never attached to the quaternary atom. If, for example, alcohols CNHZN+lOHare considered, the rooted tree(s) with the primary (secondary, tertiary) root would represent the primary (secondary, tertiary) alcohol. The mathematical theory of trees was developed in the middle of the last century.35 However, Caylef17 was the first who realized the potential of this theory for the enumeration of (hydrocarbon) isomers. He enumerated alkane isomers up to N = 13, but the numbers of isomers obtained for C12and CI3alkanes (357 and 799) were incorrect.' They were corrected 5 years later by Herrmann8 (355 and 802). It is interesting to note that Losanitsch9 was arguing that the number 0095-2338/8 1/ 1621-0091$01.25/0

of isomers for Cl2alkane is neither 357 nor 355 but 354. There was quite a discussion between Herrmann1031zand Lcsanitsch" at the end of the last century about whose number is correct. Herrmann, of course, produced the correct value.13 Since the work of Cayley, the mathematical theory of isomers was continuously developed in two directions. One direction was a development of mathematically well-founded isomer enumeration methods,13-" while the other was a development of practical schemes for enumeration of the particular kind of structural i ~ o m e r s . * ~ ~In ~ ' the * - ~early ~ thirties, Blair and Henze were especially active in this area at the University of Texas at Austin. While Henze and Blairz0 enumerated correctly the number of primary, secondary, and tertiary alcohols for a given number of carbon atoms, there is an error in their work on the number of isomeric alkanes. The Henze-Blair number of isomers for C19 alkane (147 284) should be corrected (148 284).'* The Henze-Blair approach was much later a basis for a computer program which allows the calculation of alkane isomers.z6 This program handles alkanes up to 57 carbon atoms in double precision. However, it is not only important to enumerate the isomers but also to display them all. This became possible with the advancement of computer technology and with the development of techniques for the production of graphs by c o m p ~ t e r . ~The ~-~~ present work represents an effort in this direction. METHOD First we introduce a special representation for trees and rooted trees with N vertices by N tuples of nonnegative integers smaller than N . It allows, as this paper shows, a very efficient and easy way to produce just these tuples and to obtain graphic output from them [or other forms of representation for the underlying (rooted) tree]. This method allows us to mark identity trees and homeomorphically irreducible trees while generating all trees and also to restrict the generation to special types, as, e.g., alkane 0 1981 American Chemical Society

KNOP ET AL.

92 J . Chem. It$ Comput. Sci., Vol. 21, No. 2, 1981 (rooted) trees. We make extensive use of the notion of lexicographic order, so let us begin with a definition: A K tuple ( a l ,a2, . . ., aK)of integers is defined lexicographically smaller than an L tuple (b,, 62, . . ,,6 3 , if there exists an index j with 1 I j IL so that ai = bi for 1 I i I j and either j = K + 1 or aj < bi. Now we map the nonempty rooted trees into the N tuples of nonnegative integers by induction: The trivial (rooted) tree with one vertex is represented by the 1 tuple (0). A given rooted tree with N > 1 vertices and M edges incident to the root vertex gives rise to M rooted subtrees by removing the root vertex and all of its edges. These rooted subtrees (taking as the root in the subtree the neighbor of the removed vertex) with L1, L2,. . ., LM vertices (where the sum L1 + L2 . . . +LM is N - 1) are by induction equipped with L, subtuples. We sort these subtuples into the reverse lexicographic order, concatenate the 1 subtuple (M) and these subtuples, and get a tuple of 1 L1+ LM = N components which we define to be the representative for the rooted tree. Given a tree, we inspect all rooted trees above it (Le., we select the vertices one after the other as the root for a rooted tree) and assign to the tree the lexicographically largest one of the N tuples obtained. The same tuples can be constructed by inspecting all Ariadne threads for a given (rooted) tree, i.e., all closed sequences of edges which consider all the edges exactly twice and every vertex at least once (for a rooted tree only those starting at the root vertex). Every such sequence numbers, by the order of the first consideration, the vertices from 1 to N , and we may assign an N tuple to it by setting the ith component to the number of yet unconsidered neighboring vertices when the ith vertex is considered for the first time (Le., the number of attached edges for the first vertex and one less for the others). The lexicographically largest tuple is then just the tuple defined for the (rooted) tree above. Some useful properties of this representation by N tuples are as follows. (i) Given an N tuple representing a (rooted) tree, the sum of the first K components is greater than or equal to K for K < N and N - 1 for K = N . This is useful when computing the extent of a subtree beginning at a given position. It further implies that tuples of different lengths must necessarily have unequal components at a position common to both tuples. When comparing subtrees, this may serve to save one of the two tests for the end of the subtuple. (ii) Given the N tuple for a rooted tree, the N tuple for a rooted tree above the same underlying tree with a neighbor of the old root as the new root vertex may be found by the following simple rearrangement of the given tuple: The given tuple consists of a first component M > 0 (the number of branches or neighbors) and subtuples of M , {M, ..., Al,L,I, [AZJ, . * * , AM-l,LII-,1, [AM,,, . ’ A M L M L v l in l , lexicagraphically descendant order, one of which, say [A,+,,,,. . ., begins with the new root and consists of a first component J = A,+,,, and subtuples of J , {J,[B1,1,. ., Blx,I, [Bz,,. . ., B,-,?], B,, 1, . . ., BJS,]),again in lexicographically descendent or er. Now the new root has J + 1 subtrees represented by the subtuples which are old subsubtuples of J becoming the subtuples of the new root [B,,,, . . ., B l x , ] ,[Bz,l, . . ., B,-lx,-,], [BJ,l,. . ., BJ,KJ]and the extuple (without the subtuple representing the new root), [M - 1, Al,,, . . ., Ax,&x+z,~, . . ., AM,,] becoming the subtuple and properly inserted in a tuple of the new root according to the reverse . . ., BYKJ and lexicographic order, say not greater than [By,,, . . ., BY+I,Ky+,].This leads to the new not less than . . .,BY& I, [M-1, AI,,,. . .,AxL~AX+Z,!, N-tuple ( J + 1, [BI,~, . . ., AM‘ I, [BY+!,,,. . ., BJx,Jj. This manipulation, which is useful wgen testing whether a given tuple for a rooted tree represents also the underlying tree, is best illustrated by an example (Chart I).

chart I old root

v

new root

v

(3,3,1,0,0,0,2,4,0,0,0,0,2,1,0,0,2,0,0)

+

+

- ..+

-9

.

o\ooo

O

0

O

The old root with its remaining branches forms one branch of the new root, which must be inserted at the right place to achieve reverse lexicographic ordering.

(3,4,0,0,0,0,2,3,1,0,0,0,2,0,0,2,1,0,0~

.

(iii) A tree represented by an N tuple (Al, . ., AN) is homeomorphically irreducible (Le., no vertex has exactly two adjacent edges) if, and only if, neither the first component, Al, is 2 nor any other component, A,, is 1. (iv) A tree represented by an N tuple is an identity tree if, and only if, there exists neither a second vertex which taken as the root delivers the same N tuple nor any vertex with two identical subtuples. This can be seen by a simple case study. COMPUTER PROGRAM

Now we describe the computer program for enumerating and plotting (rooted) trees by applying the method which we developed above. The main program generates the N tuples representing rooted trees in reverse lexicographic order, calls a logical function TREEKO to test for tree property, and, if wanted for this type of (rooted) tree, calls a subroutine PLTREE to produce graphic output. The main program (Figure 1) nests N loops and selects bounds for inner loops dependant on actual values of outer controlled variables. Although the number of nested loops is variable, we do not need any recursion, since we formulate the N “virtual” loops explicitly by using a vector to store the loop variables and so to contain automatically the generated N tuple. Therefore the main program may confine itself to three visible nested loops: the outer one controls the number of vertices, the second counts the (rooted) trees for a given number of vertices, and at the inner level there is one loop to find the innermost nonexhausted “virtual” loop and then one to initiate the inner “virtual” loops from there. The upper bound and first value for a “virtual” loop originates from the fact that the sum of all components for an N tuple must be N - 1 and the request that adjacent branches (or subtuples) of the same vertex must be in reverse lexicographic order, Le., they must be equal, or at the first differing position the second must have a smaller value than the first. Now many subtuples themselves consist of further subtuples, which consist in their turn of still more subtuples, and so one could expect that a lot of conditions would have to be fulfilled and monitored at the same time. Fortunately, we must pay

J . Chem. In$ Comput. Sci., Vol. 21, No. 2, 1981 93

COMPUTER ENUMERATION AND GENERATION OF TREES ,- , , ”:

TREEKO operates as follows:

.i

The main program operates as follows:

1 0

-4 0

Read in parameters. Initiate some variables, especially set N, the actual number of vertices, to the smallest wanted value. Set first lower bound to 1 . Set fir‘st component of the N-tuple to N-1 (to obtain the lexicographically largest posslble N-tuple first). Reset enumeration Counters. Set an index to the first position. Set last valid (Nth) component to 0. If N < 3 this completes the only poss~bleN-tuple, so ccntinue at I ) . Test indexed position, whether it makes the developing N-tuple lose its hameamorphical ii-peducibility (here could also be insellred a test fop alksneDPODePtY). inc>easi the index. If beyond the N-tuple, then continue at 6 ) . Compute upper bound and lower bound far this position (using RLOC and LBC as desoribed above, but also taking into account that the sum of a l l components must not exceed N-1). If the old RLOC is satisfied by exhaustion this implies a non-identical automorphism. If the upper bound is lower than the lower bound, we have tne remaining failure e a s e mentioned above, s o continue at 8 ) for the next N-tuple. Set the indexed component to the upper bound. Repeat from 3). A valid rooted tPee has been obtalned. If the first component is not greater than any other one, this is no tree-representative (by an input-switch the program c a n be made to omit the generation of these c a s e s ) . I f the first component is by 2 greater than any other o n e , this is a tree-representative. The remaining c a s e s must be Studied further by the logical function T R E E K O to decide about treeprcperty. Count c a ~ e s . If the actual N-tuple belongs to the wanted c a s e s , let subroutine PLTREE produce graphic output. Set the index to the la3t valid (Nth) position. If the indexed component IS greater than the c o r ~ e s ponding lower bound, decrease the ccmponent and repeat from 3 ) . I f the index is not at the first position, decrease the index and repeat from 8 ) . All N-tuples generated. Write out enumeration counters. Increase N . If N-tuples wanted, repeat from 2 ) .

BEGIN

7

1 ) Set an index to 1 .

2 ) Increase the index.

If beyond the N-tuple, the original N-tuple was the tree-representative, so continue at 6). If the indexed component is less than the first component minus 1, repeat from 2 ) .

3) Search towards the original root to find a vertex for which the N-tuple with respect to this vertex as the root has already been constructed (may be the original root). Name this vertex the father.

4) Name the next vertex from the father towards the indexed vertex the son. Use the permutation trace to find the position of the son in the N-tuple of the father. Use property (i) to determine the extent of the subtuple beginning at the son. Compare the remainder of the father’s N-tuple to the subtuples of the son to determine the right place for insertion. Execute the permutation according to property (ii). Collect trace-information of applied permutations. If the son i s not the indexed vertex, now name the former son the father and repeat from 4 ) . Compare the new N-tuple (of the indexed vertex) to the original N-tuple. Equality implies a non-identical automorphism. If the new N-tuple is lexicographically larger, the orignal N-tuple was no tree-representative, so then continue at 6 ) for return. Otherwise repeat from 2 ) . Return the value .FALSE. if the given N-tuple was the tree-reprensentative, and return .TRUE. if the candidate was knocked out.

Figure 2. :Description of the operation of the logical function

TREEKO.

PLTREE operates as follows:

C W 9

Figure 1. Description of the main program operation.

attention only to the oldest pending structure (i.e., neither exhausted nor fulfilled by a significantly smaller value), because any newer structures are automatically satisfied with the fulfillment of the oldest structure by the latest substructure. This is so since the comparative subtuple of the oldest structure has been built properly and by the pending condition transfers its own reverse lexicographic ordering of subtuples to the compared subtuple. Using this, a simple variable suffices to point to the actual comparative component. A smaller chosen value naturally fulfills the actual reverse lexicographic order condition (abbreviated “RLOC” in the sequel), and this is easily detected. However, we must also provide for the case that RLOC is satisfied by exhaustion of the subtuples, and to this end we keep a pointer to the very beginning of the branch which is subjected to the current RLOC. Two types of failure are yet possible when generating a “last” branch, i.e., a subtuple which must extend to the end of the N tuple as it is the only remaining branch. The first case is when the starting subtuples of this last branch are chosen so small that the remaining subtuples f o r d by RLOC cannot fill the rest of the N tuple. This enforces a lower bound condition (abbreviated LBC in the sequel) which is not very difficult to control. The only proper subtuples not allowing a longer and at the same time lexicographically smaller tuple are of the form (1 ,l , . . ., 1,O) since any value of 2 or larger anywhere in a well-formed tuple allows smaller ones of arbitrary length, namely (l,l, . . ,, 1,O). So the LBC requires that for the subtuples of a last branch, the kth last subtuple must either contain a value greater than 1 or must consume at least one kth of the remaining room in the N tuple. If a subtuple of a last branch is begun, the necessary number of 1’s is computed into a counter, which is decremented for every generated 1 and reset to 0 immediately

1)

For every component count the number of zeroes in the subtuple beginning at this component (i.e. the number of final vertices reachable from the root through the considered vertex). This number for the first component is the total number of final vertices (if the first component is 1 the root itself is final and a 1 has to be added to the number). The edges to the final vertices are spread equally to all directions (divide PI by the number of final vertices to obtain the angle increment). Plot first vertex. Set an index to 1 . Initialize a base angle.

2 ) Copy indexed component to a counter.

I f indexed component is 0, the corresponding vertex is final, s o increase base angle by 2 increments, increase index and continue at 4)

3 ) Push plot-position and counter onto a stack, Increase index. Plot an edge and a vertex (the now indexed one) using base-angle + K * increment where K is the number of final vertices reachable through the new indexed vertex. Repeat from 2 ) .

4) (Trace back along the Ariadne-thread.) Pop plot-position and counter off the stack. If nothing was on the stack, w e are ready, so continue at 6 ) . 5 ) Draw back to popped plot-position. Decrement the counter. If the counter becomes zero, repeat from 4 ) Otherwise repeat from 3).

6) Advance to the next frame.

Figure 3. Description of the operation of the subroutine PLTREE.

for any value greater than 1 generated, thus indicating by a countervalue greater than 0 that the lower bound for the actual component is 1 and not 0. The other case happens when the last branch must be longer than the comparative branch of the RLOC. Because of the above provisions the wanted branch always exists, but often

KNOP ET AL.

94 J. Chem. In5 Comput. Sci., Vol. 21, No. 2, 1981 c

T h e ~ r o g r a m llzting h a i n Pragraa

c

c

C I

c c c c

c c c

c

c c c

c c c c

c c c c

c c c

c c c

c c

c C

c c

c

c

c

c

c

c c

C C

C

c c c c

c c c C C

c

c C

c c c C

c

c c c

c c c

c c c

c

c

c c

c 1000 1001

c

c

GOTO 30 20

30

c c

NiNlllN RERUN=.FALSE. UINROTZO i f r o o t e d t r e e s a r e o f no i n t e r e s t , p r o d u c t i o n of N - t u p l e i w i t h a f i r s t c o m p o n e n t 1 i s o m i t t e d f o r N>2 UINROTil t+tt*+* i n s t a l l a t i o n dependant *+*+*** i n r t a i L a t i o o denendent D l 0 t t e r - i o r t i l l i r a t i o n If(.NOT.NOPLiT) CALL P S T A k T ( 4 0 0 0 0 . ~ O ~ 9 9 9 ) in,t,a1i*e counters HIRSiO lDTS=O TREES-0 ROOTS=O TESTS=O MAROTS'O i a s t vertex i s always f i n a l (no f u r t h e r edger) TREE(N)=O b e g i n w i t h t P i v 1 a l l e z i c o g r a p h i s s l l y Largest N - t u p l e IFl.NOT.RERUN) TREE(1)iN-1 valid 5 e C t 7 n g l f e r N < 3 NOTHIRS.fALSE. NOTIDTIN.EI.2 lF(N.LT.3) GOTO 500 f i r s t v e r t e x has a l l edger a t hand

lf(N0ROOT)

C

c c

50

c c c

c

c

c c (90

C

c

c

c

1

C

ializationr

rooted

c f

c c c the

question

for

i i n.

each I-value

If(OLD0RG.LE.PATER) GOT6 190 t h e RLOC w a s f u l f i l l e d b y e x h a u s t i o n o f t h e s u b f u p l e s , i.C. t h e y a r e e u ~ a l , i . e . we h a v e a n o n - i d e n t i c a l auto.Orphisa NOTIDT=.TRUE.

G O T O 200 t h e P I E Y ~ O U I RLOC i s i n h e r i t e d REF=OLDREF+l ORGiOLDORG V e r t e x 1-1 h a s u s e d T I e d g e s uSI=USED(1-1)tTI e v a l u a t e L8C MlNI=II~NIM~I-1~-1 i s previous coneition exhausted 1FlUlNI.LE.O) GOT0 3 1 0 i f p r e v ~ o u oC o n d i t i o n s t i l l p e n d 7 n g lF(Tl.E9.1> GO10 4 0 0 no p r e v i o u s c o n d i t t o n

'

*1n1=0 i f t h e b r a n c h b e g i n n i n g a t F A T H E R ( I j e x t e n d s t o t h e end of t h e H-tYple,thiS e r t 8 b l i l h e I II new L B C IflUSEn(PATER).EP.PATER) nINI=(KK-USl+8RAI)~BRAI a s s i g n a l l r e m a i n i n g edges t o C u r r e n t v e r t e x TI-N-US1

~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~

OLDREFSREF OLDORG=ORG t e s t f o r h o n c o n o r p h i c a l i r r e d u c i b i l i t y (PROP3) lF(Tl.E9,1) NOlHIR=.TRUE. nCTID(l)=NOTIDT NCHIR(I)-NOTHIR t e s t i f current loop alreaoy exhausted lF(Tl.GT.01 GOTO 4 9 0 r f the f o l l o u l n g branch 1s taken,"* have fhe Only r e m s i n i n g f a i l u r e c a r e m e n t i o n e d above iF(UINI.GT.0) GOTO 920 CONTINUE If(.NOT.RERUN) GOTO 4 9 5 o t h e r v e c t o r s have been i n i t i a l i r e d a c c o r a i n g to t h e r e i d v a l u e s f o r TREE RERUN=.FALSE. If(REf.LE.0)

GOTO S O 0 i f t h e l a s t RLOC 1 3 3 tun

identical

fulfilled

by e i h s u s t r O n , u e have a non-identica:

~ ~ b f u ~ al ned s t h e r e f o r e

500

'

simply a l l rooted trees ranted G O T O 800 c l a s s i f y rooted t r e e ( t e s t whethe? the g i v e n N-tuple 1 t k r L~1 a r o e s t o b f a r n a b l e a b o v e t h r ~ 0 r r e r p o n d i n Qt r e e ) UAXTEX=.TRUE. IF(N.LT.3) GOTO 530

If(N0TREE)

c

c

t h e f i r s t component never s e t t l e r a non-rdentical automorphism

G OTO ..

a p p l y (PROPA) NOTUlR=NOHIRlI-1) 1F(TI.E9.1) NOTHIR=.TRUE. K=l+? NOHIR(l)=NOTHIR NOT1 D T * N O T l D ( 1 > D O 4 9 0 1;K.N 1FtTl.GT.Oj GOTO 1 1 0 D i e c e d i n g v e r t e x was f i n a l J=l-l i s J brother of I ? If(BRA(J).GT.l) GOTO 1 3 0 no,go b a c k one g e n e r a t r o n J=FATHER(J) happens i n f r e q u e n t l y b u t O Y C T ~ L L o n l y once f o r GOTO 1 2 0 J 7s brother of 1 REF'I P r e p a l F f o r " C Y RLOC oRG=l same f a t h e r n a t u r a l l y PATER=FATHER(J) 1 7 1 n e i r b r a n c h 01 f a t h e r BRAI=BRA(J)-I GOTO 1 8 0 pveceding vertex 7s father,no elder b r o t h e r t o be r e s p e c t e d REf.0 PATER=1-1 f i r s t brother 8RAI;TI

c

c

ORGiO

GOTO 1 0 5 I f no r o o t e d t r e e s a r e i a n t C d , d v O l d the generation o f any N - t u p l e s f o r w h i c h t h e f i r s t Conponrnt I f n o t t h e I t r o n Q l I IdlQielt IFlNOROOT) U A X I k = T l - l apply (PROPI) NOTHIR*Ti.E9.2 r v i t s h ALKANE B L I O I I no more t h a n 4 e 0 f g e ~ a t a v e r t e x lF(.NOT.ALKLNE) GOTO 1 1 0 lF(lI.GT.&) TI=& TREE(l)=TI IF(UAXIU.GT.3) UAXlM=3

c

c c

tree5

49s

OLDREF;: If(I.GT.1)

c

9G

c c

i f t h e following l l n e i s r e a c h e d f r o m a b a r e , t h e first component 3 % c o n f e r n e d , i h i c h n e v e r ha3 any R L O C ; i t r e a c h e d fTOn b e l o r , t h e r e e x i s t s no R L 0 C , b e ~ d u 1 e t h e c o m p o n e n t c o n c e r n e d h a s j u a r b e e n d e c r e m e n t e d f r o m an a l l o w e d v a l u e , u h i s h d e f i n i t e l y f u l f i l l s a n y o l d RLOC

c

I U P L I C I T INTEGER (A-1) c L O G I C A L TREEKO c LOGICAL NOROOT,NOTREE,NOTEST,NOPLOT,SPPLOT,ALKANE lo0 LOGICAL NOTID,NOHIR,NOTIDT,NOTUIR,UAXTEX,RERUN DlPlENSlON T R E E ( 3 0 ~ , 8 R b ( J O ~ , F A T H E R l 3 ~ l C DlUENSlON UfNIU(3O),USED(30) c DlUENSION NOTID(30l,NOHlR(30) C TREE(1) c o n t a i n s t h e c o n t r o l l e d v * r i i l b l e o f t h e I - t h c n e s t e d " v i r t u a l " loop and t h e r e f o r e v i t h r n t h e N - t h ( i n n e r m o s t ) l o o p TREE c o n t a i n s t h e N - t u p l e d e f i n e d c above f o r the a c t u a l (rooted) t r e e FATHERlI) c o n t a i n 3 t h e rnder of t h a t v e r t e x from which v e r t e x 1 v a s r e a c h e d f i r s t ( t r e e - t h e o r e t i c f a t h e ? o f I) i.e. 1 i S one O f t h e v e r t i c e s C o u n t e d b y T R E E ( F A T H E R ( 1 ) ) B R A C I ) c o n t a i n s t h e number o f r e m a i n i n g branches o f v e r t e x PATHER(1) i n c l u d i n g t h a t b e g i n n i n g a t I c U S E D ( 1 ) d e n o t e s t h e n u n b c r O f e d g e s ( p l u s 1 ) u s e d up t o 101 b u t n o t i n c l u d i n g 1,i.c. t h e sum T R E E ( l ) + . . . + T R E E ( I - l ) NOTLO(1) v h r n . T R U E . S a y s t h a t l R E E ( l > , . . .,TREE(I) do 110 n o t represent t h e beginning o f an i d e n t i t y t r e e N O H I R ( 1 ) when .TRUE. s a y 1 t h a t no t r e e h a v i n g t h e s a m e f i r s t 1 c o n p o n e n t r a s TREE c a n b e h o m e o m o r p h i c a l l y rrreduc7ble I I I N I U ( 1 ) d e s c r i b e s t h e Lower b o u n d c o n d i t i o n ( i n t h e c sequel a b b r e v i a t e d L B 0 , i . e . IIINIM(1) > 0 says t h a t t h e l o u e s t a l l o u a b i e v a l u e f o r TREE(1) i s 1 c TI,PATER,8RAl,USI,NOTIDT,NOTHIR,M1HI serve t o o p t i m i z e 120 a c c e s s to t h e c u r r e n t c o m p o n e n t o f t h e ~ c s pv e c t o r c TREE,FATHER,8RA,USED.hOTIO,NOH~R,IIlN~II c REf a n d ORG d e s c r i b e t h e a c t u a l r e v e r s e l e x i c o g r a p h i c a i d e r c o n d i t i o n (RLOC i n t h e r e a u c l ) c REF c o n t a i n s t h e i n d e x O f t h e v e r t e x # h o s e n u m b e r o f 130 b r a n c h e s m u s t b e >I T R E E l l ) t o a c h i e v e RLO o f b r a n c h e s c ORG C o n t a i n s t h e i n d e x o f t h e v e r t e x v h l c h w a s f i r s t S u b l e c t e d t o t h e p r e s e n t RLOC c O L D R E f a n d OLDORG c o n t a i n REF a n d O R G f r o m t h e p r e v i o u s posit ion c RERUN 7 5 .TRUE. during the f i r s t pais through the loop r f r e s t a r t i n f o r m a t i o n was d e l i v e r e d , duping such a f l r i t p a s s no H - t u p l e I S g e n e r a t e d b u t t h e o t h e r v e c t o r s c a r e VeCOnDtiUCted c HlRS,IOTS,TREES,ROOlS,TESTS,MAROTS are respectively 150 homeomorphically i r r e d u c i b l e treer,identity trees, trees,rooted treer,calla t o TREEK0,raoted trees with c maximal f i r i t component component^, U A X I M c o n t a i n s an a d d i t i o n a l u p p e r b o u n d f o r c If NOROOT i s .FALSE. t h e n UAXIM i S t Set t o N and h a s no 180 c effect,if NOROOT 1 s .TRUE. t h e n U A X I M i s s e t t o t h e f i r s t component TREE(1) m i n u s 1 t h e r e b y a v o i d i n g t h e c generation o f those rooted t r e e s which obviously are n o t the trees c UAXTEX i s o n l y u s e d t o r e m e m b e r t h e r e s u l t o f a t e s t c ( I P e t t o .TRUE. o r i g i n a l l y a n d w i t h i n a DO-lOOp,it c r e s e t t o .FALSE. i f t h e examined r o o t e d t r e e must be t e s t e d b y TREEKO t o d e c i d e a b o u t t r e e - p r o p e r t y *a** i n p u t d a t I c NOROOT w h e n .TRUE. s a y s t h a t t h e program s h a l l as f a r 190 a S Possible avoid genersting looted trees rhiCh u r l l o b v i o u s l y b e no t r e e s ( o t h e r w i r e t h o s e r o o t e d t r e e s a r e c generated i n Order t o be counted) 200 NOTREE w h e n .TRUE. I I Y 5 t h a t t h e Crogran % h a i l p l o t a l l c l o o t e d t r e e s and may t h e r e f o r e o m i t a n y t e s t s f o r 300 tree-property c N O T E S T # h e n .TRUE. s a y s t h a t t h e p r o g r a m s h a l l n o t use t h e f u n c t i o n TREEIo ( w h i c h i m p l i e s t h e n t h a t lame t r e e s c nay appear t h a n o n c e ) . T h i s was b u i l t i n a s a t e s t this f a c i l r t y t o m e a s u r e t h e t i m e s p e n t i n TREEK0,ar c i s t h e only p a r t f o r u h r c h t h e expense c o u l d not be 310 t o b e b e L o u o r e q u a l t o t h e same O r d e r c a s t h e number o f v e r t i c e s i n t h e g e n e r a t e d t r e e s . c N O P L O T w h e n .TRUE. s a y s t h a t no g r a p h i c O u t p u t s h a l l b e c p r o d Y C ed 400 S P P L O T w h e n .TRUE. say5 t h a t Only i d e n t , t y t r e e s and h o n e a n o r p h i c a l l y i r r e d u c i b l e t r e e s s h a l l be p l o t t e d c C N d e n o t e s t h e a c t u a l n u m b e r o f v e r t i c e s f o r one t r e e A L K A N E w h e n .TRUE. says t h a t o n l y (rooted) t r e e s " 4 t h atmost f o u r edges a t any v e r t e x s h a l l be g e n e r a t e d c N U I N a n d NUAX a r e l o v e r a n d u p p e r b o u n d s f o r N ( 7 f on i n p u t H M I N < = N < = N I I A X t h e n t h e p r o g r a s e x p e c t s 8% f u r t h e r i n p u t a n tuple a n d g e n e r a t e s n e w N - t u p l e s c b e g i n n i n g j u s t b e l o w t h e g i v e n one, t h e o n l y n e C e s s a r Y 420 c h e r k p o i n t l r e r t a r t i n f o r m a t i o n t o resuae t h e p r o d u c t i o n c o f t r e e s i s t h e last v a l i d ( r o o t e d ) t r e e generated) 460 ~ ~ ~ ~ ( 5 , 1 0 N0 M0 I ) N , N U A X , N , N O R O O T , N O T R E E , N O ~ ~ ~ ~ ~ N O ~ L O ~ ~ FORUAT(3I2,lOLl) IF(N.LT.NMIN) GOTO 2 0 READ(5,lOOl) (TREElI), Ii1.N) fORUAT(3012) RERUN=.TRUE.

TI'TREE(1 ) - 1 D O 52D 1 = 2 XK ;OO:?f a n o t h e r component i s a s l s i g e a2 t h e f l : s t , ? I w i l l p l o d u c e a l e x i c o g i a p h i c a l l y l a r g e r N - t U p l e when u s e d as t h e r o ~ f , t h e r e f o r e t h i s i s d e f i n i t e l y no t h e SYpprcSlel t h l l Case) tree-representative (NOROOT=.TRUE. 51O:if t h e f i r s t component i s n o t b y 2 g r e a t e r t h a n any o t h e r o n e , t h i s o t h e r v e r t e x m u i t b e t C I t C d b y TREEKO w h e t h e r 1 t a s t h e roof d e l i v e r s a l a r g e r N - t u p l e

COMPUTER ENUMERATION AND GENERATION OF TREES 510

520 530 C

c

IF(TREE(I)-Tl) 520,510,900 MAXTEX=.FALSE. CONTINUE IAROTS=MAROTS+l can tree-property I F ( M A X 1 E X ) GOTO 800

J. Chem. If. Comput. Sci., Vol. 21, No. 2, 1981 95 639 640

be seen w i t h o u t

a s a i l to T R E E K O ?

c

TESTSSTESTSIl C

C a l l t o TREEKO w a n t e d ? GOTO U O O test for tree-property 1F(TREEKO(h,TREE,NOTIDT)) GOTO 9 0 0 LO""1 cases TREES=TREES+l 1F(.NOT.NOTIQT) IDTS=IDTS+l IF(.hOT.NOTnxR) H l R s.= H . r a. r+i g r a p h i t s f o r t h i s c a s e wanted ? IF(SPPLOT.AND.NOTHlR.AhD.NOTIDT) GOTO 9 0 0 g r a p h i c s Yented 7 I F ( N O P L 0 T ) GOTO 8 9 9 p?DdYcI giephlCI CALL PLTREE(h,TREES,TREE) CONTINUE CONTINUE * * * * * + * P o s s i b l y i n s t a l l a t i o n d e p c n a a n t *+.+*+. t h i s would be t h e l i g h t p l l c c f o r a p e t l o d i s a l s t o r i n g of Checkpoint information (only vector TREE a n d ( I f n e e d e d ) t h e C n Y D l C r l t l o n C O U n t L I s HIRS,IDTS, TREES,ROOTS,TESTS,.. M A R O T S must be s a v e d ) "0

IfCNOTEST)

C

C

800 C C C

899 900 C C C C C C

c 910

650 C C

c

C C C

660 C C

670

f o r N < 3 t h e r e i s o n l y one r o o t e d t r e e GOTO 960 f i n d innermost non-exhausted "Yi?tUal" loop

IF(N.LT.3)

C

C C

FOUnt

C C

1=*-7

920 930 940

YI=T~EEII)-~ lf(1I)940,930,950 l f ~ M l N I M ~ I ~ . L E . GOTO O ~ 950

C

l=l-1

c

GOTO 9 2 0

c

advance t h i s

loop

950

TREE(I)=TI

960

Outermost " v i r t u a l " loop exhausted ? IF(TREE(l).GT.IINROl) GOTO 1 0 0 * + * + * * * 3 L i n e s i n S t a L L . t i O n dependlnt *t*+*+* i n s t a l l a t i o n dependent r o u t i n e t o get CPU-tile CPU=TIME(I)

C C C

C C

used

675 C C

c C

680

n e x t nunbar o f v e r t i c e s "anted ? lF(N.LE.NMAX1 GOTO 50 i n i t a l I a t i o n dependent p l o t t e p - t e r . i n a t i o n * + * + * + * i n s t a l l a t i o n d e p e n d i n t *+et*+. lI(.NOT.NOPLOT) C A L L PEhD STOP END L O G I C A L FUNCTION TREEKO(h,TREE,hOTIDT)

C C

...

c

C C C

c c

t e s t

c

f o r

t r e e

1=1 690 700

c

rid

C C C

T l i T. l E. f.I 1

C

c C C

C

GOTO

C 610

C C C C

720

K=l-Vtl GOlO 7 4 0 If11.6T.fINV)

C

730 7bO C

750 C

620

770 77s 780 C C

790 795

c C C

c for

valid h-tuple

730 Of

nor r o o t

smaller branches Of o l d root K.1-FINVtJ-1 PLRM(SON,K)~PERM(FATHER,I) l=OLdrK.nll position PERMUT(I)*K CONTINUE c o l l e c t t r a c e i n f o r m a t i o n o f applies p e r m u t a t i o n s DO 7 6 0 I.1,N TfI=lRACE(fATHER,I) lRACE(SON,I)*PERMUT(Tfl) happens i n f r c q u c n t i y b u t OVLIILL o n l y once for e s c h L - v a l u e GOTO 650 compare N - t u p l e s CONTINUE D O 7 8 0 1.l.N If(PERM(L,I)-PERM(l,I)) 790,780,775 NOTREE=.TRUE. GOTO 7 9 5 CONTINUE tuples arc equa1,i.e. lRACE(L,*) deicribcr I non-identical .utonorphism NOTlOT=.TRUE. CONTINUE CONTINUE TREEKOSNOTREE RETURN

.....

ttt..*tt*.t.t~.".~...**.*".~*.

t

g r a p h i c

o u t p u t

t**tt...t**.*t..~t.*.t...~..~*~~~

I M P L I C I T INlEGER ( A - 1 ) REAL X , Y , X I , Y I , X J , Y J , X O , I O , X O , Y D , S l 7 . E , Y I D E , L E N G LOGICAL V A L I D DIMENSION TREEOO),VALID(30) DIRENSION f A T ~ E R ~ 3 1 ~ , B R A N C H l 3 1 ~ , F I N ~ L S ~ 3 1 ~ DIMENSION X ~ 6 0 , 3 0 ~ , ~ ~ 6 0 , 3 0 ~ , X I ~ 3 O ~ , I ~ ~ 3 0 ~ TREE 4 % d e s c r i b e d i n t h e m a i n P r o g r a m BRANCH(1) c o n t a i n s a t a n y m O m m t , r h e n s c a n n i n g t h e Ariadne-thread,the n u s b e r Of yet unused edges S t a l t i n g

C

.

new r o o t

SUBROUTINE PLTREE(N,SERNU(I,TREE) C C C

t h e r o o t a n d must

vALID(I)=.FALSE. m a r e u s e f u l when r e a r r a n g i n g PERM(1,l)iTO f i r s t h-tuple i s defined VALID(l)=.TRUE. ieacch fifhCr,grPndf.ther,..

760

c have

620

C C

GOT0

Of

K=ItN-FINV GOTO 7 4 0

c

b r a n c h used BRASP=BRANCH(SP)-l BRANCH(SP)=BRASP lF(BRASP.GT.0) GOTO 6 2 0 r f no n o p e b r a n c h e s u n s t a c k SP-SP-1 i f TREE h a $ b e e n b u i l t a c c o r d i n g t o d e f i n i t i o n , t h e f o l l o w i n g GOTO w i l l a l w a y s b e t a k e n happens i n f r r s u e n t l y b u t o v c r l l l o n l y once f o r r a s h L - v a l u e IF(SP.GE.1) GOTO 6 1 0

GOTO 7 2 0 larger branches

rmillcr branches

C

C

630

710

>

i n i t i a t e s t a c k p o i n t e r and f i r s t s t a c k elements SPil POs(t)=l BRANCH(l)=Tl L a s t v e r t e x i s a l w a y s f i n a l and t h e r e f o r e c a n n o t 8 large? N-tuple KK=N-I inspect other vertices D O 7 9 0 L=Z,KK number o f b r a n c h e s TL=TREE(L) 1fCTL.LE.O) GOTO 610 *tack "on-final vertex SP=SP+l P O S ( S P ) =L BRAhCH(SP)=Tl

o l d root

I_"

GOTO 7 4 0 If(1.GE.J)

C

TOiTl-1 C

C

--..., "=.A

f o u n d to b e no i d e n t i t y t r e e P O S a n d BRANCH a r e s t a c k s U s C d , l h e n s c a n n i n g t h e A r i a d n e t h r e a d d e f i n e d by t h e N-tYple,tO S t o r e i n d e x and r e m a i n i n g number o f ions f o p a c t u a l f a t h e r s V A L I D ( 1 ) s a y s w h e t h e r PERM(I,t) a n d TRACE(I,*) have been defined PERRUT s t o r e s t h e l a s t p e r m u t a t i o n PERPIC1 * ) c o n t a i n s t h e h - t u p l e f o r I a s t h e r o o t TRACE(;.J) c o n t a i n s t h e p o s i t i o n Of Vertex J f r o n t h e o r i g r n l l h-tuple i n t h e N-tuple fer root I UNOEC s a y s w h e t h e r t h e c a s e i s y e t u n d e c i d e d Y A L D A T S I V I whet he^ PERM a n d TRACE h a v e a L r C # d Y b e e n in7iiali NOTREE i s a t e m p o r a r y c o n t a i n e r f o r t h e f Y n s t i o n r e s u l t VALDAT=.FALSE. r o o t e d t r e e n o t y e t found t o be n o t t h e t r e e NOTREE=.FALSE.

C C

C

UNDEC=.TRUE. CONTINUE CONTINUE DO 7 5 0 I=l,N IF(1.GE.V) G O l O 710 Larger b r a n c h e s O f

C

c c c

C

f i n d p o s i t i o n f o r new b r a n c h O f n e w r o o t ( o l d r o o t a n d remaining branches of old root) UNDEC=.TRUE. JK=J D O 690 KrJK,FINV HELP=PER(l(FATHER,I)-PERM(FATHER,K) L e a v e Loop i f f i r s t d i f f e r e n c e b a t u c c n o l d a n d n c r branch i s p o s i t i v e IF(HELP.GT.O.AND.UhDEC) GOTO 7 0 0 If(HELP.GE.0) GOTO 6 7 5 f i r s t d i f f e r e n c e i s negative.10 t h i s old branch i s L a r g e ? t h a n t h e new o n e UNDEC=.FALSE. GOlO 6 8 0 1.1+1 n e w b r a n c h o f new root i s b u i l t b y c o n c a t e n a t i n g remaining branches e f old r o o t lF(1.EP.V) I*FINVtl t e s t cod o f o l d b r a n c h o f n r r r o o t SUM=SUM*PERM(fATnER,K)-1 IF(SUM.GE.0) GOTO 6 9 0 o l d b r a n c h o f ncr m o t e X h i u i t e d , L C a v r the loop if i t w a s n o t l a r g e ? t h a n new b r a n c h ( n e w o n e m a y b e D U t i n front of this old brmsh) I F ( U N D E 0 GOTO 7 0 0 o l d b r a n c h l a ' l a r g e r t h a n n C l , S e t ~ ~to c O m p l r C new t o next old branch J=K+l

(I

tt**.*..~t~l~~*.*~***~~~~*~*

C C C C C C C C

sun=o

sun-o

tt.*t*..t*.lt**.t......tt*****t

C

JJXSP POSJJ=POS(JJ) I F ( V A L I D 1 P O S J J ) ) GOlO 650 JJSJJ-1 h a p p e n s i n f r e q u e n t l y b u t o v e r a l l o n l y once f o r c a s h L - v a l u e GOTO 6 4 0 g o a n d c o m p a r e when N - t u p l e f o r v e r t e x C o n c e r n e d h a s been developed IF(JJ.GE.SP) GOlO 7 7 0 N-tuple d e f i n e d FATHER=POS(JJ) JJ=JJ*l N - t u p l e to b e d r f i n c a SON*POS(JJ) p o s i t i o n o f t h e son i n t h e N - t u p l e O f t h e f a t h e r V=TRACE(FATHER,SON) VALID(SON)*.TRUE. 1.1 b e g i n n i n g o f f i p s t o l d b r a n c h O f n e w loot J*Vtl s e a r c h t h e e x t e n t o f t h e r u b t r e e b e g i n n i n g a t V,by v i r t u e O f (PROP1) BUM=PERM(FATHER,V)-l F1NV.V IF(SUM.LT.0) GOTO 6 7 0 sum i s n o t n e g a t i v e , t h r r r f e r e t h e next vertex belongs to t h e r u b t r e e a s r e 1 1 FINV.fINVt1 SUM~SUMtPERI(fAlHER,FlNV)-l GOTO 660

.t y c r t e x I f A T H E R ( 1 ) c o n t a i n s t h e i n d e x O f t h a t v e r t e i tiom w h i s h v e r t e x I was p e a c h e d f i r s t I t r e e - t h e o r e t i c f a t h e r o f I), 1.e. 1 i s m e O f t h e vLrtiC.s C o u n t e d b y TREE(FATHER(1)) FINALS(1) c o n t a i n s t h e number o f f i n a l v e r t i c e s u h l c h can be r e a c h e d f r o n v e r t e x 3 t h r o u g h t h i s v e r t e x I xI(I),YI(I) contain the plot-cmrdinltCO Of Vertex 1 XJ,IJ c o n t i i n the plot-ceordin.tcr o f t h e current v e r t e x of vertex 1 X0,YO c o n t a i n the plot-coordin.trr X(K J ) Y(K,J) c o n t a i n ( o n l y when VALID(J).EP..TRUE.) the p l . ; - C 0 0 ~ d i n l t C - a i f f ~ ~ ~ " f~o~r * a l l n e e d e d d i I C C t l 0 n S O f e d g e s (to s a v e S I N a n d COS I v a l Y a t i e n r ) XD,YD contain pl~t-C001dln.t~-diff~~C.SCI for V C I t C I 1 Of different trees

KNOP ET AL.

96 J. Chem. If: Comput. Sci., Vol. 21, No. 2, 1981

Table 1. Number of Trees and Rooted Trees with N Vertices and the CPU Time Needed for the Calculation

c

no. of vertices ( N )

no. of treesa

no. of rooted treesa

CPU time,b s

1 1 2 4 9 20 48 115 286 719 1842 4766 12486 32973 87811 235381 634847 1721159 4688676 12826228

1 2 4 12 32 90 24 8 689

c

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

IOiYD 81:

XCiXGiXD CCNTiIUE

c

"refL,

PATERiNil QHANCH1PATER)iPATER iINALSlPATER)=G C0Y"t f >"a 1 vert ' cer D G 830 l = l , h branclei i t a r t l r g at 1 ~ I i T R E E ( 1 ) f a t h e r i s the youngest vertex r l t h oranche~ IP'HER(I)=PATER . r ~ : i a l l y a c t b r a n c i e i a r e unused QRANCHlI)=l1 1 s Yertex 1 final 7 1FiTI.GT.G) GOTG 8 2 5

c c

C

1 1 %

C

825

c c

l r 1 t 1 a L ~ i a f 1 a n r

830

c

840 850

c

c c

c c

c

c

Temdlnlng

f'la!

FlNP=l iiNALS(li=iIhP bacr t o f a t h e r iINP=IlNP+FlNALS(PATER) i n c r e a s e f a t h e r ' s f i n a l coint b y t h a t o f son FINALS(PATER)=fINP r h l i b r a n c h has been Used BRAP=BRANCH(PATER)-1 QRANCHlPATER)=QRAP branches left ? G O T O 830 ii(BRAP.GT.0) n o more b r d n i h e s , s o b a c k t o f a : h e r PATER=iATHERlPAlER) h a o p e n s ~ n f r e a u e n t l ? b L f 0 ~ E i a 1 1o n l y o n c e f o r e a c t I - v a l u e GOTO 8 2 D v e r t e x 1 I S PO'. f i n a l a n d t h e r e f o r e h a s s o n s ( b r a n c h e s ) PATERil ilNALSli)=C CCNT:NUE n u m b e r o t f , n a l v e r t i c e s I S number C f f i n a l V e r f i i e P r e a c h a b l e through v e r t e x 1 ilNMAX~flNALS11) (+l I f verter 1 ,s I t s e l f flna!) iFlTREEll).LE.l! iINMAXiilNMAXt1

~(L,iINMAX)=LENG*SIN(XJ'FLCATlI-l)) BRANCHll)=TREE(l) start coord1ndtes XJiX0 YJXYD first u e r t e r Xl(:>=XJ I l ( l ) i Y , move and mark *+*+A+* p o s s i b l y 1 n s T d l l l t 1 o n dependant t i * + * + * C A L L SYMBOL(XJ,IJ,SIIE,l,0.,-1! I F l Y . L T . 2 ) G O T O 880 i n l t i a t e eOge d l r e c t l o n ( f i r r l edge I 1 1 f t l e lower than hoi1rontal) L o b i 1 - f 1N A L I ? > D O 870 I = ? , N a n g l e o f a n y e d g e I S t h e mean v a l u e b e t w e e n t h e f l r r t I L o Y + l ) and l a s t v a l u e s ( L O U t Z t i I N A L S l I 1 - 1 1 f i n a l edge r e a c h a b l e t h r o u g h t h i s edge LNDEX=LCY+FINALS(I)

a n g l e modulo ZIP' 1ilINDEX.LE.O) INDEX=lNDEX+FINZ XJ=XJ+XIINDEX,FINllAX) YJ'Y1*I(INDEX,FINMAX) c o o r d ~ n a t e sverter I

c

XIlI)=XJ Yl(i!ilJ

c c c

c 860 C

c

C

c

170

880

.+.+.+.

draw a l a m a r k tit**+* pr,lllb,y 1 n r t a 1 1 a t . o n dependant :ALL SYUBOL(XJ,IJ,SIIE,l,O.,-2) TIiTREEII) a u a ~ i a b l eb l a n c h e s ERAHCH(1)iTI IilT1.GT.O) G O T O 870 final brdnch,Increare iawei bound c f mean value LcY=LOY+z PATER=FATHERli1 o r d l odck t o fatbel XJ=XIlPATER) Y.=Yl(PATER) .+t+*+t p ~ i i l b l ?1 n r t a l l a t 1 o n dependsrt *+*+e+* :ALL P L G T ( X J . Y J . 2 ) b i a n c h "Sed ERAPiQRANCHlPATERI-1 QRANCH(PATER>=BRAP iFlBRAP.GT.3) GOTO 870 no more b r d n C h e i , s o b a c k t o f a t h e r ( I f e x i s t e n t ) l i ( P A T E R . E Q . 1 1 GOT0 870 PATER=iATHERlPATER1 happers ~ n f r e q u e n ; l y but O ~ e r s l lo n l y Once f o i e a c h GOTC 860 CONTINUE next ~ t a r icoora>nates 10iYOtlD l F l A B S I ~ 0 - Y 1 D E ~ . L E . ~ I D E - A B S l l D ~ + . G1 O~ T C 8 9 0 Y D i - I D '0=1O+Yc

I-value

X0=xo'xD 890

RETURN

END

Figure 4. Computer program.

the program will run into a contradiction first between RLOC and LBC. This could be excluded by additional tests for all

n

L

3 6 11 23 47 106 235 551 1301 3159 7741 19320 48629 123867 317955 823065

The corresponding diagrams of the (rooted) tree graphs may be obtained from the author on request. CDCCYBER 76.

components of all N tuples, but it proved less expensive to let the program generate all tuples and then ignore the contradicted ones. Even if many failures of this type occur before generating the next valid N tuple, the expense remains of the order N , because for every such failure the number of components generated and then canceled by searching for the innermost nonexhausted loop is less than or equal to the length of comparative branch of the current RLOC. Now this RLOC is no longer valid for the next trial (it has been satisfied by a smaller value) so that any new failure must originate from an RLOC whose comparative branch must be part of the rest of the newly generated N tuple. But this implies that the total number of backsteps cannot exceeed N . Therefore special provisions for this remaining failure case have been omitted. The logical function TREEKO (Figure 2) tests whether a given N tuple representing a rooted tree is the representative of the tree or the representative of the underlying tree. This is the case if no other vertex chosen as the root delivers a lexicographically larger N tuple. The comparison is only carried out for the nontrival cases where the new root has as many adjacent edges as the original root. To get the new N tuple, TREEKO makes extensive use of property ii to transfer the root-property edge by edge on the way from the old to the new root. The subroutine PLTREE (Figure 3) transforms the representation of a (rooted) tree as an N tuple to a graphic representation. It draws the Ariadne thread described by the N tuple and spreads final edges @e., edges which lead to vertices with no other edges) equally to all directions. Other edges take the mean value of the largest and smallest angles of final edges reachable through the edge concerned. The vertices were represented by a small octagon (SYMBOL, special symbol no. 1).

The computer program is listed in Figure 4. At the end of this section we want to say a few words about the efficiency of the program. The maximal expense to find out the next N tuple representing a rooted tree is of order N , because there are no nested DO loops and backward jumps occur perhaps infrequently but overall at most once for each step of the DO loop. The expense in PLTREE to produce graphic output from an N tuple is almost proportional to N for the same reason, and the same argument bounds the CPU time for a call to

J . Chem. Inf. Comput. Sci., Vol. 21, No. 2, 1981 91

COMPUTER ENUMERATION AND GENERATION OF TREES Table 11. Number of Identity Trees and Homeographically Irreducible Trees with N Vertices

no. of no. of

no. of vertices ( N )

identity trees“

homeographically irreducible tree%

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

1 0 0 0 0 0 1 1 3 6 15 29 67 139 310 667 1480 3 244 7241 16104

1 1 0 1 1 2 2 4 5 10 14 26 42 78 132 249 445 842 1561 2988

The corresponding diagrams may b e obtained on request.

TREEKO to N X N . Therefore the average expense to find an N tuple representing a tree is of an order less than or equal to N X N X N because there are at most N rooted trees above, each of which is produced in order N , tested by the main program in order N and by TREEKO within order N X N (only if necessary). Actual runs of the program have shown that these extensive tests are seldom enough to allow estimation of the expense for the generation of trees proportional to the total number of vertices in all generated trees (physical plotter actions as well need real time proportional to the total number of vertices). The efficiency of the method is best illustrated by test runs with dummy PLOT and SYMBOL which save about 90% of CPU time. RESULTS In Table I we give the number of all trees and rooted trees for N = 1,2, . . ., 20 vertices and the CPU time (in s) needed for the calculation on the computing machine CDC-CYBER

76. These values were compared with the numbers given by Harary in his classical book “Graph Theory”.30 In a separate table (Table 11) we give the number of identity trees and homeographically irreducible trees with N vertices. Values for trees with N = 1,2, , . ., 12 vertices in Table I1 were checked against the numbers given by Harary and The diagrams of the trees, rooted trees, identity trees, and homeographically irreducible trees may be obtained on request from the Computer Centre of the University of Diisseldorf. If we wish to enumerate only trees corresponding to the carbon skeletons of alkanes (Le,, alkane tree graphs) or the rooted trees corresponding to the structures which could be generated from alkanes on substitution (i.e., alkane rooted tree graphs), the condition of the maximal valency of vertices must be set to four. In Table I11 we give the number of alkanes CNH2N+2, the number of alkane rooted trees with the primary, secondary, tertiary, and quaternary roots, respectively, the number of substituted alkanes, CNH2N+1X,and the total number of “rooted” alkanes, respectively. These results were checked against the numbers given by Read.13 The alkane diagrams may be also obtained on request from the Computer Centre of the University of Dusseldorf. However, as an example we present in Table IV the output listing containing graphs of all alkane trees with 11 vertices. We are quite aware that it is very difficult to augment the already rich literature on the enumeration of trees and alk a n e ~ . ’ , ~ ’However, .~~ we have presented here, and well documented, a method with several excellent features. (a) It enumerates trees and alkanes with high accuracy as any existing, cumbersome or elegant, method in literature. (b) It is easily adopted for the computer calculations. (c) It possesses an important advantage in comparison with other methods: it produces directly graphs of trees and alkanes. To our knowledge this is a unique method in this respect. (d) It could be easily adopted for the classroom demonstration of the chemical enumeration problem on the computer. Therefore, we believe that the presented method should be of general interest because it offers a very efficient approach for handling a class of important chemical (and graph-theoretical) structures by computer.

ACKNOWLEDGMENT Discussions with Dr. A. J. Harget (Birmingham) and Dr. T. givkovie (Zagreb) are gratefully acknowledged. We thank

Table 111. Number of Alkanes and Substituted Alkanes w i t h N Atoms

no. of atoms (N)

no. of alkanes“

no. of alkanes with a primary root

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

1 1 1 2 3 5 9 18 35 75 159 355 802 1858 4347 10359 24894 60523 148284 366319

1 1 1 2 4 8 17 39 89 211 507 1238 3057 7639 19241 48865 124906 321198 830219 2156010

no. of alkanes with a secondary root

no. of alkanes with a tertiary root

no. of alkanes with quaternary root

no. of substituted alkanes

no. of “rooted” alkane trees

0 0 1 1 3 6 15 33 82 194 482 1188 2988 7528 19181 49060 126369 326863 849650 2216862

0 0 0 1 1 3 7 17 40 102 249 631 1594 4074 10443 26981 69923 182158 476141 1249237

0 0 0 0 1 1 3 7 18 42 109 269 691 1759 4542 11733 30559 79743 209136 549959

1 1 2 4 8 17 39 89 211 507 1238 3057 7639 19241 48865 124906 321198 830219 2156010 5622109

1 1 2 4 9 18 42 96 229 549 1347 3326 8330 21000 53407 136639 351757 909962 2365146 6172068

The corresponding diagrams may be obtained from the author on request.

98 J . Chem. InJ Comput. Sci., Vovol. 21, No. 2, 1981

KNOPET

Table IV. Alkane Graphs with 11 Vertices (Copy of the Computer Output)

1 t t

?

"t: L + L-c

t

L I

jx

'I

AL.

J . Chem. In& Comput. Sci. 1981, 21, 99-101

the reviewers for their helpful comments which led to improvement of the paper. REFERENCES AND NOTES (1) F. Harary, “Graph Theory”, Addison-Wesley, Reading, MA, 1972, p I)*

JL.

(2) (3) (4) (5) (6j (7) (8) (9) (10) (11) (12) (13)

Reference 1, p 187. G.Kirchhoff, Ann. Phvs. Chem., 72,497 (1847). A. Cayley, Philos. Mag., 13, 1-2 (1857). C. Jordan. J . Reine Anpew Math.. 70. 185 (1869). . , A. Cayley; Philos. Mag., 67, 444 (1874). A. Cayley, Ber., 8, 1056 (1875). F. Herrmann, Ber., 13, 792 (1880). S.M. Losanitsch, Ber., 30, 1917 (1898). F. Herrmann, Ber., 30, 2423 (1898). S.M. Losanitsch, Ber., 30, 3059 (1898). F. Herrmann, Ber., 31, 91 (1898). R. C. Read, in “Chemical Applications of Graph Theory”, A. T. Balaban, Ed., Academic, London, 1916, p 25. (14) G.Wlya, Acta Math., 68, 145 (1937). (15) R. Otter, Ann. Math., 49, 583 (1948). (16) L. Weinberg, Proc. IRE, 46, 1954 (1958).

(28) (29) . , (30) (31) (32)

99

R. C. Read, in “Graph Theory and Applications”, Y. Alavi, D. R. Lick, and A. T. White, Eds., “Lecture Notes in Mathematics”, No. 303, Springer-Verlag, Berlin, 1972, p 243. H. Schiff, Ber., 8, 1542 (1815). F. Tiemann, Ber., 26, 1595 (1893). H. R. Henze and C. M. Blair, J . Am. Chem. SOC.,53, 3042, 3077 ( 1932). C. M. Blair and H. R. Henze, J . Am. Chem. SOC.,54, 1098, 1538 ( 1932). D. Perry, J. Am. Chem. SOC.,54, 2918 (1932). D. D. Coffman and C. M. Blair, J . Am. Chem. Soc., 55,252 (1933). H. R. Heme and C. M. Blair, J . Am. Chem. SOC.,55, 680 (1933). H. R. Henze and C. M. Blair, J . Am. Chem. SOC.,56, 157 (1934). C. C. Davis, K.Cross, and M. Ebel, J. Chem. Educ., 48,675 (1971). See, for example, various articles in “Graph Theory and Computing”, R. C. Read. Ed.. Academic. New York. 1972. L. M. Masinter,”. S.Sridharan, J. Lederberg, and D. H. Smith, J. Am. Chem. Soc., 96, 7702 (1974). J. B. Hendrikson, J. Am. Chem. Soc.,93,6847 (1971); 93,6854 (1971); 97, 5763 (1975); 97, 5784 (1975). Reference 1, p 232. F. Harary and G.Prins, Acta Math., 101, 141 (1959). D. H. Rouvray, Chem. SOC.Reu., 3, 355 (1974).

ACS Committee on Nomenclature: Annual Report for 1980 KURT L. LOENING Chemical Abstracts Service, Columbus, Ohio 43210 Received February 17, 1981 Nomenclature committees, both national and international,were very active in 1980, resulting in substantial progress in many different fields. A summary of the more important meetings and accomplishments follows. The ACS Committee on Nomenclature held its annual meeting at CAS in November.’ Progress of the work of the divisional committees and international commissions was reviewed. In addition, ways of working more closely with ACS Divisions, journal editors and authors as well as general means of promoting good nomenclature were explored. The chairman of the committee addressed the editors of ACS journals on the subject of chemical nomenclature at their conference in Columbus; this should lead to closer cooperation between the two groups. The feasibility study on compiling an authoritative chemical dictionary was extended, while the project on visual aids for chemical nomenclature was dropped. Contact was established between the committee and its recently established British equivalent. The subcommittee on chemical pronunciation continues to be active. The IUPAC Interdivisional Committee on Nomenclature and Symbols (IDCNS) functioned effectively this year. It held its annual meeting in Cambridge in September. In addition to the IUPAC publications listed in the Appendix, specific documents in process and thus not yet recorded in this Appendix deal with the following topics: straightforward transformations, transport phenomena, biochemical equilibrium data, chemical kinetics, physicochemical quantities and units in clinical chemistry, calorimetric measurements on cellular systems, and various classes of carbohydrates. The IUPAC Inorganic Nomenclature Commission met in September in Cambridge. Topics under discussion included neutral molecules and compounds, ions and radicals, rings and chains, polyhedral clusters, isopoly- and heteropolyanions, oxo acids, inorganic polymers, and stereochemical nomenclature. These topics were discussed in the context of providing a ‘Abbreviations used, not identified in the text, are ACS, American Chemical Society; CAS, Chemical Abstracts Service; IUPAC, International Union of Pure and Applied Chemistry; JCBN, Joint Commission on Biochemical Nomenclature; NC-IUB, Nomenclature Committec of International Union of Biochemistry. 0095-2338/81/1621-0099$01.25/0

revision of the 1970 edition of the Red Book. Revised recommendations on the nomenclature of nitrogen hydrides and isotopically modified compounds are expected to be issued next year. The IUPAC Organic Nomenclature Commission met in September in Cambridge. The commission continued its study of the reorganization and revision of the present rules according to a more logical arrangement (Section R) and of a more drastic long-range approach (Section G). In connection with Section G, several specific projects are under way: nodal nomenclature, radial nomenclature, “inorganic” ring nomenclature, nomenclature for delocalized ions and radicals, nomenclature of oxo acids, and general priority rules for numbering. The following topics are so well advanced that publications should be forthcoming within a year or two: lambda convention, classical ions and radicals, cyclophanes, and a revision of the Section E rules on stereochemistry. The 1979 provisional recommendations for the revision of the Hantzsch-Widman nomenclature system for naming heteromonocycles have generated so many diverse opinions and comments that the commission requires additional time and study before issuing definitive recommendations. The IUPAC Macromolecular Nomenclature Commission met in September in Naples. The commission is continuing its work on (a) nomenclature and symbolism of copolymers, (b) subsidiary definitions of terms relating to polymers, (c) definitions for physical properties of polymers, (d) substitutive nomenclature for reacted polymers, (e) nomenclature of inorganic polymers, (f) classification and family names of polymers, and (g) interpenetrating polymer networks. Of these items (a), (e), and (f) are at the most advanced stage with recommendations expected to be issued in 1981 or 1982. A definitive version of the recommendations dealing with stereochemical definitions and notations relating to polymers will be issued in 1981. 0 1981 American Chemical Society