perm filename LABEL.DOC[DOC,LMM]1 blob
sn#044084 filedate 1973-05-18 generic text, type T, neo UTF8
LABELLING OBJECTS HAVING SYMMETRY
L. Masinter
N.S. Sridharan
Computer Science Department
Stanford University
Stanford, California
May, 1973
ABSTRACT. An algorithm for finding a complete set of non-
equivalent labellings of a symmetric object and applications of
the algorithm to problems in chemistry are presented.
␈↓αINTRODUCTION␈↓β
The class of combinatorial problem dealing with finding a complete
set of non-isomorphic objects under varying constraints and concepts
of isomorphism, have wide applications in a variety of fields. The
problem attacked in this paper is one of finding all distinct ways to
assign a given number of labels or colors to the parts of a symmetric
object when it is also known how many parts get each of the labels or
colors. In chemistry, one manefestation of this problem is to make
all assignments of ligands (from a fixed set) to a given carbocyclic
skeleton. Part A of this paper may be read as a brief tutorial on
the nature of the problem and an introduction to the terminology
found in more technical treatments. Part B describes a method for
the solution of this type of problem; a formal description and proof
of correctness is available elsewhere. Part C gives examples and
applications of this algorithm in both organic and inorganic
chemistry.
This problem is encountered in a wide range of applications beyond
chemistry-- within many areas of graph theory and combinatorics, for
example. It has been known how to compute the number of solutions
[1], but an efficent method of actually constructing the solutions
has not previously been reported.
1
The discussion in this paper is restricted to the labelling of graphs
with the "topological" symmetry groups; the algorithm, however, is
immediatly applicable to other types of labellings; one could, for
example, generate all distinct "permutational isomers" for a given
three dimensional ring system as defined in Ruch [2].
␈↓αPART A: DEFINITIONS␈↓β
␈↓αSYMMETRY AND ITS RELATIONSHIP TO LABELLING␈↓β
Consider the special case of the general problem: suppose all of the
labels are distinct. This means that, for example,one wishes to
number the faces of a cube with the numbers {1,2,3,4,5,6}, or the
"nodes" (atom positions in a graph) of the decalin skeleton with
numbers {1,2,3,4,5,6,7,8,9,10}. There are n! (n factorial) ways of
labelling where n is the number of parts. If the object has no
symmetry then each of these n! ways are distinct from the rest.
However for the decalin skeleton (where n! = 10! = 3,628,800 ways)
there is some symmetry. Pick one of the numberings as a point of
reference (Fig 2a). Some of the 10! ways are different (Fig 2b);
some of them are essentially the same (Fig 2c).
2
------------------------------------------------------------
Figure 1
The Decalin Skeleton
⊗ ⊗
/ \ / \
/ \ / \
⊗ ⊗ ⊗
| | |
| | |
⊗ ⊗ ⊗
\ / \ /
\ / \ /
⊗ ⊗
------------------------------------------------------------
Figure 2
Three of the 10! numberings of
the Decalin Skeleton.
2 4 7 1 4 2
/ \ / \ / \ / \ / \ / \
/ \ / \ / \ / \ / \ / \
1 3 5 2 8 9 5 3 1
| | | | | | | | |
| | | | | | | | |
10 8 6 3 4 5 6 8 10
\ / \ / \ / \ / \ / \ /
\ / \ / \ / \ / \ / \ /
9 7 6 10 7 9
(2a) (2b) (2c)
------------------------------------------------------------
Intuitively, Figs 2a and 2c are equivalent because one could take 2a,
flip it over and get 2c. There is another way of determining the
"sameness" of such numberings which is easier in the more complicated
cases and is in general more suited to computer application:
3
Write down the respective "connection tables". (See Table I). Note
that the connection table for Fig 2c is identical to that of Fig 2a;
that of Fig 2b is different.
------------------------------------------------------------
Table I.
Structure (2a) Structure (2b) Structure (2c)
node|connected | node|connected | node|connected
| to | | to | | to
| |
1 2,10 | 1 8,9 | 1 2,10
2 1,3 | 2 7,3 | 2 1,3
3 2,4,8 | 3 2,6 | 3 2,4,8
4 3,5 | 4 6,8,10 | 4 3,5
5 4,6 | 5 9,10 | 5 4,6
6 5,7 | 6 3,4 | 6 5,7
7 6,8 | 7 2,8 | 7 6,8
8 3,7,9 | 8 1,4,7 | 8 3,7,9
9 8,10 | 9 1,5 | 9 8,10
10 1,9 | 10 4,5 | 10 1,9
------------------------------------------------------------
DEFINITION: two numberings of the nodes of a graph are
␈↓↓equivalent␈↓β if the connection tables with the respective
numberings are identical when the node numbers are
written in ascending order and each "connected to" list
is in ascending order.
This means, among other things, that properties such as valency are
preserved: If two numberings are equivalent and in the first, node 1
is trivalent then in the second, node 1 is trivalent. If there are
other properties of the nodes (already colored or labelled, for
example), then this definition can be extened to include the
preserving of those properties.
4
For example, suppose there are atom names associated with (some of)
the nodes of the graph. Then one can define equivalent numberings to
be those which not only have identical connection tables, but also
the same atom names for the corresponding nodes. Then Fig 3a would
still be equivalent to Fig 3c but no longer to Fig 3b since, although
the connection tables of 3a and 3b are identical, node 1 in Fig 3a is
labelled with an "N" while while in 3b it unlabelled.
----------------------------------------------------------
Figure 3.
Partially labelled graphs reduce
the equivalencies.
2 4 9 7 4 2
/ \ / \ / \ / \ / \ / \
/ \ / \ / \ / \ / \ / \
N1 3 5N N10 8 6N N 3 1N
| | | | | | | | |
| | | | | | | | |
10 8 6 1 3 5 6 8 10
\ / \ / \ / \ / \ / \ /
\ / \ / \ / \ / \ / \ /
9 7 2 4 7 9
(3a) (3b) (3c)
----------------------------------------------------------
␈↓αPERMUTATIONS AND PERMUTATION GROUPS␈↓β
Given one numbering, one can use a condensed notation to write down
other numberings in terms of the first. In the 2b case (Table II),
the row of numbers means that, in sequence, the node numbered 1 in
the reference numbering 2a is now numbered 2, the node originally
numbered 2 is now numbered 10, and so on. All of these are written
5
down with respect to the original "reference" numbering of figure 2a.
-------------------------------------------------------------
Table II
Condensed Notation for Numberings
Figure 2a numbers: 1 2 3 4 5 6 7 8 9 10
Figure 2b numbers: 2 10 3 7 4 6 5 8 1 9
Figure 2c numbers: 5 4 3 2 1 10 9 8 7 6
-------------------------------------------------------------
One can conceptualize a numbering as a transformation or as a
function: The transformation for 2c is π (1)=5, π (2)=4, π (3)=3,
2c 2c 2c
... π (10)=6. These transformations are ␈↓↓permutations␈↓β : one to one
2c
mappings from the integers {1,2,...,n} to themselves. The
transformation for the "reference" numbering is the identity i(x)=x.
It can be shown that whatever the graph, the set of transformations
satisfying the "equivalency" requirement above satisfies the property
of a group. One may then say:
The ␈↓↓symmetry ␈↓↓group␈↓β of a graph is the set of all
transformations which yield identical connection tables.
(If there are other properties to be considered, one may
include them as part of the connection table). For the
decalin skeleton there are 4 permutations in the group.
6
--------------------------------------------------------------------
The Symmetry Group
of the Decalin Skeleton
π 1 2 3 4 5 6 7 8 9 10
i
π 5 4 3 2 1 10 9 8 7 6
v
π 10 9 8 7 6 5 4 3 2 1
h
π 6 7 8 9 10 1 2 3 4 5
180
-------------------------------------------------------------------
These correspond directly to the intuitive geometric symmetries
π =identity, π =reflection about horizontal axis, π =reflection about
i h v
vertical axis, π = rotation 180 degrees. It is not too difficult
180
for a computer program to figure out the symmetry group of a graph
given its connection table.
In many cases, one might wish to label the "edges" (interconnecting
lines) of a graph. In this case, the symmetry group on the edges
rather than on the nodes is required. It is very easy to calculate
this group. First find the group on the nodes. Then, for each edge
in the graph, write down the nodes that form the end-points. Then
the corresponding permutations can be obtained as follows:
π {n ,n }={π (n ),π (n )}
i 1 2 i 1 i 2
This is most easily shown by way of an example. (Table IV).
7
------------------------------------------------------------------
Table IV
Group of Decalin Skeleton acting on the edges
π 1-2 2-3 3-8 3-4 4-5 5-6 6-7 7-8 8-9 9-10 1-10
i
π 4-5 3-4 3-8 2-3 1-2 1-10 9-10 8-9 7-8 6-7 5-6
v
π 9-10 8-9 3-8 7-8 6-7 5-6 4-5 3-4 2-3 1-2 1-10
h
π 6-7 7-8 3-8 8-9 9-10 1-10 1-2 2-3 3-4 4-5 5-6
180
------------------------------------------------------------------
Finding the group of an object is a special kind of labelling
problem. Given one way of numbering (labelling with distinct labels)
the parts of the object, one finds all other ways which are
equivalent. The labelling problem attacked in this paper is the
converse: to find a maximal list of labellings, none of which are
equivalent to each other. In general the set of all posssible
labellings can be split into subsets, such that:
1) If two labellings are in different subsets, they are
not equivalent.
2) If two labellings are in the same subset, they are
equivalent.
These subsets are called ␈↓↓equivalence ␈↓↓classes.␈↓β
The relationship of finding the group, and of finding labellings
where all the labels are distinct, can be viewed as follows: Take the
8
n! possible labellings and lay them out in a matrix: each row will
contain one equivalence class. (It can be shown that in this special
cases where the labels are distinct, all of the equivalence classes
are of the same size). To find the group, one needs to find a row.
To find the labellings, one needs to pick one element from each row.
-------------------------------------------------------------
Figure 4
Equivalence classes, Groups, and Labellings
-------------------------------------------------------------
␈↓αPART B. SOLUTION TO THE LABELLING PROBLEM␈↓β
␈↓αA Naive Method␈↓β
An obvious method to find the distinct labellings would be to
generate all n! of the possible ones, and for each one to check if an
equivalent one was previously generated. Unfortunately, this method
takes an exhorbitant amount of computation (proportional to n!
9
squared). Below a method is discussed which takes an amount of time
roughly proportional to the number of solutions (i.e. the number of
equivalence classes of labellings) and requires only knowledge of the
symmetry group. Thus it is useful for labelling objects using their
geometric symmetry as well as the topological symmetry defined above.
␈↓αA Better Method␈↓β
␈↓αSpecial case: distinguish 1 node.␈↓β First consider the special case
where there are only two types of labels such that there is one label
of the first type and all the rest of the second. (E.g., color one
red, and the rest green, or label one N and the rest C.) Intuitively,
for the decalin skeleton, one can see that there are three classes of
symmetric nodes, or orbits, marked with "⊗", "+" and "&" in Fig. 5,
and that each distinct labelling corresponds to selecting one node
from each type. (see Fig 6.)
-------------------------------------------------------------
Figure 5
Orbits in the Decalin Skeleton
⊗ ⊗
/ \ / \
/ \ / \
+ & +
| | |
| | |
+ & +
\ / \ /
\ / \ /
⊗ ⊗
-------------------------------------------------------------
10
-------------------------------------------------------------
Figure 6
Three Labellings of Decalin with
1 N and 9 C's.
⊗ ⊗ N ⊗ ⊗ ⊗
/ \ / \ / \ / \ / \ / \
/ \ / \ / \ / \ / \ / \
N ⊗ ⊗ ⊗ ⊗ ⊗ ⊗ N ⊗
| | | | | | | | |
| | | | | | | | |
⊗ ⊗ ⊗ ⊗ ⊗ ⊗ ⊗ ⊗ ⊗
\ / \ / \ / \ / \ / \ /
\ / \ / \ / \ / \ / \ /
⊗ ⊗ ⊗ ⊗ ⊗ ⊗
-------------------------------------------------------------
Thus there are three distinct labellings (the ten possible labellings
split into three equivalalence classes).
␈↓αComputing orbits.␈↓β In general, the parts of an object with symmetry
split into different orbits (sometimes there is only one type, as in
the faces of a cube, or the nodes of the cyclohexane skeleton). To
label the parts of an object such that one is distinguished, it is
necessary only to find the orbits and then, for each type, pick one
of the parts in that type arbitrarily. Note that if the object has
no symmetry each type has exactly one part in it.
It is very simple to find the different types from the table of the
symmetry group: if one writes out the group, as in Table III, with
11
each permutation as a row, then the numbers in each column, taken as
a set, form an orbit. The orbits of the group in Table III are:
{1,5,6,10}, {2,4,7,9}, {3,8}.
After finding the set of orbits, one then can do the special
labelling described above (distinguishing only one node): the number
of distinct labellings is the number of orbits. Each corresponds to
selecting an element from an corresponding orbit and labelling it.
For reasons to be explained later, the first element of each orbit
should be chosen (i.e. the one with the smallest number in the
reference numbering).
One might note here that if one has n-1 labels and 1 "blank" it is
the same as 1 label and n-1 "blanks". This special case can be
applied here as well. ␈↓αThe reduced symmetry group.␈↓β Once a group has
been calculated for a structure, many times one wants to know what
the group is after some labels have been attached. The group of a
labelled structure is always a ␈↓↓subset␈↓β of the group of the
unlabelled structure. Thus one needs to know which permutations in
the group must be thrown out. To do this, write the "labels"
associated with each node next to the node number in the permutation
table as in Table II. If in any column, there is an element which
has a different label than the label in the "reference" numbering
12
(identity permutation), then that row can be discarded. That is,
every permutation in the reduced symmetry group must satisfy:
for every node i, label(π(i))=label(i).
Exactly those permutations in the original group that satisfy this
equation are the ones in the reduced symmetry group of the labelled
structure. ␈↓αReduction techniques␈↓β In the general labelling problem,
there are two important techniques used to reduce the problem. The
*
first is called LABEL RECURSION and the second ORBIT RECURSION. The
idea behind LABEL RECURSION is that one can do one label at a time.
The idea behind ORBIT RECURSION is that one can label one orbit at a
time. These reductions are discussed in detail below.
------------------------
* A ␈↓↓recursive␈↓β technique is one which is defined in terms of itself.
For example, n! can be defined by:
{ 1 if n=1
n! = {
{ n*(n-1)! otherwise
The computation of 10! involves computing another factorial. It is
necessary that at each step, the problem is reduced. Here the
general solution of the labelling problem is described in terms of
less complicated labellings. Several special cases (analogous to the
n=1 case in the definition of factorial) are also defined.
13
␈↓αLABEL RECURSION␈↓β
If one is given many (more than 2) kinds of labels, say n of type 1,
1
n of type 2, ... n of type k, proceed as follows: (1) solve the
2 k
labelling problem for n of one type of label and n +n +...+n of
1 2 3 k
another type. (Call the second type of label "blank"). The result
will be a list of partially labelled objects (along with their
reduced symmetry groups). Take each of the results and label the
"blank" parts with n labels of one kind, n of another, ... ,n of
2 3 k
another. It can be proved that the results will be a list of all
distinct solutions to the original problem. For example, to label
the decalin skeleton with 1 label "N", 1 label "S" and 8 labels "C",
one first labels with 1 "N" and 9 "blanks" obtaining the 3 results in
figure 7a. (Fig 7a1,7a2,7a3). There are now 3 new problems to
label The "blanks" if Figs 7a1-3 (under their respective reduced
symmetry), with 1 "S" and 8 "C"'s. In Figs 7a1 and 7a2, there is no
symmetry left, and so each of the "blanks" has its own orbit, thus
there are 9 distinct labellings apiece. In Fig. 7a3, there are 5
orbits under the reduced symmetry, and thus there are 5 additional
possiblities. (Fig 7b).
14
-------------------------------------------------------------
Figure 7
Labellings with 1 N, 1 S, and 8 C's.
A B C
⊗ ⊗ ⊗ ⊗
/ \ / \ / \ / \
N ⊗ ⊗ N ⊗ ⊗
7a1 | | | 7a1a | | |
⊗ ⊗ ⊗ ⊗ ⊗ ⊗
\ / \ / \ / \ /
⊗ ⊗ ⊗ ⊗
N ⊗ N ⊗
/ \ / \ / \ / \
⊗ ⊗ ⊗ ⊗ ⊗ ⊗
7a2 | | | 7a2 | | | this diagram
⊗ ⊗ ⊗ ⊗ ⊗ ⊗ is not complete
\ / \ / \ / \ /
⊗ ⊗ ⊗ ⊗
⊗ ⊗ ⊗ ⊗
/ \ / \ / \ / \
⊗ N ⊗ ⊗ N ⊗
7a3 | | | 7a3 | | |
⊗ ⊗ ⊗ ⊗ ⊗ ⊗
\ / \ / \ / \ /
⊗ ⊗ ⊗ ⊗
-------------------------------------------------------------
␈↓αORBIT RECURSION␈↓β
There are 3 cases in the 1 N, 9 C on the decalin skeleton problem.
(case 1) 1 N goes to orbit {1,5,6,10}.
(case 2) 1 N goes to orbit {2,4,7,9},
(case 3) 1 N goes to orbit {3,8}.
There is only one possibility in each of these cases. Suppose there
were 3 N's. Then there would be 9 cases. (Table IV).
15
------------------------------------------------------------
Table IV.
(3 N's on a decalin)
# N's going to
case orbit orbit orbit
number {1,5,6,10} {2,4,7,9} {3,8}
1 3 - -
2 2 1 -
3 2 - 1
4 1 2 -
5 1 1 1
6 1 - 2
7 - 3 -
8 - 2 1
9 - 1 2
------------------------------------------------------------
In some of these cases there are more than one possibility (cases
2,3,4,5 and 8). However, every labelling fits into one of these
cases, and labellings from different cases cannot be equivalent.
Thus, each of these cases can be done independantly, and the results
collected together. To do any one of the cases, the labellings of the
orbits can be done sequentially.
Take case 5. First label one of {1,5,6,10} with one N, (and the rest
"blanks"). Since {1,5,6,10} is an orbit, we can chose one
arbitrarily and get Fig 8. (The first one is chosen).
16
------------------------------------------------------------
Figure 8
1 N to orbit {1,5,6,10}
⊗ ⊗
/ \ / \
/ \ / \
N ⊗ ⊗
| | |
| | |
⊗ ⊗ ⊗
\ / \ /
\ / \ /
⊗ ⊗
------------------------------------------------------------
Second, label {2,4,7,9} with 1 N (and 3 blanks). Note that {2,4,7,9}
is no longer an orbit under the reduced group. Stick to the original
plan-- it is just necessary to find ␈↓↓new␈↓β orbits under the reduced
group to label {2,4,7,9}. Since there is no symmetry left, each of
{2,4,7,9} falls into its own orbit. The "special case" can be used
directly to find the 4 solutions (Fig 9).
17
------------------------------------------------------------
Figure 9
Second step of case 5
N ⊗ ⊗ N
/ \ / \ / \ / \
N ⊗ ⊗ N ⊗ ⊗
9a | | | 9b | | |
⊗ ⊗ ⊗ ⊗ ⊗ ⊗
\ / \ / \ / \ /
⊗ ⊗ ⊗ ⊗
⊗ ⊗ ⊗ ⊗
/ \ / \ / \ / \
N ⊗ ⊗ N ⊗ ⊗
9c | | | 9d | | |
⊗ ⊗ ⊗ ⊗ ⊗ ⊗
\ / \ / \ / \ /
⊗ N N ⊗
------------------------------------------------------------
Then, for each of these solutions, {3,8} must be labelled with 1 N
(and 1 blank). The final result is 8 possibilities for case 5, none
of which have any remaining symmetry.
␈↓αFINAL TECHNIQUE.␈↓β
Now the only problem to be solved is this: suppose there are two
types of labels, n of the first, and n of the second, neither n or
1 2 1
n are 1, and there is only one orbit. No more simple reductions
2
left. The solution, however, is another trick. Pick the first node
and label it, calculating the reduced symmetry group and new orbits.
18
Label the rest of the nodes (under the reduced group) with n -1
1
labels of type 1 and n labels of type 2. The result will contain a
2
representative of each equivalence class of labellings; however, if
n >2 then there may be some duplicates (i.e., two of the results may
1
actually be equivalent).
For example, the cyclohexane skeleton has a group of order twelve
(has twelve permutations), and there is only one orbit. To label it
with three N's, one labels node 1 with a N, calculates the reduced
group and new orbits; then finds the various cases for distributing
the remaining two N's among those orbits. (Table V.) Then for each
case, do each orbit sequentially. Cases 1, 3, 4 and 5 are fairly
straightforward; in case 2, first label {1,2} with 1 N. (Figure
9). Now the group reduces even further, and one gets the two results
depicted in Figure 9b. Note that cases 1 and 2a are equivalent as
well as 2b, 3, and 5. What to do? Fortunately, there is a good
way of throwing out the impostors without having to check each of the
results against each of the others for equivalency. If there is a
permutation π in the group, such that π(labelled set) > labelled set
then the labelled set is an impostor-- throw it out. Furthermore,
every impostor is detected this way. All that is necessary is that
when doing these "lower level" labellings, that one is careful to
19
pick the "first" element of each orbit to label and the "first orbit"
when there are a choice of orbits.
------------------------------------------------------------
Table V
cases for 2 N's on
{2,3,4,5,6} of cyclo-
hexane
case {2,6} {3,5} {4}
1 2 - -
2 1 1 -
3 1 - 1
4 - 2 -
5 - 1 1
------------------------------------------------------------
Fortunately, this technique is rarely necessary -- usually in the
course of a computation, the "special cases" catch almost everything.
For example, to label the decalin skeleton, it is never needed since
even when one is labelling say orbit {2,4,7,9}, there is either 1, 2,
n-1 or n-2 labels to be attached. For the cyclohexane skeleton, it
is only needed if there are 3 of one label and 3 of another (if there
were 3,2 and 1 of three respective label types, just do the singleton
first -- the group will then reduce and again this "FINAL TECHNIQUE"
will not be necessary. Only in cases where there is at least 6 fold
symmetry (i.e., every orbit has at least 6 elements) and there are at
least 3 of each of the label types is this technique required.
20
␈↓αC. SUMMARY OF LABELLING STEPS␈↓β
1) Calculate the group if not previously calculated
2) if more than 2 different node types, do the entire labelling
process with 1 of the label types, and the rest "blanks"; then
for each of the solutions obtained, label the "blanks" with
the remaining label types using the reduced symmetry group.
and collect the results.
3) otherwise,
a) find the orbits
b) if more than one orbit, then
1) find the different "cases" -- ways of distributing
the labels among the orbits
2) for each case,
a) label the first orbit
b) label the rest of the orbits using the reduced
symmetry group obtained from a).
and collect the results
c) otherwise (only 1 orbit and 2 label types)
1) if there is only one of one of the label types,
pick the "first" node and label it with that
label type. This is the only distinct possibility.
2) otherwise, if there are two of one of the label
types, label the first node with that label type,
calculate the reduced symmetry group and new orbits,
and from each of the new orbits, pick the "first"
node. The solutions consist of those possibilities
(one for each new orbit).
3) otherwise, (3 or more of each label type, and one orbit)
label the "first" node, calculate the reduced symmetry
group, label the rest of the nodes under the reduced
group, and for each of the results, check if for every
permutation π in the original group that
π(labelled set)≥labelled set
If this is not true of a labelled set, discard it
as a solution. The result is every such labelling
that satisfies this property.
21
␈↓αREFERENCES␈↓β
[1] De Bruijn, N. G., Generalization of Polya's Fundamental theorem
in Enumerative Combinatorial Analysis, Nedarl. Akad. Wentensh. Proc.
Ser. A, 62 (1959), 59-69; also Polya's Theory of Counting, in
Applied Combinatorial Mathematics, E. Beckenbach, ed, Wiley, New
York, 1964, pp 144-184.
[2] I. Ugi
[3] Previous DENDRAL papers
[4] Perlman?
22