Thursday, May 10, 2007

On Burnside's lemma and colorings of the cube

I’d like to discuss counting for my carnival post. Counting seems to be a trivial problem: it is the first math lesson anyone gets, because it is at the heart of what numbers are and how they connect with the real world. Surprisingly, this most basic use of mathematics is the most subtle, and there’s a whole branch of math that discusses not only counting but also other discrete (non-continuous) topics called combinatorics.

There are two main difficulties to counting, when you break it down: 1) distinguishing between distinct elements of a collection of things, and 2) enumerating them. This may seem trivial, yet both steps are crucial. There are simple examples of objects that are countable yet notoriously hard to distinguish (the set of unique graphs with N vertices up to isomorphism), and sets that have distinguishable elements yet cannot be enumerated (the set of unique real numbers). Mathematicians have numerous (yet countable :P) tools to deal with these issues. This article will discuss one of the most powerful in the toolkit.

I’ve been inspired by an earlier carnival post where there was a rather lengthy discussion of the permutations of the Rubik’s Cube, in which it is stated that there are 43,252,003,274,489,856,000 states of the cube. What I find more interesting than this ginormous number is the smaller number 12. That’s the number of different ‘kinds’ of cube, of which only one of these is actually solvable (if you ever had someone bust a cube in order to scramble it, you know what I mean).

Any time you ask how many ‘kinds’ of object there are, you invoke of my favorite results of combinatorics: Burnside’s lemma. Instead of answering the question “How many objects do I have?” Burnside’s lemma allows us to answer the question “How many kinds of object do I have?” I’ll attempt to break down for you why this lemma is important as simply as possible, how it is used, and how ubiquitous it is. While I won’t tackle the number 12 in this article (we’ll need Polya for that), I will tackle another interesting question that has puzzled me since I saw a beginner’s cube with only two colors in a toy store in New York: how many ‘kinds’ of solvable cube are there?

Let’s start with some history, because even using the name Burnside’s lemma is controversial. William Burnside published this lemma and its proof in 1897, but the technique existed before Burnside; Cauchy and Frobenius independently discovered this method on their own. Some mathematicians call it the orbit-counting theorem or the lemma that isn’t Burnside’s (as opposed to the many lemmas that were; Burnside was rather prolific in group theory). I’ll still use the term Burnside’s lemma, because that’s how I learned it, and in fact it really is a lemma, not a theorem. Heck, it’s even a corollary.

The problem domain we will be discussing will be the counting of finite objects, so the focus will be on the first half of counting discussed above: Distinction. It is a difficult problem because of the fact that two objects can be distinct and yet counted as two elements of the same ‘kind’ of object. There are two ways to look at this. One way is to make piles of items such that any two objects in a given pile are similar, and then just count the number of piles (I know that the Ace and Queen of Spades are the same suit, because if I change the A to a Q it’s still a spade, so let’s put them in the same pile and count all the piles later). The other way is to count the number of ways a pair of objects can be different, and see how many of those kinds are represented in the object set ( I know there are thirteen ranks in a deck of cards by scanning the deck and counting the different ranks I see as I come along them).

It’s interesting that when discussing distinction, we get into equivalence. This is in fact the pure genius behind the lemma that proves Burnside’s lemma: equivalence creates both 1) A collection of distinct piles of elements in the set called a partition and 2) a collection of symmetries for each object that each object seems immune to. The technical way of expressing this is to say that in the actions of a finite group upon a set, there is something like a one-to-one mapping between the orbits and the stabilizers. It’s called the orbit-stabilizer theorem and will make more sense later.

Let’s explain the terminology. Start with a set of objects that we want to count and call it X (for example the set of all possible states of 3x3x3 Rubik’s Cubes). There are ways of changing one object in X into another object in X. These are actions, literally injective functions from X to X. If these actions are reversible (no blowing up the Cube, drawing on stickers, or taking a hammer to it), then these injective functions are actually bijective, and by Cayley’s theorem, each bijective (reversible) action is actually a permutation of the elements of X. Therefore the set of all actions on X constitutes a finite group with composition as its operator.

This is where you ask the question “What is meant by ‘kind of’?” . If you take the cube and turn it around, is it the same cube? This depends on the type of question you’re trying to answer. If a cube is the same no matter its orientation, we must consider that a rotation of the cube is really a symmetry of the cube. The set of all symmetries is a subgroup of the set of all actions, and definition of this subgroup is crucial to defining the question. At this point, two objects in X are considered the ‘same kind of object’ if there is an action in this symmetry subgroup that transforms one object into the other. As a mental exercise, verify that ‘same kind of object’ is actually an equivalence relation: you’ll find that the laws of a group line up nicely with the properties of equivalence relations.

Call this symmetry subgroup G. It is important to know the relationship between X and G: X is the set of objects placed under the influence of actions in G, or G acts upon the elements of X. Take an action g in G and an object x in X. The action g turns object x into another object called gx, and x is considered to be the same kind of object as gx. There are three important concepts in this model to consider:

1) Take the set of all objects that are equivalent to x. This is called the orbit of x or Gx. The family of all orbits is called X/G.

2) Take the set of all actions upon an object that don’t do anything to that particular object. This is the set of stabilizers of x or Gx.

3) Take the set of objects such that a given action doesn’t change that object. This is the fixed set of g or Xg.

How are these related? If two objects have the same orbit, then there has to be some action turning one object into the other, meaning they are considered equivalent. If you take the set of all actions and apply each one to the same object, it generates all the elements of that orbit. In fact if you squint at this a little bit, you can imagine seeing that orbit of x, being driven from object to object in its tight circle by the whims of the actions of G. We discussed orbits before when talking about distinction. In fact, this partitioning of X into orbits is the disjoint family X/G, and the size of this family is the answer to our question.

So how do we count orbits? Recall that there are two points to distinguishing objects. Here’s the cool part: that stabilizer of that x whose orbit we were looking at? It’s a kernel of this mapping of G onto x, and the moment I say kernel, you should immediately be thinking of the quotient group formed by it. By calculating that quotient group G/Gx suddenly we have a collection of cosets. Each of these cosets is a collection of actions that will always translate x to a y and always that same y in that same orbit. This G/Gx becomes a way of saying that the elements of this orbit are different in the same way, and they match up one-to-one. This bijective mapping is the orbit-stabilizer theorem and it’s the lemma to Burnside’s lemma, because instead of counting the orbits, we can count the stabilizers instead and use the bijective mapping to get our answer.

So how do we count the stabilizers? Let’s look at the third concept above. Fixed sets are a really useful thing to look at: it’s relatively easy to count the size of a fixed set, because you’re looking at one symmetry at a time, and analyzing the effect of that symmetry. So how are fixed sets related to stabilizers? Well, fixed sets are to actions what stabilizers are to objects. If we made a big chart with actions g going down the left side, objects x written across the top, and a check mark everywhere gx=x, you could see that each row of checkmarks represents the fixed set of that action, and each column of check marks represents the stabilizer for that object. They’re two different ways of counting the same set of checkmarks:

gXg={(g,x) gx=x} = ∑xGx

Now we are getting somewhere. Instead of counting the stabilizers we’re counting the sum of the sizes of all the stabilizers. What’s the size of a stabilizer? This is where that orbit-stabilizer theorem comes in. Remember that the stabilizer is a kernel of that big mapping that finds orbits, and that the cosets of that kernel are representatives of each element in the orbit. Now use Lagrange’s theorem: the number of cosets times the size of any one coset (they’re all the same size) is the size of the group. The number of cosets is just the size of the order (by orbit-stabilizer theorem), so divide:

gXg={(g,x) gx=x} = ∑xGx=∑xG/Gx

Pull G out of the summation, because it’s a constant. But now it looks like we’re adding up a bunch of fractions, the sum of the inverses of the size of each orbit. How is that supposed to be a whole number? Remember that if two objects are similar, they are part of the same orbit. Moreover, no two orbits can intersect. Why? Because if x in one orbit intersected with z in another orbit, there would be an object y in both orbits such that gx=y and hy=z thus (h*g)x=z and h*g would prove that x and z are in the same orbit. As discussed above, orbits partition the object space into distinct piles. So if there are 5 elements in an orbit, then that orbit will be counted 5 times for 1/5 each time, and volia we’re back to 1. So, this summation is really the count of the number of orbits or X/G, the very number we’re looking for:

gXg={(g,x) gx=x} = ∑xGx=∑xG/Gx
=GX/G

Divide by G on both sides and we have Burnside’s lemma in the flesh:

X/G=(1/G)∑Xg

Or: The number of distinct orbits (read kinds of object) is equal to the average size of the fixed sets.

Now that we know why it’s true, how do we use it? A simple example should help here: say you have a strip of 5 red and green squares in a row. How many kinds of strips are there? Start with X: There are 5 choices of red or green or 25=32 different strips. But that over-counts: RGRGG and GGRGR are actually the same strip, just one is the other flipped over. The action to factor over is “flipping over the strip.”. Therefore the two actions in our symmetry group are e, which is leaving the strip alone, and f, which flips the strip over. By Burnside’s lemma:

X/G=(1/2)*(Ge+Gf)

Ge is the easier one to calculate: since nothing happens, everything is fixed by it, therefore 32. Gf is fixed if the first and fifth squares match, and the second and fourth squares match. Choosing the first, second, and third squares are sufficient to choose the pattern, so there are 23 or 8 such patterns. Therefore there are (1/2)(32+8) or 20 different patterns.

Burnside’s lemma gives us a striking derivation of the Mississippi formula, counting the number of permutations of a set of symbols with duplicates. The formula is P=(n!)/∏(ri), where n is the number of symbols overall and ri is the number of duplicates of symbol i. From the perspective of group theory, let X be all the different permutations of those symbols with duplicates numbered. There are X! of those permutations, and they are the objects. Let ai,P be the action of rotating the duplicate i symbols using permutation P. For each symbol i there are ri! permutations, and the action group is the direct product of those permutation subgroups, a group with ∏ri! actions. Now, here’s the trick: only the identity action will fix any objects and there are n! of them. Therefore X/G=(1/G)∑Xg=(1/∏ri)(n!+0+0+0+0+…)=n!/∏ri!. Many times when using Burnside’s lemma, you have to set up the object set in such a way that the division by the action set is clean, like in the example above. When you do this, a lot of the problem’s contours come into sharp relief. In the example above, by numbering the duplicates, you first reduce the problem in a way that makes it easier to divide by the duplicates later.

By now you recognize the pattern in solving a “how many kinds of” Burnside’s lemma type of question:

1) Determine the population of your object set X

2) Calculate the permutations present in the symmetry you wish to factor out

3) Calculate all Xg for each permutation

4) Divide by the size of the action group to calculate the number of orbits

Now with Burnside’s lemma in grasp, we’ll tackle the question we started with… how many solvable cubes could they sell? First we know that there could be at most 6 colors (any more than that and one side would have to have two colors, violating solvability). Let’s assume that manufacturers will only retool the machines they already have, so that they can only produce the six colors on a normal cube, or red, yellow, blue, white, green and orange. Let’s label the sides 1-6. Then there are 66 or 46656 different combinations of the cube. This is the population of our object set X.

Next, we calculate the actions on the faces of a cube. The easy way to count this is that any automorphism in R3 is a rotation around some axis, a flip across some plane, or a combination of the two. The nice thing for us is that we don’t have to worry about the flipping part, unless some of the audience is 4-dimensional. Therefore, it’s sufficient to analyze rotations. Furthermore, this line has to pass through the center of the cube. This limits us to three kinds of points that the rotation can happen: through a corner, the midpoint of an edge, or the centroid of a face. There are 3 pairs of faces that rotate three ways, 6 pairs of edges that rotate one way, and 4 pairs of corners that rotate two ways. That and the identity rotation are our (1+3*3+6*1+4*2)=24 rotations.

That’s still a lot of rotations. To simplify the rotation, we can group the rotations based upon their actions on the faces. For example the face rotation (1234) will do a similar action as the similar face action (1536) even though they are technically different rotations. When rearranged, we have 5 types of rotation: the identity, a 90-degree face rotation (abcd), a 180-degree face rotation(ac)(bd), an edge rotation(ab)(cd)(ef), and a corner rotation (abc)(def). Each cycle fixes to a single color, which makes this representation very fortunate.
Now we are ready to calculate the Xg. The identity obviously has 66 (everything fixes with the identity). The 90-face will only have 3 colors because the 4-cycle can only have one color so 63. The 180 face has 4 cycles (2 2-cycles and 2 fixed points) so 64. By similar logic the corner rotation has 62 and edge rotation has 63. We now have all the pieces to calculate X/G:

X/G=(1/G)∑(Xg)=(1/24)( 66+6*63+3*64+8*62+6*63)=2224.

So, let’s say that we want to figure out the answer for 7 colors instead of 6. Do we have to do all that work again? Nope. The only time 6 colors was used was in calculating the size of the fix points, and it was always raised to a constant power. We could have just as easily calculated for 2,9, or even k colors. For k colors the number of cubes is (1/24)( k6+6k3+3k4+8k2+6k3), and if you see enough of these you can recognize a particular symmetry by this formula. The special name for this form is called a cycle index.

Which brings me to my final question... why is Burnside’s lemma a lemma? A lemma is a step in a proof that leads to another result but isn’t really significant in its own right. The reason is because of these cycle indices. Polya took the idea of a cycle index a lot further by recognizing something rather brilliant: the cycle index is really a generating function for the counts of each of the rotations! By separating the cycles out instead of multiplying them all together, automating the process of producing the index, and coming up with a way to throw a weight function into the generating function, the Polya Enumeration Theorem was born which allows us to count more complicated symmetries and represent them in a much more elegant fashion. This is really what Burnside’s lemma is a lemma to, and it will be the topic of my next carnival post.

No comments: