Haskell for Maths

Sunday, 10 June 2012

CHAs V: More Hopf Algebra morphisms

Last time we looked at the descending tree morphism between the combinatorial Hopf algebras SSym and YSym with fundamental bases consisting of (indexed by) permutations and binary trees respectively. We previously also looked at a Hopf algebra QSym with a basis consisting of compositions.

There are also morphisms between SSym/YSym and QSym. However, before we look at these, we need to look at an alternative basis for QSym.

When I introduced QSym, I defined a type QSymM for the basis, without explaining what the M stands for. It actually stands for "monomial" (but I'm not going to explain why quite yet). Now, of course it is possible to construct any number of alternative bases for QSym, by taking linear combinations of the QSymM basis elements. However, most of these alternative bases are not likely to be very mathematically useful. (By mathematically useful, I mean, for example, that it leads to a simple expression for the multiplication rule.) When looking at the relation between QSym and SSym/YSym, there is another basis for QSym that leads to a clearer picture of their relationship, called the fundamental basis.

We will represent the fundamental basis by a new type:

newtype QSymF = QSymF [Int] deriving (Eq)

instance Ord QSymF where
    compare (QSymF xs) (QSymF ys) = compare (sum xs, xs) (sum ys, ys)

instance Show QSymF where
    show (QSymF xs) = "F " ++ show xs

qsymF :: [Int] -> Vect Q QSymF
qsymF xs | all (>0) xs = return (QSymF xs)
         | otherwise = error "qsymF: not a composition"

In a moment, I'll describe the relationship between the monomial and fundamental bases, but first, there's something I need to explain.

If the monomial and fundamental bases are bases for the same Hopf algebra (QSym), how can they be different types? So I think what it comes down to is that if we have different types then we get to have different Show instances. So we will be able to choose whether to view an element of QSym in terms of the monomial or the fundamental basis.

We could have achieved this in other ways, say by designating the monomial basis as the "true" basis, and then providing functions to input and output using the fundamental basis. Giving the fundamental basis its own type is more egalitarian: it puts the two bases on an equal footing.

Okay then, so in order to make this all work, we need to define the relationship between the two bases, and provide functions to convert between them. Let's take a look.

A (proper) refinement of a composition is any composition which can be obtained from the first composition by splitting one or more of the parts of the first composition.

refinements (x:xs) = [y++ys | y <- compositions x, ys <- refinements xs]
refinements [] = [[]]

> refinements [1,3]

Then the fundamental basis can be expressed in terms of the monomial basis, as follows:

qsymFtoM :: (Eq k, Num k) => Vect k QSymF -> Vect k QSymM
qsymFtoM = linear qsymFtoM' where
    qsymFtoM' (QSymF alpha) = sumv [return (QSymM beta) | beta <- refinements alpha]

For example:

> qsymFtoM (qsymF [1,3])
M [1,1,1,1]+M [1,1,2]+M [1,2,1]+M [1,3]

Conversely, elements of the monomial basis can be expressed as sums of elements of the fundamental basis, as follows:

qsymMtoF :: (Eq k, Num k) => Vect k QSymM -> Vect k QSymF
qsymMtoF = linear qsymMtoF' where
    qsymMtoF' (QSymM alpha) = sumv [(-1) ^ (length beta - length alpha) *> return (QSymF beta) | beta <- refinements alpha]

> qsymMtoF (qsymM [1,3])
F [1,1,1,1]-F [1,1,2]-F [1,2,1]+F [1,3]

So we can input elements of QSym using either the monomial or fundamental basis (using the qsymM and qsymF constructors). Shortly, we'll define Algebra and Coalgebra instances for QSymF, so that we can perform arithmetic in either basis. Finally, we can output in either basis, by using the conversion functions if necessary.

How do we know that QSymF is a basis? How do we know that its elements are linearly independent, and span the space? In Vect Q QSymF, this is guaranteed by the free vector space construction. But what the question is really asking is, how do we know that the image of the "basis" QSymF in Vect Q QSymM (via qsymFtoM) is a basis?

Well, it will be linearly independent if qsymFtoM is injective, and spanning if qsymFtoM is surjective. So we require that qsymFtoM is bijective. This follows if we can show that qsymFtoM and qsymMtoF are mutual inverses. Well, quickCheck seems to think so:

> quickCheck (\x -> x == (qsymMtoF . qsymFtoM) x)
+++ OK, passed 100 tests.
> quickCheck (\x -> x == (qsymFtoM . qsymMtoF) x)
+++ OK, passed 100 tests.

(For the cognoscenti: The reason this works is that qsymMtoF' is the Mobius inversion of qsymFtoM' in the poset of compositions ordered by refinement.)

Okay, so we have an alternative basis for QSym as a vector space. What do the multiplication and comultiplication look like relative to this new basis? Now, it is possible to define the algebra, coalgebra and Hopf algebra structures explicitly in terms of the QSymF basis, but I'm going to cheat, and just round-trip via QSymM:

instance (Eq k, Num k) => Algebra k QSymF where
    unit x = x *> return (QSymF [])
    mult = qsymMtoF . mult . (qsymFtoM `tf` qsymFtoM)

instance (Eq k, Num k) => Coalgebra k QSymF where
    counit = unwrap . linear counit' where counit' (QSymF xs) = if null xs then 1 else 0
    comult = (qsymMtoF `tf` qsymMtoF) . comult . qsymFtoM

instance (Eq k, Num k) => Bialgebra k QSymF where {}

instance (Eq k, Num k) => HopfAlgebra k QSymF where
    antipode = qsymMtoF . antipode . qsymFtoM

(Recall that `tf` is the tensor product of linear maps.)

It's kind of obvious from the definitions that the algebra, coalgebra and Hopf algebra laws will be satisfied. (It's obvious because we already know that these laws are satisfied in Vect Q QSymM, and the definitions for Vect Q QSymF are just the same, but under the change of basis.) However, for additional confidence, we can for example:

> quickCheck (prop_Algebra :: (Q, Vect Q QSymF, Vect Q QSymF, Vect Q QSymF) -> Bool)
+++ OK, passed 100 tests.

Okay, so the reason for introducing the fundamental basis for QSym is that there is a Hopf algebra morphism from SSym to QSym, which is easiest to express in terms of their respective fundamental bases. Specifically, we can define a map between the bases, SSymF -> QSymF, which lifts (using fmap, ie using the free vector space functor) to a map between the Hopf algebras.

Given a permutation p of [1..n], a descent is an index i such that p(i) > p(i+1). For example, the permutation [2,3,5,1,6,4] has descents from the 5 to the 1 and from the 6 to the 4. We can think of the descents as splitting the permutation into segments, each of which is strictly ascending. Thus 235164 splits into 235-16-4. If we count the lengths of these segments, we get a composition, which I call the descent composition. Here's the code:

descentComposition [] = []
descentComposition xs = descComp 0 xs where
    descComp c (x1:x2:xs) = if x1 < x2 then descComp (c+1) (x2:xs) else (c+1) : descComp 0 (x2:xs)
    descComp c [x] = [c+1]

> descentComposition [2,3,5,1,6,4]

We can lift this map between the bases to a map between the Hopf algebras.

descentMap :: (Eq k, Num k) => Vect k SSymF -> Vect k QSymF
descentMap = nf . fmap (\(SSymF xs) -> QSymF (descentComposition xs))

Now, it turns out that this is a Hopf algebra morphism. That is, it commutes with the algebra, coalgebra and Hopf algebra structures.

> quickCheck (prop_AlgebraMorphism descentMap)
+++ OK, passed 100 tests.
> quickCheck (prop_CoalgebraMorphism descentMap)
+++ OK, passed 100 tests.
> quickCheck (prop_HopfAlgebraMorphism descentMap)
+++ OK, passed 100 tests.

Why does this work? Well, let's work through an example, for comultiplication. (In the following I omit brackets and commas for brevity.) If we do the descent map before the comultiplication, we get:

2341 (SSymF)
-> (descentMap)
31 (QSymF)
-> (qsymFtoM - sum of refinements)
31+211+121+1111 (QSymM)
-> (comult - deconcatenations)
[]⊗31 + 3⊗1 + 31⊗[] +
[]⊗211 + 2⊗11 + 21⊗1 + 211⊗[] +
[]⊗121 + 1⊗21 + 12⊗1 + 121⊗[] +
[]⊗1111 + 1⊗111 + 11⊗11 + 111⊗1 + 1111⊗[]

(We convert to QSymM at the second step because it's in QSymM that we know how to comultiply. It is possible to give an explicit expression for the comultiplication in terms of the QSymF basis, but I wanted to keep things simple.)

Conversely, if we do the comultiplication before the descent map:

2341 (SSymF)
-> (comult - flattened deconcatenations)
[]⊗2341 + 1⊗231 + 12⊗21 + 123⊗1 + 2341⊗[] (SSymF⊗SSymF)
-> (descentMap⊗descentMap)
[]⊗31 + 1⊗21 + 2⊗11 + 3⊗1 + 31⊗[] (QSymF⊗QSymF)
-> (qsymFtoM⊗qsymFtoM - sum of refinements)
[]⊗(31+211+121+1111) +
1⊗(21+111) +
(2+11)⊗11 +
(3+21+12+111)⊗1 +

The result comes out the same, whichever way round you go, as required. But why does it work? Well, you can imagine the inventor going through the following thought process:

  • Comult in SSymF is by flattened deconcatenations (of permutations), and in QSymM is by deconcatenations (of compositions). If we split a permutation at a descent, the descents on either side are preserved. So we could try sending a permutation in SSymF to its descent composition in QSymM. For example, ssymF [2,3,4,1] -> qsymM [3,1], which deconcatenates to [2,3,4]⊗[1] -> [3]⊗[1].
  • However, a deconcatenation in SSymF might split a permutation partway through an ascending segment. For example, [2,3,4,1] -> [2,3]⊗[4,1] (which flattens to [1,2]⊗[2,1]). Taking this to descent compositions would give [2]⊗[1,1]. This is not a deconcatenation of qsymM [3,1] - it is however a deconcatenation of [2,1,1], which is a one-step refinement of [3,1].
  • So we could try sending a permutation in SSymF to its descent composition in QSymM, and its one-step refinements. For example, ssymF [2,3,4,1] -> qsymM [3,1] + qsymM [2,1,1] + qsymM [1,2,1].
  • But now that means that [2,3]⊗[4,1] (flattening omitted for clarity) -> [2]⊗[1,1] + [1,1]⊗[1,1]. The second term is a deconcatenation of [1,1,1,1], a two-step refinement of [3,1].
  • It's pretty obvious that the way to make it all work out is to send a permutation in SSymF to its descent composition in QSymM, and all its proper refinements.
  • But this sum, of a composition and all its refinements (in QSymM) is just exactly how we defined the QSymF basis.

Exercise: Explain why descentMap commutes with mult.

Exercise: Last time we looked at a descendingTreeMap : SSym -> YSym. Show that the descentMap : SSym -> QSym factors through the descendingTreeMap, and describe the other factor f : YSym -> QSym.

Monday, 30 April 2012

CHAs IV: Hopf Algebra Morphisms

In the last few posts, we've been looking at combinatorial Hopf algebras, specifically:
- SSym, a Hopf algebra with a basis of (indexed by) permutations
- YSym, with a basis of binary trees
- QSym, with a basis of compositions

It turns out that these three Hopf algebras are quite closely related (and indeed, there are a few others in the same "family"). Let's start with SSym and YSym.

Given a permutation p of [1..n], we can construct a binary tree called the descending tree as follows:
- Split the permutation as p = ls ++ [n] ++ rs
- Place n at the root of the tree, and recursively place the descending trees of ls and rs as the left and right children of the root
- To bottom out the recursion, the descending tree of the empty permutation is of course the empty tree

For example, the following diagram shows the descending tree of [3,5,1,4,2].

Here's the Haskell code:

descendingTree [] = E
descendingTree [x] = T E x E
descendingTree xs = T l x r
    where x = maximum xs
          (ls,_:rs) = L.break (== x) xs
          l = descendingTree ls
          r = descendingTree rs

> :m Math.Algebras.Structures Math.Combinatorics.CombinatorialHopfAlgebra
> descendingTree [3,5,1,4,2]
T (T E 3 E) 5 (T (T E 1 E) 4 (T E 2 E))

Now you'll recall that for the Hopf algebra YSym, although we sometimes carry around the node labels to help us see what is going on, we're really only interested in the shapes of the trees. So here's a function to erase the labels:

shape :: PBT a -> PBT ()
shape t = fmap (\_ -> ()) t

> shape $ T (T E 3 E) 5 (T (T E 1 E) 4 (T E 2 E))
T (T E () E) () (T (T E () E) () (T E () E))

Thus (shape . descendingTree) is a function from permutations to unlabelled trees. We can consider this as a map between the fundamental bases of SSym and YSym, which therefore induces a linear map between the corresponding Hopf algebras:

descendingTreeMap :: (Eq k, Num k) => Vect k SSymF -> Vect k (YSymF ())
descendingTreeMap = nf . fmap (\(SSymF xs) -> (YSymF . shape .  descendingTree) xs)

Now it turns out that this map is in fact a Hopf algebra morphism. What does that mean? Basically it means that descendingTreeMap plays nicely ("commutes") with the unit, mult, counit, comult, and antipode maps in the two Hopf algebras.

For example, for an algebra morphism, we require:
f (x1*x2) = f x1 * f x2
f . unit = unit

It's not immediately clear why descendingTreeMap should have these properties. The unit property is clear:
> descendingTreeMap 1 == 1
or put another way
> descendingTreeMap (ssymF []) == ysymF E

But what about the mult property?

Recall that in SSymF, we multiply permutations by shifting the right operand, and then summing all possible shuffles of the two lists:

> ssymF [3,2,1] * ssymF [1,2]
F [3,2,1,4,5]+F [3,2,4,1,5]+F [3,2,4,5,1]+F [3,4,2,1,5]+F [3,4,2,5,1]+F [3,4,5,2,1]+F [4,3,2,1,5]+F [4,3,2,5,1]+F [4,3,5,2,1]+F [4,5,3,2,1]

On the other hand, in YSymF, we multiply by multi-splitting the left operand, and then grafting the parts onto the leafs of the right operand, in all possible ways. The following diagram shows one possible multi-split-grafting, corresponding to one of the summands in [3,5,1,4,2] * [1,2]:

The numbers along the tops of the trees are the node labels generated by the descending tree construction. We can see from these that there is an exact correspondence between a shifted shuffle in SSymF and a multi-split grafting in YSymF. The asymmetry in the mult for YSymF, where we graft multi-splits of the left operand onto the right operand, corresponds to the asymmetry in the mult for SSymF, where we shift the right operand. This shifting in SSymF ensures that the nodes for the right operand end up at the root of the descending tree, as required by the grafting in YSymF. When we defined shuffling, we gave a recursive definition, but it's fairly clear that the grafting of the parts of the multi-split onto the right tree is accomplishing the same thing.

Similarly, a coalgebra morphism is a linear map which commutes with counit and comult:

In SSymF, counit is 1 on the empty permutation, 0 on anything else. In YSymF, counit is 1 on the empty tree, 0 on anything else. The descendingTreeMap sends the empty permutation to the empty tree, so it's clear that it commutes with counit in the required way.

What about comult? In SSymF, the comult of a permutation is the sum of its two-part deconcatenations (with each part flattened back to a permutation).

> comult $ ssymF [3,5,1,4,2]
(F [],F [3,5,1,4,2])+(F [1],F [4,1,3,2])+(F [1,2],F [1,3,2])+(F [2,3,1],F [2,1])+(F [2,4,1,3],F [1])+(F [3,5,1,4,2],F [])

In YSymF, the comult of a tree is the sum of its two-part splits. Now it's clear that flattening makes no difference to the descending tree. Then, it's also clear that if you take descending trees of the two parts of a deconcatenation, this corresponds to a split of the descending tree of the whole.

Finally, a Hopf algebra morphism is a bialgebra morphism which in addition commutes with antipode.

In SSymF and YSymF, we didn't give an explicit expression for the antipode, but instead derived it from mult and comult using the fact that they are both graded connected bialgebras. So it's actually kind of obvious that descendingTreeMap will be a Hopf algebra morphism, but just to check:

prop_HopfAlgebraMorphism f x = (f . antipode) x == (antipode . f) x

> quickCheck (prop_HopfAlgebraMorphism descendingTreeMap)
+++ OK, passed 100 tests.

Given any tree, there are many ways to label the nodes so that the tree is descending (ie such that the label on each child node is less than the label on its parent). For example, we could first label all the leaf nodes [1..k], and then all their immediate parents [k+1..], and so on. (For our example tree, this would lead to the alternative labelling [1,5,2,4,3].)

This shows that the descending tree map is surjective, but not injective. Hence YSym is a quotient of SSym.

Monday, 23 April 2012

CHAs III: QSym, a combinatorial Hopf algebra on compositions

The compositions of a number n are the different ways that it can be expressed as an ordered sum of positive integers. For example, the compositions of 4 are 1+1+1+1, 1+1+2, 1+2+1, 2+1+1, 1+3, 2+2, 3+1, 4. Equivalently, we can forget about the plus signs, and just consider the ordered lists of positive integers that sum to n. Here's the Haskell code:

compositions 0 = [[]]
compositions n = [i:is | i <- [1..n], is <- compositions (n-i)]

> :m Math.Algebras.Structures Math.Combinatorics.CombinatorialHopfAlgebra
> compositions 4

We will define a CHA QSym whose basis elements are (indexed by) compositions:

newtype QSymM = QSymM [Int] deriving (Eq)

instance Ord QSymM where
    compare (QSymM xs) (QSymM ys) = compare (sum xs, xs) (sum ys, ys)

instance Show QSymM where
    show (QSymM xs) = "M " ++ show xs

qsymM :: [Int] -> Vect Q QSymM
qsymM xs | all (>0) xs = return (QSymM xs)
         | otherwise = error "qsymM: not a composition"

We will use QSymM as the basis for a Hopf algebra, indexed by compositions. (In practice, we're going to stick with smallish compositions, as the calculations would take too long otherwise.) We form the free vector space over this basis. An element of the free vector space is a linear combination of compositions. For example:

> qsymM [1,2] + 2 * qsymM [3,1]
M [1,2]+2M [3,1]

The algebra structure on QSymM is similar to the algebra structure we saw last time on SSymM. Instead of shifted shuffles, we will use overlapping shuffles or quasi-shuffles. This means that when shuffling (x:xs) and (y:ys), then instead of just choosing between taking x first or taking y first, we also have a third choice to take them both and add them together.

quasiShuffles (x:xs) (y:ys) = map (x:) (quasiShuffles xs (y:ys)) ++
                              map ((x+y):) (quasiShuffles xs ys) ++
                              map (y:) (quasiShuffles (x:xs) ys)
quasiShuffles xs [] = [xs]
quasiShuffles [] ys = [ys]

For example:

> quasiShuffles [1,2] [3]

For our algebra structure, we say that the product of two compositions is the sum of all quasi-shuffles of the compositions:

instance (Eq k, Num k) => Algebra k QSymM where
    unit x = x *> return (QSymM [])
    mult = linear mult' where
        mult' (QSymM alpha, QSymM beta) = sumv [return (QSymM gamma) | gamma <- quasiShuffles alpha beta]

It's fairly obvious that this is associative and satisfies the algebra requirements. (And there are quickCheck tests in the package to confirm it.)

The coalgebra structure on QSymM is also similar to the coalgebra structure on SSymM. (We'll see in due course that SSym and QSym are closely related.) For the comultiplication of a composition, we'll just use the sum of the deconcatenations of the composition (without the flattening that we did with SSymM):

instance (Eq k, Num k) => Coalgebra k QSymM where
    counit = unwrap . linear counit' where counit' (QSymM alpha) = if null alpha then 1 else 0
    comult = linear comult' where
        comult' (QSymM gamma) = sumv [return (QSymM alpha, QSymM beta) | (alpha,beta) <- deconcatenations gamma]

For example:

> comult $ qsymM [1,2,3]
(M [],M [1,2,3])+(M [1],M [2,3])+(M [1,2],M [3])+(M [1,2,3],M [])

(Recall that (x,y) should be read as x⊗y.)

This comultiplication, along with those we have seen for other combinatorial Hopf algebras, is obviously coassociative - but perhaps I should spell out what that means. Coassociativity says that:
(comult⊗id) . comult = (id⊗comult) . comult

In other words, if you split a composition in two, and then split the left part in two - in all possible ways - then you get the same sum of possibilities as if you had done the same but splitting the right part in two. This is obvious, because it's just the sum of possible ways of splitting the composition in three (modulo associativity).

Then for a coalgebra we also require:
(id⊗counit) . comult = id = (counit⊗id) . comult

But this is clear. For example:
((id⊗counit) . comult) [1,2,3] =
(id⊗counit) ([]⊗[1,2,3] + [1]⊗[2,3] + [1,2]⊗[3] + [1,2,3]⊗[]) =
0 + 0 + 0 + [1,2,3]
(modulo the isomorphism C⊗k = C)

Then, as we did previously, we can quickCheck that the algebra and coalgebra structures are compatible, and hence that we have a bialgebra. (See the detailed explanation last time: mainly it comes down to the fact that deconcatenating quasi-shuffles is the same as quasi-shuffling deconcatenations.)

instance (Eq k, Num k) => Bialgebra k QSymM where {}

As before, this is a connected graded bialgebra in an obvious way (using the sum of the composition as the grading), and hence it automatically has an antipode. In this case we can give an explicit expression for the antipode. The coarsenings of a composition are the compositions which are "more coarse" than the given composition, in the sense that they can be obtained by combining two or more adjacent parts of the given composition.

coarsenings (x1:x2:xs) = map (x1:) (coarsenings (x2:xs)) ++ coarsenings ((x1+x2):xs)
coarsenings xs = [xs] -- for xs a singleton or null

For example:

> coarsenings [1,2,3]

Then the antipode can be defined as follows:

instance (Eq k, Num k) => HopfAlgebra k QSymM where
    antipode = linear antipode' where
        antipode' (QSymM alpha) = (-1)^length alpha * sumv [return (QSymM beta) | beta <- coarsenings (reverse alpha)]

Why does this work? Remember that - in the case of a combinatorial Hopf algebra, where counit picks out the empty structure - antipode has to perform the disappearing trick of making everything cancel out.

That is, we require that
mult . (id⊗antipode) . comult = unit . counit
where the right hand side is zero for any non-empty composition.

Now, in order to try to figure out how it works, I decided to go through an example. However, in the cold light of day I have to admit that perhaps this is one of those times when maths is not a spectator sport. So the rest of this blog post is "optional".

Ok, so let's look at an example:


-> (comult = deconcatenations)

[]⊗[1,2,3] + [1]⊗[2,3] + [1,2]⊗[3] + [1,2,3]⊗[]

-> (id⊗antipode = reversal coarsening of right hand side, with alternating signs)

- []⊗([3,2,1] + [5,1] + [3,3] + [6])
+ [1]⊗([3,2] + [5])
- [1,2]⊗[3]
+ [1,2,3]⊗[]

-> (mult = quasi-shuffles)

- [3,2,1] - [5,1] - [3,3] - [6]
+ [1,3,2] + [4,2] + [3,1,2] + [3,3] + [3,2,1] + [1,5] + [6] + [5,1]
- [1,2,3] - [1,5] - [1,3,2] - [4,2] - [3,1,2]
+ [1,2,3]

= 0

(The way I like to think of this is:
- comult splits the atom into a superposition of fragment states
- antipode turns the right hand part of each fragment pair into a superposition of anti-matter states
- mult brings the matter and anti-matter parts together, causing them to annihilate
However, I think that is just fanciful - there's no flash of light - so I'll say no more.)

So why does it work? Well, in the final sum, the terms have to cancel out in pairs. If we look at some of the terms, it appears that there are three possibilities:

Case 1:

Consider the case of [4,2]. It arises in two ways in the final sum:
123 -> 1,23 -> 1,32 -> 42
123 -> 12,3 -> -12,3 -> -42

In both cases, the 1 and the 3 are combined during the quasi-shuffle phase to make a 4. In order to be combined in this way, they have to end up on opposite sides of the split during the deconcatenation phase. Because there is a 2 separating them, there are two ways this can happen, with the 2 ending up on either the left or right hand side of the split. And then the alternating signs ensure that the two outcomes cancel out at the end.

(In this case there was just one part separating them, but the same thing happens if there are more. For example:
1234 -> 1,234 -> -1,432 -> -532 + other terms
1234 -> 12,34 -> 12,43 -> 532 + 523
1234 -> 123,4 -> -123,4 -> -523)

Case 2:

Consider the case of [3,3]. This arises in two ways in the final sum:
123 -> [],123 -> -[],33 -> -33
123 -> 1,23 -> 1,32 -> 33

In this case it is the 1 and the 2 that combine. Unlike the 42 case, in this case the parts that combine are adjacent to one another at the beginning. In the top interaction, they combine during the reverse coarsening phase, in the bottom interaction, they combine during the quasi-shuffle phase. If x and y are adjacent, then the only way they can combine during the quasi-shuffle phase is if the split happens between them during the deconcatenation phase. This will always cancel with the term we get by splitting just before both x and y, and then combining them during coarsening.

wxyz -> w,xyz -> +/- w,z(y+x) -> {wz}(y+x)
wxyz -> wx,yz -> -/+ wx,zy -> {wz}(x+y)

Case 3:

Finally, it may happen that x and y don't combine. This can happen in two different ways, depending whether x and y are adjacentparts or not.

If they're not adjacent, then we get a failed case 1. For example, the [3,1,2] and [1,3,2] terms arise when the 1 and 3 fail to combine into a 4:
123 -> 1,23 -> 1,32 -> 132 + 312
123 -> 12,3 -> -12,3 -> -132-312

If they're adjacent, we get a failed case 2. For example, the [3,2,1] terms arise when the 1 and 2 fail to combine into a 3:
123 -> [],123 -> -[],321 -> -321
123 -> 1,23 -> 1,32 -> 321

This isn't quite a proof, of course. In a longer composition, there might be an opportunity for more than one of these cases to happen at the same time, and we need to show how that all works out. Also, I'm not sure that the [1,2,3] term is either a failed case 1 or a failed case 2. I hope at least though that this analysis has shed some light on why it works. (Exercise: Complete the proof.)

Sunday, 25 March 2012

CHAs II: The Hopf Algebra SSym of Permutations

Last time we looked at YSym, the (dual of the) Loday-Ronco Hopf algebra of binary trees. This time I want to look at SSym, the Malvenuto-Reutenauer Hopf algebra of permutations. (In due course we'll see that YSym and SSym are quite closely related.)

The fundamental basis for SSym is (indexed by) the permutations of [1..n] for n <- [0..]. As usual, we work in the free vector space over this basis. Hence a typical element of SSym might be something like [1,3,2] + 2 [4,1,3,2]. Here's some code:

newtype SSymF = SSymF [Int] deriving (Eq)

instance Ord SSymF where
    compare (SSymF xs) (SSymF ys) = compare (length xs, xs) (length ys, ys)

instance Show SSymF where
    show (SSymF xs) = "F " ++ show xs

ssymF :: [Int] -> Vect Q SSymF
ssymF xs | L.sort xs == [1..n] = return (SSymF xs)
         | otherwise = error "Not a permutation of [1..n]"
         where n = length xs

(The "F" in SSymF stands for fundamental basis: we may have cause to look at other bases in due course.)

Let's try it out:

$ cabal update
$ cabal install HaskellForMaths
$ ghci
> :m Math.Algebras.Structures Math.Combinatorics.CombinatorialHopfAlgebra
> ssymF [1,3,2] + 2 * ssymF [4,1,3,2]
F [1,3,2]+2F [4,1,3,2]

Ok, so how can we define an algebra structure on this basis? How can we multiply permutations? (We want to consider permutations as combinatorial rather than algebraic objects here. So no, the answer isn't the group algebra.) One possibility would be to use the following shifted concatenation operation:

shiftedConcat (SSymF xs) (SSymF ys) = let k = length xs in SSymF (xs ++ map (+k) ys)

For example:

> shiftedConcat (SSymF [1,2]) (SSymF [2,1,3])
F [1,2,4,3,5]

This has the required properties. It's associative:
> quickCheck (\x y z -> shiftedConcat (shiftedConcat x y) z == shiftedConcat x (shiftedConcat y z))
+++ OK, passed 100 tests.

And it's pretty obvious that the empty permutation, SSymF [], is a left and right identity. Hence we could form a monoid algebra using this operation.

However, for the Hopf algebra we're going to look at, we will use a slightly more complicated multiplication. We will retain the idea of shifting the second permutation, so that the two lists are disjoint. However, instead of just concatenating them, we will form the sum of all possible "shuffles" of the two lists. Here's the shuffle code:

shuffles (x:xs) (y:ys) = map (x:) (shuffles xs (y:ys)) ++ map (y:) (shuffles (x:xs) ys)
shuffles xs [] = [xs]
shuffles [] ys = [ys]

So shuffles takes two input "decks of cards", and it outputs all possible ways that they can be shuffled together, while preserving the order between cards from the same deck. For example:

> shuffles [1,2] [3,4,5]

Notice how in each of the output shuffles, we have 1 before 2, and 3 before 4 before 5. This enables us to define an algebra structure on permutations as follows:

instance (Eq k, Num k) => Algebra k SSymF where
    unit x = x *> return (SSymF [])
    mult = linear mult'
        where mult' (SSymF xs, SSymF ys) =
                  let k = length xs
                  in sumv [return (SSymF zs) | zs <- shuffles xs (map (+k) ys)]

For example:

> ssymF [1,2] * ssymF [2,1,3]
F [1,2,4,3,5]+F [1,4,2,3,5]+F [1,4,3,2,5]+F [1,4,3,5,2]+F [4,1,2,3,5]+F [4,1,3,2,5]+F [4,1,3,5,2]+F [4,3,1,2,5]+F [4,3,1,5,2]+F [4,3,5,1,2]

It's clear that ssymF [] is indeed a left and right unit for this multiplication. It's also fairly clear that this multiplication is associative (because both shifting and shuffling are). Let's just check:

> quickCheck (prop_Algebra :: (Q, Vect Q SSymF, Vect Q SSymF, Vect Q SSymF) -> Bool)
+++ OK, passed 100 tests.

(The test code isn't exposed in the package, so you'll have to dig around in the source if you want to try this. It takes a minute or two, because every time we multiply we end up with lots of terms.)

What about a coalgebra structure? As I mentioned last time, in a combinatorial Hopf algebra, the comultiplication is usually a sum of the different ways to take apart our combinatorial object into two parts. In this case, we take a permutation apart by "deconcatenating" it (considered as a list) into two pieces. This is like cutting a deck of cards:

deconcatenations xs = zip (inits xs) (tails xs)

For example:

> deconcatenations [2,3,4,1]

However, most of those parts are no longer permutations of [1..n] (for any n), because they are missing some numbers. In order to get back to permutations, we need to "flatten" each part:

flatten xs = let mapping = zip (L.sort xs) [1..]
        in [y | x <- xs, let Just y = lookup x mapping]

For example:

> flatten [3,4,1]

Putting the deconcatenation and flattening together we get the following coalgebra definition:

instance (Eq k, Num k) => Coalgebra k SSymF where
    counit = unwrap . linear counit' where counit' (SSymF xs) = if null xs then 1 else 0
    comult = linear comult'
        where comult' (SSymF xs) = sumv [return (SSymF (flatten us), SSymF (flatten vs))
                                        | (us, vs) <- deconcatenations xs]

For example:

> comult $ ssymF [2,3,4,1]
(F [],F [2,3,4,1])+(F [1],F [2,3,1])+(F [1,2],F [2,1])+(F [1,2,3],F [1])+(F [2,3,4,1],F [])

(Recall that the result should be read as F[]⊗F[2,3,4,1] + F[1]⊗F[2,3,1] + ...)

It's fairly clear that this comultiplication is coassociative. The counit properties are equally straightforward. Hence:

> quickCheck (prop_Coalgebra :: Vect Q SSymF -> Bool)
+++ OK, passed 100 tests.

Is it a bialgebra? Do the algebra and coalgebra structures commute with one another? In previous posts I've been a bit hand-wavy about this, so let's take a short time out to look at what this actually means. For example, what does it mean for mult and comult to commute? Well, it means that the following diagram commutes:

But hold on a sec - what is the mult operation on (B⊗B)⊗(B⊗B) - is B⊗B even an algebra? And similarly, what is the comult operation on B⊗B - is it even a coalgebra?

Yes they are. Given any algebras A and B, we can define an algebra structure on A⊗B via (a1⊗b1) * (a2⊗b2) = (a1*a2)⊗(b1*b2). In code:

instance (Eq k, Num k, Ord a, Ord b, Algebra k a, Algebra k b) => Algebra k (Tensor a b) where
    unit x = x *> (unit 1 `te` unit 1)
    mult = (mult `tf` mult) . fmap (\((a,b),(a',b')) -> ((a,a'),(b,b')) )

Similarly, given coalgebras A and B we can define a coalgebra structure on A⊗B as follows:

instance (Eq k, Num k, Ord a, Ord b, Coalgebra k a, Coalgebra k b) => Coalgebra k (Tensor a b) where
    counit = unwrap . linear counit'
        where counit' (a,b) = (wrap . counit . return) a * (wrap . counit . return) b
    comult = nf . fmap (\((a,a'),(b,b')) -> ((a,b),(a',b')) ) . (comult `tf` comult)

(Recall that those pesky wrap and unwrap calls are the isomorphisms k <-> Vect k (). What the counit definition really says is that counit (a⊗b) = counit a * counit b.)

Notice how in both the mult and comult definitions, we have to swap the middle two terms of the fourfold tensor product over, in order to have something of the right type.

So how does this work for SSym? Well, we start in the top left with SSym⊗SSym. You can think of that as two decks of cards.

If we go along the top and down the right, then we first shuffle the two decks together (in all possible ways), and then deconcatenate the results (in all possible ways).

If we go down the left and along the bottom, then we first deconcatenate each deck independently (in all possible ways), and then shuffle the first parts of both decks together and separately shuffle the second parts together (in all possible ways).

You can kind of see that this is going to lead to the same result. It's just saying that it doesn't matter whether you shuffle before you cut or cut before you shuffle. (There's also shifting and flattening going on, of course, but it's clear that doesn't affect the result.)

For a bialgebra, we require that each of unit and mult commutes with each of counit and comult - so four conditions in all. The other three are much easier to verify, so I'll leave them as an exercise.

Just to check:

> quickCheck (prop_Bialgebra :: (Q, Vect Q SSymF, Vect Q SSymF) -> Bool)
+++ OK, passed 100 tests.

Okay so what about a Hopf algebra structure? Is there an antipode operation?

Well, recall that when we looked last time at YSym, we saw that it was possible to give a recursive definition of the antipode map. This isn't always possible. The reason it was possible for YSym was:
- We saw that the comultiplication of a tree t is a sum of terms u⊗v, where with the exception of the term 1⊗t, all the other terms have a smaller tree on the right hand side.
- We saw that the counit is 1 on the smallest tree, and 0 otherwise.

It turns out that these are instances of a more general concept, of a graded and connected coalgebra.
- A graded vector space means that there is a concept of the size or degree of the basis elements. For YSym, the degree was the number of nodes in the tree. A graded coalgebra means that the comultiplication respects the degree, in the sense that if comult t is a sum of terms u⊗v, then degree u + degree v = degree t.
- A connected coalgebra means that the counit is 1 on the degree zero piece, and 0 otherwise.

(There is a basis-independent way to explain this, for the purists.)

Now, SSym is also a graded connected coalgebra:
- the degree of SSymF xs is simply length xs. Comultiplication respects the degree.
- counit is 1 on the degree zero piece, 0 otherwise.

Hence we can again give a recursive definition for the antipode:

instance (Eq k, Num k) => HopfAlgebra k SSymF where
    antipode = linear antipode'
        where antipode' (SSymF []) = return (SSymF [])
              antipode' x@(SSymF xs) = (negatev . mult . (id `tf` antipode) . removeTerm (SSymF [],x) . comult . return) x

For example:

> antipode $ ssymF [1,2,3]
-F [3,2,1]
> antipode $ ssymF [1,3,2]
-F [2,1,3]-F [2,3,1]+F [3,1,2]
> antipode $ ssymF [2,1,3]
-F [1,3,2]+F [2,3,1]-F [3,1,2]

It is possible to give an explicit expression for the antipode. However, it's a little complicated, and I haven't got around to coding it yet.

Let's just check:

> quickCheck (prop_HopfAlgebra :: Vect Q SSymF -> Bool)
+++ OK, passed 100 tests.

However, this isn't quite satisfactory. I'd like a little more insight into what the antipode actually does.

Recall that the definition requires that (mult . (id ⊗ antipode) . comult) == (unit . counit).

The left hand side is saying:
- first cut (deconcatenate) the deck of cards into two parts (in all possible ways)
- next apply the antipode to just one of the two parts
- finally shuffle the two parts together (in all possible ways)

The right hand side, remember, sends the empty permutation SSymF [] to itself, and every other permutation to zero.

So what this is saying is, you cut a (non-empty) deck of cards, wave a wand over one part, shuffle the two parts together again, and the cards all disappear!

Or is it? No, not quite. Remember that comult (resp. mult) is a sum of all possible deconcatenations (resp. shuffles). So what this is saying is that the antipode arranges things so that when you sum over all possible deconcatenations and shuffles, they cancel each other out. Cool!

(This sounds like it might have something to do with renormalization in physics, where we want to get a bunch of troublesome infinities to cancel each other out. There is apparently a connection between Hopf algebras and renormalization, but I don't know if this is it. Unfortunately my understanding of physics isn't up to figuring this all out.)

So we've now seen how to define Hopf algebra structures on two different sets of combinatorial objects: trees and permutations. We'll see that these two Hopf algebras are actually quite closely related, and that there is something like a family of combinatorial Hopf algebras with interesting connections to one another.

My main source for this article was Aguiar, Sottile, Structure of the Malvenuto-Reutenauer Hopf algebra of permutations.

Sunday, 18 March 2012

Combinatorial Hopf Algebras I: The Hopf Algebra YSym of Binary Trees

Last time we looked at the definition of Hopf algebras, using the group algebra as a motivating example. This time I want to look at YSym, a Hopf algebra of binary trees. This is an example of a Combinatorial Hopf Algebra (CHA), meaning a Hopf algebra defined over some combinatorial object such as partitions, compositions, permutations, trees.

The binary trees we're concerned with are the familiar binary trees from computer science. In the math literature on this, however, they're called (rooted) planar binary trees. As far as I can tell, that's because in math, a tree means a simple graph with no cycles. So from that point of view, a CS binary tree is in addition rooted - it has a distinguished root node - and planar - so that you can distinguish between left and right child nodes.

Here's a Haskell type for (rooted, planar) binary trees:

data PBT a = T (PBT a) a (PBT a) | E deriving (Eq, Show, Functor)

instance Ord a => Ord (PBT a) where


As a convenience, the trees have labels at each node, and PBT is polymorphic in the type of the labels. The labels aren't really necessary: the Hopf algebra structure we're going to look at depends only on the shapes of the trees, not the labels. However, it will turn out to be useful to be able to label them, to see more clearly what is going on.

There's more than one way to set up a Hopf algebra structure on binary trees, so we'll use a newtype wrapper to identify which structure we're using.

newtype YSymF a = YSymF (PBT a) deriving (Eq, Ord, Functor)

instance Show a => Show (YSymF a) where

    show (YSymF t) = "F(" ++ show t ++ ")"

ysymF :: PBT a -> Vect Q (YSymF a)

ysymF t = return (YSymF t)

(The "F" in YSymF signifies that we're looking at the fundamental basis. We may have occasion to look at other bases later.)

So as usual, we're going to work in the free vector space on this basis, Vect Q (YSymF a), consisting of linear combinations of binary trees. Here's what a typical element might look like:

$ cabal update

$ cabal install HaskellForMaths

$ ghci

> :m Math.Algebras.Structures Math.Combinatorics.CombinatorialHopfAlgebra

> ysymF (E) + 2 * ysymF (T (T E () E) () E)

F(E)+2F(T (T E () E) () E)

Ok, so how do we multiply two binary trees together? Well actually, let's look at comultiplication first, because it's a little easier to explain. In CHAs (combinatorial Hopf algebras), the comultiplication is often a sum of the different ways of taking a combinatorial structure apart into two pieces. In the case of binary trees, we take them apart by splitting down the middle, starting at a leaf and continuing down to the root. Each node that we pass through goes to the side which has both its branches.
The diagram shows one possible split of the tree. Each leaf of the tree gives rise to a split. Hence the possible splits are:

> mapM_ print $ splits $ T (T E 1 E) 2 (T (T E 3 E) 4 (T E 5 E))

(E,T (T E 1 E) 2 (T (T E 3 E) 4 (T E 5 E)))

(T E 1 E,T E 2 (T (T E 3 E) 4 (T E 5 E)))

(T (T E 1 E) 2 E,T (T E 3 E) 4 (T E 5 E))

(T (T E 1 E) 2 (T E 3 E),T E 4 (T E 5 E))

(T (T E 1 E) 2 (T (T E 3 E) 4 E),T E 5 E)

(T (T E 1 E) 2 (T (T E 3 E) 4 (T E 5 E)),E)

The definition of splits is wonderfully simple:

splits E = [(E,E)]

splits (T l x r) = [(u, T v x r) | (u,v) <- splits l] ++ [(T l x u, v) | (u,v) <- splits r]

We use this to define a coalgebra structure on the vector space of binary trees as follows:

instance (Eq k, Num k, Ord a) => Coalgebra k (YSymF a) where

    counit = unwrap . linear counit' where counit' (YSymF E) = 1; counit' (YSymF (T _ _ _)) = 0

    comult = linear comult'

        where comult' (YSymF t) = sumv [return (YSymF u, YSymF v) | (u,v) <- splits t]

In other words, the counit is the indicator function for the empty tree, and the comult sends a tree t to the sum of u⊗v for all splits (u,v).

> comult $ ysymF $ T (T E 1 E) 2 (T E 3 E)

(F(E),F(T (T E 1 E) 2 (T E 3 E)))+(F(T E 1 E),F(T E 2 (T E 3 E)))+(F(T (T E 1 E) 2 E),F(T E 3 E))+(F(T (T E 1 E) 2 (T E 3 E)),F(E))

Let's just check that this satisfies the coalgebra conditions:

> quickCheck (prop_Coalgebra :: Vect Q (YSymF ()) -> Bool)

+++ OK, passed 100 tests.

[I should say that although the test code is included in the HaskellForMaths package, it is not part of the exposed modules, so if you want to try this you will have to fish around in the source.]

Multiplication is slightly more complicated. Suppose that we are trying to calculate the product of trees t and u. Suppose that u has k leaves. Then we look at all possible "multi-splits" of t into k parts, and then graft the parts onto the leaves of u (in order). This is probably easiest explained with a diagram:

The diagram shows just one possible multi-split, but the multiplication is defined as the sum over all possible multi-splits. Here's the code:

multisplits 1 t = [ [t] ]

multisplits 2 t = [ [u,v] | (u,v) <- splits t ]

multisplits n t = [ u:ws | (u,v) <- splits t, ws <- multisplits (n-1) v ]

graft [t] E = t

graft ts (T l x r) = let (ls,rs) = splitAt (leafcount l) ts

                     in T (graft ls l) x (graft rs r)

instance (Eq k, Num k, Ord a) => Algebra k (YSymF a) where

    unit x = x *> return (YSymF E)

    mult = linear mult'

        where mult' (YSymF t, YSymF u) = sumv [return (YSymF (graft ts u)) | ts <- multisplits (leafcount u) t]

For example:

> ysymF (T (T E 1 E) 2 E) * ysymF (T E 3 E)

F(T (T (T E 1 E) 2 E) 3 E)+F(T (T E 1 E) 3 (T E 2 E))+F(T E 3 (T (T E 1 E) 2 E))

It's fairly clear that the empty tree E is a left and right identity for this multiplication. It seems plausible that the multiplication is also associative - let's just check:

> quickCheck (prop_Algebra :: (Q, Vect Q (YSymF ()), Vect Q (YSymF ()), Vect Q (YSymF ())) -> Bool)

+++ OK, passed 100 tests.

So we've defined both algebra and coalgebra structures on the free vector space of binary trees. For a bialgebra, the algebra and coalgebra structures need to satisfy compatibility conditions: roughly, the multiplication and comultiplication need to commute, ie comult (mult x y) == mult (comult x) (comult y); plus similar conditions involving unit and counit.

Given the way they have been defined, it seems plausible that the structures are compatible (roughly, because it doesn't matter whether you split before or after grafting), but let's just check:

> quickCheck (prop_Bialgebra :: (Q, Vect Q (YSymF ()), Vect Q (YSymF ())) -> Bool)

+++ OK, passed 100 tests.

This entitles us to declare a Bialgebra instance:

instance (Eq k, Num k, Ord a) => Bialgebra k (YSymF a) where {}

(The Bialgebra class doesn't define any "methods". So this is just a way for us to declare in the code that we have a bialgebra. For example, we could write functions which require Bialgebra as a context.)

Finally, for a Hopf algebra, we need an antipode operation. Recall that an antipode must satisfy the following diagram:

In particular,

mult . (id ⊗ antipode) . comult = unit . counit

Let's assume that an antipode for YSym exists, and see what we can deduce about it. The right hand side of the above equation is the function that takes a linear combination of trees, and drops everything except the empty tree. Informally:
(unit . counit) E = E
(unit . counit) (T _ _ _) = 0

For example:

> (unit . counit) (ysymF E) :: Vect Q (YSymF ())


> (unit . counit) (ysymF (T E () E)) :: Vect Q (YSymF ())


Since we also know that comult E = E⊗E, and mult E⊗E = E, it follows that antipode E = E.

Now what about the antipode of non-empty trees? We know that
comult t = E⊗t + ... + t⊗E
where the sum is over all splits of t.

((id ⊗ antipode) . comult) t = E⊗(antipode t) + ... + t⊗(antipode E)
(mult . (id ⊗ antipode) . comult) t = E * antipode t + ... + t * antipode E
where the right hand side of each multiplication symbol has antipode applied.

Now, E is the identity for multiplication of trees, so
(mult . (id ⊗ antipode) . comult) t = antipode t + ... + t * antipode E

Now, the Hopf algebra condition requires that (mult . (id ⊗ antipode) . comult) = (unit . counit). And we saw that for a non-empty tree, (unit . counit) t = 0. Hence:

antipode t + ... + t * antipode E = 0

Notice that all the terms after the first involve the antipodes of trees "smaller" than t (ie with fewer nodes). As a consequence, we can use this equation as the basis of a recursive definition of the antipode. We recurse through progressively smaller trees, and the recursion terminates because we know that antipode E = E. Here's the code:

instance (Eq k, Num k, Ord a) => HopfAlgebra k (YSymF a) where

    antipode = linear antipode'

        where antipode' (YSymF E) = return (YSymF E)

              antipode' x = (negatev . mult . (id `tf` antipode) . removeTerm (YSymF E,x) . comult . return) x

For example:

> antipode $ ysymF (T E () E)

-F(T E () E)

> antipode $ ysymF (T (T E () E) () E)

F(T E () (T E () E))

> quickCheck (prop_HopfAlgebra :: Vect Q (YSymF ()) -> Bool)

+++ OK, passed 100 tests.

It is also possible to give an explicit definition of the antipode (exercise: find it), but I thought it would be more illuminating to do it this way.

It's probably silly, but I just love being able to define algebraic structures on pictures (ie of trees).

Incidentally, I understand that there is a Hopf algebra structure similar to YSym underlying Feynman diagrams in physics.

I got the maths in the above mainly from Aguiar, Sottile, Structure of the Loday-Ronco Hopf algebra of trees. Thanks to them and many other researchers for coming up with all this cool maths, and for making it freely available online.

Saturday, 3 March 2012

What is a Hopf algebra?

A while ago I looked at the concepts of an algebra and a coalgebra, and showed how to represent them in Haskell. I was intending to carry on to look at bialgebras and Hopf algebras, but I realised that I wasn't sufficiently clear in my own mind about the motivation for studying them. So, I confess that I have been keeping this blog just about afloat with filler material, while behind the scenes I've been writing loads of code - some fruitful, some not - to figure out how to motivate the concept of Hopf algebra.

Some concepts in mathematics have an obvious motivation at the outset: groups arise from thinking about (eg geometric) symmetry, (normed) vector spaces are a representation of physical space. (Both concepts turn out to have a usefulness that goes far beyond the initial intuitive motivation.) With Hopf algebras this doesn't really appear to be the case.

Hopf algebras appear to have been invented for narrow technical reasons to solve a specific problem, and then over time, it has turned out both that they are far more widespread and far more useful than was initially realised. So for me, there are two motivations for looking at Hopf algebras:
- There are some really interesting structures that are Hopf algebras. I'm particularly interested in combinatorial Hopf algebras, which I hope to cover in this blog in due course. But there are also examples in physics, associated with Lie algebras, for example.
- They have some interesting applications. I'm interested in the applications to knot theory (which unfortunately are a little technical, so I'll have to summon up some courage if I want to go over them in this blog). But again, there seem to be applications to physics, such as renormalization in quantum field theory (which I don't claim to understand, btw).

As a gentle introduction to Hopf algebras, I want to look at the group Hopf algebra, which has some claim to be the fundamental example, and provides a really good anchor for the concept.

So last time we looked at the group algebra - the free k-vector space of k-linear combinations of elements of a group G. We saw that many elements in the group algebra have multiplicative inverses, and we wrote code to calculate them. However, if you look again at that code, you'll see that it doesn't actually make use of group inverses anywhere. The code really only relies on the fact that finite groups are finite monoids. We could say that actually, the code we wrote is for finding inverses in finite monoid algebras. (But of course it just so happens that all finite monoids are groups.)

[Edit: Oops - that last claim in brackets is not true. Thanks to reddit readers for pointing it out.]

So the group algebra is an algebra, indeed a monoid algebra, but with the special property that the basis elements (the group elements) all have multiplicative inverses.

Now, mathematicians always prefer, if possible, to come up with definitions that are basis-independent. (In HaskellForMaths, we've been working with free vector spaces, which encourages us to think of vector spaces in terms of a particular basis. But many interesting properties of vector spaces are true regardless of the choice of basis, so it is better to find a way to express them that doesn't involve the basis.)

Can we find a basis-independent way to characterise this special property of the group algebra?

Well, our first step has to be to find a way to talk about the group inverse without having to mention group elements. We do that by encapsulating the group inverse in a linear map, which we call the antipode:

antipode x = nf (fmap inverse x)

So in other words the antipode operation just inverts each group element in a group algebra element (a k-linear combination of group elements). (The nf call just puts the vector in normal form.) For example:

> antipode $ p [[1,2,3]] + 2 * p [[2,3,4]]

The key point about the antipode is that we now have a linear map on the group algebra, rather just an operation on the group alone. Although we defined the antipode in terms of a particular basis, linear maps fundamentally don't care about the basis. If you choose some other basis for the group algebra, I can tell you how the antipode transforms your basis elements.

Ok, so what we would like to do is find a way to express the group inverse property (ie the property that every group element has an inverse) as a property of the group algebra, in terms of the antipode.

Well, no point beating about the bush. Define:

trace (V ts) = sum [x | (g,x) <- ts]
diag = fmap (\g -> (g,g))

(So in maths notation, diag is the linear map which sends g to g⊗g.)

Note that these are both linear maps. In particular, once again, although we have defined them in terms of our natural basis (the group elements), as linear maps they don't really care what basis we choose.

Then it follows from the group inverse property that the following diagram commutes:

Why will it commute? Well, think about what happens to a group element, going from left to right:
- Going along the top, we have g -> g⊗g -> g⊗g-1 -> g*g-1 = 1
- Going along the middle, we have g -> 1 -> 1
- Going along the bottom, we have g -> g⊗g -> g-1⊗g -> g-1*g = 1
All of the maps are linear, so they extend from group elements to arbitrary k-linear combinations of group elements.

(Shortly, I'll demonstrate that it commutes as claimed, using a quickcheck property.)

We're nearly there. We've found a way to express the special property of the group algebra, in a basis-independent way. We can say that what is special about the group algebra is that there is a linear antipode map, such that the above diagram commutes.

[In fact I think it may be true that if you have a monoid algebra such that the above diagram commutes, then it follows that the monoid is a group. This would definitely be true if we constrained antipode to be of the form (fmap f), for f a function on the monoid, but I'm not absolutely sure that it's true if antipode is allowed to be an arbitrary linear function.]

Now, the concept of a Hopf algebra is just a slight generalization of this.

Observe that trace and diag actually define a coalgebra structure:
- diag is clearly coassociative
- the left and right counit properties are also easy to check

So we can define:

instance (Eq k, Num k) => Coalgebra k (Permutation Int) where

    counit = unwrap . linear counit' where counit' g = 1  -- trace

    comult = fmap (\g -> (g,g))                           -- diagonal

(In fact, trace and diag define a coalgebra structure on the free vector space over any set. Of course, some free vector spaces also have other more interesting coalgebra structures.)

Let's just quickcheck:

> quickCheck (prop_Coalgebra :: GroupAlgebra Q -> Bool)
+++ OK, passed 100 tests.

So for the definition of a Hopf algebra, we allow the antipode to be defined relative to other coalgebra structures besides the trace-diag structure (with some restrictions to be discussed later). So a Hopf algebra is a vector space having both an algebra and a coalgebra structure, such that there exists an antipode map that makes the following diagram commute:

Thus we can think of Hopf algebras as a generalisation of the group algebra. As we'll see (in future posts), there are Hopf algebras with rather more intricate coalgebra structures and antipodes than the group algebra.

Here's a Haskell class for Hopf algebras:

class Bialgebra k b => HopfAlgebra k b where

    antipode :: Vect k b -> Vect k b

(A bialgebra is basically an algebra plus coalgebra - but with one more condition that I'll explain in a minute.)

We've already seen the antipode for the group algebra, but here it is again in its proper home, as part of a Hopf algebra instance:

instance (Eq k, Num k) => HopfAlgebra k (Permutation Int) where

    antipode = nf . fmap inverse

And here's a quickCheck property:

prop_HopfAlgebra x =

    (unit . counit) x == (mult . (antipode `tf` id) . comult) x &&

    (unit . counit) x == (mult . (id `tf` antipode) . comult) x

> quickCheck (prop_HopfAlgebra :: GroupAlgebra Q -> Bool)

+++ OK, passed 100 tests.

So there you have it. That's what a Hopf algebra is.

Except that I've cheated slightly. What I've defined so far is actually only a weak Hopf algebra. There is one other condition that is needed, called the Hopf compatibility condition. This requires that the algebra and coalgebra structures are "compatible" in the following sense:
- counit and comult are algebra morphisms
- unit and mult are coalgebra morphisms

I don't want to dwell too much on this. It seems a pretty reasonable requirement, although other compatibility conditions are possible (eg Frobenius algebras). An algebra plus coalgebra satisfying these conditions (even if it doesn't have an antipode) is called a bialgebra. And it turns out that the group algebra is one.

> quickCheck (prop_Bialgebra :: (Q, GroupAlgebra Q, GroupAlgebra Q) -> Bool)
+++ OK, passed 100 tests.

Friday, 10 February 2012

Introducing the Group Algebra

Here's an interesting example of an algebra.

Given a group, form the free vector space on the elements of the group. For example, if g and h are elements of the group, then the following are some elements of the free vector space:
- g + h
- 1 + 2*g
- 2 + g*h + h/3

It's pretty obvious how to define an algebra structure on this vector space:
- the unit is 1, the identity element of the group
- the multiplication is the multiplication in the group, lifted to the vector space by linearity.

So for example:
(1 + 2g)(g + h/3) = g + 2g^2 + h/3 + 2gh/3

This is called the group algebra. (It's a special case of the monoid algebra construction that we looked at previously.) Given some particular field k and group G, it is usually written as kG.

How can we represent this in Haskell? Well in HaskellForMaths, we already have code for working in permutation groups, and code for forming free vector spaces. So it's fairly straightforward:

module Math.Algebras.GroupAlgebra where

-- ... imports ...

instance (Eq k, Num k) => Algebra k (Permutation Int) where

    unit x = x *> return 1

    mult = nf . fmap (\(g,h) -> g*h)

type GroupAlgebra k = Vect k (Permutation Int)

p :: [[Int]] -> GroupAlgebra Q

p = return . fromCycles

Then for example we can do the following:

$ cabal update
$ cabal install HaskellForMaths
$ ghci
> :m Math.Core.Utils Math.Algebras.GroupAlgebra
> (1 + p[[1,2,3],[4,5]])^2

(Actually, in HaskellForMaths <= 0.4.3, the first term will be shown as [] instead of 1. That's just a "bug" in the Show instance, which I have a fix for in the next release.)

For reference, in a maths book, the same result would be written:
( 1 + (1 2 3)(4 5) )^2 = 1 + 2(1 2 3)(4 5) + (1 3 2)

So I guess one thing to point out is that in effect this code defines the group algebra for the group of all permutations of the integers. In practice however, we can always think of ourselves as working in some finite subgroup of this group. For example, if we want to work in the group of symmetries of a square, generated by a rotation (1 2 3 4) and a reflection (1 2)(3 4), then we just need to consider only sums of the eight elements in the generated group.

Another thing to point out is that this code could easily be modified to allow permutations over an arbitrary type, since that is supported by the underlying permutation code.

So what is this group algebra then? What sort of thing is it, and how should one think about it?

Well, first, as an algebra, it has zero divisors. For example:

> (1+p[[1,2]])*(1-p[[1,2]])

However, a lot of the elements aren't zero divisors, and whenever they're not, they have inverses. The group elements themselves have inverses of course, but so do many sums of group elements. For example:

> (1+p[[1,2,3]])^-1
> (1+2*p[[1,2,3]])^-1

Just to check:

> (1+p[[1,2,3]]) * (1-p[[1,2,3]]+p[[1,3,2]])
> (1+2*p[[1,2,3]]) * (1-2*p[[1,2,3]]+4*p[[1,3,2]])

How do we calculate the inverses? Well it's quite clever actually. Let's work through an example. Suppose we want to find an inverse for
x = 1+2*p[[1,2]]+3*p[[1,2,3]]
The inverse, if it exists, will be a linear combination of elements of the group generated by 1, p[[1,2]] and p[[1,2,3]]. So it will be a sum
y = a*1 + b*p[[1,2]] + c*p[[1,3]] + d*p[[2,3]] + e*p[[1,2,3]] + f*p[[1,3,2]]
By supposition x*y = 1, so
(1+2*p[[1,2]]+3*p[[1,2,3]]) * (a*1+b*p[[1,2]]+c*p[[1,3]]+d*p[[2,3]]+e*p[[1,2,3]]+f*p[[1,3,2]]) =
1 + 0*p[[1,2]] + 0*p[[1,3]] + 0*p[[2,3]] + 0*p[[1,2,3]] + 0*p[[1,3,2]]

If we multiply out and equate coefficients, we will get a linear system in a,b,c,d,e,f. Something like:

 a+2b      +3e    = 1 (coefficients of 1)

2a+ b+3c          = 0 (coefficients of p[[1,2]])

       c+3d+2e    = 0 (coefficients of p[[1,3]])

   3b   + d   +2f = 0 (coefficients of p[[2,3]])


So we just solve the linear system to find a,b,c,d,e,f.

Here's the code:

newtype X a = X a deriving (Eq,Ord,Show)

instance HasInverses (GroupAlgebra Q) where

    inverse x@(V ts) =

        let gs = elts $ map fst ts -- all elements in the group generated by the terms

            n = length gs

            y = V $ zip gs $ map (glexvar . X) [1..n] -- x1*1+x2*g2+...+xn*gn

            x' = V $ map (\(g,c) -> (g, unit c)) ts -- lift the coefficients in x into the polynomial algebra

            one = x' * y

            m = [ [coeff (mvar (X j)) c | j <- [1..n]] | i <- gs, let c = coeff i one] -- matrix of the linear system

            b = 1 : replicate (n-1) 0

        in case solveLinearSystem m b of -- find v such that m v == b

            Just v -> nf $ V $ zip gs v

            Nothing -> error "GroupAlgebra.inverse: not invertible"

I won't explain it in detail. I'll just remark that this is one of the places where HaskellForMaths shows its power. We happen to have a type for polynomials lying around. They're a Num instance, so we can use them as the coefficients in GroupAlgebra k. We can create new variables x1, x2, ... (using glexvar . X), lift field elements into the polynomial algebra (using unit), multiply x and y in the group algebra, extract the coefficients into a matrix, and solve the linear system.

I should point out that unfortunately, since this method involves solving a linear system in |G| variables, it's only going to be efficient for small groups.

So what is the group algebra useful for? Well actually quite a lot. It's fundamental to the study of representation theory - representing groups as matrices. It's also used for "Fourier analysis of groups" - though I don't know much about that. But those will have to wait for another time.