Haskell for Maths: April 2012

In the last few posts, we've been looking at combinatorial Hopf algebras, specifically:
- SSym, a Hopf algebra with a basis of (indexed by) permutations
- YSym, with a basis of binary trees
- QSym, with a basis of compositions

It turns out that these three Hopf algebras are quite closely related (and indeed, there are a few others in the same "family"). Let's start with SSym and YSym.

Given a permutation p of [1..n], we can construct a binary tree called the descending tree as follows:
- Split the permutation as p = ls ++ [n] ++ rs
- Place n at the root of the tree, and recursively place the descending trees of ls and rs as the left and right children of the root
- To bottom out the recursion, the descending tree of the empty permutation is of course the empty tree

For example, the following diagram shows the descending tree of [3,5,1,4,2].

Here's the Haskell code:

descendingTree [] = E
descendingTree [x] = T E x E
descendingTree xs = T l x r
    where x = maximum xs
          (ls,_:rs) = L.break (== x) xs
          l = descendingTree ls
          r = descendingTree rs

> :m Math.Algebras.Structures Math.Combinatorics.CombinatorialHopfAlgebra
> descendingTree [3,5,1,4,2]
T (T E 3 E) 5 (T (T E 1 E) 4 (T E 2 E))

Now you'll recall that for the Hopf algebra YSym, although we sometimes carry around the node labels to help us see what is going on, we're really only interested in the shapes of the trees. So here's a function to erase the labels:

shape :: PBT a -> PBT ()
shape t = fmap (\_ -> ()) t

> shape $ T (T E 3 E) 5 (T (T E 1 E) 4 (T E 2 E))
T (T E () E) () (T (T E () E) () (T E () E))

Thus (shape . descendingTree) is a function from permutations to unlabelled trees. We can consider this as a map between the fundamental bases of SSym and YSym, which therefore induces a linear map between the corresponding Hopf algebras:

descendingTreeMap :: (Eq k, Num k) => Vect k SSymF -> Vect k (YSymF ())
descendingTreeMap = nf . fmap (\(SSymF xs) -> (YSymF . shape .  descendingTree) xs)

Now it turns out that this map is in fact a Hopf algebra morphism. What does that mean? Basically it means that descendingTreeMap plays nicely ("commutes") with the unit, mult, counit, comult, and antipode maps in the two Hopf algebras.

For example, for an algebra morphism, we require:
f (x1*x2) = f x1 * f x2
f . unit = unit

It's not immediately clear why descendingTreeMap should have these properties. The unit property is clear:
> descendingTreeMap 1 == 1

True

or put another way
> descendingTreeMap (ssymF []) == ysymF E

True

But what about the mult property?

Recall that in SSymF, we multiply permutations by shifting the right operand, and then summing all possible shuffles of the two lists:

> ssymF [3,2,1] * ssymF [1,2]
F [3,2,1,4,5]+F [3,2,4,1,5]+F [3,2,4,5,1]+F [3,4,2,1,5]+F [3,4,2,5,1]+F [3,4,5,2,1]+F [4,3,2,1,5]+F [4,3,2,5,1]+F [4,3,5,2,1]+F [4,5,3,2,1]

On the other hand, in YSymF, we multiply by multi-splitting the left operand, and then grafting the parts onto the leafs of the right operand, in all possible ways. The following diagram shows one possible multi-split-grafting, corresponding to one of the summands in [3,5,1,4,2] * [1,2]:

The numbers along the tops of the trees are the node labels generated by the descending tree construction. We can see from these that there is an exact correspondence between a shifted shuffle in SSymF and a multi-split grafting in YSymF. The asymmetry in the mult for YSymF, where we graft multi-splits of the left operand onto the right operand, corresponds to the asymmetry in the mult for SSymF, where we shift the right operand. This shifting in SSymF ensures that the nodes for the right operand end up at the root of the descending tree, as required by the grafting in YSymF. When we defined shuffling, we gave a recursive definition, but it's fairly clear that the grafting of the parts of the multi-split onto the right tree is accomplishing the same thing.

Similarly, a coalgebra morphism is a linear map which commutes with counit and comult:

In SSymF, counit is 1 on the empty permutation, 0 on anything else. In YSymF, counit is 1 on the empty tree, 0 on anything else. The descendingTreeMap sends the empty permutation to the empty tree, so it's clear that it commutes with counit in the required way.

What about comult? In SSymF, the comult of a permutation is the sum of its two-part deconcatenations (with each part flattened back to a permutation).

> comult $ ssymF [3,5,1,4,2]
(F [],F [3,5,1,4,2])+(F [1],F [4,1,3,2])+(F [1,2],F [1,3,2])+(F [2,3,1],F [2,1])+(F [2,4,1,3],F [1])+(F [3,5,1,4,2],F [])

In YSymF, the comult of a tree is the sum of its two-part splits. Now it's clear that flattening makes no difference to the descending tree. Then, it's also clear that if you take descending trees of the two parts of a deconcatenation, this corresponds to a split of the descending tree of the whole.

Finally, a Hopf algebra morphism is a bialgebra morphism which in addition commutes with antipode.

In SSymF and YSymF, we didn't give an explicit expression for the antipode, but instead derived it from mult and comult using the fact that they are both graded connected bialgebras. So it's actually kind of obvious that descendingTreeMap will be a Hopf algebra morphism, but just to check:

prop_HopfAlgebraMorphism f x = (f . antipode) x == (antipode . f) x

> quickCheck (prop_HopfAlgebraMorphism descendingTreeMap)
+++ OK, passed 100 tests.

Given any tree, there are many ways to label the nodes so that the tree is descending (ie such that the label on each child node is less than the label on its parent). For example, we could first label all the leaf nodes [1..k], and then all their immediate parents [k+1..], and so on. (For our example tree, this would lead to the alternative labelling [1,5,2,4,3].)

This shows that the descending tree map is surjective, but not injective. Hence YSym is a quotient of SSym.

The compositions of a number n are the different ways that it can be expressed as an ordered sum of positive integers. For example, the compositions of 4 are 1+1+1+1, 1+1+2, 1+2+1, 2+1+1, 1+3, 2+2, 3+1, 4. Equivalently, we can forget about the plus signs, and just consider the ordered lists of positive integers that sum to n. Here's the Haskell code:

compositions 0 = [[]]
compositions n = [i:is | i <- [1..n], is <- compositions (n-i)]

> :m Math.Algebras.Structures Math.Combinatorics.CombinatorialHopfAlgebra
> compositions 4
[[1,1,1,1],[1,1,2],[1,2,1],[1,3],[2,1,1],[2,2],[3,1],[4]]

We will define a CHA QSym whose basis elements are (indexed by) compositions:

newtype QSymM = QSymM [Int] deriving (Eq)

instance Ord QSymM where
    compare (QSymM xs) (QSymM ys) = compare (sum xs, xs) (sum ys, ys)

instance Show QSymM where
    show (QSymM xs) = "M " ++ show xs

qsymM :: [Int] -> Vect Q QSymM
qsymM xs | all (>0) xs = return (QSymM xs)
         | otherwise = error "qsymM: not a composition"

We will use QSymM as the basis for a Hopf algebra, indexed by compositions. (In practice, we're going to stick with smallish compositions, as the calculations would take too long otherwise.) We form the free vector space over this basis. An element of the free vector space is a linear combination of compositions. For example:

> qsymM [1,2] + 2 * qsymM [3,1]

M [1,2]+2M [3,1]

The algebra structure on QSymM is similar to the algebra structure we saw last time on SSymM. Instead of shifted shuffles, we will use overlapping shuffles or quasi-shuffles. This means that when shuffling (x:xs) and (y:ys), then instead of just choosing between taking x first or taking y first, we also have a third choice to take them both and add them together.

quasiShuffles (x:xs) (y:ys) = map (x:) (quasiShuffles xs (y:ys)) ++
                              map ((x+y):) (quasiShuffles xs ys) ++
                              map (y:) (quasiShuffles (x:xs) ys)
quasiShuffles xs [] = [xs]
quasiShuffles [] ys = [ys]

For example:

> quasiShuffles [1,2] [3]
[[1,2,3],[1,5],[1,3,2],[4,2],[3,1,2]]

For our algebra structure, we say that the product of two compositions is the sum of all quasi-shuffles of the compositions:

instance (Eq k, Num k) => Algebra k QSymM where
    unit x = x *> return (QSymM [])
    mult = linear mult' where
        mult' (QSymM alpha, QSymM beta) = sumv [return (QSymM gamma) | gamma <- quasiShuffles alpha beta]

It's fairly obvious that this is associative and satisfies the algebra requirements. (And there are quickCheck tests in the package to confirm it.)

The coalgebra structure on QSymM is also similar to the coalgebra structure on SSymM. (We'll see in due course that SSym and QSym are closely related.) For the comultiplication of a composition, we'll just use the sum of the deconcatenations of the composition (without the flattening that we did with SSymM):

instance (Eq k, Num k) => Coalgebra k QSymM where
    counit = unwrap . linear counit' where counit' (QSymM alpha) = if null alpha then 1 else 0
    comult = linear comult' where
        comult' (QSymM gamma) = sumv [return (QSymM alpha, QSymM beta) | (alpha,beta) <- deconcatenations gamma]

For example:

> comult $ qsymM [1,2,3]
(M [],M [1,2,3])+(M [1],M [2,3])+(M [1,2],M [3])+(M [1,2,3],M [])

(Recall that (x,y) should be read as x⊗y.)

This comultiplication, along with those we have seen for other combinatorial Hopf algebras, is obviously coassociative - but perhaps I should spell out what that means. Coassociativity says that:
(comult⊗id) . comult = (id⊗comult) . comult

In other words, if you split a composition in two, and then split the left part in two - in all possible ways - then you get the same sum of possibilities as if you had done the same but splitting the right part in two. This is obvious, because it's just the sum of possible ways of splitting the composition in three (modulo associativity).

Then for a coalgebra we also require:
(id⊗counit) . comult = id = (counit⊗id) . comult

But this is clear. For example:
((id⊗counit) . comult) [1,2,3] =
(id⊗counit) ([]⊗[1,2,3] + [1]⊗[2,3] + [1,2]⊗[3] + [1,2,3]⊗[]) =
0 + 0 + 0 + [1,2,3]
(modulo the isomorphism C⊗k = C)

Then, as we did previously, we can quickCheck that the algebra and coalgebra structures are compatible, and hence that we have a bialgebra. (See the detailed explanation last time: mainly it comes down to the fact that deconcatenating quasi-shuffles is the same as quasi-shuffling deconcatenations.)

instance (Eq k, Num k) => Bialgebra k QSymM where {}

As before, this is a connected graded bialgebra in an obvious way (using the sum of the composition as the grading), and hence it automatically has an antipode. In this case we can give an explicit expression for the antipode. The coarsenings of a composition are the compositions which are "more coarse" than the given composition, in the sense that they can be obtained by combining two or more adjacent parts of the given composition.

coarsenings (x1:x2:xs) = map (x1:) (coarsenings (x2:xs)) ++ coarsenings ((x1+x2):xs)
coarsenings xs = [xs] -- for xs a singleton or null

For example:

> coarsenings [1,2,3]
[[1,2,3],[1,5],[3,3],[6]]

Then the antipode can be defined as follows:

instance (Eq k, Num k) => HopfAlgebra k QSymM where
    antipode = linear antipode' where
        antipode' (QSymM alpha) = (-1)^length alpha * sumv [return (QSymM beta) | beta <- coarsenings (reverse alpha)]

Why does this work? Remember that - in the case of a combinatorial Hopf algebra, where counit picks out the empty structure - antipode has to perform the disappearing trick of making everything cancel out.

That is, we require that
mult . (id⊗antipode) . comult = unit . counit
where the right hand side is zero for any non-empty composition.

Now, in order to try to figure out how it works, I decided to go through an example. However, in the cold light of day I have to admit that perhaps this is one of those times when maths is not a spectator sport. So the rest of this blog post is "optional".

Ok, so let's look at an example:

[1,2,3]

-> (comult = deconcatenations)

[]⊗[1,2,3] + [1]⊗[2,3] + [1,2]⊗[3] + [1,2,3]⊗[]

-> (id⊗antipode = reversal coarsening of right hand side, with alternating signs)

- []⊗([3,2,1] + [5,1] + [3,3] + [6])
+ [1]⊗([3,2] + [5])
- [1,2]⊗[3]
+ [1,2,3]⊗[]

-> (mult = quasi-shuffles)

- [3,2,1] - [5,1] - [3,3] - [6]
+ [1,3,2] + [4,2] + [3,1,2] + [3,3] + [3,2,1] + [1,5] + [6] + [5,1]
- [1,2,3] - [1,5] - [1,3,2] - [4,2] - [3,1,2]
+ [1,2,3]

= 0

(The way I like to think of this is:
- comult splits the atom into a superposition of fragment states
- antipode turns the right hand part of each fragment pair into a superposition of anti-matter states
- mult brings the matter and anti-matter parts together, causing them to annihilate
However, I think that is just fanciful - there's no flash of light - so I'll say no more.)

So why does it work? Well, in the final sum, the terms have to cancel out in pairs. If we look at some of the terms, it appears that there are three possibilities:

Case 1:

Consider the case of [4,2]. It arises in two ways in the final sum:
123 -> 1,23 -> 1,32 -> 42
123 -> 12,3 -> -12,3 -> -42

In both cases, the 1 and the 3 are combined during the quasi-shuffle phase to make a 4. In order to be combined in this way, they have to end up on opposite sides of the split during the deconcatenation phase. Because there is a 2 separating them, there are two ways this can happen, with the 2 ending up on either the left or right hand side of the split. And then the alternating signs ensure that the two outcomes cancel out at the end.

(In this case there was just one part separating them, but the same thing happens if there are more. For example:
1234 -> 1,234 -> -1,432 -> -532 + other terms
1234 -> 12,34 -> 12,43 -> 532 + 523
1234 -> 123,4 -> -123,4 -> -523)

Case 2:

Consider the case of [3,3]. This arises in two ways in the final sum:
123 -> [],123 -> -[],33 -> -33
123 -> 1,23 -> 1,32 -> 33

In this case it is the 1 and the 2 that combine. Unlike the 42 case, in this case the parts that combine are adjacent to one another at the beginning. In the top interaction, they combine during the reverse coarsening phase, in the bottom interaction, they combine during the quasi-shuffle phase. If x and y are adjacent, then the only way they can combine during the quasi-shuffle phase is if the split happens between them during the deconcatenation phase. This will always cancel with the term we get by splitting just before both x and y, and then combining them during coarsening.

wxyz -> w,xyz -> +/- w,z(y+x) -> {wz}(y+x)
wxyz -> wx,yz -> -/+ wx,zy -> {wz}(x+y)

Case 3:

Finally, it may happen that x and y don't combine. This can happen in two different ways, depending whether x and y are adjacentparts or not.

If they're not adjacent, then we get a failed case 1. For example, the [3,1,2] and [1,3,2] terms arise when the 1 and 3 fail to combine into a 4:
123 -> 1,23 -> 1,32 -> 132 + 312
123 -> 12,3 -> -12,3 -> -132-312

If they're adjacent, we get a failed case 2. For example, the [3,2,1] terms arise when the 1 and 2 fail to combine into a 3:
123 -> [],123 -> -[],321 -> -321
123 -> 1,23 -> 1,32 -> 321

This isn't quite a proof, of course. In a longer composition, there might be an opportunity for more than one of these cases to happen at the same time, and we need to show how that all works out. Also, I'm not sure that the [1,2,3] term is either a failed case 1 or a failed case 2. I hope at least though that this analysis has shed some light on why it works. (Exercise: Complete the proof.)

Haskell for Maths

Monday 30 April 2012

CHAs IV: Hopf Algebra Morphisms

Monday 23 April 2012

CHAs III: QSym, a combinatorial Hopf algebra on compositions

Followers

Blog Archive

About Me