Friday, 18 March 2011

Tensor Products, part 2: Monoids and Arrows

[New release, HaskellForMaths v0.3.2, available on Hackage]

Last time we looked at the tensor product of free vector spaces. Given A = Vect k a, B = Vect k b, then the tensor product A⊗B can be represented as Vect k (a,b). As we saw, the tensor product is the "mother of all bilinear functions".

In the HaskellForMaths library, I have defined a couple of type synonyms for direct sum and tensor product:

type DSum a b = Either a b
type Tensor a b = (a,b)

This means that in type signatures we can write the type of a direct sum as Vect k (DSum a b), and of a tensor product as Vect k (Tensor a b). The idea is that this will remind us what we're dealing with, and make things clearer.

During development, I initially called the tensor type TensorBasis. In maths, tensor product is thought of as an operation on vector spaces - A⊗B - rather than on their bases. It would be nicer if we could define direct sum and tensor product as operators on the vector spaces themselves, rather than their bases.

Well, we can have a go, something like this:

{-# LANGUAGE TypeFamilies, RankNTypes #-}

type DirectSum k u v =
    (u ~ Vect k a, v ~ Vect k b) => Vect k (DSum a b)

type TensorProd k u v =
    (u ~ Vect k a, v ~ Vect k b) => Vect k (Tensor a b)

type En = Vect Q EBasis

This appears to work:

$ ghci -XTypeFamilies
> :l Math.Test.TAlgebras.TTensorProduct
> e1 `te` e2 :: TensorProd Q En En

I'll reserve judgement. (Earlier in the development of the quantum algebra code for HaskellForMaths, I tried something similar to this, and ran into problems later on - but I can't now remember exactly what I did, so perhaps this will work.)

Okay, so what can we do with tensor products. Well first, given vectors u in A = Vect k a and v in B = Vect k b, we can form their tensor product, u⊗v, an element of A⊗B = Vect k (Tensor a b). To calculate u⊗v, we use the bilinearity of tensor product to reduce the tensor product of arbitrary vectors to a linear combination of tensor products of basis elements:
(x1 a1 + x2 a2)⊗(y1 b1 + y2 b2) = x1 y1 a1⊗b1 + x1 y2 a1⊗b2 + x2 y1 a2⊗b1 + x2 y2 a2⊗b2
Here's the code:

te :: Num k => Vect k a -> Vect k b -> Vect k (Tensor a b)
te (V us) (V vs) = V [((a,b), x*y) | (a,x) <- us, (b,y) <- vs]

This is in essence just the "tensor" function from last time, but rewritten to take its two inputs separately rather than in a direct sum. Mnemonic: "te" stands for "tensor product of elements". Note that the definition respects normal form: provided the inputs are in normal form (the as and bs are in order, and the xs and ys are non-zero), then so is the output.


We can form tensor products of tensor products, such as A⊗(B⊗C) = Vect k (a,(b,c)), and likewise (A⊗B)⊗C = Vect k ((a,b),c). These two are isomorphic as vector spaces. This is obvious if you think about it in the right way. Recall from last week that we can think of elements of A⊗B = Vect k (a,b) as 2-dimensional matrices with rows indexed by a, columns indexed by b, and entries in k. Well A⊗B⊗C (we can drop the parentheses as it makes no difference) is the space of three-dimensional matrices, with one dimension indexed by a, one by b, and one by c.

We can define isomorphisms either way with the following Haskell code:

assocL :: Vect k (Tensor u (Tensor v w)) -> Vect k (Tensor (Tensor u v) w)
assocL = fmap ( \(a,(b,c)) -> ((a,b),c) )

assocR :: Vect k (Tensor (Tensor u v) w) -> Vect k (Tensor u (Tensor v w))
assocR = fmap ( \((a,b),c) -> (a,(b,c)) )

It's clear that these functions are linear, since they're defined using fmap. It's also clear that they are bijections, since they are mutually inverse. Hence they are the required isomorphisms.


Last time we saw that the field k is itself a vector space, which can be represented as the free vector space Vect k (). What happens if we take the tensor product k⊗A of the field with some other vector space A = Vect k a? Well, if you think about it in terms of matrices, Vect k () is a one-dimensional vector space, so Vect k ((),a) will be a 1*n matrix (where n is the number of basis elements in a). But a 1*n matrix looks just the same as an n-vector:

    a1 a2 ...      a1 a2 ...
() ( .  .    ) ~= ( .  .    )

So we should expect that k⊗A = Vect k ((),a) is isomorphic to A = Vect k a. And indeed it is - here are the relevant isomorphisms:

unitInL = fmap ( \a -> ((),a) )

unitOutL = fmap ( \((),a) -> a )

unitInR = fmap ( \a -> (a,()) )

unitOutR = fmap ( \(a,()) -> a )

So tensor product is associative, and has a unit. In other words, vector spaces form a monoid under tensor product.

Tensor product of functions

Given linear functions f: A -> A', g: B -> B', we can define a linear function f⊗g: A⊗B -> A'⊗B' by
(f⊗g)(a⊗b) = f(a)⊗g(b)

Exercise: Prove that f⊗g is linear

Here's the Haskell code:

tf :: (Num k, Ord a', Ord b') => (Vect k a -> Vect k a') -> (Vect k b -> Vect k b')
   -> Vect k (Tensor a b) -> Vect k (Tensor a' b')
tf f g (V ts) = sum [x *> te (f $ return a) (g $ return b) | ((a,b), x) <- ts]
    where sum = foldl add zero

(Mnemonic: "tf" stands for "tensor product of functions".)

Let's just check that this is linear:

prop_Linear_tf ((f,g),k,(a1,a2,b1,b2)) = prop_Linear (linfun f `tf` linfun g) (k, a1 `te` b1, a2 `te` b2)
    where types = (f,g,k,a1,a2,b1,b2) :: (LinFun Q ABasis SBasis, LinFun Q BBasis TBasis, Q,
                                          Vect Q ABasis, Vect Q ABasis, Vect Q BBasis, Vect Q BBasis)

> quickCheck prop_Linear_tf
+++ OK, passed 100 tests.

So we now have tensor product operations on objects and on arrows. In each case, tensor product takes a pair of objects/arrows, and returns a new object/arrow.

There is a product category k-Vect×k-Vect, consisting of pairs of objects and pairs of arrows from k-Vect. The identity arrow is defined to be (id,id), and composition is defined by (f,g) . (f',g') = (f . f', g . g'). Given these definitions, it turns out that tensor product is a functor from k-Vect×k-Vect to k-Vect. (Another way to say this is that tensor product is a bifunctor in the category of vector spaces.)

Recall that a functor is just a map that "commutes" with the category operations, id and . (composition).
So the conditions for tensor product to be a functor are:
id⊗id = id
(f' . f)⊗(g' . g) = (f'⊗g') . (f⊗g)

Both of these follow immediately from the definition of f⊗g that was given above. However, just in case you don't believe me, here's a quickCheck property to prove it:

prop_TensorFunctor ((f1,f2,g1,g2),(a,b)) =
    (id `tf` id) (a `te` b) == id (a `te` b) &&
    ((f' . f) `tf` (g' . g)) (a `te` b) == ((f' `tf` g') . (f `tf` g)) (a `te` b)
    where f = linfun f1
          f' = linfun f2
          g = linfun g1
          g' = linfun g2
          types = (f1,f2,g1,g2,a,b) :: (LinFun Q ABasis ABasis, LinFun Q ABasis ABasis,
                                        LinFun Q BBasis BBasis, LinFun Q BBasis BBasis,
                                        Vect Q ABasis, Vect Q BBasis)

> quickCheck prop_TensorFunctor
+++ OK, passed 100 tests.

We can think of composition as doing things in series, and tensor as doing things in parallel.

Then the second bifunctor condition can be paraphrased as "Doing things in parallel, in series, is the same as doing things in series, in parallel", as represented by the following diagram.

You might recall that there are a couple of Haskell type classes for that kind of thing. The Category typeclass from Control.Category is about doing things in series. Here is the definition:

class Category cat where
    id :: cat a a
    (.) :: cat b c -> cat a b -> cat a c

The Arrow typeclass from Control.Arrow is about doing things in parallel:

class Category arr => Arrow arr where
    arr :: (a -> b) -> arr a b
    first :: arr a b -> arr (a,c) (b,c)
    second :: arr a b -> arr (c,a) (c,b)
    (***) :: arr a b -> arr a' b' -> arr (a,a') (b,b')
    (&&&) :: arr a b -> arr a b' -> arr a (b,b')

Intuitively, linear functions (Vect k a -> Vect k b) are arrows, via the definitions:
id = id
(.) = (.)
arr = fmap
first f = f `tf` id
second f = id `tf` f
f *** g = f `tf` g
(f &&& g) a = (f `tf` g) (a `te` a)

However, in order to define an Arrow instance we'll have to wrap the functions in a newtype.

import Prelude as P
import Control.Category as C
import Control.Arrow

newtype Linear k a b = Linear (Vect k a -> Vect k b)

instance Category (Linear k) where
    id = Linear
    (Linear f) . (Linear g) = Linear (f P.. g)

instance Num k => Arrow (Linear k) where
    arr f = Linear (fmap f)
    first (Linear f) = Linear f *** Linear
    second (Linear f) = Linear *** Linear f
    Linear f *** Linear g = Linear (f `tf2` g)
        where tf2 f g (V ts) = V $ concat
                  [let V us = x *> te (f $ return a) (g $ return b) in us | ((a,b), x) <- ts]
    Linear f &&& Linear g = (Linear f *** Linear g) C.. Linear (\a -> a `te` a)

Note that we can't use tf directly, because it requires Ord instances for a and b, and Haskell doesn't give us a way to require these. For this reason we define a tf2 function, which is equivalent except that it doesn't guarantee that results are in normal form.

There is loads of other stuff I could talk about:
Exercise: Show that direct sum is also a monoid, with the zero vector space as its identity. (Write Haskell functions for the necessary isomorphisms.)
Exercise: Show that tensor product distributes over direct sum - A⊗(B⊕C) is isomorphic to (A⊗B)⊕(A⊗C). (Write the isomorphisms.)
Exercise: Show that given f: A->A', g: B->B', it is possible to define a linear function f⊕g: A⊕B->A'⊕B' by (f⊕g)(a⊕b) = f(a)⊕g(b). (Write a dsumf function analogous to tf.)

There is another arrow related typeclass called ArrowChoice. It represents arrows where you have a choice of doing either one thing or another thing:

class Arrow arr => ArrowChoice arr where
    left :: arr a b -> arr (Either a c) (Either b c)
    right :: arr a b -> arr (Either c a) (Either c b)
    (+++) :: arr a a' -> arr b b' -> arr (Either a b) (Either a' b')
    (|||) :: arr a c -> arr b c -> arr (Either a b) c

Exercise: Show that the dsumf function can be used to give an ArrowChoice instance for linear functions, where the left summand goes down one path and the right summand down another.