Saturday 12 November 2011

New release of HaskellForMaths


I've just uploaded a new version v0.4.1 of HaskellForMaths, containing three new modules and a couple of other improvements. The additions are as follows:

Math.Algebras.Quaternions

This module was already present: it defines the quaternion algebra on the basis {1,i,j,k}, where multiplication is defined by:
i^2 = j^2 = k^2 = ijk = -1

This is enough information to figure out the full multiplication table. For example:
ijk = -1
=> (ijk)k = -k
=> ij(kk) = -k (associativity of multiplication)
=> ij = k
It turns out that the basis elements i,j,k anti-commute in pairs, eg ij = -ji, etc.

In this release I've added a couple of new things.

First, the quaternions are a division algebra, so I've added a Fractional instance.

Specifically, we can define a conjugation operation on the quaternions (similar to complex conjugation) via
conj (w+xi+yj+zk) = w-xi-yj-zk
Then we can define a quadratic norm via
sqnorm q = q * conj q = w^2+x^2+y^2+z^2
Since the norm is always a scalar, we can define a multiplicative inverse by
q^-1 = conj q / sqnorm q

For example:

$ cabal install HaskellForMaths
$ ghci
> :m Math.Algebras.Quaternions
> (2*i+3*j)^-1 :: Quaternion Q
-2/13i-3/13j

(If you leave out the type annotation, you'll be working in Quaternion Double.)

Second, the quaternions have an interesting role in 3- and 4-dimensional geometry.

Given any non-zero quaternion q, the map x -> q^-1 x q turns out to be a rotation of the 3-dimensional space spanned by {i,j,k}. To multiply rotations together (ie do one then another), just multiply the quaternions. This turns out to be a better way to represent rotations than 3*3 matrices:
- It's more compact - four scalars rather than nine
- They're faster to multiply - 16 scalar multiplications versus 27
- It's more robust against rounding error - whatever quaternion you end up with will still represent a rotation, whereas a sequence of matrix multiplications of rotations might not be quite a rotation any more, due to rounding error.

If you're curious, the function reprSO3 converts a quaternion to the corresponding 3*3 matrix:

> reprSO3 (1+2*i) :: [[Q]]
[[1,0,0],[0,-3/5,-4/5],[0,4/5,-3/5]]

(Exercise: Figure out why we got this matrix.)

Quaternions can also be used to represent rotations of 4-dimensional space - see the documentation.


Math.Algebras.Octonions

This is a new module, providing an implementation of the 8-dimensional non-associative division algebra of octonions. I follow Conway's notation [1], so the octonions have basis {1,e0,e1,e2,e3,e4,e5,e6}, with multiplication defined by:
ei * ei = -1, for i in [0..6]
ei+1 * ei+2 = ei+4, where the indices are taken modulo 7.

The octonions are not associative, but they are an inverse loop, so they satisfy x-1(xy) = y = (yx)x-1. This is enough to enable us to deduce the full multiplication table from the relations above.

Like the quaternions, the octonions have conjugation and a norm, and multiplicative inverses:

> :l Math.Algebras.Octonions
> (2+i0+2*i3)^-1
2/9-1/9i0-2/9i3

The octonions are an exceptional object in mathematics: there's nothing else quite like them. They can be used to construct various other exceptional objects, such as the root lattice E8, or the Lie group G2. Hopefully I'll be able to cover some of that stuff in a future installment.

[1] Conway and Smith, On Quaternions and Octonions


Math.NumberTheory.Prime

The main function in this module is isPrime :: Integer -> Bool, which tells you whether a number is prime or not. It's implemented using the Miller-Rabin test.

The basic idea of the test is:
- If p is prime, then Zp is a field
- In a field, the equation x^2 = 1 has only two solutions, 1 and -1
- Given an arbitrary b coprime to p, we know from Fermat's little theorem that b^(p-1) = 1 (mod p)
- So if p-1 = q * 2^s, with q odd, then either b^q = 1 (mod p), or there is some r, 0 <= r < s with b^(q*2^r) = -1 (mod p)

The idea of the algorithm is to try to show that p isn't prime by trying to find a b where the above is not true. We take several different values of b at random, and repeatedly square b^q, to see whether we get -1 or not.

The advantage of the Miller-Rabin test, as compared to trial division say, is that it has a fast running time even for very large numbers. For example:

> :m Math.NumberTheory.Prime
> :set +s
> nextPrime $ 10^100
10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000267
(0.09 secs, 14904632 bytes)

The potential disadvantage of the Miller-Rabin test is that it is probabilistic: There is a very small chance (1 in 10^15 in this implementation) that it could just fail to hit on a b which disproves n's primeness, so that it would say n is prime when it isn't. In practice, at those odds it's not worth worrying about.


Math.NumberTheory.Factor

The main function in this module is pfactors :: Integer -> [Integer], which returns the prime factors of a number (with multiplicity). It uses trial division to try to find prime factors less than 10000. After that, it uses the elliptic curve method to try to split what remains. The elliptic curve method relies on some quite advanced maths, but the basic idea is this:
- If p is a prime, then Zp is a field
- Given a field, we can do "arithmetic on elliptic curves" over the field.
- So to factor n, pretend that n is prime, try doing arithmetic on elliptic curves, and wait till something goes wrong.
- It turns out that if we look at what went wrong, we can figure out a non-trivial factor of n

Here it is in action:

> :m Math.NumberTheory.Factor
> pfactors $ 10^30+5
[3,5,4723,1399606163,10085210079364883]
(0.55 secs, 210033624 bytes)
> pfactors $ 10^30+6
[2,7,3919,758405810021,24032284101871]
(2.31 secs, 883504748 bytes)

I love the way it can crunch through 12-digit prime factors with relative ease.

Sunday 18 September 2011

Commutative Algebra and Algebraic Geometry


Last time we saw how to create variables for use in polynomial arithmetic. This time I want to look at some of the things we can do next.

First, let's define the variables we are going to use:

> :l Math.CommutativeAlgebra.GroebnerBasis
> let [t,u,v,x,y,z,x',y',z'] = map glexvar ["t","u","v","x","y","z","x'","y'","z'"]


So now we can do arithmetic in the polynomial ring Q[t,u,v,x,y,z,x',y',z']. For example:

> (x+y)^2
x^2+2xy+y^2


The branch of mathematics dealing with the theory of polynomial rings is called commutative algebra, and it was "invented" mainly in support of algebraic geometry. Algebraic geometry is roughly the study of the curves, surfaces, etc that arise as the solution sets of polynomial equations. For example, the solution-set of the equation x^2+y^2=1 is the unit circle.

If we are given any polynomial equation f = g, then we can rewrite it more conveniently as f-g = 0. In other words, we only need to track individual polynomials, rather than pairs of polynomials. Call the solution set of f = 0 the zero-set of f.

Sometimes we're interested in the intersection of two or more curves, surfaces, etc. For example, it is well known that the hyperbola, parabola and ellipse all arise as "conic sections" - that is, as the intersection of a cone with a plane. So define the zero-set of a collection (or system) of polynomials to be the set of points which are zeros of all the polynomials simultaneously. For example, the zero-set of the system [x^2+y^2-z^2, z-1] is the unit circle x^2+y^2=1 situated on the plane z=1.


Okay, so how can commutative algebra help us to investigate curves and surfaces? Is there a way for us to "do geometry by doing algebra"? Well, first, what does "doing geometry" consist of? Well, at least some of the following:
- Looking at the shapes of curves and surfaces
- Looking at intersections, unions, differences and products of curves and surfaces
- Looking at when curves or surfaces can be mapped into or onto other curves or surfaces
- Looking at when two different curves or surfaces are equivalent, in some sense (for example, topologically equivalent)

(That phrase "curves and surfaces" is not only clumsy but also inaccurate, so from now on I'll use the proper term, "variety", for the zero-set of a system of polynomials, whether it's a set of isolated points, a curve, a surface, some higher dimensional thing, or a combination of some of the preceding.)

So can we do all those things using algebra? Well, let's have a go.

Let's start by looking at intersections and unions of varieties (remember, that's just the fancy name for curves, surfaces, etc.).

Well, we've already seen how to do intersections. If a variety V1 is defined by a system of polynomials [f1...fm], and a variety V2 is defined by [g1...gn], then their intersection is defined by the system [f1...fm,g1...gn] - the zero-set of both sets of polynomials simultaneously. We'll call this the "sum" of the systems of polynomials. (Note to the cognoscenti: yes, I'm really talking about ideals here.)

sumI fs gs = gb (fs ++ gs)

Don't worry too much about what that "gb" (Groebner basis) call is doing. Let's just say that it's choosing the best way to represent the system of polynomials. For example:

> sumI [x^2+y^2-z^2] [z-1]
[x^2+y^2-1,z-1]


Notice how the gb call has caused the first polynomial to be simplified slightly. The same variety might arise as the zero-set of many different systems of polynomials. That's something that we're going to need to look into - but later.

Okay, so what about unions of varieties. So suppose V1 is defined by [f1...fm], V2 is defined by [g1...gn]. A point in their union is in either V1 or V2, so it is in the zero-set of either [f1...fm] or [g1...gn]. So how about multiplying the polynomials together in pairs. That is, let's look at the system [fi*gj | fi <- fs, gj <- gs]. Call the zero-set of this system V. Then clearly, any point in either V1 or V2 is in V, since we then know that either all the fs or all the gs vanish at that point, and hence so do all the products. Conversely, suppose that some point is not in the union of V1 and V2. Then there must exist some fi, and some gj, which are non-zero at that point. Hence there is an fi*gj which is non-zero, so the point is not in V.

This construction is called, naturally enough, the product of the systems of polynomials.

productI fs gs = gb [f * g | f <- fs, g <- gs]


> productI [x^2+y^2-z^2] [z-1]
[x^2z+y^2z-z^3-x^2-y^2+z^2]


Just in case you're still a little unsure, let's confirm that a few arbitrary points in the union are in the zero-set of this polynomial:

> eval (x^2*z+y^2*z-z^3-x^2-y^2+z^2) [(x,100),(y,-100),(z,1)]
0
> eval (x^2*z+y^2*z-z^3-x^2-y^2+z^2) [(x,3),(y,4),(z,5)]
0


The first expression evaluates the polynomial at the point (100,-100,1), an arbitrary point on the plane z=1. The second evaluates at (3,4,5), an arbitrary point on the cone x^2+y^2=z^2. Both points are in the zero-set of our product polynomial.

Since we're in the neighbourhood, let's have a look at the other conic sections. First, let's rotate our coordinate system by 45 degrees, using the substitution x'=x+z, z'=z-x. (Okay, so this also scales - to save us having to handle a sqrt 2 factor.)



> let cone' = subst (x^2+y^2-z^2) [(x,(x'-z')/2),(y,y'),(z,(x'+z')/2)]
> cone'
-x'z'+y'^2


In these coordinates, the intersection of the cone with the plane z'=1 is the parabola x'=y'^2:

> sumI [cone'] [z'-1]
[y'^2-x',z'-1]


Alternatively, the intersection with the plane y'=1 is the hyperbola x'z'=1:

> sumI [cone'] [y'-1]
[x'z'-1,y'-1]


Okay, so we've made a start on seeing how to do geometry by doing algebra, by looking at unions and intersections of varieties. There's still plenty more to do. We mustn't forget that we have some unfinished business: we need to understand when different polynomial systems can define the same variety, and in what sense the gb (Groebner basis) function finds the "best" representation. That will have to wait for another time.

Incidentally, for the eval and subst functions that I used above, you will need to take the new release HaskellForMaths v0.4.0. In this release I also removed the older commutative algebra modules, so I revved the minor version number.

Sunday 4 September 2011

Commutative Algebra in Haskell, part 1


Once again, it's been a little while since my last post, and once again, my excuse is partly that I've been too busy writing code.

I've just uploaded a new release, HaskellForMaths 0.3.4, which contains the following new modules:

Math.Core.Utils - this is a collection of utility functions used throughout the rest of the library. I've belatedly decided that it's better to put them all in one place rather than scattered here and there throughout other modules.

Math.Core.Field - this provides new, more efficient implementations of several finite fields. There already were implementations of these finite fields, in the Math.Algebra.Field.Base and ...Extension modules, as discussed here and here. However, that code was written to make the maths clear, rather than for speed. This new module is about speed. For the prime power fields in particular (eg F4, F8, F9), these implementations are significantly faster.

Math.Combinatorics.Matroid - Matroids are a kind of combinatorial abstraction of the concept of linear independence. They're something that I heard about years ago - both of my favourite combinatorics books have brief introductions - but I never bothered to follow up. Well anyway, so something finally piqued my curiosity, and I got Oxley's Matroid Theory. It turned out to be really interesting stuff, and this module is pretty much a translation of a large part of that book into Haskell code, written as I taught myself all about matroids.

Math.CommutativeAlgebra.Polynomial - Although I hadn't yet got around to discussing them in the blog, HaskellForMaths has always had modules for working with multivariate polynomials, namely Math.Algebra.Commutative.Monomial and ...MPoly. However, these were some of the earliest code I wrote, before my more recent free vector space and algebra code. So I saw an opportunity to simplify and improve this code, by building it on top of the free vector space code. Also, I'm trying to rationalise the module naming convention in HaskellForMaths, to more closely follow the categories used in arxiv.org or mathoverflow.net . In the long run, I expect this module to supercede the older modules.

Math.CommutativeAlgebra.GroebnerBasis - Again, there was already code for Groebner bases in Math.Algebra.Commutative.GBasis. This is pretty much the same code, ported to the new polynomial implementation, but I've also begun to build on this, with code to find the sum, product, intersection, and quotient of ideals.


So the matroid code was just new code that I wrote while teaching myself some new maths. But most of the other code comes from an ambition to organise and simplify the HaskellForMaths library. I've also been trying to improve the documentation.

My ultimate ambition is to get more people using the library. To do that, the structure of the library needs to be clearer, the documentation needs to be better, and I need to explain how to use it. So I thought I'd start by explaining how to use the new commutative algebra modules.

(So this is a bit of a digression from the series on quantum algebra that I've been doing the last few months. However, in terms of the cumulative nature of maths, it's probably better to do this first.)


Okay, so suppose we want to do some polynomial arithmetic. Well, first we need to create some variables to work with. How do we do that?

First, decide on a monomial ordering - that is, we need to decide in what order monomials are to be listed within a polynomial. For the moment, let's use "graded lexicographic" or Glex order. This says that you should put monomials of higher degree before those of lower degree (eg y^3 before x^2), and if two monomials have the same degree, you should use lexicographic (dictionary) order (eg xyz before y^3).

Next, decide on a field to work over. Most often, we'll want to work over Q, the rationals.

Then, our variables themselves can be of any Haskell type - but there are usually only two sensible choices:

The easiest way is to use String as the type for our variables.

Then we could make some variables like this:

> :l Math.CommutativeAlgebra.Polynomial
> let [x,y,z] = map glexvar ["x","y","z"]

And then we can do polynomial arithmetic:

> (x+y+z)^3
x^3+3x^2y+3x^2z+3xy^2+6xyz+3xz^2+y^3+3y^2z+3yz^2+z^3


If we want to use any other field besides Q, then we will have to use a type annotation to tell the compiler which field we're working over:

> let [x,y,z] = map var ["x","y","z"] :: [GlexPoly F3 String]
> (x+y+z)^3
x^3+y^3+z^3


The alternative to using String for our variables is to define our own type. For example


data Var = X | Y | Z | W deriving (Eq,Ord)

instance Show Var where
    show X = "x"
    show Y = "y"
    show Z = "z"
    show W = "w"

[x,y,z,w] = map glexvar [X,Y,Z,W]


So there you have it - now you can do polynomial arithmetic in Haskell.

So how does it work?

Well, fundamentally, k-polynomials are a free k-vector space on the basis of monomials. So we define a type to implement monomials:


data MonImpl v = M Int [(v,Int)] deriving (Eq)
-- The initial Int is the degree of the monomial. Storing it speeds up equality tests and comparisons

instance Show v => Show (MonImpl v) where
    show (M _ []) = "1"
    show (M _ xis) = concatMap (\(x,i) -> if i==1 then showVar x else showVar x ++ "^" ++ show i) xis
        where showVar x = filter ( /= '"' ) (show x) -- in case v == String

Notice that our monomial implementation is polymorphic in v, the type of the variables.

Next, monomials form a monoid, so we make them an instance of Mon (the HaskellForMaths class for monoids):


instance (Ord v) => Mon (MonImpl v) where
    munit = M 0 []
    mmult (M si xis) (M sj yjs) = M (si+sj) $ addmerge xis yjs

In principle, all we need to do now is define an Ord instance, and then an Algebra instance, using the monoid algebra construction.

However, for reasons that will become clear in future postings, we want to be able to work with various different orderings on monomials, such as Lex, Glex, or Grevlex. So we provide various newtype wrappers around this basic monomial implementation. Here's the code for the Glex ordering that we used above:


newtype Glex v = Glex (MonImpl v) deriving (Eq, Mon) -- GeneralizedNewtypeDeriving

instance Show v => Show (Glex v) where
    show (Glex m) = show m

instance Ord v => Ord (Glex v) where
    compare (Glex (M si xis)) (Glex (M sj yjs)) =
        compare (-si, [(x,-i) | (x,i) <- xis]) (-sj, [(y,-j) | (y,j) <- yjs])

type GlexPoly k v = Vect k (Glex v)

glexvar :: v -> GlexPoly Q v
glexvar v = return $ Glex $ M 1 [(v,1)]

instance (Num k, Ord v, Show v) => Algebra k (Glex v) where
    unit x = x *> return munit
    mult xy = nf $ fmap (\(a,b) -> a `mmult` b) xy

We also have similar newtypes for Lex and Grevlex orderings, which I'll discuss another time.

And that's pretty much it. Now that we have an instance of Algebra k (Glex v), we get a Num instance for free, so we get +, -, *, and fromInteger. That means we can enter expressions like the following:

> (2*x^2-y*z)^2
4x^4-4x^2yz+y^2z^2

Note that division is not supported: you can't write x/y, for example. However, as a convenience, I have defined a partial instance of Fractional, which does let you divide by scalars. That means that it's okay to write x/2, for example.

Next time, some more things you can do with commutative algebra.

Sunday 10 July 2011

The Tensor Algebra Monad

It's been a little while since my last post. That's partly because I've been busy writing new code. I've put up a new release, HaskellForMaths 0.3.3, which contains three new modules:
- Math.Combinatorics.Digraph
- Math.Combinatorics.Poset
- Math.Combinatorics.IncidenceAlgebra

I'll go through their contents at some point, but this time I want to talk about the tensor algebra.

So recall that previously we defined the free vector space over a type, tensor products, algebras and coalgebras in Haskell code.

In HaskellForMaths, we always work with the free vector space over a type: that means, we take some type b as a basis, and form k-linear combinations of elements of b. This construction is represented by the type Vect k b.

Given two vector spaces A = Vect k a, B = Vect k b, we can form their tensor product A⊗B = Vect k (Tensor a b). So Tensor is a type constructor on basis types, which takes basis types a, b for vector spaces A, B, and returns a basis type for the tensor product A⊗B.

We also defined a type constructor DSum, which returns a basis type for the direct sum A⊕B.

Now, we saw that tensor product is a monoid (at the type level, up to isomorphism):
- it is associative: (A⊗B)⊗C is isomorphic to A⊗(B⊗C)
- it has a unit: the field k itself is an identity for tensor product, in the sense that k⊗A is isomorphic to A, is isomorphic to A⊗k

Given some specific vector space V, we can consider the tensor powers of V:
k, V, V⊗V, V⊗V⊗V, ...
(We can omit brackets in V⊗V⊗V because tensor product is associative.)

And indeed we can form their direct sum:
T(V) = k ⊕ V ⊕ V⊗V ⊕ V⊗V⊗V ⊕ ...
(where an element of T(V) is understood to be a finite sum of elements of the tensor powers.)

This is a vector space, since tensor products and direct sums are vector spaces. If V has a basis e1,e2,e3,..., then a typical element of T(V) might be something like 3 + 5e2 + 2e1⊗e3⊗e1.

Now the interesting thing is that T(V) can be given the structure of an algebra, as follows:
- for the unit, we use the injection of k into the first direct summand
- for the mult, we use tensor product

For example, we would have
e2 * (2 + 3e1 + e4⊗e2) = 2e2 + 3e2⊗e1 + e2⊗e4⊗e2

With this algebra structure, T(V) is called the tensor algebra.


So how should we represent the tensor algebra in HaskellForMaths? Suppose that V is the free vector space Vect k a over some basis type a. (Recall also that the field k itself can be represented as the free vector space Vect k () over the unit type.) Can we use the DSum and Tensor type constructors to build the tensor algebra? Something like:
Vect k (DSum () (DSum a (DSum (Tensor a a) (DSum ...))))

Hmm, that's not going to work - we can't build the whole of what we want that way. (Unless some type system wizard knows otherwise?) So instead of representing the direct sum and tensor product at the type level, we're going to have to do it at the value level. Here's the definition:

data TensorAlgebra a = TA Int [a] deriving (Eq,Ord)

Given the free vector space V = Vect k a over basis type a, then TensorAlgebra a is the basis type for the tensor algebra over a, so that Vect k (TensorAlgebra a) is the tensor algebra T(V). The Int in TA Int [a] tells us which direct summand we're in (ie which tensor power), and the [a] tells us the tensor multiplicands. So for example, e2⊗e1⊗e4 would be represented as TA 3 [e2,e1,e4]. Then Vect k (TensorAlgebra a) consists of k-linear combinations of these basis elements, so it is the vector space T(V) that we are after.

Here's a Show instance:


instance Show a => Show (TensorAlgebra a) where
    show (TA _ []) = "1"
    show (TA _ xs) = filter (/= '"') $ concat $ L.intersperse "*" $ map show xs

It will be helpful to have a vector space basis to work with, so here's one that we used previously:

newtype EBasis = E Int deriving (Eq,Ord)

instance Show EBasis where show (E i) = "e" ++ show i


Then, for example, our Show instance gives us:

> :l Math.Algebras.TensorAlgebra
> return (TA 0 []) <+> return (TA 2 [E 1, E 3])
1+e1*e3


(Recall that the free vector space is a monad, hence our use of return to put a basis element into the vector space.)

So note that in the show output, the "*" is meant to represent tensor product, so this is really 1+e1⊗e3. You can actually get Haskell to output the tensor product symbol - just replace "*" by "\x2297" in the definition of show - however I found that it didn't look too good in the Mac OS X terminal, and I wasn't sure it would work on all OSes.

Ok, how about an Algebra instance? Well, TensorAlgebra a is basically just a slightly frilly version of [a], so it's a monoid, and we can use the monoid algebra construction:


instance Mon (TensorAlgebra a) where
    munit = TA 0 []
    mmult (TA i xs) (TA j ys) = TA (i+j) (xs++ys)

instance (Num k, Ord a) => Algebra k (TensorAlgebra a) where
    unit x = x *> return munit
    mult = nf . fmap (\(a,b) -> a `mmult` b)

So now we can do arithmetic in the tensor algebra:

> let e_ i = return (TA 1 [E i]) :: Vect Q (TensorAlgebra EBasis)
> let e1 = e_ 1; e2 = e_ 2; e3 = e_ 3; e4 = e_ 4
> (e1+e2) * (1+e3*e4)
e1+e2+e1*e3*e4+e2*e3*e4


We've got into the habit of using QuickCheck to check algebraic properties. Let's just check that the tensor algebra, as we've defined it, is an algebra:


instance Arbitrary b => Arbitrary (TensorAlgebra b) where
    arbitrary = do xs <- listOf arbitrary :: Gen [b] -- ScopedTypeVariables
                   let d = length xs
                   return (TA d xs)

prop_Algebra_TensorAlgebra (k,x,y,z) = prop_Algebra (k,x,y,z)
    where types = (k,x,y,z) :: ( Q, Vect Q (TensorAlgebra EBasis), Vect Q (TensorAlgebra EBasis), Vect Q (TensorAlgebra EBasis) )

> quickCheck prop_Algebra_TensorAlgebra
+++ OK, passed 100 tests.



Ok, so what's so special about the tensor algebra? Well, it has a rather nice universal property:
Suppose A = Vect k a, B = Vect k b are vector spaces, and we have a linear map f : A -> B. Suppose that B is also an algebra. Then we can "lift" f to an algebra morphism f' : T(A) -> B, such that the following diagram commutes.


In other words, f' agrees with f on the copy of A within T(A): f = f' . i

Ah, but hold on, I didn't say what an algebra morphism is. Well, it's just the usual: a function which "commutes" with the algebra structure. Specifically, it's a linear map (so that it commutes with the vector space structure), which makes the following diagrams commute:


So how does this universal property work then? Here's the code:


injectTA :: Num k => Vect k a -> Vect k (TensorAlgebra a)
injectTA = fmap (\a -> TA 1 [a])

liftTA :: (Num k, Ord b, Show b, Algebra k b) =>
     (Vect k a -> Vect k b) -> Vect k (TensorAlgebra a) -> Vect k b
liftTA f = linear (\(TA _ xs) -> product [f (return x) | x <- xs])

In other words, any tensor product u⊗v⊗... is sent to f(u)*f(v)*...

Let's look at an example. Recall that the quaternion algebra H has the basis {1,i,j,k}, with i^2 = j^2 = k^2 = ijk = -1.

> let f = linear (\(E n) -> case n of 1 -> 1+i; 2 -> 1-i; 3 -> j+k; 4 -> j-k; _ -> zerov)
> let f' = liftTA f
> e1*e2
e1*e2
> f' (e1*e2)
2


Recall that we usually define a linear map by linear extension from its action on a basis - that's what the "linear" is doing in the definition of f. It's fairly clear what f' is doing: it's basically just variable substitution. That is, we can consider the basis elements ei as variables, and the tensor algebra as the algebra of non-commutative polynomials in the ei. Then the linear map f assigns a substitution to each basis element, and f' just substitutes and multiplies out in the target algebra. In this case, we have:
e1⊗e2 -> (1+i)*(1-i) = 1-i^2 = 2

We can use QuickCheck to verify that liftTA f is indeed the algebra morphism required by the universal property. Here's a QuickCheck property for an algebra morphism. (We don't bother to check that f is a linear map, since it's almost always clear from the definition. If in doubt, we can test that separately.)


prop_AlgebraMorphism f (k,x,y) =
    (f . unit) k == unit k &&
    (f . mult) (x `te` y) == (mult . (f `tf` f)) (x `te` y)

This is just a transcription of the diagrams into code.

In order to test the universal property, we have to check that liftTA f is an algebra morphism, and that it agrees with f on (the copy of) V (in T(V)):


prop_TensorAlgebra_UniversalProperty (fmatrix,(k,x,y),z) =
    prop_AlgebraMorphism f' (k,x,y) &&
    (f' . injectTA) z == f z
    where f = linfun fmatrix
          f' = liftTA f
          types = (fmatrix,(k,x,y),z) :: (LinFun Q EBasis HBasis,
                                         (Q,Vect Q (TensorAlgebra EBasis), Vect Q (TensorAlgebra EBasis)),
                                         Vect Q EBasis)

So the key to this code is the parameter fmatrix, which is an arbitrary (sparse) matrix from Q^n to H, the quaternions, from which we build an arbitrary linear function. Note that the universal property of course implies that we can choose any algebra as the target for f - I just chose the quaternions because they're familiar.

> quickCheck prop_TensorAlgebra_UniversalProperty
+++ OK, passed 100 tests.


With this construction, tensor algebra is in fact a functor from k-Vect to k-Alg. The action on objects is V -> T(V), Vect k a -> Vect k (TensorAlgebra a). But a functor also acts on the arrows of the source category.


How do we get an action on arrows? Well, we can use the universal property to construct one. If we have an arrow f: A -> B, then (injectTA . f) is an arrow A -> T(B). Then we use the universal property to lift to an arrow f': T(A) -> T(B).


Here's the code:


fmapTA :: (Num k, Ord b, Show b) =>
    (Vect k a -> Vect k b) -> Vect k (TensorAlgebra a) -> Vect k (TensorAlgebra b)
fmapTA f = liftTA (injectTA . f)

For example:


newtype ABasis = A Int deriving (Eq,Ord,Show)
newtype BBasis = B Int deriving (Eq,Ord,Show)

> let f = linear (\(A i) -> case i of 1 -> return (B 1) <+> return (B 2);
                                      2 -> return (B 3) <+> return (B 4);
                                      _ -> zerov :: Vect Q BBasis)
> let f' = fmapTA f
> return (TA 2 [A 1, A 2]) :: Vect Q (TensorAlgebra ABasis)
A 1*A 2
> f' it
B 1*B 3+B 1*B 4+B 2*B 3+B 2*B 4

So this is variable substitution again. In this case, as f is just a linear map between vector spaces, we can think of it as something like a change of basis of the underlying space. Then f' shows us how the (non-commutative) polynomials defined over the space transform under the change of basis.

Let's just verify that this is a functor. We have to show:
- That fmapTA f is an algebra morphism (ie it is an arrow in k-Alg)
- That fmapTA commutes with the category structure, ie fmapTA id = id, and fmapTA (g . f) = fmapTA g . fmapTA f.

Here's a QuickCheck property:


prop_Functor_Vect_TensorAlgebra (f,g,k,x,y) =
    prop_AlgebraMorphism (fmapTA f') (k,x,y) &&
    (fmapTA id) x == id x &&
    fmapTA (g' . f') x == (fmapTA g' . fmapTA f') x
    where f' = linfun f
          g' = linfun g
          types = (f,g,k,x,y) :: (LinFun Q ABasis BBasis, LinFun Q BBasis CBasis,
                                  Q, Vect Q (TensorAlgebra ABasis), Vect Q (TensorAlgebra ABasis) )

> quickCheck prop_Functor_Vect_TensorAlgebra
+++ OK, passed 100 tests.

So can we declare a Functor instance? Well no, actually. Haskell only allows us to declare type constructors as Functor instances, whereas what we would want to do is declare the type function (\Vect k a -> Vect k (TensorAlgebra a)) as a Functor, which isn't allowed.


Ok, so we have a functor T: k-Vect -> k-Alg, the tensor algebra functor. We also have a forgetful functor going the other way, k-Alg -> k-Vect, which consists in taking an algebra, and simply forgetting that it is an algebra, and seeing only the vector space structure. (As it does at least remember the vector space structure, perhaps we should call this a semi-forgetful, or merely absent-minded functor.)

The cognoscenti will no doubt have seen what is coming next: we have an adjunction, and hence a monad.

How so? Well, it's obvious from its type signature that injectTA is return. For (>>=) / bind, we can define the following:


bindTA :: (Num k, Ord b, Show b) =>
    Vect k (TensorAlgebra a) -> (Vect k a -> Vect k (TensorAlgebra b)) -> Vect k (TensorAlgebra b)
bindTA = flip liftTA

Note that in addition to flipping the arguments, bindTA also imposes a more restrictive signature than liftTA: the target algebra is constrained to be a tensor algebra.


> let f = linear (\(A i) -> case i of 1 -> return (TA 2 [B 1, B 2]);
                                      2 -> return (TA 1 [B 3]) + return (TA 1 [B 4]);
                                      _ -> zerov :: Vect Q (TensorAlgebra BBasis))
> return (TA 2 [A 1, A 2]) `bindTA` f
B 1*B 2*B 3+B 1*B 2*B 4

So the effect of bind is to feed a non-commutative polynomial through a variable substitution.


Monads are meant to satisfy the following monad laws:
- "Left identity": return a >>= f  ==  f a
- "Right identity": m >>= return  ==  m
- "Associativity": (m >>= f) >>= g  ==  m >>= (\x -> f x >>= g)

As usual, we write a QuickCheck property:


prop_Monad_Vect_TensorAlgebra (fmatrix,gmatrix,a,ta)=
    injectTA a `bindTA` f == f a                                  && -- left identity
    ta `bindTA` injectTA == ta                                    && -- right identity
    (ta `bindTA` f) `bindTA` g == ta `bindTA` (\a -> f a `bindTA` g) -- associativity
    where f = linfun fmatrix
          g = linfun gmatrix
          types = (fmatrix,gmatrix,a,ta) :: (LinFun Q ABasis (TensorAlgebra BBasis),
                                             LinFun Q BBasis (TensorAlgebra CBasis),
                                             Vect Q ABasis, Vect Q (TensorAlgebra ABasis) )

> quickCheck prop_Monad_Vect_TensorAlgebra
+++ OK, passed 100 tests.

Once again, we can't actually declare a Monad instance, because our type function (\Vect k a -> Vect k (TensorAlgebra a)) is not a type constructor.


So, we have a functor, and indeed a monad, T: k-Vect -> k-Alg. Now recall that the free vector space construction (\a -> Vect k a) was itself a functor, indeed a monad, from Set -> k-Vect. What happens if we compose these two functors? Why then of course we get a functor, and a monad, from Set -> k-Alg. In Haskell terms, this is a functor a -> Vect k (TensorAlgebra a).

What does this functor look like? Well, relative to a, Vect k (TensorAlgebra a) is the free algebra on a, consisting of all expressions in which the elements of k and the elements of a are combined using (commutative) addition and (non-commutative) multiplication. In other words, the elements of a can be thought of as variable symbols, and Vect k (TensorAlgebra a) as the algebra of non-commutative polynomials in these variables.

Here's the code:


injectTA' :: Num k => a -> Vect k (TensorAlgebra a)
injectTA' = injectTA . return

liftTA' :: (Num k, Ord b, Show b, Algebra k b) =>
     (a -> Vect k b) -> Vect k (TensorAlgebra a) -> Vect k b
liftTA' = liftTA . linear

fmapTA' :: (Num k, Ord b, Show b) =>
    (a -> b) -> Vect k (TensorAlgebra a) -> Vect k (TensorAlgebra b)
fmapTA' = fmapTA . fmap

bindTA' :: (Num k, Ord b, Show b) =>
    Vect k (TensorAlgebra a) -> (a -> Vect k (TensorAlgebra b)) -> Vect k (TensorAlgebra b)
bindTA' = flip liftTA'

The only one of these which might require a little explanation is liftTA'. This works by applying a universal property twice, as shown by the diagram below: first, the universal property of free vector spaces is used to lift a function a -> Vect k (TensorAlgebra b) to a function Vect k a -> Vect k (TensorAlgebra b); then the universal property of the tensor algebra is used to lift that to a function Vect k (TensorAlgebra a) -> Vect k (TensorAlgebra b).


Here's an example, which shows that in the free algebra as in the tensor algebra, bind corresponds to variable substitution:

> let [t,x,y,z] = map injectTA' ["t","x","y","z"] :: [Vect Q (TensorAlgebra String)]
> let f "x" = 1-t^2; f "y" = 2*t; f "z" = 1+t^2
> (x^2+y^2-z^2) `bindTA'` f
0


Saturday 23 April 2011

What is a Coalgebra?

Last time we saw how to define an algebra structure on a vector space, in terms of category theory. I think perhaps some readers wondered what we gained by using category theory. The answer may be: not much, yet. However, in due course, we would like to understand the connection between quantum algebra and knot theory, and for that, category theory is essential.

This week, I want to look at coalgebras. We already saw, in the case of products and coproducts, how given a structure in category theory, you can define a dual structure by reversing the directions of all the arrows. So it is with algebras and coalgebras.

Recall that an algebra consisted of a k-vector space A together with linear functions
unit :: k -> A
mult :: A⊗A -> A
satisfying two commutative diagrams, associativity and unit.

Well, a coalgebra consists of a k-vector space C together with two linear functions:
counit :: C -> k
comult :: C -> C⊗C
satisfying the following two commutative diagrams:

Coassociativity:

This diagram is actually shorthand for the following diagram:
The isomorphisms at the top are the assocL and assocR isomorphisms that we defined here.

Counit:


These are just the associativity and unit diagrams, but with arrows reversed and relabeled.

Recall that when we say that a diagram commutes, we mean that it doesn't matter which way you follow the arrows, you end up with the same result. In other words, what these diagrams are saying is:

comult⊗id . comult == id⊗comult . comult  (Coassoc)
counit⊗id . comult == unitInL             (Left counit)
id⊗comult . comult == unitInR             (Right counit)

(where unitInL, unitInR are the isomorphisms that we defined in here.)


In HaskellForMaths, recall that we work with free vector spaces over a basis type, so the definition comes out slightly different:

module Math.Algebras.Structures where

...

class Coalgebra k c where
    counit :: Vect k c -> k
    comult :: Vect k c -> Vect k (Tensor c c)

What this definition really says is that c is the basis for a coalgebra. As before, we could try using type families to make this look more like the mathematical definition:

type TensorProd k u v =
    (u ~ Vect k a, v ~ Vect k b) => Vect k (Tensor a b)

class Coalgebra2 k c where
    counit2 :: c -> k
    comult2 :: c -> TensorProd k c c

In this definition, c is the coalgebra itself. However, I'm not going to pursue this approach for now.


What do coalgebras look like? Well, they look a bit like algebras would, if you looked at them through the wrong end of the telescope.

More specifically, given any basis b, define a dual basis as follows:

newtype Dual basis = Dual basis deriving (Eq,Ord)

instance Show basis => Show (Dual basis) where
    show (Dual b) = show b ++ "'"

(For those who know what a dual vector space is - this is it. For those who don't, I'll explain in a minute.)

Then, given an Algebra instance on some finite-dimensional basis b, we can define a Coalgebra instance on Dual b as follows:

1. Write out the unit and mult operations in the algebra as matrices.

For example, in the case of the quaternions, we have
unit:

     1 i j k

1 -> 1 0 0 0

mult:

        1  i  j  k

1⊗1 ->  1  0  0  0
1⊗i ->  0  1  0  0
1⊗j ->  0  0  1  0
1⊗k ->  0  0  0  1
i⊗1 ->  0  1  0  0
i⊗i -> -1  0  0  0
i⊗j ->  0  0  0  1
i⊗k ->  0  0 -1  0
j⊗1 ->  0  0  1  0
j⊗i ->  0  0  0 -1
j⊗j -> -1  0  0  0
j⊗k ->  0  1  0  0
k⊗1 ->  0  0  0  1
k⊗i ->  0  0  1  0
k⊗j ->  0 -1  0  0
k⊗k -> -1  0  0  0

2. Then transpose these two matrices, and use them as the definitions for counit and comult, but replacing each basis element by its dual.

So, in the case of the quaternions, we would get
counit:

      1'

1' -> 1
i' -> 0
j' -> 0
k' -> 0

comult:

     1'⊗1' 1'⊗i' 1'⊗j' 1'⊗k' i'⊗1' i'⊗i' i'⊗j' i'⊗k' j'⊗1' j'⊗i' j'⊗j' j'⊗k' k'⊗1' k'⊗i' k'⊗j' k'⊗k'

1' ->  1     0     0     0      0    -1     0     0     0     0    -1     0     0     0     0    -1
i' ->  0     1     0     0      1     0     0     0     0     0     0     1     0     0    -1     0
j' ->  0     0     1     0      0     0     0    -1     1     0     0     0     0     1     0     0
k' ->  0     0     0     1      0     0     1     0     0    -1     0     0     1     0     0     0


In code, we get:

instance Num k => Coalgebra k (Dual HBasis) where
    counit = unwrap . linear counit'
        where counit' (Dual One) = return ()
              counit' _ = zero
    comult = linear comult'
        where comult' (Dual One) = return (Dual One, Dual One) <+>
                  (-1) *> ( return (Dual I, Dual I) <+> return (Dual J, Dual J) <+> return (Dual K, Dual K) )
              comult' (Dual I) = return (Dual One, Dual I) <+> return (Dual I, Dual One) <+>
                  return (Dual J, Dual K) <+> (-1) *> return (Dual K, Dual J)
              comult' (Dual J) = return (Dual One, Dual J) <+> return (Dual J, Dual One) <+>
                  return (Dual K, Dual I) <+> (-1) *> return (Dual I, Dual K)
              comult' (Dual K) = return (Dual One, Dual K) <+> return (Dual K, Dual One) <+>
                  return (Dual I, Dual J) <+> (-1) *> return (Dual J, Dual I)

unwrap :: Num k => Vect k () -> k
unwrap (V []) = 0
unwrap (V [( (),x)]) = x

(Recall that when we want to think of k as a vector space, we have to represent it as Vect k ().)


We should check that this does indeed define a coalgebra. It's clear, by definition, that counit and comult are linear, as required. Here's a quickcheck property for coassociativity and counit:

prop_Coalgebra x =
    ((comult `tf` id) . comult) x == (assocL . (id `tf` comult) . comult) x  && -- coassociativity
    ((counit' `tf` id) . comult) x == unitInL x                              && -- left counit
    ((id `tf` counit') . comult) x == unitInR x                                 -- right counit
        where counit' = wrap . counit

wrap :: Num k => k -> Vect k ()
wrap 0 = zero
wrap x = V [( (),x)]


(It's a bit awkward that we have to keep wrapping and unwrapping between k and Vect k (). I think we could have avoided this if we had defined counit to have signature Vect k c -> Vect k () instead of Vect k c -> k. To be honest, I think that is probably the right thing to do, so perhaps I'll change it in some future release of HaskellForMaths. What does anyone else think?)


Anyway, here's a quickCheck property to test that the dual quaternions are a coalgebra:

instance Arbitrary b => Arbitrary (Dual b) where
    arbitrary = fmap Dual arbitrary

prop_Coalgebra_DualQuaternion x = prop_Coalgebra x
    where types = x :: Vect Q (Dual HBasis)

> quickCheck prop_Coalgebra_DualQuaternion
+++ OK, passed 100 tests.


So, given an algebra structure on some vector space (basis), we can define a coalgebra structure on the dual vector space (dual basis). But why did we use the dual? Why didn't we just use the above construction to define a coalgebra structure on the vector space itself? For example, for the quaternions, that would look like this:

instance Num k => Coalgebra k HBasis where
    counit = unwrap . linear counit'
        where counit' One = return ()
              counit' _ = zero
    comult = linear comult'
        where comult' One = return (One,One) <+> (-1) *> ( return (I,I) <+> return (J,J) <+> return (K,K) )
              comult' I = return (One,I) <+> return (I,One) <+> return (J,K) <+> (-1) *> return (K,J)
              comult' J = return (One,J) <+> return (J,One) <+> return (K,I) <+> (-1) *> return (I,K)
              comult' K = return (One,K) <+> return (K,One) <+> return (I,J) <+> (-1) *> return (J,I)

Well, we could indeed do that. However, it obscures the underlying mathematics. The point is that an algebra structure on a finite-dimensional vector space gives rise "naturally" to a coalgebra structure on the dual space. It is then also true that a finite-dimensional vector space is naturally isomorphic to its dual, which is I guess why the second construction works.

Okay, so why is the coalgebra structure on the dual vector space "natural"?

Well first, what is a dual vector space anyway? Given a vector space V, the dual space V* is Hom(V,k), the space of linear maps from V to k. If V is finite-dimensional with basis b, then as we have seen, we can define a basis Dual b for V* which is in one-to-one correspondence with b. Given a basis element ei, then the dual basis element Dual ei represents the linear map that sends ei to 1, and any other basis element ej, j /= i, to 0.

It is convenient to define a linear map called the evaluation map, ev: V*⊗V -> k, with ev(a⊗x) = a(x):

ev :: (Num k, Ord b) => Vect k (Tensor (Dual b) b) -> k
ev = unwrap . linear (\(Dual bi, bj) -> delta bi bj *> return ())

-- where delta i j = if i == j then 1 else 0


Then given an element a in V* = Vect k (Dual b), and an element x in V = Vect k b, we can evaluate a(x) by calling ev (a `te` x).

For example:

dual = fmap Dual

> ev $ dual e1 `te` (4 *> e1 <+> 5 *> e2)
4
> ev $ dual e2 `te` (4 *> e1 <+> 5 *> e2)
5
> ev $ dual (e1 <+> 2 *> e2) `te` (4 *> e1 <+> 5 *> e2)
14


Provided V is finite-dimensional, then every element of V* can be expressed in the form dual v, for some v in V.

If we want to turn an element of Vect k (Dual b) into a real Haskell function Vect k b -> k, we can use the following code:

reify :: (Num k, Ord b) => Vect k (Dual b) -> (Vect k b -> k)
reify a x = ev (a `te` x)


For example:

> let f = reify (dual e2)
> f (4 *> e1 <+> 5 *> e2)
5



Now, suppose that we have a linear map f: U -> V between vector spaces. This gives rise to a linear map f*: V* -> U*, by defining:
ev (f* a ⊗ x) = ev (a ⊗ f x)
It turns out that the matrix for f* will be the transpose of the matrix for f.

For, suppose f(ui) = sum [mij vj | j <- ...] and f*(vi*) = sum [m*ij uj* | j <- ...]. Then

ev (f* vk* ⊗ ui) = ev (vk* ⊗ f ui)                                           -- definition of f*
=> ev (sum [m*kl ul* | l <- ...] ⊗ ui) = ev (vk* ⊗ sum [mij vj | j <- ...])  -- expanding f and f*
=> sum [m*kl ev(ul* ⊗ ui) | l <- ...] = sum [mij ev (vk* ⊗ vj) | j <- ...]   -- linearity of ev
=> sum [m*kl (delta l i) | l <- ...] = sum [mij (delta k j) | j <- ...]       -- definition of ev
=> m*ki = mik                                                                 -- definition of delta

So m* is the transpose of m.

Hence we have a contravariant functor * from the category of finite-dimensional vector spaces to itself, which takes a space V to the dual space V*, and a linear map f: U -> V to the dual map f*: V* -> U*. ("Contravariant" just means that it reverses the directions of arrows.)

Hopefully this explains why it is natural to think of an algebra structure on V as giving rise to a coalgebra structure on V*: a coalgebra is basically an algebra but with arrows reversed - and in going from vector spaces to their duals, we reverse arrows.

By the way, the converse is also true: A coalgebra structure on V gives rise to an algebra structure on V*, in the same way.

Saturday 16 April 2011

What is an Algebra?

Over the last few months, we've spent somewhat longer than I originally expected looking at vector spaces, direct sums and tensor products. I hope you haven't forgotten that the reason we were doing this is because we want to look at quantum algebra, and "quantum groups". What are quantum groups? Well, one thing they are is algebras - so the next thing we need to do is define algebras.

Informally, an algebra is just a vector space which is also a ring (with unit) - or to put it another way, a ring (with unit) which is also a vector space. So a straightforward definition would be, A is an algebra if
(i) A is an additive group (this is required by both vector spaces and rings)
(ii) There is a scalar multiplication smult :: k×A -> A (satisfying some laws, as discussed in a previous post)
(iii) There is a multiplication mult :: A×A -> A, satisfying some laws:
- mult is associative: a(bc) = (ab)c
- mult distributes over addition: a(b+c) = (ab)+(ac), (a+b)c = (ac)+(bc)
(iv) There is a unit :: A, which is an identity for mult: 1a = a = a1

Some examples:
- C is an algebra over R (2-dimensional as a vector space)
- 2×2 matrices over a field k form a k-algebra (4-dimensional)
- polynomials over a field k form a k-algebra (infinite-dimensional)

It would be fairly straightforward to translate these definitions into a Haskell type class as they are. However, we're going to do things slightly differently, for two reasons.

Firstly, we would like to use the language of category theory, and define multiplication and unit in terms of linear maps (arrows in the category of vector spaces). Specifically, we define linear maps:
unit :: k -> A
mult :: A⊗A -> A

We then require that the following diagrams commute:

Associativity:



When we say that this diagram commutes, what it means is that it doesn't matter which way you decide to follow the arrows, the result is the same. Specifically:
mult . mult⊗id == mult . id⊗mult

Unit:



In this case there are two commuting triangles, so we are saying:
mult . unit⊗id == unitOutL
mult . id⊗unit == unitOutR
(where unitOutL, unitOutR are the relevant isomorphisms, which we defined last time.)

But hold on, haven't we forgotten one of the requirements? What about distributivity? Well, notice that the signature of mult was A⊗A -> A, not A×A -> A. A linear map from the tensor product is bilinear in each component (by definition of tensor product) - and bilinearity implies distributivity. So in this version, distributivity is built into the definition of mult. Neat, eh?

The second reason our definition will be different is that in HaskellForMaths, all our vector spaces are free vector spaces over some basis type b: V = Vect k b. Consequently, our algebras will be A = Vect k a, where a is a k-basis for the algebra. Because of this, it will turn out to be more natural to express some things in terms of a, rather than A.

With those forewarnings, here's the HaskellForMaths definition of an algebra:


module Math.Algebras.Structures where

import Math.Algebras.VectorSpace
import Math.Algebras.TensorProduct

class Algebra k a where
    unit :: k -> Vect k a
    mult :: Vect k (Tensor a a) -> Vect k a

In this definition, a represents the basis of the algebra, not the algebra itself, which is A = Vect k a. Recall that we defined a type Tensor a b, which is a basis for Vect k a ⊗ Vect k b.


If we wanted to stay a bit closer to the category theory definition, we could try continuing with the type family approach that we looked at last time:


type TensorProd k u v =
    (u ~ Vect k a, v ~ Vect k b) => Vect k (Tensor a b)

class Algebra2 k a where
    unit2 :: k -> a
    mult2 :: TensorProd k a a -> a

In this definition a is the algebra itself. I'm not going to pursue that approach any further here, but if anyone fancies giving it a go, I'd be interested to hear how they get on.


Anyway, as discussed, unit and mult are required to be linear maps. We could check this using QuickCheck, but in practice we will always define unit and mult in such a way that they are clearly linear.

However, we can write a QuickCheck property to check the other requirements:


prop_Algebra (k,x,y,z) =
    (mult . (id `tf` mult)) (x `te` (y `te` z)) ==
         (mult . (mult `tf` id)) ((x `te` y) `te` z)              && -- associativity
    unitOutL (k' `te` x) == (mult . (unit' `tf` id)) (k' `te` x)  && -- left unit
    unitOutR (x `te` k') == (mult . (id `tf` unit')) (x `te` k')     -- right unit
    where k' = k *> return ()

(Recall that when we wish to consider k as a vector space, we represent it as the free vector space Vect k ().)


When we have an algebra, then we have a ring, so we can define a Num instance:


instance (Num k, Eq b, Ord b, Show b, Algebra k b) => Num (Vect k b) where
    x+y = x <+> y
    negate x = neg x
    x*y = mult (x `te` y)
    fromInteger n = unit (fromInteger n)
    abs _ = error "Prelude.Num.abs: inappropriate abstraction"
    signum _ = error "Prelude.Num.signum: inappropriate abstraction"

This means that when we have an algebra, we'll be able to write expressions using the usual arithmetic operators.


Okay, so how about some examples of algebras. Well, we mentioned the complex numbers as an algebra over the reals, but let's go one better and define the quaternion algebra. This is a four dimensional algebra over any field k, which is generated by {1, i, j, k}, satisfying the relations i^2 = j^2 = k^2 = ijk = -1. (It follows, for example that (ijk)k = (-1)k, so ij(k^2) = -k, so ij = k.)

Here's the code. First, we define our basis. The quaternions are traditionally denoted H, after Hamilton, who discovered them:


data HBasis = One | I | J | K deriving (Eq,Ord)

type Quaternion k = Vect k HBasis

i,j,k :: Num k => Quaternion k
i = return I
j = return J
k = return K

instance Show HBasis where
    show One = "1"
    show I = "i"
    show J = "j"
    show K = "k"

instance (Num k) => Algebra k HBasis where
    unit x = x *> return One
    mult = linear m
         where m (One,b) = return b
               m (b,One) = return b
               m (I,I) = unit (-1)
               m (J,J) = unit (-1)
               m (K,K) = unit (-1)
               m (I,J) = return K
               m (J,I) = -1 *> return K
               m (J,K) = return I
               m (K,J) = -1 *> return I
               m (K,I) = return J
               m (I,K) = -1 *> return J

Note that unit and mult are both linear by definition.

Let's just check that the code works as expected:

> :l Math.Algebras.Quaternions
> i^2
-1
> j^2
-1
> i*j
k


Now, are we sure that the quaternions are an algebra? Well, it's clear from the definition that the left and right unit conditions hold - see the lines m (One,b) = m (b,One) = return b. But it's not obvious that the associativity condition holds, so perhaps we should quickCheck:


instance Arbitrary HBasis where
    arbitrary = elements [One,I,J,K]

prop_Algebra_Quaternion (k,x,y,z) = prop_Algebra (k,x,y,z)
    where types = (k,x,y,z) :: (Q, Quaternion Q, Quaternion Q, Quaternion Q)

> quickCheck prop_Algebra_Quaternion
+++ OK, passed 100 tests.


Ok, how about 2*2 matrices. These form a four dimensional algebra generated by the elementary matrices {e11, e12, e21, e22}, where eij is the matrix with a 1 in the (i,j) position, and 0s elsewhere:


data Mat2 = E2 Int Int deriving (Eq,Ord,Show)

instance Num k => Algebra k Mat2 where
   unit x = x *> V [(E2 i i, 1) | i <- [1..2] ]
   mult = linear mult' where
        mult' (E2 i j, E2 k l) = delta j k *> return (E2 i l)

delta i j | i == j    = 1
          | otherwise = 0

Notice the way that we only have to define multiplication on our basis elements, the elementary matrices eij, and the rest follows by bilinearity. Notice also that unit and mult are linear by definition. Let's just sanity check that this works as expected:

> :l Math.Algebras.Matrix
> unit 1 :: Vect Q Mat2
E2 1 1+E2 2 2
> let a = 2 *> return (E2 1 2) + 3 *> return (E2 2 1)
> a^2
6E2 1 1+6E2 2 2


It's straightforward to define an Arbitrary instance for Mat2, and quickCheck that it satisfies the algebra conditions.


Finally, what about polynomials? Let's go one better, and define polynomials in more than one variable. An obvious basis for these polynomials as a vector space is the set of monomials. For example, polynomials in x,y,z are a vector space on the basis { x^i y^j z^k | i <- [0..], j <- [0..], k <- [0..] }.

Recall that our vector space code requires that the basis be an Ord instance, so we need to define an ordering on monomials. There are many ways to do this. We'll use the graded lex or glex ordering, which says that monomials of higher degree sort before those of lower degree, and among those of equal degree, lexicographic (alphabetical) ordering applies.

Given any set X of variables, we can construct the polynomial algebra over X as the vector space with basis the monomials in X. In the example above, we had X = {x,y,z}. For this reason, we'll allow our glex monomials to be polymorphic in the type of the variables. (In practice though, as you will see shortly, we will often just use String as the type of our variables.)

So a glex monomial over variables v is basically just a list of powers of elements of v:

data GlexMonomial v = Glex Int [(v,Int)] deriving (Eq)
-- The initial Int is the degree of the monomial. Storing it speeds up equality tests and comparisons


For example x^3 y^2 would be represented as Glex 5 [("x",3),("y",2)].


instance Ord v => Ord (GlexMonomial v) where
    compare (Glex si xis) (Glex sj yjs) =
        compare (-si, [(x,-i) | (x,i) <- xis]) (-sj, [(y,-j) | (y,j) <- yjs])
-- all the minus signs are to make things sort in the right order

[There's a bug in the HaskellForMaths v0.3.2 version of this Ord instance - the above code, which fixes it, will be in the next release.]

I won't bore you with the Show instance - it's a bit fiddly.

The Algebra instance uses the fact that monomials form a monoid under multiplication, with unit 1. Given any monoid, we can form the free vector space having the monoid elements as basis, and then lift the unit and multiplication in the monoid into the vector space, thus forming an algebra called the monoid algebra. Here's the construction:


instance (Num k, Ord v) => Algebra k (GlexMonomial v) where
    unit x = x *> return munit
        where munit = Glex 0 []
    mult xy = nf $ fmap (\(a,b) -> a `mmult` b) xy
        where mmult (Glex si xis) (Glex sj yjs) = Glex (si+sj) $ addmerge xis yjs

(In the addmerge function, we ensure that provided the variables were listed in ascending order in both inputs, then they are still so in the output.)

Finally, here's a convenience function for injecting a variable into the polynomial algebra:

glexVar v = V [(Glex 1 [(v,1)], 1)]

Then for example, we can do the following:

type GlexPoly k v = Vect k (GlexMonomial v)

> let x = glexVar "x" :: GlexPoly Q String
> let y = glexVar "y" :: GlexPoly Q String
> let z = glexVar "z" :: GlexPoly Q String
> (x+y+z)^3
x^3+3x^2y+3x^2z+3xy^2+6xyz+3xz^2+y^3+3y^2z+3yz^2+z^3


I hope you were impressed at how easy that all was. The foundation of free vector spaces, tensor products and algebras is only around a hundred lines of code. It then takes just another dozen lines or so to define an algebra: just define a basis, and define how the basis elements multiply - the rest follows by linearity.

Friday 18 March 2011

Tensor Products, part 2: Monoids and Arrows

[New release, HaskellForMaths v0.3.2, available on Hackage]

Last time we looked at the tensor product of free vector spaces. Given A = Vect k a, B = Vect k b, then the tensor product A⊗B can be represented as Vect k (a,b). As we saw, the tensor product is the "mother of all bilinear functions".

In the HaskellForMaths library, I have defined a couple of type synonyms for direct sum and tensor product:

type DSum a b = Either a b
type Tensor a b = (a,b)


This means that in type signatures we can write the type of a direct sum as Vect k (DSum a b), and of a tensor product as Vect k (Tensor a b). The idea is that this will remind us what we're dealing with, and make things clearer.

During development, I initially called the tensor type TensorBasis. In maths, tensor product is thought of as an operation on vector spaces - A⊗B - rather than on their bases. It would be nicer if we could define direct sum and tensor product as operators on the vector spaces themselves, rather than their bases.

Well, we can have a go, something like this:

{-# LANGUAGE TypeFamilies, RankNTypes #-}

type DirectSum k u v =
    (u ~ Vect k a, v ~ Vect k b) => Vect k (DSum a b)

type TensorProd k u v =
    (u ~ Vect k a, v ~ Vect k b) => Vect k (Tensor a b)

type En = Vect Q EBasis

This appears to work:

$ ghci -XTypeFamilies
...
> :l Math.Test.TAlgebras.TTensorProduct
...
> e1 `te` e2 :: TensorProd Q En En
(e1,e2)


I'll reserve judgement. (Earlier in the development of the quantum algebra code for HaskellForMaths, I tried something similar to this, and ran into problems later on - but I can't now remember exactly what I did, so perhaps this will work.)


Okay, so what can we do with tensor products. Well first, given vectors u in A = Vect k a and v in B = Vect k b, we can form their tensor product, u⊗v, an element of A⊗B = Vect k (Tensor a b). To calculate u⊗v, we use the bilinearity of tensor product to reduce the tensor product of arbitrary vectors to a linear combination of tensor products of basis elements:
(x1 a1 + x2 a2)⊗(y1 b1 + y2 b2) = x1 y1 a1⊗b1 + x1 y2 a1⊗b2 + x2 y1 a2⊗b1 + x2 y2 a2⊗b2
Here's the code:

te :: Num k => Vect k a -> Vect k b -> Vect k (Tensor a b)
te (V us) (V vs) = V [((a,b), x*y) | (a,x) <- us, (b,y) <- vs]


This is in essence just the "tensor" function from last time, but rewritten to take its two inputs separately rather than in a direct sum. Mnemonic: "te" stands for "tensor product of elements". Note that the definition respects normal form: provided the inputs are in normal form (the as and bs are in order, and the xs and ys are non-zero), then so is the output.


Associativity

We can form tensor products of tensor products, such as A⊗(B⊗C) = Vect k (a,(b,c)), and likewise (A⊗B)⊗C = Vect k ((a,b),c). These two are isomorphic as vector spaces. This is obvious if you think about it in the right way. Recall from last week that we can think of elements of A⊗B = Vect k (a,b) as 2-dimensional matrices with rows indexed by a, columns indexed by b, and entries in k. Well A⊗B⊗C (we can drop the parentheses as it makes no difference) is the space of three-dimensional matrices, with one dimension indexed by a, one by b, and one by c.

We can define isomorphisms either way with the following Haskell code:

assocL :: Vect k (Tensor u (Tensor v w)) -> Vect k (Tensor (Tensor u v) w)
assocL = fmap ( \(a,(b,c)) -> ((a,b),c) )

assocR :: Vect k (Tensor (Tensor u v) w) -> Vect k (Tensor u (Tensor v w))
assocR = fmap ( \((a,b),c) -> (a,(b,c)) )


It's clear that these functions are linear, since they're defined using fmap. It's also clear that they are bijections, since they are mutually inverse. Hence they are the required isomorphisms.


Unit

Last time we saw that the field k is itself a vector space, which can be represented as the free vector space Vect k (). What happens if we take the tensor product k⊗A of the field with some other vector space A = Vect k a? Well, if you think about it in terms of matrices, Vect k () is a one-dimensional vector space, so Vect k ((),a) will be a 1*n matrix (where n is the number of basis elements in a). But a 1*n matrix looks just the same as an n-vector:

    a1 a2 ...      a1 a2 ...
() ( .  .    ) ~= ( .  .    )

So we should expect that k⊗A = Vect k ((),a) is isomorphic to A = Vect k a. And indeed it is - here are the relevant isomorphisms:

unitInL = fmap ( \a -> ((),a) )

unitOutL = fmap ( \((),a) -> a )

unitInR = fmap ( \a -> (a,()) )

unitOutR = fmap ( \(a,()) -> a )


So tensor product is associative, and has a unit. In other words, vector spaces form a monoid under tensor product.


Tensor product of functions

Given linear functions f: A -> A', g: B -> B', we can define a linear function f⊗g: A⊗B -> A'⊗B' by
(f⊗g)(a⊗b) = f(a)⊗g(b)

Exercise: Prove that f⊗g is linear

Here's the Haskell code:

tf :: (Num k, Ord a', Ord b') => (Vect k a -> Vect k a') -> (Vect k b -> Vect k b')
   -> Vect k (Tensor a b) -> Vect k (Tensor a' b')
tf f g (V ts) = sum [x *> te (f $ return a) (g $ return b) | ((a,b), x) <- ts]
    where sum = foldl add zero

(Mnemonic: "tf" stands for "tensor product of functions".)

Let's just check that this is linear:

prop_Linear_tf ((f,g),k,(a1,a2,b1,b2)) = prop_Linear (linfun f `tf` linfun g) (k, a1 `te` b1, a2 `te` b2)
    where types = (f,g,k,a1,a2,b1,b2) :: (LinFun Q ABasis SBasis, LinFun Q BBasis TBasis, Q,
                                          Vect Q ABasis, Vect Q ABasis, Vect Q BBasis, Vect Q BBasis)

> quickCheck prop_Linear_tf
+++ OK, passed 100 tests.


So we now have tensor product operations on objects and on arrows. In each case, tensor product takes a pair of objects/arrows, and returns a new object/arrow.

There is a product category k-Vect×k-Vect, consisting of pairs of objects and pairs of arrows from k-Vect. The identity arrow is defined to be (id,id), and composition is defined by (f,g) . (f',g') = (f . f', g . g'). Given these definitions, it turns out that tensor product is a functor from k-Vect×k-Vect to k-Vect. (Another way to say this is that tensor product is a bifunctor in the category of vector spaces.)

Recall that a functor is just a map that "commutes" with the category operations, id and . (composition).
So the conditions for tensor product to be a functor are:
id⊗id = id
(f' . f)⊗(g' . g) = (f'⊗g') . (f⊗g)

Both of these follow immediately from the definition of f⊗g that was given above. However, just in case you don't believe me, here's a quickCheck property to prove it:

prop_TensorFunctor ((f1,f2,g1,g2),(a,b)) =
    (id `tf` id) (a `te` b) == id (a `te` b) &&
    ((f' . f) `tf` (g' . g)) (a `te` b) == ((f' `tf` g') . (f `tf` g)) (a `te` b)
    where f = linfun f1
          f' = linfun f2
          g = linfun g1
          g' = linfun g2
          types = (f1,f2,g1,g2,a,b) :: (LinFun Q ABasis ABasis, LinFun Q ABasis ABasis,
                                        LinFun Q BBasis BBasis, LinFun Q BBasis BBasis,
                                        Vect Q ABasis, Vect Q BBasis)

> quickCheck prop_TensorFunctor
+++ OK, passed 100 tests.

We can think of composition as doing things in series, and tensor as doing things in parallel.



Then the second bifunctor condition can be paraphrased as "Doing things in parallel, in series, is the same as doing things in series, in parallel", as represented by the following diagram.



You might recall that there are a couple of Haskell type classes for that kind of thing. The Category typeclass from Control.Category is about doing things in series. Here is the definition:

class Category cat where
    id :: cat a a
    (.) :: cat b c -> cat a b -> cat a c

The Arrow typeclass from Control.Arrow is about doing things in parallel:

class Category arr => Arrow arr where
    arr :: (a -> b) -> arr a b
    first :: arr a b -> arr (a,c) (b,c)
    second :: arr a b -> arr (c,a) (c,b)
    (***) :: arr a b -> arr a' b' -> arr (a,a') (b,b')
    (&&&) :: arr a b -> arr a b' -> arr a (b,b')

Intuitively, linear functions (Vect k a -> Vect k b) are arrows, via the definitions:
id = id
(.) = (.)
arr = fmap
first f = f `tf` id
second f = id `tf` f
f *** g = f `tf` g
(f &&& g) a = (f `tf` g) (a `te` a)

However, in order to define an Arrow instance we'll have to wrap the functions in a newtype.

import Prelude as P
import Control.Category as C
import Control.Arrow

newtype Linear k a b = Linear (Vect k a -> Vect k b)

instance Category (Linear k) where
    id = Linear P.id
    (Linear f) . (Linear g) = Linear (f P.. g)

instance Num k => Arrow (Linear k) where
    arr f = Linear (fmap f)
    first (Linear f) = Linear f *** Linear P.id
    second (Linear f) = Linear P.id *** Linear f
    Linear f *** Linear g = Linear (f `tf2` g)
        where tf2 f g (V ts) = V $ concat
                  [let V us = x *> te (f $ return a) (g $ return b) in us | ((a,b), x) <- ts]
    Linear f &&& Linear g = (Linear f *** Linear g) C.. Linear (\a -> a `te` a)

Note that we can't use tf directly, because it requires Ord instances for a and b, and Haskell doesn't give us a way to require these. For this reason we define a tf2 function, which is equivalent except that it doesn't guarantee that results are in normal form.

There is loads of other stuff I could talk about:
Exercise: Show that direct sum is also a monoid, with the zero vector space as its identity. (Write Haskell functions for the necessary isomorphisms.)
Exercise: Show that tensor product distributes over direct sum - A⊗(B⊕C) is isomorphic to (A⊗B)⊕(A⊗C). (Write the isomorphisms.)
Exercise: Show that given f: A->A', g: B->B', it is possible to define a linear function f⊕g: A⊕B->A'⊕B' by (f⊕g)(a⊕b) = f(a)⊕g(b). (Write a dsumf function analogous to tf.)

There is another arrow related typeclass called ArrowChoice. It represents arrows where you have a choice of doing either one thing or another thing:

class Arrow arr => ArrowChoice arr where
    left :: arr a b -> arr (Either a c) (Either b c)
    right :: arr a b -> arr (Either c a) (Either c b)
    (+++) :: arr a a' -> arr b b' -> arr (Either a b) (Either a' b')
    (|||) :: arr a c -> arr b c -> arr (Either a b) c



Exercise: Show that the dsumf function can be used to give an ArrowChoice instance for linear functions, where the left summand goes down one path and the right summand down another.

Monday 21 February 2011

Tensor products of vector spaces, part 1

A little while back on this blog, we defined the free k-vector space over a type b:

newtype Vect k b = V [(b,k)] deriving (Eq,Ord)
Elements of Vect k b are k-linear combinations of elements of b.

Whenever we have a mathematical structure like this, we want to know about building blocks and new-from-old constructions.

We already looked at one new-from-old construction: given free k-vector spaces A = Vect k a and B = Vect k b, we can construct their direct sum A⊕B = Vect k (Either a b).

We saw that the direct sum is both the product and the coproduct in the category of free vector spaces - which means that it is the object which satisfies the universal properties implied by the following two diagrams:


So we have injections i1, i2 : Vect k a, Vect k b -> Vect k (Either a b), to put elements of A and B into the direct sum A⊕B, and projections p1, p2 : Vect k (Either a b) -> Vect k a, Vect k b to take them back out again.


However, there is another obvious new-from-old construction: Vect k (a,b). What does this represent?

In order to answer that question, we need to look at bilinear functions. The basic idea of a bilinear function is that it is a function of two arguments, which is linear in each argument. So we might start by looking at functions f :: Vect k a -> Vect k b -> Vect k t.

However, functions of two arguments don't really sit very well in category theory, where arrows are meant to have a single source. (We can handle functions of two arguments in multicategories, but I don't want to go there just yet.) In order to stay within category theory, we need to combine the two arguments into a single argument, using the direct sum construction. So instead of looking at functions f :: Vect k a -> Vect k b -> Vect k t, we will look at functions f :: Vect k (Either a b) -> Vect k t.

To see that they are equivalent, recall from last time that Vect k (Either a b) is isomorphic to (Vect k a, Vect k b), via the isomorphisms:

to :: (Vect k a, Vect k b) -> Vect k (Either a b)
to = \(u,v) -> i1 u <+> i2 v
from :: Vect k (Either a b) -> (Vect k a, Vect k b)
from = \uv -> (p1 uv, p2 uv)

So in going from f :: Vect k a -> Vect k b -> Vect k t to f :: Vect k (Either a b) -> Vect k t, we're really just uncurrying.

Ok, so suppose we are given f :: Vect k (Either a b) -> Vect k t. It helps to still think of this as a function of two arguments, even though we've wrapped them up together in either side of a direct sum. Then we say that f is bilinear, if it is linear in each side of the direct sum. That is:
- for any fixed a0 in A, the function f_a0 :: Vect k b -> Vect k t, f_a0 = \b -> f (i1 a0 <+> i2 b) is linear
- for any fixed b0 in B, the function f_b0 :: Vect k a -> Vect k t, f_b0 = \a -> f (i1 a <+> i2 b0) is linear


Here's a QuickCheck property to test whether a function is bilinear:


prop_Bilinear :: (Num k, Ord a, Ord b, Ord t) =>
     (Vect k (Either a b) -> Vect k t) -> (k, Vect k a, Vect k a, Vect k b, Vect k b) -> Bool
prop_Bilinear f (k,a1,a2,b1,b2) =
    prop_Linear (\b -> f (i1 a1 <+> i1 b)) (k,b1,b2) &&
    prop_Linear (\a -> f (i1 a <+> i1 b1)) (k,a1,a2)

prop_BilinearQn f (a,u1,u2,v1,v2) = prop_Bilinear f (a,u1,u2,v1,v2)
    where types = (a,u1,u2,v1,v2) :: (Q, Vect Q EBasis, Vect Q EBasis, Vect Q EBasis, Vect Q EBasis)

What are some examples of bilinear functions?

Well, perhaps the most straightforward is the dot product of vectors. If our vector spaces A and B are the same, then we can define the dot product:


dot0 uv = sum [ if a == b then x*y else 0 | (a,x) <- u, (b,y) <- v]
    where V u = p1 uv
          V v = p2 uv

However, as it stands, this won't pass our QuickCheck property - because it has the wrong type! This has the type dot0 :: Vect k (Either a b) -> k, whereas we need something of type Vect k (Either a b) -> Vect k t.

Now, it is of course true that k is a k-vector space. However, as it stands, it's not a free k-vector space over some basis type t. Luckily, this is only a technicality, which is easily fixed. When we want to consider k as itself a (free) vector space, we will take t = (), the unit type, and equate k with Vect k (). Since the type () has only a single inhabitant, the value (), then Vect k () consists of scalar multiples of () - so it is basically just a single copy of k itself. The isomorphism between k and Vect k () is \k -> k *> return ().

Okay, so now that we know how to represent k as a free k-vector space, we can define dot product again:


dot1 uv = nf $ V [( (), if a == b then x*y else 0) | (a,x) <- u, (b,y) <- v]
    where V u = p1 uv
          V v = p2 uv

This now has the type dot1 :: Vect k (Either a b) -> Vect k (). Here's how you use it:

> dot1 ( i1 (e1 <+> 2 *> e2) <+> i2 (3 *> e1 <+> e2) )
5()

(So thinking of our function as a function of two arguments, what we do is use i1 to inject the first argument into the left hand side of the direct sum, and i2 to inject the second argument into the right hand side.)

So we can now use the QuickCheck property:

> quickCheck (prop_BilinearQn dot1)
+++ OK, passed 100 tests.


Another example of a bilinear function is polynomial multiplication. Polynomials of course form a vector space, with basis {x^i | i <- [0..] }. So we could define a type to represent the monomials x^i, and then form the polynomials as the free vector space in the monomials. In a few weeks we will do that, but for the moment, to save time, let's just use our existing EBasis type, and take E i to represent x^i. Then polynomial multiplication is the following function:


polymult1 uv = nf $ V [(E (i+j) , x*y) | (E i,x) <- u, (E j,y) <- v]
    where V u = p1 uv
          V v = p2 uv

Let's just convince ourselves that this is polynomial multiplication:

> polymult1 (i1 (e 0 <+> e 1) <+> i2 (e 0 <+> e 1))
e0+2e1+e2

So this is just our way of saying that (1+x)*(1+x) = 1+2x+x^2.

Again, let's verify that this is bilinear:

> quickCheck (prop_BilinearQn polymult1)
+++ OK, passed 100 tests.


So what's all this got to do with Vect k (a,b)? Well, here's another bilinear function:


tensor :: (Num k, Ord a, Ord b) => Vect k (Either a b) -> Vect k (a, b)
tensor uv = nf $ V [( (a,b), x*y) | (a,x) <- u, (b,y) <- v]
    where V u = p1 uv; V v = p2 uv

> quickCheck (prop_BilinearQn tensor)
+++ OK, passed 100 tests.

So this "tensor" function takes each pair of basis elements a, b in the input to a basis element (a,b) in the output. The thing that is interesting about this bilinear function is that it is in some sense "the mother of all bilinear functions". Specifically, you can specify a bilinear function completely by specifying what happens to each pair (a,b) of basis elements. It follows that any bilinear function f :: Vect k (Either a b) -> Vect k t can be factored as f = f' . tensor, where f' :: Vect k (a,b) -> Vect k t is the linear function having the required action on the basis elements (a,b) of Vect k (a,b).


For example:


bilinear :: (Num k, Ord a, Ord b, Ord c) =>
    ((a, b) -> Vect k c) -> Vect k (Either a b) -> Vect k c
bilinear f = linear f . tensor

dot = bilinear (\(a,b) -> if a == b then return () else zero)

polymult = bilinear (\(E i, E j) -> return (E (i+j)))


We can check that these are indeed the same functions as we were looking at before:

> quickCheck (\x -> dot1 x == dot x)
+++ OK, passed 100 tests.
> quickCheck (\x -> polymult1 x == polymult x)
+++ OK, passed 100 tests.

So Vect k (a,b) has a special role in the theory of bilinear functions. If A = Vect k a, B = Vect k b, then we write A⊗B = Vect k (a,b) (pronounced "A tensor B").



[By the way, it's possible that this diagram might upset category theorists - because the arrows in the diagram are not all arrows in the category of vector spaces. Specifically, note that bilinear maps are not, in general, linear. We'll come back to this in a moment.]

So a bilinear map can be specified by its action on the tensor basis (a,b). This corresponds to writing out matrices. To specify any bilinear map Vect k (Either a b) -> Vect k t, you write out a matrix with rows indexed by a, columns indexed by b, and entries in Vect k t.


      b1  b2 ...
 a1 (t11 t12 ...)
 a2 (t21 t22 ...)
... (...        )

So this says that (ai,bj) is taken to tij. Then given an element of A⊕B = Vect k (Either a b), which we can think of as a vector (x1 a1 + x2 a2 + ...) in A = Vect k a together with a vector (y1 b1 + y2 b2 + ...) in B = Vect k b, then we can calculate its image under the bilinear map by doing matrix multiplication as follows:


 a1 a2 ...        b1  b2 ...
(x1 x2 ...)  a1 (t11 t12 ...)  b1 (y1)
             a2 (t21 t22 ...)  b2 (y2)
            ... (...        ) ... (...)

(Sorry, this diagram might be a bit confusing. The ai, bj are labeling the rows and columns. The xi are the entries in a row vector in A, the yj are the entries in a column vector in B, and the tij are the entries in the matrix.)

So xi ai <+> yj bj goes to xi yj tij.

For example, dot product corresponds to the matrix:


(1 0 0)
(0 1 0)
(0 0 1)

Polynomial multiplication corresponds to the matrix:


    e0 e1 e2 ...
e0 (e0 e1 e2 ...)
e1 (e1 e2 e3 ...)
e2 (e2 e3 e4 ...)
...

A matrix with entries in T = Vect k t is just a convenient way of specifying a linear map from A⊗B = Vect k (a,b) to T.

Indeed, any matrix, provided that all the entries are in the same T, defines a bilinear function. So bilinear functions are ten-a-penny.


Now, I stated above that bilinear functions are not in general linear. For example:

> quickCheck (prop_Linear polymult)
*** Failed! Falsifiable (after 2 tests and 2 shrinks):
(0,Right e1,Left e1)

What went wrong? Well:

> polymult (Right e1)
0
> polymult (Left e1)
0
> polymult (Left e1 <+> Right e1)
e2

So we fail to have f (a <+> b) = f a <+> f b, which is one of the requirements of a linear function.


Conversely, it's also important to realise that linear functions (on Vect k (Either a b)) are not in general bilinear. For example:

> quickCheck (prop_BilinearQn id)
*** Failed! Falsifiable (after 2 tests):
(1,0,0,e1,0)

The problem here is:

> id $ i1 (zero <+> zero) <+> i2 e1
Right e1
> id $ (i1 zero <+> i2 e1) <+> (i1 zero <+> i2 e1)
2Right e1

So we fail to have linearity in the left hand side (or the right for that matter).

Indeed we can kind of see that linearity and bilinearity are in conflict.
- Linearity requires that f (a1 <+> a2 <+> b) = f a1 <+> f a2 <+> f b
- Bilinearity requires that f (a1 <+> a2 <+> b) = f (a1 <+> b) <+> f (a2 <+> b)

Exercise: Find a function which is both linear and bilinear.

Followers