<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss' xmlns:gd='http://schemas.google.com/g/2005' xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-5195188167565410449</id><updated>2012-01-28T05:26:06.189Z</updated><title type='text'>Haskell for Maths</title><subtitle type='html'></subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://haskellformaths.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5195188167565410449/posts/default?max-results=100'/><link rel='alternate' type='text/html' href='http://haskellformaths.blogspot.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><author><name>DavidA</name><uri>http://www.blogger.com/profile/16359932006803389458</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>45</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>100</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-5195188167565410449.post-590261774789912776</id><published>2011-11-12T21:24:00.001Z</published><updated>2011-11-12T21:36:40.971Z</updated><title type='text'>New release of HaskellForMaths</title><content type='html'>&lt;br /&gt;I've just uploaded a new version v0.4.1 of &lt;a href="http://hackage.haskell.org/package/HaskellForMaths"&gt;HaskellForMaths&lt;/a&gt;, containing three new modules and a couple of other improvements. The additions are as follows:&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Math.Algebras.Quaternions&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;This module was already present: it defines the quaternion algebra on the basis {1,i,j,k}, where multiplication is defined by:&lt;br /&gt;i^2 = j^2 = k^2 = ijk = -1&lt;br /&gt;&lt;br /&gt;This is enough information to figure out the full multiplication table. For example:&lt;br /&gt;ijk = -1&lt;br /&gt;=&amp;gt; (ijk)k = -k&lt;br /&gt;=&amp;gt; ij(kk) = -k (associativity of multiplication)&lt;br /&gt;=&amp;gt; ij = k&lt;br /&gt;It turns out that the basis elements i,j,k anti-commute in pairs, eg ij = -ji, etc.&lt;br /&gt;&lt;br /&gt;In this release I've added a couple of new things.&lt;br /&gt;&lt;br /&gt;First, the quaternions are a division algebra, so I've added a Fractional instance.&lt;br /&gt;&lt;br /&gt;Specifically, we can define a conjugation operation on the quaternions (similar to complex conjugation) via&lt;br /&gt;conj (w+xi+yj+zk) = w-xi-yj-zk&lt;br /&gt;Then we can define a quadratic norm via&lt;br /&gt;sqnorm q = q * conj q = w^2+x^2+y^2+z^2&lt;br /&gt;Since the norm is always a scalar, we can define a multiplicative inverse by&lt;br /&gt;q^-1 = conj q / sqnorm q&lt;br /&gt;&lt;br /&gt;For example:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;$ cabal install HaskellForMaths&lt;/code&gt;&lt;br /&gt;&lt;code&gt;$ ghci&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; :m Math.Algebras.Quaternions&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; (2*i+3*j)^-1 :: Quaternion Q&lt;/code&gt;&lt;br /&gt;&lt;code&gt;-2/13i-3/13j&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;(If you leave out the type annotation, you'll be working in Quaternion Double.)&lt;br /&gt;&lt;br /&gt;Second, the quaternions have an interesting role in 3- and 4-dimensional geometry.&lt;br /&gt;&lt;br /&gt;Given any non-zero quaternion q, the map x -&amp;gt; q^-1 x q turns out to be a rotation of the 3-dimensional space spanned by {i,j,k}. To multiply rotations together (ie do one then another), just multiply the quaternions. This turns out to be a better way to represent rotations than 3*3 matrices:&lt;br /&gt;- It's more compact - four scalars rather than nine&lt;br /&gt;- They're faster to multiply - 16 scalar multiplications versus 27&lt;br /&gt;- It's more robust against rounding error - whatever quaternion you end up with will still represent a rotation, whereas a sequence of matrix multiplications of rotations might not be quite a rotation any more, due to rounding error.&lt;br /&gt;&lt;br /&gt;If you're curious, the function reprSO3 converts a quaternion to the corresponding 3*3 matrix:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; reprSO3 (1+2*i) :: [[Q]]&lt;/code&gt;&lt;br /&gt;&lt;code&gt;[[1,0,0],[0,-3/5,-4/5],[0,4/5,-3/5]]&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;(Exercise: Figure out why we got this matrix.)&lt;br /&gt;&lt;br /&gt;Quaternions can also be used to represent rotations of 4-dimensional space - see the documentation.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Math.Algebras.Octonions&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;This is a new module, providing an implementation of the 8-dimensional non-associative division algebra of octonions. I follow &lt;a href="http://en.wikipedia.org/wiki/John_H._Conway"&gt;Conway&lt;/a&gt;'s notation [1], so the octonions have basis {1,e0,e1,e2,e3,e4,e5,e6}, with multiplication defined by:&lt;br /&gt;e&lt;sub&gt;i&lt;/sub&gt; * e&lt;sub&gt;i&lt;/sub&gt; = -1, for i in [0..6]&lt;br /&gt;e&lt;sub&gt;i+1&lt;/sub&gt; * e&lt;sub&gt;i+2&lt;/sub&gt; = e&lt;sub&gt;i+4&lt;/sub&gt;, where the indices are taken modulo 7.&lt;br /&gt;&lt;br /&gt;The octonions are not associative, but they are an &lt;i&gt;inverse loop&lt;/i&gt;, so they satisfy x&lt;sup&gt;-1&lt;/sup&gt;(xy) = y = (yx)x&lt;sup&gt;-1&lt;/sup&gt;. This is enough to enable us to deduce the full multiplication table from the relations above.&lt;br /&gt;&lt;br /&gt;Like the quaternions, the octonions have conjugation and a norm, and multiplicative inverses:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; :l Math.Algebras.Octonions&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; (2+i0+2*i3)^-1&lt;/code&gt;&lt;br /&gt;&lt;code&gt;2/9-1/9i0-2/9i3&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;The octonions are an exceptional object in mathematics: there's nothing else quite like them. They can be used to construct various other exceptional objects, such as the root lattice E8, or the Lie group G2. Hopefully I'll be able to cover some of that stuff in a future installment.&lt;br /&gt;&lt;br /&gt;[1] Conway and Smith, On Quaternions and Octonions&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Math.NumberTheory.Prime&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;The main function in this module is &lt;code&gt;isPrime :: Integer -&amp;gt; Bool&lt;/code&gt;, which tells you whether a number is prime or not. It's implemented using the &lt;a href="http://en.wikipedia.org/wiki/Miller-Rabin_primality_test"&gt;Miller-Rabin test&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;The basic idea of the test is:&lt;br /&gt;- If p is prime, then Zp is a field&lt;br /&gt;- In a field, the equation x^2 = 1 has only two solutions, 1 and -1&lt;br /&gt;- Given an arbitrary b coprime to p, we know from Fermat's little theorem that b^(p-1) = 1 (mod p)&lt;br /&gt;- So if p-1 = q * 2^s, with q odd, then either b^q = 1 (mod p), or there is some r, 0 &amp;lt;= r &amp;lt; s with b^(q*2^r) = -1 (mod p)&lt;br /&gt;&lt;br /&gt;The idea of the algorithm is to try to show that p isn't prime by trying to find a b where the above is not true. We take several different values of b at random, and repeatedly square b^q, to see whether we get -1 or not.&lt;br /&gt;&lt;br /&gt;The advantage of the Miller-Rabin test, as compared to trial division say, is that it has a fast running time even for very large numbers. For example:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; :m Math.NumberTheory.Prime&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; :set +s&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; nextPrime $ 10^100&lt;/code&gt;&lt;br /&gt;&lt;code&gt;10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000267&lt;/code&gt;&lt;br /&gt;&lt;code&gt;(0.09 secs, 14904632 bytes)&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;The potential disadvantage of the Miller-Rabin test is that it is probabilistic: There is a very small chance (1 in 10^15 in this implementation) that it could just fail to hit on a b which disproves n's primeness, so that it would say n is prime when it isn't. In practice, at those odds it's not worth worrying about.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Math.NumberTheory.Factor&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;The main function in this module is &lt;code&gt;pfactors :: Integer -&amp;gt; [Integer]&lt;/code&gt;, which returns the prime factors of a number (with multiplicity). It uses trial division to try to find prime factors less than 10000. After that, it uses the elliptic curve method to try to split what remains. The elliptic curve method relies on some quite advanced maths, but the basic idea is this:&lt;br /&gt;- If p is a prime, then Zp is a field&lt;br /&gt;- Given a field, we can do "arithmetic on elliptic curves" over the field.&lt;br /&gt;- So to factor n, pretend that n is prime, try doing arithmetic on elliptic curves, and wait till something goes wrong.&lt;br /&gt;- It turns out that if we look at what went wrong, we can figure out a non-trivial factor of n&lt;br /&gt;&lt;br /&gt;Here it is in action:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; :m Math.NumberTheory.Factor&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; pfactors $ 10^30+5&lt;/code&gt;&lt;br /&gt;&lt;code&gt;[3,5,4723,1399606163,10085210079364883]&lt;/code&gt;&lt;br /&gt;&lt;code&gt;(0.55 secs, 210033624 bytes)&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; pfactors $ 10^30+6&lt;/code&gt;&lt;br /&gt;&lt;code&gt;[2,7,3919,758405810021,24032284101871]&lt;/code&gt;&lt;br /&gt;&lt;code&gt;(2.31 secs, 883504748 bytes)&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;I love the way it can crunch through 12-digit prime factors with relative ease.&lt;br /&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5195188167565410449-590261774789912776?l=haskellformaths.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://haskellformaths.blogspot.com/feeds/590261774789912776/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://haskellformaths.blogspot.com/2011/11/new-release-of-haskellformaths.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5195188167565410449/posts/default/590261774789912776'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5195188167565410449/posts/default/590261774789912776'/><link rel='alternate' type='text/html' href='http://haskellformaths.blogspot.com/2011/11/new-release-of-haskellformaths.html' title='New release of HaskellForMaths'/><author><name>DavidA</name><uri>http://www.blogger.com/profile/16359932006803389458</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5195188167565410449.post-6214386064203847480</id><published>2011-09-18T19:22:00.000+01:00</published><updated>2011-09-18T19:47:42.162+01:00</updated><title type='text'>Commutative Algebra and Algebraic Geometry</title><content type='html'>&lt;br /&gt;&lt;a href="http://haskellformaths.blogspot.com/2011/09/commutative-algebra-in-haskell-part-1.html"&gt;Last time&lt;/a&gt; we saw how to create variables for use in polynomial arithmetic. This time I want to look at some of the things we can do next.&lt;br /&gt;&lt;br /&gt;First, let's define the variables we are going to use:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; :l Math.CommutativeAlgebra.GroebnerBasis&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; let [t,u,v,x,y,z,x',y',z'] = map glexvar ["t","u","v","x","y","z","x'","y'","z'"]&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;So now we can do arithmetic in the polynomial ring Q[t,u,v,x,y,z,x',y',z']. For example:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; (x+y)^2&lt;/code&gt;&lt;br /&gt;&lt;code&gt;x^2+2xy+y^2&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;The branch of mathematics dealing with the theory of polynomial rings is called commutative algebra, and it was "invented" mainly in support of algebraic geometry. Algebraic geometry is roughly the study of the curves, surfaces, etc that arise as the solution sets of polynomial equations. For example, the solution-set of the equation x^2+y^2=1 is the unit circle.&lt;br /&gt;&lt;br /&gt;If we are given any polynomial equation f = g, then we can rewrite it more conveniently as f-g = 0. In other words, we only need to track individual polynomials, rather than pairs of polynomials. Call the solution set of f = 0 the zero-set of f.&lt;br /&gt;&lt;br /&gt;Sometimes we're interested in the intersection of two or more curves, surfaces, etc. For example, it is well known that the hyperbola, parabola and ellipse all arise as "&lt;a href="http://en.wikipedia.org/wiki/Conic_section"&gt;conic sections&lt;/a&gt;" - that is, as the intersection of a cone with a plane. So define the zero-set of a collection (or system) of polynomials to be the set of points which are zeros of all the polynomials simultaneously. For example, the zero-set of the system [x^2+y^2-z^2, z-1] is the unit circle x^2+y^2=1 situated on the plane z=1.&lt;br /&gt;&lt;a href="http://4.bp.blogspot.com/-CAlvn5cb6tQ/TnY0LiVc9pI/AAAAAAAAAJM/q_9xJf3BVQk/s1600/Cone.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="307" src="http://4.bp.blogspot.com/-CAlvn5cb6tQ/TnY0LiVc9pI/AAAAAAAAAJM/q_9xJf3BVQk/s320/Cone.png" style="cursor: move;" width="320" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;br /&gt;&lt;/div&gt;Okay, so how can commutative algebra help us to investigate curves and surfaces? Is there a way for us to "do geometry by doing algebra"? Well, first, what does "doing geometry" consist of? Well, at least some of the following:&lt;br /&gt;- Looking at the shapes of curves and surfaces&lt;br /&gt;- Looking at intersections, unions, differences and products of curves and surfaces&lt;br /&gt;- Looking at when curves or surfaces can be mapped into or onto other curves or surfaces&lt;br /&gt;- Looking at when two different curves or surfaces are equivalent, in some sense (for example, topologically equivalent)&lt;br /&gt;&lt;br /&gt;(That phrase "curves and surfaces" is not only clumsy but also inaccurate, so from now on I'll use the proper term, "variety", for the zero-set of a system of polynomials, whether it's a set of isolated points, a curve, a surface, some higher dimensional thing, or a combination of some of the preceding.)&lt;br /&gt;&lt;br /&gt;So can we do all those things using algebra? Well, let's have a go.&lt;br /&gt;&lt;br /&gt;Let's start by looking at intersections and unions of varieties (remember, that's just the fancy name for curves, surfaces, etc.).&lt;br /&gt;&lt;br /&gt;Well, we've already seen how to do intersections. If a variety V1 is defined by a system of polynomials [f1...fm], and a variety V2 is defined by [g1...gn], then their intersection is defined by the system [f1...fm,g1...gn] - the zero-set of both sets of polynomials simultaneously. We'll call this the "sum" of the systems of polynomials. (Note to the cognoscenti: yes, I'm really talking about ideals here.)&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;sumI fs gs = gb (fs ++ gs)&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;Don't worry too much about what that "gb" (Groebner basis) call is doing. Let's just say that it's choosing the best way to represent the system of polynomials. For example:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; sumI [x^2+y^2-z^2] [z-1]&lt;/code&gt;&lt;br /&gt;&lt;code&gt;[x^2+y^2-1,z-1]&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Notice how the gb call has caused the first polynomial to be simplified slightly. The same variety might arise as the zero-set of many different systems of polynomials. That's something that we're going to need to look into - but later.&lt;br /&gt;&lt;br /&gt;Okay, so what about unions of varieties. So suppose V1 is defined by [f1...fm], V2 is defined by [g1...gn]. A point in their union is in either V1 or V2, so it is in the zero-set of either [f1...fm] or [g1...gn]. So how about multiplying the polynomials together in pairs. That is, let's look at the system [fi*gj | fi &amp;lt;- fs, gj &amp;lt;- gs]. Call the zero-set of this system V. Then clearly, any point in either V1 or V2 is in V, since we then know that either all the fs or all the gs vanish at that point, and hence so do all the products. Conversely, suppose that some point is not in the union of V1 and V2. Then there must exist some fi, and some gj, which are non-zero at that point. Hence there is an fi*gj which is non-zero, so the point is not in V.&lt;br /&gt;&lt;br /&gt;This construction is called, naturally enough, the product of the systems of polynomials.&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;productI fs gs = gb [f * g | f &amp;lt;- fs, g &amp;lt;- gs]&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; productI [x^2+y^2-z^2] [z-1]&lt;/code&gt;&lt;br /&gt;&lt;code&gt;[x^2z+y^2z-z^3-x^2-y^2+z^2]&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Just in case you're still a little unsure, let's confirm that a few arbitrary points in the union are in the zero-set of this polynomial:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; eval (x^2*z+y^2*z-z^3-x^2-y^2+z^2) [(x,100),(y,-100),(z,1)]&lt;/code&gt;&lt;br /&gt;&lt;code&gt;0&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; eval (x^2*z+y^2*z-z^3-x^2-y^2+z^2) [(x,3),(y,4),(z,5)]&lt;/code&gt;&lt;br /&gt;&lt;code&gt;0&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;The first expression evaluates the polynomial at the point (100,-100,1), an arbitrary point on the plane z=1. The second evaluates at (3,4,5), an arbitrary point on the cone x^2+y^2=z^2. Both points are in the zero-set of our product polynomial.&lt;br /&gt;&lt;br /&gt;Since we're in the neighbourhood, let's have a look at the other conic sections. First, let's rotate our coordinate system by 45 degrees, using the substitution x'=x+z, z'=z-x. (Okay, so this also scales - to save us having to handle a sqrt 2 factor.)&lt;br /&gt;&lt;br /&gt;&lt;a href="http://2.bp.blogspot.com/-xAPGN0iKaFo/TnY0PhuiXeI/AAAAAAAAAJQ/GdXKnNFDlHw/s1600/Cone+rotated.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="300" src="http://2.bp.blogspot.com/-xAPGN0iKaFo/TnY0PhuiXeI/AAAAAAAAAJQ/GdXKnNFDlHw/s320/Cone+rotated.png" style="cursor: move;" width="320" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; let cone' = subst (x^2+y^2-z^2) [(x,(x'-z')/2),(y,y'),(z,(x'+z')/2)]&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; cone'&lt;/code&gt;&lt;br /&gt;&lt;code&gt;-x'z'+y'^2&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;In these coordinates, the intersection of the cone with the plane z'=1 is the parabola x'=y'^2:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; sumI [cone'] [z'-1]&lt;/code&gt;&lt;br /&gt;&lt;code&gt;[y'^2-x',z'-1]&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Alternatively, the intersection with the plane y'=1 is the hyperbola x'z'=1:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; sumI [cone'] [y'-1]&lt;/code&gt;&lt;br /&gt;&lt;code&gt;[x'z'-1,y'-1]&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Okay, so we've made a start on seeing how to do geometry by doing algebra, by looking at unions and intersections of varieties. There's still plenty more to do. We mustn't forget that we have some unfinished business: we need to understand when different polynomial systems can define the same variety, and in what sense the gb (Groebner basis) function finds the "best" representation. That will have to wait for another time.&lt;br /&gt;&lt;br /&gt;Incidentally, for the eval and subst functions that I used above, you will need to take the new release &lt;a href="http://hackage.haskell.org/package/HaskellForMaths"&gt;HaskellForMaths v0.4.0&lt;/a&gt;. In this release I also removed the older commutative algebra modules, so I revved the minor version number.&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5195188167565410449-6214386064203847480?l=haskellformaths.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://haskellformaths.blogspot.com/feeds/6214386064203847480/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://haskellformaths.blogspot.com/2011/09/commutative-algebra-and-algebraic.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5195188167565410449/posts/default/6214386064203847480'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5195188167565410449/posts/default/6214386064203847480'/><link rel='alternate' type='text/html' href='http://haskellformaths.blogspot.com/2011/09/commutative-algebra-and-algebraic.html' title='Commutative Algebra and Algebraic Geometry'/><author><name>DavidA</name><uri>http://www.blogger.com/profile/16359932006803389458</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/-CAlvn5cb6tQ/TnY0LiVc9pI/AAAAAAAAAJM/q_9xJf3BVQk/s72-c/Cone.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5195188167565410449.post-5561495038359270686</id><published>2011-09-04T21:11:00.000+01:00</published><updated>2011-09-04T21:11:32.677+01:00</updated><title type='text'>Commutative Algebra in Haskell, part 1</title><content type='html'>&lt;br /&gt;Once again, it's been a little while since my last post, and once again, my excuse is partly that I've been too busy writing code.&lt;br /&gt;&lt;br /&gt;I've just uploaded a new release, &lt;a href="http://hackage.haskell.org/package/HaskellForMaths"&gt;HaskellForMaths 0.3.4&lt;/a&gt;, which contains the following new modules:&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Math.Core.Utils&lt;/b&gt; - this is a collection of utility functions used throughout the rest of the library. I've belatedly decided that it's better to put them all in one place rather than scattered here and there throughout other modules.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Math.Core.Field&lt;/b&gt; - this provides new, more efficient implementations of several finite fields. There already were implementations of these finite fields, in the Math.Algebra.Field.Base and ...Extension modules, as discussed &lt;a href="http://haskellformaths.blogspot.com/2009/08/finite-fields-part-1.html"&gt;here&lt;/a&gt; and &lt;a href="http://haskellformaths.blogspot.com/2009/09/finite-fields-part-2.html"&gt;here&lt;/a&gt;.&amp;nbsp;However, that code was written to make the maths clear, rather than for speed. This new module is about speed. For the prime power fields in particular (eg F4, F8, F9), these implementations are significantly faster.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Math.Combinatorics.Matroid&lt;/b&gt; - Matroids are a kind of combinatorial abstraction of the concept of linear independence. They're something that I heard about years ago - both of my favourite combinatorics books have brief introductions - but I never bothered to follow up. Well anyway, so something finally piqued my curiosity, and I got Oxley's Matroid Theory. It turned out to be really interesting stuff, and this module is pretty much a translation of a large part of that book into Haskell code, written as I taught myself all about matroids.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Math.CommutativeAlgebra.Polynomial&lt;/b&gt; - Although I hadn't yet got around to discussing them in the blog, HaskellForMaths has always had modules for working with multivariate polynomials, namely Math.Algebra.Commutative.Monomial and ...MPoly. However, these were some of the earliest code I wrote, before my more recent free vector space and algebra code. So I saw an opportunity to simplify and improve this code, by building it on top of the free vector space code. Also, I'm trying to rationalise the module naming convention in HaskellForMaths, to more closely follow the categories used in &lt;a href="http://arxiv.org/archive/math"&gt;arxiv.org&lt;/a&gt;&amp;nbsp;or &lt;a href="http://mathoverflow.net/tags"&gt;mathoverflow.net&lt;/a&gt;&amp;nbsp;. In the long run, I expect this module to supercede the older modules.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Math.CommutativeAlgebra.GroebnerBasis&lt;/b&gt; - Again, there was already code for Groebner bases in Math.Algebra.Commutative.GBasis. This is pretty much the same code, ported to the new polynomial implementation, but I've also begun to build on this, with code to find the sum, product, intersection, and quotient of ideals.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;So the matroid code was just new code that I wrote while teaching myself some new maths. But most of the other code comes from an ambition to organise and simplify the HaskellForMaths library. I've also been trying to improve the documentation.&lt;br /&gt;&lt;br /&gt;My ultimate ambition is to get more people using the library. To do that, the structure of the library needs to be clearer, the documentation needs to be better, and I need to explain how to use it. So I thought I'd start by explaining how to use the new commutative algebra modules.&lt;br /&gt;&lt;br /&gt;(So this is a bit of a digression from the series on quantum algebra that I've been doing the last few months. However, in terms of the cumulative nature of maths, it's probably better to do this first.)&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Okay, so suppose we want to do some polynomial arithmetic. Well, first we need to create some variables to work with. How do we do that?&lt;br /&gt;&lt;br /&gt;First, decide on a monomial ordering - that is, we need to decide in what order monomials are to be listed within a polynomial. For the moment, let's use "graded lexicographic" or Glex order. This says that you should put monomials of higher degree before those of lower degree (eg y^3 before x^2), and if two monomials have the same degree, you should use lexicographic (dictionary) order (eg xyz before y^3).&lt;br /&gt;&lt;br /&gt;Next, decide on a field to work over. Most often, we'll want to work over Q, the rationals.&lt;br /&gt;&lt;br /&gt;Then, our variables themselves can be of any Haskell type - but there are usually only two sensible choices:&lt;br /&gt;&lt;br /&gt;The easiest way is to use String as the type for our variables.&lt;br /&gt;&lt;br /&gt;Then we could make some variables like this:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; :l Math.CommutativeAlgebra.Polynomial&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; let [x,y,z] = map glexvar ["x","y","z"]&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;And then we can do polynomial arithmetic:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; (x+y+z)^3&lt;/code&gt;&lt;br /&gt;&lt;code&gt;x^3+3x^2y+3x^2z+3xy^2+6xyz+3xz^2+y^3+3y^2z+3yz^2+z^3&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;If we want to use any other field besides Q, then we will have to use a type annotation to tell the compiler which field we're working over:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; let [x,y,z] = map var ["x","y","z"] :: [GlexPoly F3 String]&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; (x+y+z)^3&lt;/code&gt;&lt;br /&gt;&lt;code&gt;x^3+y^3+z^3&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;The alternative to using String for our variables is to define our own type. For example&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;&lt;br /&gt;data Var = X | Y | Z | W deriving (Eq,Ord)&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;instance Show Var where&lt;br /&gt;&lt;br /&gt;&amp;nbsp; &amp;nbsp; show X = "x"&lt;br /&gt;&lt;br /&gt;&amp;nbsp; &amp;nbsp; show Y = "y"&lt;br /&gt;&lt;br /&gt;&amp;nbsp; &amp;nbsp; show Z = "z"&lt;br /&gt;&lt;br /&gt;&amp;nbsp; &amp;nbsp; show W = "w"&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;[x,y,z,w] = map glexvar [X,Y,Z,W]&lt;br /&gt;&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;So there you have it - now you can do polynomial arithmetic in Haskell.&lt;br /&gt;&lt;br /&gt;So how does it work?&lt;br /&gt;&lt;br /&gt;Well, fundamentally, k-polynomials are a free k-vector space on the basis of monomials. So we define a type to implement monomials:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;&lt;br /&gt;data MonImpl v = M Int [(v,Int)] deriving (Eq)&lt;br /&gt;&lt;br /&gt;-- The initial Int is the degree of the monomial. Storing it speeds up equality tests and comparisons&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;instance Show v =&amp;gt; Show (MonImpl v) where&lt;br /&gt;&lt;br /&gt;&amp;nbsp; &amp;nbsp; show (M _ []) = "1"&lt;br /&gt;&lt;br /&gt;&amp;nbsp; &amp;nbsp; show (M _ xis) = concatMap (\(x,i) -&amp;gt; if i==1 then showVar x else showVar x ++ "^" ++ show i) xis&lt;br /&gt;&lt;br /&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; where showVar x = filter ( /= '"' ) (show x) -- in case v == String&lt;br /&gt;&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;Notice that our monomial implementation is polymorphic in v, the type of the variables.&lt;br /&gt;&lt;br /&gt;Next, monomials form a monoid, so we make them an instance of Mon (the HaskellForMaths class for monoids):&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;&lt;br /&gt;instance (Ord v) =&amp;gt; Mon (MonImpl v) where&lt;br /&gt;&lt;br /&gt;&amp;nbsp; &amp;nbsp; munit = M 0 []&lt;br /&gt;&lt;br /&gt;&amp;nbsp; &amp;nbsp; mmult (M si xis) (M sj yjs) = M (si+sj) $ addmerge xis yjs&lt;br /&gt;&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;In principle, all we need to do now is define an Ord instance, and then an Algebra instance, using the monoid algebra construction.&lt;br /&gt;&lt;br /&gt;However, for reasons that will become clear in future postings, we want to be able to work with various different orderings on monomials, such as Lex, Glex, or Grevlex. So we provide various newtype wrappers around this basic monomial implementation. Here's the code for the Glex ordering that we used above:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;br /&gt;&lt;br /&gt;newtype Glex v = Glex (MonImpl v) deriving (Eq, Mon) -- GeneralizedNewtypeDeriving&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;instance Show v =&amp;gt; Show (Glex v) where&lt;br /&gt;&lt;br /&gt;&amp;nbsp; &amp;nbsp; show (Glex m) = show m&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;instance Ord v =&amp;gt; Ord (Glex v) where&lt;br /&gt;&lt;br /&gt;&amp;nbsp; &amp;nbsp; compare (Glex (M si xis)) (Glex (M sj yjs)) =&lt;br /&gt;&lt;br /&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; compare (-si, [(x,-i) | (x,i) &amp;lt;- xis]) (-sj, [(y,-j) | (y,j) &amp;lt;- yjs])&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;type GlexPoly k v = Vect k (Glex v)&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;glexvar :: v -&amp;gt; GlexPoly Q v&lt;br /&gt;&lt;br /&gt;glexvar v = return $ Glex $ M 1 [(v,1)]&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;instance (Num k, Ord v, Show v) =&amp;gt; Algebra k (Glex v) where&lt;br /&gt;&lt;br /&gt;&amp;nbsp; &amp;nbsp; unit x = x *&amp;gt; return munit&lt;br /&gt;&lt;br /&gt;&amp;nbsp; &amp;nbsp; mult xy = nf $ fmap (\(a,b) -&amp;gt; a `mmult` b) xy&lt;br /&gt;&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;We also have similar newtypes for Lex and Grevlex orderings, which I'll discuss another time.&lt;br /&gt;&lt;br /&gt;And that's pretty much it. Now that we have an instance of Algebra k (Glex v), we get a Num instance for free, so we get +, -, *, and fromInteger. That means we can enter expressions like the following:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; (2*x^2-y*z)^2&lt;/code&gt;&lt;br /&gt;&lt;code&gt;4x^4-4x^2yz+y^2z^2&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;Note that division is not supported: you can't write x/y, for example. However, as a convenience, I have defined a partial instance of Fractional, which does let you divide by scalars. That means that it's okay to write x/2, for example.&lt;br /&gt;&lt;br /&gt;Next time, some more things you can do with commutative algebra.&lt;br /&gt;&lt;br /&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5195188167565410449-5561495038359270686?l=haskellformaths.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://haskellformaths.blogspot.com/feeds/5561495038359270686/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://haskellformaths.blogspot.com/2011/09/commutative-algebra-in-haskell-part-1.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5195188167565410449/posts/default/5561495038359270686'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5195188167565410449/posts/default/5561495038359270686'/><link rel='alternate' type='text/html' href='http://haskellformaths.blogspot.com/2011/09/commutative-algebra-in-haskell-part-1.html' title='Commutative Algebra in Haskell, part 1'/><author><name>DavidA</name><uri>http://www.blogger.com/profile/16359932006803389458</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5195188167565410449.post-5653873779426609723</id><published>2011-07-10T20:09:00.001+01:00</published><updated>2011-07-11T20:44:42.937+01:00</updated><title type='text'>The Tensor Algebra Monad</title><content type='html'>It's been a little while since my last post. That's partly because I've been busy writing new code. I've put up a new release, &lt;a href="http://hackage.haskell.org/package/HaskellForMaths"&gt;HaskellForMaths&lt;/a&gt; 0.3.3, which contains three new modules:&lt;br /&gt;- Math.Combinatorics.Digraph&lt;br /&gt;- Math.Combinatorics.Poset&lt;br /&gt;- Math.Combinatorics.IncidenceAlgebra&lt;br /&gt;&lt;br /&gt;I'll go through their contents at some point, but this time I want to talk about the tensor algebra.&lt;br /&gt;&lt;br /&gt;So recall that previously we defined the &lt;a href="http://haskellformaths.blogspot.com/2010/12/free-vector-space-on-type-part-1.html"&gt;free vector space over a type&lt;/a&gt;, &lt;a href="http://haskellformaths.blogspot.com/2011/02/tensor-products-of-vector-spaces-part-1.html"&gt;tensor products&lt;/a&gt;, &lt;a href="http://haskellformaths.blogspot.com/2011/04/what-is-algebra.html"&gt;algebras&lt;/a&gt; and &lt;a href="http://haskellformaths.blogspot.com/2011/04/what-is-coalgebra.html"&gt;coalgebras&lt;/a&gt; in Haskell code.&lt;br /&gt;&lt;br /&gt;In HaskellForMaths, we always work with the free vector space over a type: that means, we take some type b as a basis, and form k-linear combinations of elements of b. This construction is represented by the type Vect k b.&lt;br /&gt;&lt;br /&gt;Given two vector spaces A = Vect k a, B = Vect k b, we can form their tensor product A⊗B = Vect k (Tensor a b). So Tensor is a type constructor on basis types, which takes basis types a, b for vector spaces A, B, and returns a basis type for the tensor product A⊗B.&lt;br /&gt;&lt;br /&gt;We also defined a type constructor DSum, which returns a basis type for the direct sum A⊕B.&lt;br /&gt;&lt;br /&gt;Now, we saw that tensor product is a monoid (at the type level, up to isomorphism):&lt;br /&gt;- it is associative: (A⊗B)⊗C is isomorphic to A⊗(B⊗C)&lt;br /&gt;- it has a unit: the field k itself is an identity for tensor product, in the sense that k⊗A is isomorphic to A, is isomorphic to A⊗k&lt;br /&gt;&lt;br /&gt;Given some specific vector space V, we can consider the tensor powers of V:&lt;br /&gt;k, V, V⊗V, V⊗V⊗V, ...&lt;br /&gt;(We can omit brackets in V⊗V⊗V because tensor product is associative.)&lt;br /&gt;&lt;br /&gt;And indeed we can form their direct sum:&lt;br /&gt;T(V) = k ⊕ V ⊕ V⊗V ⊕ V⊗V⊗V ⊕ ...&lt;br /&gt;(where an element of T(V) is understood to be a &lt;i&gt;finite&lt;/i&gt; sum of elements of the tensor powers.)&lt;br /&gt;&lt;br /&gt;This is a vector space, since tensor products and direct sums are vector spaces. If V has a basis e1,e2,e3,..., then a typical element of T(V) might be something like 3 + 5e2 + 2e1⊗e3⊗e1.&lt;br /&gt;&lt;br /&gt;Now the interesting thing is that T(V) can be given the structure of an algebra, as follows:&lt;br /&gt;- for the unit, we use the injection of k into the first direct summand&lt;br /&gt;- for the mult, we use tensor product&lt;br /&gt;&lt;br /&gt;For example, we would have&lt;br /&gt;e2 * (2 + 3e1 + e4⊗e2) = 2e2 + 3e2⊗e1 + e2⊗e4⊗e2&lt;br /&gt;&lt;br /&gt;With this algebra structure, T(V) is called the tensor algebra.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;So how should we represent the tensor algebra in HaskellForMaths? Suppose that V is the free vector space Vect k a over some basis type a. (Recall also that the field k itself can be represented as the free vector space Vect k () over the unit type.) Can we use the DSum and Tensor type constructors to build the tensor algebra? Something like:&lt;br /&gt;Vect k (DSum () (DSum a (DSum (Tensor a a) (DSum ...))))&lt;br /&gt;&lt;br /&gt;Hmm, that's not going to work - we can't build the whole of what we want that way. (Unless some type system wizard knows otherwise?) So instead of representing the direct sum and tensor product at the type level, we're going to have to do it at the value level. Here's the definition:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;data TensorAlgebra a = TA Int [a] deriving (Eq,Ord)&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;Given the free vector space V = Vect k a over basis type a, then TensorAlgebra a is the basis type for the tensor algebra over a, so that Vect k (TensorAlgebra a) is the tensor algebra T(V). The Int in TA Int [a] tells us which direct summand we're in (ie which tensor power), and the [a] tells us the tensor multiplicands. So for example, e2⊗e1⊗e4 would be represented as TA 3 [e2,e1,e4]. Then Vect k (TensorAlgebra a) consists of k-linear combinations of these basis elements, so it is the vector space T(V) that we are after.&lt;br /&gt;&lt;br /&gt;Here's a Show instance:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;instance Show a =&amp;gt; Show (TensorAlgebra a) where&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;show (TA _ []) = "1"&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;show (TA _ xs) = filter (/= '"') $ concat $ L.intersperse "*" $ map show xs&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;It will be helpful to have a vector space basis to work with, so here's one that we used previously:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;newtype EBasis = E Int deriving (Eq,Ord)&lt;br /&gt;&lt;br /&gt;instance Show EBasis where show (E i) = "e" ++ show i&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;Then, for example, our Show instance gives us:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; :l Math.Algebras.TensorAlgebra&lt;br /&gt;&amp;gt; return (TA 0 []) &amp;lt;+&amp;gt; return (TA 2 [E 1, E 3])&lt;br /&gt;1+e1*e3&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;(Recall that the free vector space is a monad, hence our use of return to put a basis element into the vector space.)&lt;br /&gt;&lt;br /&gt;So note that in the show output, the "*" is meant to represent tensor product, so this is really 1+e1⊗e3. You can actually get Haskell to output the tensor product symbol - just replace "*" by "\x2297" in the definition of show - however I found that it didn't look too good in the Mac OS X terminal, and I wasn't sure it would work on all OSes.&lt;br /&gt;&lt;br /&gt;Ok, how about an Algebra instance? Well, TensorAlgebra a is basically just a slightly frilly version of [a], so it's a monoid, and we can use the monoid algebra construction:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;instance Mon (TensorAlgebra a) where&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;munit = TA 0 []&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;mmult (TA i xs) (TA j ys) = TA (i+j) (xs++ys)&lt;br /&gt;&lt;br /&gt;instance (Num k, Ord a) =&amp;gt; Algebra k (TensorAlgebra a) where&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;unit x = x *&amp;gt; return munit&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;mult = nf . fmap (\(a,b) -&amp;gt; a `mmult` b)&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;So now we can do arithmetic in the tensor algebra:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; let e_ i = return (TA 1 [E i]) :: Vect Q (TensorAlgebra EBasis)&lt;br /&gt;&amp;gt; let e1 = e_ 1; e2 = e_ 2; e3 = e_ 3; e4 = e_ 4&lt;br /&gt;&amp;gt; (e1+e2) * (1+e3*e4)&lt;br /&gt;e1+e2+e1*e3*e4+e2*e3*e4&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;We've got into the habit of using QuickCheck to check algebraic properties. Let's just check that the tensor algebra, as we've defined it, is an algebra:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;instance Arbitrary b =&amp;gt; Arbitrary (TensorAlgebra b) where&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;arbitrary = do xs &amp;lt;- listOf arbitrary :: Gen [b] -- ScopedTypeVariables&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; let d = length xs&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; return (TA d xs)&lt;br /&gt;&lt;br /&gt;prop_Algebra_TensorAlgebra (k,x,y,z) = prop_Algebra (k,x,y,z)&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;where types = (k,x,y,z) :: ( Q, Vect Q (TensorAlgebra EBasis), Vect Q (TensorAlgebra EBasis), Vect Q (TensorAlgebra EBasis) )&lt;br /&gt;&lt;br /&gt;&amp;gt; quickCheck prop_Algebra_TensorAlgebra&lt;br /&gt;+++ OK, passed 100 tests.&lt;br /&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;&lt;/div&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Ok, so what's so special about the tensor algebra? Well, it has a rather nice universal property:&lt;br /&gt;Suppose A = Vect k a, B = Vect k b are vector spaces, and we have a linear map f : A -&amp;gt; B. Suppose that B is also an algebra. Then we can "lift" f to an algebra morphism f' : T(A) -&amp;gt; B, such that the following diagram commutes.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/-dlk7g918R0Q/ThnzbgKU8DI/AAAAAAAAAJI/xFtMQ87Eftk/s1600/TensorAlgebra_UniversalProperty.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="219" src="http://1.bp.blogspot.com/-dlk7g918R0Q/ThnzbgKU8DI/AAAAAAAAAJI/xFtMQ87Eftk/s320/TensorAlgebra_UniversalProperty.png" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;In other words, f' agrees with f on the copy of A within T(A): f = f' . i&lt;br /&gt;&lt;br /&gt;Ah, but hold on, I didn't say what an algebra morphism is. Well, it's just the usual: a function which "commutes" with the algebra structure. Specifically, it's a linear map (so that it commutes with the vector space structure), which makes the following diagrams commute:&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/-Z6NwSUqB0YQ/ThnzZascvMI/AAAAAAAAAI4/xBSU1R2tgok/s1600/Algebra_morphism.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="112" src="http://2.bp.blogspot.com/-Z6NwSUqB0YQ/ThnzZascvMI/AAAAAAAAAI4/xBSU1R2tgok/s320/Algebra_morphism.png" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;So how does this universal property work then? Here's the code:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;injectTA :: Num k =&amp;gt; Vect k a -&amp;gt; Vect k (TensorAlgebra a)&lt;br /&gt;injectTA = fmap (\a -&amp;gt; TA 1 [a])&lt;br /&gt;&lt;br /&gt;liftTA :: (Num k, Ord b, Show b, Algebra k b) =&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; (Vect k a -&amp;gt; Vect k b) -&amp;gt; Vect k (TensorAlgebra a) -&amp;gt; Vect k b&lt;br /&gt;liftTA f = linear (\(TA _ xs) -&amp;gt; product [f (return x) | x &amp;lt;- xs])&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;In other words, any tensor product u⊗v⊗... is sent to f(u)*f(v)*...&lt;br /&gt;&lt;br /&gt;Let's look at an example. Recall that the quaternion algebra H has the basis {1,i,j,k}, with i^2 = j^2 = k^2 = ijk = -1.&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; let f = linear (\(E n) -&amp;gt; case n of 1 -&amp;gt; 1+i; 2 -&amp;gt; 1-i; 3 -&amp;gt; j+k; 4 -&amp;gt; j-k; _ -&amp;gt; zerov)&lt;br /&gt;&amp;gt; let f' = liftTA f&lt;br /&gt;&amp;gt; e1*e2&lt;br /&gt;e1*e2&lt;br /&gt;&amp;gt; f' (e1*e2)&lt;br /&gt;2&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;Recall that we usually define a linear map by linear extension from its action on a basis - that's what the "linear" is doing in the definition of f. It's fairly clear what f' is doing: it's basically just variable substitution. That is, we can consider the basis elements ei as variables, and the tensor algebra as the algebra of non-commutative polynomials in the ei. Then the linear map f assigns a substitution to each basis element, and f' just substitutes and multiplies out in the target algebra. In this case, we have:&lt;br /&gt;e1⊗e2 -&amp;gt; (1+i)*(1-i) = 1-i^2 = 2&lt;br /&gt;&lt;br /&gt;We can use QuickCheck to verify that liftTA f is indeed the algebra morphism required by the universal property. Here's a QuickCheck property for an algebra morphism. (We don't bother to check that f is a linear map, since it's almost always clear from the definition. If in doubt, we can test that separately.)&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;prop_AlgebraMorphism f (k,x,y) =&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;(f . unit) k == unit k &amp;amp;&amp;amp;&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;(f . mult) (x `te` y) == (mult . (f `tf` f)) (x `te` y)&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;This is just a transcription of the diagrams into code.&lt;br /&gt;&lt;br /&gt;In order to test the universal property, we have to check that liftTA f is an algebra morphism, and that it agrees with f on (the copy of) V (in T(V)):&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;prop_TensorAlgebra_UniversalProperty (fmatrix,(k,x,y),z) =&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;prop_AlgebraMorphism f' (k,x,y) &amp;amp;&amp;amp;&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;(f' . injectTA) z == f z&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;where f = linfun fmatrix&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;f' = liftTA f&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;types = (fmatrix,(k,x,y),z) :: (LinFun Q EBasis HBasis,&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; (Q,Vect Q (TensorAlgebra EBasis), Vect Q (TensorAlgebra EBasis)),&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Vect Q EBasis)&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;So the key to this code is the parameter fmatrix, which is an arbitrary (sparse) matrix from Q^n to H, the quaternions, from which we build an arbitrary linear function. Note that the universal property of course implies that we can choose any algebra as the target for f - I just chose the quaternions because they're familiar.&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; quickCheck prop_TensorAlgebra_UniversalProperty&lt;br /&gt;+++ OK, passed 100 tests.&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;With this construction, tensor algebra is in fact a functor from &lt;b&gt;k-Vect&lt;/b&gt; to &lt;b&gt;k-Alg&lt;/b&gt;. The action on objects is V -&amp;gt; T(V), Vect k a -&amp;gt; Vect k (TensorAlgebra a). But a functor also acts on the arrows of the source category.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/-UW1BR5IoBxE/ThnzawK-IWI/AAAAAAAAAJA/V3ihVlsxr2s/s1600/TensorAlgebra_Functor.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="217" src="http://2.bp.blogspot.com/-UW1BR5IoBxE/ThnzawK-IWI/AAAAAAAAAJA/V3ihVlsxr2s/s320/TensorAlgebra_Functor.png" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;How do we get an action on arrows? Well, we can use the universal property to construct one. If we have an arrow f: A -&amp;gt; B, then (injectTA . f) is an arrow A -&amp;gt; T(B). Then we use the universal property to lift to an arrow f': T(A) -&amp;gt; T(B).&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/-ydt_A6xCfzM/ThnzbW516TI/AAAAAAAAAJE/Sm4fYC8PxT8/s1600/TensorAlgebra_FunctorDerivation.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="221" src="http://1.bp.blogspot.com/-ydt_A6xCfzM/ThnzbW516TI/AAAAAAAAAJE/Sm4fYC8PxT8/s320/TensorAlgebra_FunctorDerivation.png" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;Here's the code:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;fmapTA :: (Num k, Ord b, Show b) =&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;(Vect k a -&amp;gt; Vect k b) -&amp;gt; Vect k (TensorAlgebra a) -&amp;gt; Vect k (TensorAlgebra b)&lt;br /&gt;fmapTA f = liftTA (injectTA . f)&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;For example:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;newtype ABasis = A Int deriving (Eq,Ord,Show)&lt;br /&gt;newtype BBasis = B Int deriving (Eq,Ord,Show)&lt;br /&gt;&lt;br /&gt;&amp;gt; let f = linear (\(A i) -&amp;gt; case i of 1 -&amp;gt; return (B 1) &amp;lt;+&amp;gt; return (B 2);&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;2 -&amp;gt; return (B 3) &amp;lt;+&amp;gt; return (B 4);&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;_ -&amp;gt; zerov :: Vect Q BBasis)&lt;br /&gt;&amp;gt; let f' = fmapTA f&lt;br /&gt;&amp;gt; return (TA 2 [A 1, A 2]) :: Vect Q (TensorAlgebra ABasis)&lt;br /&gt;A 1*A 2&lt;br /&gt;&amp;gt; f' it&lt;br /&gt;B 1*B 3+B 1*B 4+B 2*B 3+B 2*B 4&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;So this is variable substitution again. In this case, as f is just a linear map between vector spaces, we can think of it as something like a change of basis of the underlying space. Then f' shows us how the (non-commutative) polynomials defined over the space transform under the change of basis.&lt;br /&gt;&lt;br /&gt;Let's just verify that this is a functor. We have to show:&lt;br /&gt;- That fmapTA f is an algebra morphism (ie it is an arrow in &lt;b&gt;k-Alg&lt;/b&gt;)&lt;br /&gt;- That fmapTA commutes with the category structure, ie fmapTA id = id, and fmapTA (g . f) = fmapTA g . fmapTA f.&lt;br /&gt;&lt;br /&gt;Here's a QuickCheck property:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;prop_Functor_Vect_TensorAlgebra (f,g,k,x,y) =&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;prop_AlgebraMorphism (fmapTA f') (k,x,y) &amp;amp;&amp;amp;&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;(fmapTA id) x == id x &amp;amp;&amp;amp;&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;fmapTA (g' . f') x == (fmapTA g' . fmapTA f') x&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;where f' = linfun f&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;g' = linfun g&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;types = (f,g,k,x,y) :: (LinFun Q ABasis BBasis, LinFun Q BBasis CBasis,&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Q, Vect Q (TensorAlgebra ABasis), Vect Q (TensorAlgebra ABasis) )&lt;br /&gt;&lt;br /&gt;&amp;gt; quickCheck prop_Functor_Vect_TensorAlgebra&lt;br /&gt;+++ OK, passed 100 tests.&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;So can we declare a Functor instance? Well no, actually. Haskell only allows us to declare type constructors as Functor instances, whereas what we would want to do is declare the type function (\Vect k a -&amp;gt; Vect k (TensorAlgebra a)) as a Functor, which isn't allowed.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Ok, so we have a functor T: &lt;b&gt;k-Vect&lt;/b&gt; -&amp;gt; &lt;b&gt;k-Alg&lt;/b&gt;, the tensor algebra functor. We also have a forgetful functor going the other way, &lt;b&gt;k-Alg&lt;/b&gt; -&amp;gt; &lt;b&gt;k-Vect&lt;/b&gt;, which consists in taking an algebra, and simply forgetting that it is an algebra, and seeing only the vector space structure. (As it does at least remember the vector space structure, perhaps we should call this a semi-forgetful, or merely absent-minded functor.)&lt;br /&gt;&lt;br /&gt;The cognoscenti will no doubt have seen what is coming next: we have an adjunction, and hence a monad.&lt;br /&gt;&lt;br /&gt;How so? Well, it's obvious from its type signature that injectTA is return. For (&amp;gt;&amp;gt;=) / bind, we can define the following:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;bindTA :: (Num k, Ord b, Show b) =&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;Vect k (TensorAlgebra a) -&amp;gt; (Vect k a -&amp;gt; Vect k (TensorAlgebra b)) -&amp;gt; Vect k (TensorAlgebra b)&lt;br /&gt;bindTA = flip liftTA&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;Note that in addition to flipping the arguments, bindTA also imposes a more restrictive signature than liftTA: the target algebra is constrained to be a tensor algebra.&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;&amp;gt; let f = linear (\(A i) -&amp;gt; case i of 1 -&amp;gt; return (TA 2 [B 1, B 2]);&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;2 -&amp;gt; return (TA 1 [B 3]) + return (TA 1 [B 4]);&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;_ -&amp;gt; zerov :: Vect Q (TensorAlgebra BBasis))&lt;br /&gt;&amp;gt; return (TA 2 [A 1, A 2]) `bindTA` f&lt;br /&gt;B 1*B 2*B 3+B 1*B 2*B 4&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;So the effect of bind is to feed a non-commutative polynomial through a variable substitution.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Monads are meant to satisfy the following &lt;a href="http://www.haskell.org/haskellwiki/Monad_Laws"&gt;monad laws&lt;/a&gt;:&lt;br /&gt;- "Left identity": return a &amp;gt;&amp;gt;= f &amp;nbsp;== &amp;nbsp;f a&lt;br /&gt;- "Right identity": m &amp;gt;&amp;gt;= return &amp;nbsp;== &amp;nbsp;m&lt;br /&gt;- "Associativity": (m &amp;gt;&amp;gt;= f) &amp;gt;&amp;gt;= g &amp;nbsp;== &amp;nbsp;m &amp;gt;&amp;gt;= (\x -&amp;gt; f x &amp;gt;&amp;gt;= g)&lt;br /&gt;&lt;br /&gt;As usual, we write a QuickCheck property:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;prop_Monad_Vect_TensorAlgebra (fmatrix,gmatrix,a,ta)=&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;injectTA a `bindTA` f == f a &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;&amp;amp;&amp;amp; -- left identity&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;ta `bindTA` injectTA == ta &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;&amp;amp;&amp;amp; -- right identity&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;(ta `bindTA` f) `bindTA` g == ta `bindTA` (\a -&amp;gt; f a `bindTA` g) -- associativity&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;where f = linfun fmatrix&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;g = linfun gmatrix&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;types = (fmatrix,gmatrix,a,ta) :: (LinFun Q ABasis (TensorAlgebra BBasis),&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; LinFun Q BBasis (TensorAlgebra CBasis),&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Vect Q ABasis, Vect Q (TensorAlgebra ABasis) )&lt;br /&gt;&lt;br /&gt;&amp;gt; quickCheck prop_Monad_Vect_TensorAlgebra&lt;br /&gt;+++ OK, passed 100 tests.&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;Once again, we can't actually declare a Monad instance, because our type function (\Vect k a -&amp;gt; Vect k (TensorAlgebra a)) is not a type constructor.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;So, we have a functor, and indeed a monad, T: &lt;b&gt;k-Vect&lt;/b&gt; -&amp;gt; &lt;b&gt;k-Alg&lt;/b&gt;. Now recall that the free vector space construction (\a -&amp;gt; Vect k a) was itself a functor, indeed a monad, from &lt;b&gt;Set&lt;/b&gt; -&amp;gt; &lt;b&gt;k-Vect&lt;/b&gt;. What happens if we compose these two functors? Why then of course we get a functor, and a monad, from &lt;b&gt;Set&lt;/b&gt; -&amp;gt; &lt;b&gt;k-Alg&lt;/b&gt;. In Haskell terms, this is a functor a -&amp;gt; Vect k (TensorAlgebra a).&lt;br /&gt;&lt;br /&gt;What does this functor look like? Well, relative to a, Vect k (TensorAlgebra a) is the &lt;i&gt;free algebra&lt;/i&gt; on a, consisting of all expressions in which the elements of k and the elements of a are combined using (commutative) addition and (non-commutative) multiplication. In other words, the elements of a can be thought of as variable symbols, and Vect k (TensorAlgebra a) as the algebra of non-commutative polynomials in these variables.&lt;br /&gt;&lt;br /&gt;Here's the code:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;injectTA' :: Num k =&amp;gt; a -&amp;gt; Vect k (TensorAlgebra a)&lt;br /&gt;injectTA' = injectTA . return&lt;br /&gt;&lt;br /&gt;liftTA' :: (Num k, Ord b, Show b, Algebra k b) =&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; (a -&amp;gt; Vect k b) -&amp;gt; Vect k (TensorAlgebra a) -&amp;gt; Vect k b&lt;br /&gt;liftTA' = liftTA . linear&lt;br /&gt;&lt;br /&gt;fmapTA' :: (Num k, Ord b, Show b) =&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;(a -&amp;gt; b) -&amp;gt; Vect k (TensorAlgebra a) -&amp;gt; Vect k (TensorAlgebra b)&lt;br /&gt;fmapTA' = fmapTA . fmap&lt;br /&gt;&lt;br /&gt;bindTA' :: (Num k, Ord b, Show b) =&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;Vect k (TensorAlgebra a) -&amp;gt; (a -&amp;gt; Vect k (TensorAlgebra b)) -&amp;gt; Vect k (TensorAlgebra b)&lt;br /&gt;bindTA' = flip liftTA'&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;The only one of these which might require a little explanation is liftTA'. This works by applying a universal property twice, as shown by the diagram below: first, the universal property of free vector spaces is used to lift a function a -&amp;gt; Vect k (TensorAlgebra b) to a function Vect k a -&amp;gt; Vect k (TensorAlgebra b); then the universal property of the tensor algebra is used to lift that to a function Vect k (TensorAlgebra a) -&amp;gt; Vect k (TensorAlgebra b).&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/-qJkxa70qXsk/ThnzaJagvtI/AAAAAAAAAI8/jE0_1VIqZCo/s1600/FreeAlgebra_LiftDerivation.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="90" src="http://1.bp.blogspot.com/-qJkxa70qXsk/ThnzaJagvtI/AAAAAAAAAI8/jE0_1VIqZCo/s320/FreeAlgebra_LiftDerivation.png" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;Here's an example, which shows that in the free algebra as in the tensor algebra, bind corresponds to variable substitution:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; let [t,x,y,z] = map injectTA' ["t","x","y","z"] :: [Vect Q (TensorAlgebra String)]&lt;br /&gt;&amp;gt; let f "x" = 1-t^2; f "y" = 2*t; f "z" = 1+t^2&lt;br /&gt;&amp;gt; (x^2+y^2-z^2) `bindTA'` f&lt;br /&gt;0&lt;/code&gt;&lt;br /&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5195188167565410449-5653873779426609723?l=haskellformaths.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://haskellformaths.blogspot.com/feeds/5653873779426609723/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://haskellformaths.blogspot.com/2011/07/tensor-algebra-monad.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5195188167565410449/posts/default/5653873779426609723'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5195188167565410449/posts/default/5653873779426609723'/><link rel='alternate' type='text/html' href='http://haskellformaths.blogspot.com/2011/07/tensor-algebra-monad.html' title='The Tensor Algebra Monad'/><author><name>DavidA</name><uri>http://www.blogger.com/profile/16359932006803389458</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/-dlk7g918R0Q/ThnzbgKU8DI/AAAAAAAAAJI/xFtMQ87Eftk/s72-c/TensorAlgebra_UniversalProperty.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5195188167565410449.post-5812047000238997018</id><published>2011-04-23T21:48:00.000+01:00</published><updated>2011-04-23T21:48:45.624+01:00</updated><title type='text'>What is a Coalgebra?</title><content type='html'>&lt;a href="http://haskellformaths.blogspot.com/2011/04/what-is-algebra.html"&gt;Last time&lt;/a&gt; we saw how to define an algebra structure on a vector space, in terms of category theory. I think perhaps some readers wondered what we gained by using category theory. The answer may be: not much, yet. However, in due course, we would like to understand the connection between quantum algebra and knot theory, and for that, category theory is essential.&lt;br /&gt;&lt;br /&gt;This week, I want to look at coalgebras. We already saw, in the case of &lt;a href="http://haskellformaths.blogspot.com/2011/02/products-of-lists-and-vector-spaces.html"&gt;products and coproducts&lt;/a&gt;, how given a structure in category theory, you can define a dual structure by reversing the directions of all the arrows. So it is with algebras and coalgebras.&lt;br /&gt;&lt;br /&gt;Recall that an algebra consisted of a k-vector space A together with linear functions&lt;br /&gt;unit :: k -&amp;gt; A&lt;br /&gt;mult :: A⊗A -&amp;gt; A&lt;br /&gt;satisfying two commutative diagrams, associativity and unit.&lt;br /&gt;&lt;br /&gt;Well, a coalgebra consists of a k-vector space C together with two linear functions:&lt;br /&gt;counit :: C -&amp;gt; k&lt;br /&gt;comult :: C -&amp;gt; C⊗C&lt;br /&gt;satisfying the following two commutative diagrams:&lt;br /&gt;&lt;br /&gt;Coassociativity:&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/-USwBOcKB6sY/TbM1GUBYxKI/AAAAAAAAAIs/nLhWSj9YB0U/s1600/Coassoc.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="234" src="http://1.bp.blogspot.com/-USwBOcKB6sY/TbM1GUBYxKI/AAAAAAAAAIs/nLhWSj9YB0U/s320/Coassoc.png" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;This diagram is actually shorthand for the following diagram:&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/-z55STZY8KW0/TbM1G6huW0I/AAAAAAAAAIw/3QPjm8OPAGY/s1600/Coassoc2.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="208" src="http://4.bp.blogspot.com/-z55STZY8KW0/TbM1G6huW0I/AAAAAAAAAIw/3QPjm8OPAGY/s320/Coassoc2.png" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;The isomorphisms at the top are the assocL and assocR isomorphisms that we defined &lt;a href="http://haskellformaths.blogspot.com/2011/03/tensor-products-part-2-monoids-and.html"&gt;here&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;Counit:&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/-7pVoe5pn29U/TbM1HKUlnhI/AAAAAAAAAI0/8aan1SHxjfQ/s1600/Counit.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="149" src="http://3.bp.blogspot.com/-7pVoe5pn29U/TbM1HKUlnhI/AAAAAAAAAI0/8aan1SHxjfQ/s320/Counit.png" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;These are just the associativity and unit diagrams, but with arrows reversed and relabeled.&lt;br /&gt;&lt;br /&gt;Recall that when we say that a diagram commutes, we mean that it doesn't matter which way you follow the arrows, you end up with the same result. In other words, what these diagrams are saying is:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;comult⊗id . comult == id⊗comult . comult &amp;nbsp;(Coassoc)&lt;br /&gt;counit⊗id . comult == unitInL &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; (Left counit)&lt;br /&gt;id⊗comult . comult == unitInR &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; (Right counit)&lt;/code&gt;&lt;br /&gt;(where unitInL, unitInR are the isomorphisms that we defined in &lt;a href="http://haskellformaths.blogspot.com/2011/03/tensor-products-part-2-monoids-and.html"&gt;here&lt;/a&gt;.)&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;In HaskellForMaths, recall that we work with free vector spaces over a basis type, so the definition comes out slightly different:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;module Math.Algebras.Structures where&lt;br /&gt;&lt;br /&gt;...&lt;br /&gt;&lt;br /&gt;class Coalgebra k c where&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;counit :: Vect k c -&amp;gt; k&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;comult :: Vect k c -&amp;gt; Vect k (Tensor c c)&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;What this definition really says is that c is the &lt;i&gt;basis&lt;/i&gt; for a coalgebra. As before, we could try using type families to make this look more like the mathematical definition:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;type TensorProd k u v =&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;(u ~ Vect k a, v ~ Vect k b) =&amp;gt; Vect k (Tensor a b)&lt;br /&gt;&lt;br /&gt;class Coalgebra2 k c where&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;counit2 :: c -&amp;gt; k&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;comult2 :: c -&amp;gt; TensorProd k c c&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;In this definition, c is the coalgebra itself. However, I'm not going to pursue this approach for now.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;What do coalgebras look like? Well, they look a bit like algebras would, if you looked at them through the wrong end of the telescope.&lt;br /&gt;&lt;br /&gt;More specifically, given any basis b, define a dual basis as follows:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;newtype Dual basis = Dual basis deriving (Eq,Ord)&lt;br /&gt;&lt;br /&gt;instance Show basis =&amp;gt; Show (Dual basis) where&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;show (Dual b) = show b ++ "'"&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;(For those who know what a dual vector space is - this is it. For those who don't, I'll explain in a minute.)&lt;br /&gt;&lt;br /&gt;Then, given an Algebra instance on some &lt;i&gt;finite-dimensional&lt;/i&gt; basis b, we can define a Coalgebra instance on Dual b as follows:&lt;br /&gt;&lt;br /&gt;1. Write out the unit and mult operations in the algebra as matrices.&lt;br /&gt;&lt;br /&gt;For example, in the case of the quaternions, we have&lt;br /&gt;unit:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; 1 i j k&lt;br /&gt;&lt;br /&gt;1 -&amp;gt; 1 0 0 0&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;mult:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;1 &amp;nbsp;i &amp;nbsp;j &amp;nbsp;k&lt;br /&gt;&lt;br /&gt;1⊗1 -&amp;gt; &amp;nbsp;1 &amp;nbsp;0 &amp;nbsp;0 &amp;nbsp;0&lt;br /&gt;1⊗i -&amp;gt; &amp;nbsp;0 &amp;nbsp;1 &amp;nbsp;0 &amp;nbsp;0&lt;br /&gt;1⊗j -&amp;gt; &amp;nbsp;0 &amp;nbsp;0 &amp;nbsp;1 &amp;nbsp;0&lt;br /&gt;1⊗k -&amp;gt; &amp;nbsp;0 &amp;nbsp;0 &amp;nbsp;0 &amp;nbsp;1&lt;br /&gt;i⊗1 -&amp;gt; &amp;nbsp;0 &amp;nbsp;1 &amp;nbsp;0 &amp;nbsp;0&lt;br /&gt;i⊗i -&amp;gt; -1 &amp;nbsp;0 &amp;nbsp;0 &amp;nbsp;0&lt;br /&gt;i⊗j -&amp;gt; &amp;nbsp;0 &amp;nbsp;0 &amp;nbsp;0 &amp;nbsp;1&lt;br /&gt;i⊗k -&amp;gt; &amp;nbsp;0 &amp;nbsp;0 -1 &amp;nbsp;0&lt;br /&gt;j⊗1 -&amp;gt; &amp;nbsp;0 &amp;nbsp;0 &amp;nbsp;1 &amp;nbsp;0&lt;br /&gt;j⊗i -&amp;gt; &amp;nbsp;0 &amp;nbsp;0 &amp;nbsp;0 -1&lt;br /&gt;j⊗j -&amp;gt; -1 &amp;nbsp;0 &amp;nbsp;0 &amp;nbsp;0&lt;br /&gt;j⊗k -&amp;gt; &amp;nbsp;0 &amp;nbsp;1 &amp;nbsp;0 &amp;nbsp;0&lt;br /&gt;k⊗1 -&amp;gt; &amp;nbsp;0 &amp;nbsp;0 &amp;nbsp;0 &amp;nbsp;1&lt;br /&gt;k⊗i -&amp;gt; &amp;nbsp;0 &amp;nbsp;0 &amp;nbsp;1 &amp;nbsp;0&lt;br /&gt;k⊗j -&amp;gt; &amp;nbsp;0 -1 &amp;nbsp;0 &amp;nbsp;0&lt;br /&gt;k⊗k -&amp;gt; -1 &amp;nbsp;0 &amp;nbsp;0 &amp;nbsp;0&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;2. Then transpose these two matrices, and use them as the definitions for counit and comult, but replacing each basis element by its dual.&lt;br /&gt;&lt;br /&gt;So, in the case of the quaternions, we would get&lt;br /&gt;counit:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp;1'&lt;br /&gt;&lt;br /&gt;1' -&amp;gt; 1&lt;br /&gt;i' -&amp;gt; 0&lt;br /&gt;j' -&amp;gt; 0&lt;br /&gt;k' -&amp;gt; 0&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;comult:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; 1'⊗1' 1'⊗i' 1'⊗j' 1'⊗k' i'⊗1' i'⊗i' i'⊗j' i'⊗k' j'⊗1' j'⊗i' j'⊗j' j'⊗k' k'⊗1' k'⊗i' k'⊗j' k'⊗k'&lt;br /&gt;&lt;br /&gt;1' -&amp;gt; &amp;nbsp;1 &amp;nbsp; &amp;nbsp; 0 &amp;nbsp; &amp;nbsp; 0 &amp;nbsp; &amp;nbsp; 0 &amp;nbsp; &amp;nbsp; &amp;nbsp;0 &amp;nbsp; &amp;nbsp;-1 &amp;nbsp; &amp;nbsp; 0 &amp;nbsp; &amp;nbsp; 0 &amp;nbsp; &amp;nbsp; 0 &amp;nbsp; &amp;nbsp; 0 &amp;nbsp; &amp;nbsp;-1 &amp;nbsp; &amp;nbsp; 0 &amp;nbsp; &amp;nbsp; 0 &amp;nbsp; &amp;nbsp; 0 &amp;nbsp; &amp;nbsp; 0 &amp;nbsp; &amp;nbsp;-1&lt;br /&gt;i' -&amp;gt; &amp;nbsp;0 &amp;nbsp; &amp;nbsp; 1 &amp;nbsp; &amp;nbsp; 0 &amp;nbsp; &amp;nbsp; 0 &amp;nbsp; &amp;nbsp; &amp;nbsp;1 &amp;nbsp; &amp;nbsp; 0 &amp;nbsp; &amp;nbsp; 0 &amp;nbsp; &amp;nbsp; 0 &amp;nbsp; &amp;nbsp; 0 &amp;nbsp; &amp;nbsp; 0 &amp;nbsp; &amp;nbsp; 0 &amp;nbsp; &amp;nbsp; 1 &amp;nbsp; &amp;nbsp; 0 &amp;nbsp; &amp;nbsp; 0 &amp;nbsp; &amp;nbsp;-1 &amp;nbsp; &amp;nbsp; 0&lt;br /&gt;j' -&amp;gt; &amp;nbsp;0 &amp;nbsp; &amp;nbsp; 0 &amp;nbsp; &amp;nbsp; 1 &amp;nbsp; &amp;nbsp; 0 &amp;nbsp; &amp;nbsp; &amp;nbsp;0 &amp;nbsp; &amp;nbsp; 0 &amp;nbsp; &amp;nbsp; 0 &amp;nbsp; &amp;nbsp;-1 &amp;nbsp; &amp;nbsp; 1 &amp;nbsp; &amp;nbsp; 0 &amp;nbsp; &amp;nbsp; 0 &amp;nbsp; &amp;nbsp; 0 &amp;nbsp; &amp;nbsp; 0 &amp;nbsp; &amp;nbsp; 1 &amp;nbsp; &amp;nbsp; 0 &amp;nbsp; &amp;nbsp; 0&lt;br /&gt;k' -&amp;gt; &amp;nbsp;0 &amp;nbsp; &amp;nbsp; 0 &amp;nbsp; &amp;nbsp; 0 &amp;nbsp; &amp;nbsp; 1 &amp;nbsp; &amp;nbsp; &amp;nbsp;0 &amp;nbsp; &amp;nbsp; 0 &amp;nbsp; &amp;nbsp; 1 &amp;nbsp; &amp;nbsp; 0 &amp;nbsp; &amp;nbsp; 0 &amp;nbsp; &amp;nbsp;-1 &amp;nbsp; &amp;nbsp; 0 &amp;nbsp; &amp;nbsp; 0 &amp;nbsp; &amp;nbsp; 1 &amp;nbsp; &amp;nbsp; 0 &amp;nbsp; &amp;nbsp; 0 &amp;nbsp; &amp;nbsp; 0&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;In code, we get:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;instance Num k =&amp;gt; Coalgebra k (Dual HBasis) where&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;counit = unwrap . linear counit'&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;where counit' (Dual One) = return ()&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;counit' _ = zero&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;comult = linear comult'&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;where comult' (Dual One) = return (Dual One, Dual One) &amp;lt;+&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;(-1) *&amp;gt; ( return (Dual I, Dual I) &amp;lt;+&amp;gt; return (Dual J, Dual J) &amp;lt;+&amp;gt; return (Dual K, Dual K) )&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;comult' (Dual I) = return (Dual One, Dual I) &amp;lt;+&amp;gt; return (Dual I, Dual One) &amp;lt;+&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;return (Dual J, Dual K) &amp;lt;+&amp;gt; (-1) *&amp;gt; return (Dual K, Dual J)&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;comult' (Dual J) = return (Dual One, Dual J) &amp;lt;+&amp;gt; return (Dual J, Dual One) &amp;lt;+&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;return (Dual K, Dual I) &amp;lt;+&amp;gt; (-1) *&amp;gt; return (Dual I, Dual K)&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;comult' (Dual K) = return (Dual One, Dual K) &amp;lt;+&amp;gt; return (Dual K, Dual One) &amp;lt;+&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;return (Dual I, Dual J) &amp;lt;+&amp;gt; (-1) *&amp;gt; return (Dual J, Dual I)&lt;br /&gt;&lt;br /&gt;unwrap :: Num k =&amp;gt; Vect k () -&amp;gt; k&lt;br /&gt;unwrap (V []) = 0&lt;br /&gt;unwrap (V [( (),x)]) = x&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;(Recall that when we want to think of k as a vector space, we have to represent it as Vect k ().)&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;We should check that this does indeed define a coalgebra. It's clear, by definition, that counit and comult are linear, as required. Here's a quickcheck property for coassociativity and counit:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;prop_Coalgebra x =&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;((comult `tf` id) . comult) x == (assocL . (id `tf` comult) . comult) x &amp;nbsp;&amp;amp;&amp;amp; -- coassociativity&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;((counit' `tf` id) . comult) x == unitInL x &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;&amp;amp;&amp;amp; -- left counit&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;((id `tf` counit') . comult) x == unitInR x &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; -- right counit&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;where counit' = wrap . counit&lt;br /&gt;&lt;br /&gt;wrap :: Num k =&amp;gt; k -&amp;gt; Vect k ()&lt;br /&gt;wrap 0 = zero&lt;br /&gt;wrap x = V [( (),x)]&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;(It's a bit awkward that we have to keep wrapping and unwrapping between k and Vect k (). I think we could have avoided this if we had defined counit to have signature Vect k c -&amp;gt; Vect k () instead of Vect k c -&amp;gt; k. To be honest, I think that is probably the right thing to do, so perhaps I'll change it in some future release of HaskellForMaths. What does anyone else think?)&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Anyway, here's a quickCheck property to test that the dual quaternions are a coalgebra:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;instance Arbitrary b =&amp;gt; Arbitrary (Dual b) where&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;arbitrary = fmap Dual arbitrary&lt;br /&gt;&lt;br /&gt;prop_Coalgebra_DualQuaternion x = prop_Coalgebra x&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;where types = x :: Vect Q (Dual HBasis)&lt;br /&gt;&lt;br /&gt;&amp;gt; quickCheck prop_Coalgebra_DualQuaternion&lt;br /&gt;+++ OK, passed 100 tests.&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;So, given an algebra structure on some vector space (basis), we can define a coalgebra structure on the dual vector space (dual basis). But why did we use the dual? Why didn't we just use the above construction to define a coalgebra structure on the vector space itself? For example, for the quaternions, that would look like this:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;instance Num k =&amp;gt; Coalgebra k HBasis where&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;counit = unwrap . linear counit'&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;where counit' One = return ()&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;counit' _ = zero&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;comult = linear comult'&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;where comult' One = return (One,One) &amp;lt;+&amp;gt; (-1) *&amp;gt; ( return (I,I) &amp;lt;+&amp;gt; return (J,J) &amp;lt;+&amp;gt; return (K,K) )&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;comult' I = return (One,I) &amp;lt;+&amp;gt; return (I,One) &amp;lt;+&amp;gt; return (J,K) &amp;lt;+&amp;gt; (-1) *&amp;gt; return (K,J)&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;comult' J = return (One,J) &amp;lt;+&amp;gt; return (J,One) &amp;lt;+&amp;gt; return (K,I) &amp;lt;+&amp;gt; (-1) *&amp;gt; return (I,K)&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;comult' K = return (One,K) &amp;lt;+&amp;gt; return (K,One) &amp;lt;+&amp;gt; return (I,J) &amp;lt;+&amp;gt; (-1) *&amp;gt; return (J,I)&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;Well, we could indeed do that. However, it obscures the underlying mathematics. The point is that an algebra structure on a finite-dimensional vector space gives rise "naturally" to a coalgebra structure on the dual space. It is then also true that a finite-dimensional vector space is naturally isomorphic to its dual, which is I guess why the second construction works.&lt;br /&gt;&lt;br /&gt;Okay, so why is the coalgebra structure on the dual vector space "natural"?&lt;br /&gt;&lt;br /&gt;Well first, what is a dual vector space anyway? Given a vector space V, the dual space V* is Hom(V,k), the space of linear maps from V to k. If V is finite-dimensional with basis b, then as we have seen, we can define a basis Dual b for V* which is in one-to-one correspondence with b. Given a basis element ei, then the dual basis element Dual ei represents the linear map that sends ei to 1, and any other basis element ej, j /= i, to 0.&lt;br /&gt;&lt;br /&gt;It is convenient to define a linear map called the evaluation map, ev: V*⊗V -&amp;gt; k, with ev(a⊗x) = a(x):&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;ev :: (Num k, Ord b) =&amp;gt; Vect k (Tensor (Dual b) b) -&amp;gt; k&lt;br /&gt;ev = unwrap . linear (\(Dual bi, bj) -&amp;gt; delta bi bj *&amp;gt; return ())&lt;br /&gt;&lt;br /&gt;-- where delta i j = if i == j then 1 else 0&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;Then given an element a in V* = Vect k (Dual b), and an element x in V = Vect k b, we can evaluate a(x) by calling ev (a `te` x).&lt;br /&gt;&lt;br /&gt;For example:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;dual = fmap Dual&lt;br /&gt;&lt;br /&gt;&amp;gt; ev $ dual e1 `te` (4 *&amp;gt; e1 &amp;lt;+&amp;gt; 5 *&amp;gt; e2)&lt;br /&gt;4&lt;br /&gt;&amp;gt; ev $ dual e2 `te` (4 *&amp;gt; e1 &amp;lt;+&amp;gt; 5 *&amp;gt; e2)&lt;br /&gt;5&lt;br /&gt;&amp;gt; ev $ dual (e1 &amp;lt;+&amp;gt; 2 *&amp;gt; e2) `te` (4 *&amp;gt; e1 &amp;lt;+&amp;gt; 5 *&amp;gt; e2)&lt;br /&gt;14&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;Provided V is finite-dimensional, then every element of V* can be expressed in the form dual v, for some v in V.&lt;br /&gt;&lt;br /&gt;If we want to turn an element of Vect k (Dual b) into a real Haskell function Vect k b -&amp;gt; k, we can use the following code:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;reify :: (Num k, Ord b) =&amp;gt; Vect k (Dual b) -&amp;gt; (Vect k b -&amp;gt; k)&lt;br /&gt;reify a x = ev (a `te` x)&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;For example:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; let f = reify (dual e2)&lt;br /&gt;&amp;gt; f (4 *&amp;gt; e1 &amp;lt;+&amp;gt; 5 *&amp;gt; e2)&lt;br /&gt;5&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Now, suppose that we have a linear map f: U -&amp;gt; V between vector spaces. This gives rise to a linear map f*: V* -&amp;gt; U*, by defining:&lt;br /&gt;ev (f* a ⊗ x) = ev (a ⊗ f x)&lt;br /&gt;It turns out that the matrix for f* will be the transpose of the matrix for f.&lt;br /&gt;&lt;br /&gt;For, suppose f(ui) = sum [mij vj | j &amp;lt;- ...] and f*(vi*) = sum [m*ij uj* | j &amp;lt;- ...]. Then&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;ev (f* vk* ⊗ ui) = ev (vk* ⊗ f ui) &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; -- definition of f*&lt;br /&gt;=&amp;gt; ev (sum [m*kl ul* | l &amp;lt;- ...] ⊗ ui) = ev (vk* ⊗ sum [mij vj | j &amp;lt;- ...]) &amp;nbsp;-- expanding f and f*&lt;br /&gt;=&amp;gt; sum [m*kl ev(ul* ⊗ ui) | l &amp;lt;- ...] = sum [mij ev (vk* ⊗ vj) | j &amp;lt;- ...] &amp;nbsp; -- linearity of ev&lt;br /&gt;=&amp;gt; sum [m*kl (delta l i) | l &amp;lt;- ...] = sum [mij (delta k j) | j &amp;lt;- ...] &amp;nbsp; &amp;nbsp; &amp;nbsp; -- definition of ev&lt;br /&gt;=&amp;gt; m*ki = mik &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; -- definition of delta&lt;/code&gt;&lt;br /&gt;So m* is the transpose of m.&lt;br /&gt;&lt;br /&gt;Hence we have a contravariant functor * from the category of finite-dimensional vector spaces to itself, which takes a space V to the dual space V*, and a linear map f: U -&amp;gt; V to the dual map f*: V* -&amp;gt; U*. ("Contravariant" just means that it reverses the directions of arrows.)&lt;br /&gt;&lt;br /&gt;Hopefully this explains why it is natural to think of an algebra structure on V as giving rise to a coalgebra structure on V*: a coalgebra is basically an algebra but with arrows reversed - and in going from vector spaces to their duals, we reverse arrows.&lt;br /&gt;&lt;br /&gt;By the way, the converse is also true: A coalgebra structure on V gives rise to an algebra structure on V*, in the same way.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5195188167565410449-5812047000238997018?l=haskellformaths.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://haskellformaths.blogspot.com/feeds/5812047000238997018/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://haskellformaths.blogspot.com/2011/04/what-is-coalgebra.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5195188167565410449/posts/default/5812047000238997018'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5195188167565410449/posts/default/5812047000238997018'/><link rel='alternate' type='text/html' href='http://haskellformaths.blogspot.com/2011/04/what-is-coalgebra.html' title='What is a Coalgebra?'/><author><name>DavidA</name><uri>http://www.blogger.com/profile/16359932006803389458</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/-USwBOcKB6sY/TbM1GUBYxKI/AAAAAAAAAIs/nLhWSj9YB0U/s72-c/Coassoc.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5195188167565410449.post-8463564113343387585</id><published>2011-04-16T20:21:00.001+01:00</published><updated>2011-04-16T21:51:30.957+01:00</updated><title type='text'>What is an Algebra?</title><content type='html'>Over the last few months, we've spent somewhat longer than I originally expected looking at vector spaces, direct sums and tensor products. I hope you haven't forgotten that the reason we were doing this is because we want to look at quantum algebra, and "quantum groups". What are quantum groups? Well, one thing they are is algebras - so the next thing we need to do is define algebras.&lt;br /&gt;&lt;br /&gt;Informally, an algebra is just a vector space which is also a ring (with unit) - or to put it another way, a ring (with unit) which is also a vector space. So a straightforward definition would be, A is an algebra if&lt;br /&gt;(i) A is an additive group (this is required by both vector spaces and rings)&lt;br /&gt;(ii) There is a scalar multiplication smult :: k×A -&amp;gt; A (satisfying some laws, as discussed in a previous post)&lt;br /&gt;(iii) There is a multiplication mult :: A×A -&amp;gt; A, satisfying some laws:&lt;br /&gt;- mult is associative: a(bc) = (ab)c&lt;br /&gt;- mult distributes over addition: a(b+c) = (ab)+(ac), (a+b)c = (ac)+(bc)&lt;br /&gt;(iv) There is a unit :: A, which is an identity for mult: 1a = a = a1&lt;br /&gt;&lt;br /&gt;Some examples:&lt;br /&gt;- C is an algebra over R (2-dimensional as a vector space)&lt;br /&gt;- 2×2 matrices over a field k form a k-algebra (4-dimensional)&lt;br /&gt;- polynomials over a field k form a k-algebra (infinite-dimensional)&lt;br /&gt;&lt;br /&gt;It would be fairly straightforward to translate these definitions into a Haskell type class as they are. However, we're going to do things slightly differently, for two reasons.&lt;br /&gt;&lt;br /&gt;Firstly, we would like to use the language of category theory, and define multiplication and unit in terms of linear maps (arrows in the category of vector spaces). Specifically, we define linear maps:&lt;br /&gt;unit :: k -&amp;gt; A&lt;br /&gt;mult :: A⊗A -&amp;gt; A&lt;br /&gt;&lt;br /&gt;We then require that the following diagrams commute:&lt;br /&gt;&lt;br /&gt;Associativity:&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/-HLIyGh7Gvcg/TaoA4VUiKgI/AAAAAAAAAIk/yc4fM809r0w/s1600/Associativity.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="243" src="http://1.bp.blogspot.com/-HLIyGh7Gvcg/TaoA4VUiKgI/AAAAAAAAAIk/yc4fM809r0w/s320/Associativity.png" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;When we say that this diagram commutes, what it means is that it doesn't matter which way you decide to follow the arrows, the result is the same. Specifically:&lt;br /&gt;mult . mult⊗id == mult . id⊗mult&lt;br /&gt;&lt;br /&gt;Unit:&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/-83eZQCdAErM/TaoA6Y1KHrI/AAAAAAAAAIo/LNyIlAlY0T8/s1600/Unit.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="150" src="http://3.bp.blogspot.com/-83eZQCdAErM/TaoA6Y1KHrI/AAAAAAAAAIo/LNyIlAlY0T8/s320/Unit.png" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;In this case there are two commuting triangles, so we are saying:&lt;br /&gt;mult . unit⊗id == unitOutL&lt;br /&gt;mult . id⊗unit == unitOutR&lt;br /&gt;(where unitOutL, unitOutR are the relevant isomorphisms, which we defined &lt;a href="http://haskellformaths.blogspot.com/2011/03/tensor-products-part-2-monoids-and.html"&gt;last time&lt;/a&gt;.)&lt;br /&gt;&lt;br /&gt;But hold on, haven't we forgotten one of the requirements? What about distributivity? Well, notice that the signature of mult was A⊗A -&amp;gt; A, not A×A -&amp;gt; A. A linear map from the tensor product is bilinear in each component (by definition of tensor product) - and bilinearity implies distributivity. So in this version, distributivity is built into the definition of mult. Neat, eh?&lt;br /&gt;&lt;br /&gt;The second reason our definition will be different is that in HaskellForMaths, all our vector spaces are free vector spaces over some basis type b: V = Vect k b. Consequently, our algebras will be A = Vect k a, where a is a k-basis for the algebra. Because of this, it will turn out to be more natural to express some things in terms of a, rather than A.&lt;br /&gt;&lt;br /&gt;With those forewarnings, here's the HaskellForMaths definition of an algebra:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;module Math.Algebras.Structures where&lt;br /&gt;&lt;br /&gt;import Math.Algebras.VectorSpace&lt;br /&gt;import Math.Algebras.TensorProduct&lt;br /&gt;&lt;br /&gt;class Algebra k a where&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;unit :: k -&amp;gt; Vect k a&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;mult :: Vect k (Tensor a a) -&amp;gt; Vect k a&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;In this definition, a represents the basis of the algebra, not the algebra itself, which is A = Vect k a. Recall that we defined a type Tensor a b, which is a basis for Vect k a ⊗ Vect k b.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;If we wanted to stay a bit closer to the category theory definition, we could try continuing with the type family approach that we looked at last time:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;type TensorProd k u v =&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;(u ~ Vect k a, v ~ Vect k b) =&amp;gt; Vect k (Tensor a b)&lt;br /&gt;&lt;br /&gt;class Algebra2 k a where&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;unit2 :: k -&amp;gt; a&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;mult2 :: TensorProd k a a -&amp;gt; a&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;In this definition a is the algebra itself. I'm not going to pursue that approach any further here, but if anyone fancies giving it a go, I'd be interested to hear how they get on.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Anyway, as discussed, unit and mult are required to be linear maps. We could check this using QuickCheck, but in practice we will always define unit and mult in such a way that they are clearly linear.&lt;br /&gt;&lt;br /&gt;However, we can write a QuickCheck property to check the other requirements:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;prop_Algebra (k,x,y,z) =&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;(mult . (id `tf` mult)) (x `te` (y `te` z)) ==&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; (mult . (mult `tf` id)) ((x `te` y) `te` z) &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;&amp;amp;&amp;amp; -- associativity&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;unitOutL (k' `te` x) == (mult . (unit' `tf` id)) (k' `te` x) &amp;nbsp;&amp;amp;&amp;amp; -- left unit&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;unitOutR (x `te` k') == (mult . (id `tf` unit')) (x `te` k') &amp;nbsp; &amp;nbsp; -- right unit&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;where k' = k *&amp;gt; return ()&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;(Recall that when we wish to consider k as a vector space, we represent it as the free vector space Vect k ().)&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;When we have an algebra, then we have a ring, so we can define a Num instance:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;instance (Num k, Eq b, Ord b, Show b, Algebra k b) =&amp;gt; Num (Vect k b) where&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;x+y = x &amp;lt;+&amp;gt; y&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;negate x = neg x&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;x*y = mult (x `te` y)&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;fromInteger n = unit (fromInteger n)&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;abs _ = error "Prelude.Num.abs: inappropriate abstraction"&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;signum _ = error "Prelude.Num.signum: inappropriate abstraction"&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;This means that when we have an algebra, we'll be able to write expressions using the usual arithmetic operators.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Okay, so how about some examples of algebras. Well, we mentioned the complex numbers as an algebra over the reals, but let's go one better and define the quaternion algebra. This is a four dimensional algebra over any field k, which is generated by {1, i, j, k}, satisfying the relations i^2 = j^2 = k^2 = ijk = -1. (It follows, for example that (ijk)k = (-1)k, so ij(k^2) = -k, so ij = k.)&lt;br /&gt;&lt;br /&gt;Here's the code. First, we define our basis. The quaternions are traditionally denoted H, after Hamilton, who discovered them:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;data HBasis = One | I | J | K deriving (Eq,Ord)&lt;br /&gt;&lt;br /&gt;type Quaternion k = Vect k HBasis&lt;br /&gt;&lt;br /&gt;i,j,k :: Num k =&amp;gt; Quaternion k&lt;br /&gt;i = return I&lt;br /&gt;j = return J&lt;br /&gt;k = return K&lt;br /&gt;&lt;br /&gt;instance Show HBasis where&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;show One = "1"&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;show I = "i"&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;show J = "j"&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;show K = "k"&lt;br /&gt;&lt;br /&gt;instance (Num k) =&amp;gt; Algebra k HBasis where&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;unit x = x *&amp;gt; return One&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;mult = linear m&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; where m (One,b) = return b&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; m (b,One) = return b&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; m (I,I) = unit (-1)&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; m (J,J) = unit (-1)&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; m (K,K) = unit (-1)&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; m (I,J) = return K&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; m (J,I) = -1 *&amp;gt; return K&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; m (J,K) = return I&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; m (K,J) = -1 *&amp;gt; return I&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; m (K,I) = return J&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; m (I,K) = -1 *&amp;gt; return J&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;Note that unit and mult are both linear by definition.&lt;br /&gt;&lt;br /&gt;Let's just check that the code works as expected:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; :l Math.Algebras.Quaternions&lt;br /&gt;&amp;gt; i^2&lt;br /&gt;-1&lt;br /&gt;&amp;gt; j^2&lt;br /&gt;-1&lt;br /&gt;&amp;gt; i*j&lt;br /&gt;k&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;Now, are we sure that the quaternions are an algebra? Well, it's clear from the definition that the left and right unit conditions hold - see the lines m (One,b) = m (b,One) = return b. But it's not obvious that the associativity condition holds, so perhaps we should quickCheck:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;instance Arbitrary HBasis where&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;arbitrary = elements [One,I,J,K]&lt;br /&gt;&lt;br /&gt;prop_Algebra_Quaternion (k,x,y,z) = prop_Algebra (k,x,y,z)&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;where types = (k,x,y,z) :: (Q, Quaternion Q, Quaternion Q, Quaternion Q)&lt;br /&gt;&lt;br /&gt;&amp;gt; quickCheck prop_Algebra_Quaternion&lt;br /&gt;+++ OK, passed 100 tests.&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;Ok, how about 2*2 matrices. These form a four dimensional algebra generated by the elementary matrices {e11, e12, e21, e22}, where eij is the matrix with a 1 in the (i,j) position, and 0s elsewhere:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;data Mat2 = E2 Int Int deriving (Eq,Ord,Show)&lt;br /&gt;&lt;br /&gt;instance Num k =&amp;gt; Algebra k Mat2 where&lt;br /&gt;&amp;nbsp;&amp;nbsp; unit x = x *&amp;gt; V [(E2 i i, 1) | i &amp;lt;- [1..2] ]&lt;br /&gt;&amp;nbsp;&amp;nbsp; mult = linear mult' where&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;mult' (E2 i j, E2 k l) = delta j k *&amp;gt; return (E2 i l)&lt;br /&gt;&lt;br /&gt;delta i j | i == j &amp;nbsp; &amp;nbsp;= 1&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;| otherwise = 0&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;Notice the way that we only have to define multiplication on our basis elements, the elementary matrices eij, and the rest follows by bilinearity. Notice also that unit and mult are linear by definition. Let's just sanity check that this works as expected:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; :l Math.Algebras.Matrix&lt;br /&gt;&amp;gt; unit 1 :: Vect Q Mat2&lt;br /&gt;E2 1 1+E2 2 2&lt;br /&gt;&amp;gt; let a = 2 *&amp;gt; return (E2 1 2) + 3 *&amp;gt; return (E2 2 1)&lt;br /&gt;&amp;gt; a^2&lt;br /&gt;6E2 1 1+6E2 2 2&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;It's straightforward to define an Arbitrary instance for Mat2, and quickCheck that it satisfies the algebra conditions.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Finally, what about polynomials? Let's go one better, and define polynomials in more than one variable. An obvious basis for these polynomials as a vector space is the set of monomials. For example, polynomials in x,y,z are a vector space on the basis { x^i y^j z^k | i &amp;lt;- [0..], j &amp;lt;- [0..], k &amp;lt;- [0..] }.&lt;br /&gt;&lt;br /&gt;Recall that our vector space code requires that the basis be an Ord instance, so we need to define an ordering on monomials. There are many ways to do this. We'll use the graded lex or glex ordering, which says that monomials of higher degree sort before those of lower degree, and among those of equal degree, lexicographic (alphabetical) ordering applies.&lt;br /&gt;&lt;br /&gt;Given any set X of variables, we can construct the polynomial algebra over X as the vector space with basis the monomials in X. In the example above, we had X = {x,y,z}. For this reason, we'll allow our glex monomials to be polymorphic in the type of the variables. (In practice though, as you will see shortly, we will often just use String as the type of our variables.)&lt;br /&gt;&lt;br /&gt;So a glex monomial over variables v is basically just a list of powers of elements of v:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;data GlexMonomial v = Glex Int [(v,Int)] deriving (Eq)&lt;br /&gt;-- The initial Int is the degree of the monomial. Storing it speeds up equality tests and comparisons&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;For example x^3 y^2 would be represented as Glex 5 [("x",3),("y",2)].&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;instance Ord v =&amp;gt; Ord (GlexMonomial v) where&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;compare (Glex si xis) (Glex sj yjs) =&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;compare (-si, [(x,-i) | (x,i) &amp;lt;- xis]) (-sj, [(y,-j) | (y,j) &amp;lt;- yjs])&lt;br /&gt;-- all the minus signs are to make things sort in the right order&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;[There's a bug in the HaskellForMaths v0.3.2 version of this Ord instance - the above code, which fixes it, will be in the next release.]&lt;br /&gt;&lt;br /&gt;I won't bore you with the Show instance - it's a bit fiddly.&lt;br /&gt;&lt;br /&gt;The Algebra instance uses the fact that monomials form a monoid under multiplication, with unit 1. Given any monoid, we can form the free vector space having the monoid elements as basis, and then lift the unit and multiplication in the monoid into the vector space, thus forming an algebra called the monoid algebra. Here's the construction:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;instance (Num k, Ord v) =&amp;gt; Algebra k (GlexMonomial v) where&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;unit x = x *&amp;gt; return munit&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;where munit = Glex 0 []&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;mult xy = nf $ fmap (\(a,b) -&amp;gt; a `mmult` b) xy&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;where mmult (Glex si xis) (Glex sj yjs) = Glex (si+sj) $ addmerge xis yjs&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;(In the addmerge function, we ensure that provided the variables were listed in ascending order in both inputs, then they are still so in the output.)&lt;br /&gt;&lt;br /&gt;Finally, here's a convenience function for injecting a variable into the polynomial algebra:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;glexVar v = V [(Glex 1 [(v,1)], 1)]&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;Then for example, we can do the following:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;type GlexPoly k v = Vect k (GlexMonomial v)&lt;br /&gt;&lt;br /&gt;&amp;gt; let x = glexVar "x" :: GlexPoly Q String&lt;br /&gt;&amp;gt; let y = glexVar "y" :: GlexPoly Q String&lt;br /&gt;&amp;gt; let z = glexVar "z" :: GlexPoly Q String&lt;br /&gt;&amp;gt; (x+y+z)^3&lt;br /&gt;x^3+3x^2y+3x^2z+3xy^2+6xyz+3xz^2+y^3+3y^2z+3yz^2+z^3&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;I hope you were impressed at how easy that all was. The foundation of free vector spaces, tensor products and algebras is only around a hundred lines of code. It then takes just another dozen lines or so to define an algebra: just define a basis, and define how the basis elements multiply - the rest follows by linearity.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5195188167565410449-8463564113343387585?l=haskellformaths.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://haskellformaths.blogspot.com/feeds/8463564113343387585/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://haskellformaths.blogspot.com/2011/04/what-is-algebra.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5195188167565410449/posts/default/8463564113343387585'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5195188167565410449/posts/default/8463564113343387585'/><link rel='alternate' type='text/html' href='http://haskellformaths.blogspot.com/2011/04/what-is-algebra.html' title='What is an Algebra?'/><author><name>DavidA</name><uri>http://www.blogger.com/profile/16359932006803389458</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/-HLIyGh7Gvcg/TaoA4VUiKgI/AAAAAAAAAIk/yc4fM809r0w/s72-c/Associativity.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5195188167565410449.post-667145862310179957</id><published>2011-03-18T10:23:00.000Z</published><updated>2011-03-18T10:23:35.042Z</updated><title type='text'>Tensor Products, part 2: Monoids and Arrows</title><content type='html'>[New release, HaskellForMaths v0.3.2, available on &lt;a href="http://hackage.haskell.org/package/HaskellForMaths"&gt;Hackage&lt;/a&gt;]&lt;br /&gt;&lt;br /&gt;&lt;a href="http://haskellformaths.blogspot.com/2011/02/tensor-products-of-vector-spaces-part-1.html"&gt;Last time&lt;/a&gt; we looked at the tensor product of free vector spaces. Given A = Vect k a, B = Vect k b, then the tensor product A⊗B can be represented as Vect k (a,b). As we saw, the tensor product is the "mother of all bilinear functions".&lt;br /&gt;&lt;br /&gt;In the HaskellForMaths library, I have defined a couple of type synonyms for direct sum and tensor product:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;type DSum a b = Either a b&lt;br /&gt;type Tensor a b = (a,b)&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;This means that in type signatures we can write the type of a direct sum as Vect k (DSum a b), and of a tensor product as Vect k (Tensor a b). The idea is that this will remind us what we're dealing with, and make things clearer.&lt;br /&gt;&lt;br /&gt;During development, I initially called the tensor type TensorBasis. In maths, tensor product is thought of as an operation on vector spaces - A⊗B - rather than on their bases. It would be nicer if we could define direct sum and tensor product as operators on the vector spaces themselves, rather than their bases.&lt;br /&gt;&lt;br /&gt;Well, we can have a go, something like this:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;{-# LANGUAGE TypeFamilies, RankNTypes #-}&lt;br /&gt;&lt;br /&gt;type DirectSum k u v =&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;(u ~ Vect k a, v ~ Vect k b) =&amp;gt; Vect k (DSum a b)&lt;br /&gt;&lt;br /&gt;type TensorProd k u v =&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;(u ~ Vect k a, v ~ Vect k b) =&amp;gt; Vect k (Tensor a b)&lt;br /&gt;&lt;br /&gt;type En = Vect Q EBasis&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;This appears to work:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;$ ghci -XTypeFamilies&lt;br /&gt;...&lt;br /&gt;&amp;gt; :l Math.Test.TAlgebras.TTensorProduct&lt;br /&gt;...&lt;br /&gt;&amp;gt; e1 `te` e2 :: TensorProd Q En En&lt;br /&gt;(e1,e2)&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;I'll reserve judgement. (Earlier in the development of the quantum algebra code for HaskellForMaths, I tried something similar to this, and ran into problems later on - but I can't now remember exactly what I did, so perhaps this will work.)&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Okay, so what can we do with tensor products. Well first, given vectors u in A = Vect k a and v in B = Vect k b, we can form their tensor product, u⊗v, an element of A⊗B = Vect k (Tensor a b). To calculate u⊗v, we use the bilinearity of tensor product to reduce the tensor product of arbitrary vectors to a linear combination of tensor products of basis elements:&lt;br /&gt;(x1 a1 + x2 a2)⊗(y1 b1 + y2 b2) = x1 y1 a1⊗b1 + x1 y2 a1⊗b2 + x2 y1 a2⊗b1 + x2 y2 a2⊗b2&lt;br /&gt;Here's the code:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;te :: Num k =&amp;gt; Vect k a -&amp;gt; Vect k b -&amp;gt; Vect k (Tensor a b)&lt;br /&gt;te (V us) (V vs) = V [((a,b), x*y) | (a,x) &amp;lt;- us, (b,y) &amp;lt;- vs]&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;This is in essence just the "tensor" function from last time, but rewritten to take its two inputs separately rather than in a direct sum. Mnemonic: "te" stands for "&lt;b&gt;t&lt;/b&gt;ensor product of &lt;b&gt;e&lt;/b&gt;lements". Note that the definition respects normal form: provided the inputs are in normal form (the as and bs are in order, and the xs and ys are non-zero), then so is the output.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-size: large;"&gt;Associativity&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;We can form tensor products of tensor products, such as A⊗(B⊗C) = Vect k (a,(b,c)), and likewise (A⊗B)⊗C = Vect k ((a,b),c). These two are isomorphic as vector spaces. This is obvious if you think about it in the right way. Recall from last week that we can think of elements of A⊗B = Vect k (a,b) as 2-dimensional matrices with rows indexed by a, columns indexed by b, and entries in k. Well A⊗B⊗C (we can drop the parentheses as it makes no difference) is the space of three-dimensional matrices, with one dimension indexed by a, one by b, and one by c.&lt;br /&gt;&lt;br /&gt;We can define isomorphisms either way with the following Haskell code:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;assocL :: Vect k (Tensor u (Tensor v w)) -&amp;gt; Vect k (Tensor (Tensor u v) w)&lt;br /&gt;assocL = fmap ( \(a,(b,c)) -&amp;gt; ((a,b),c) )&lt;br /&gt;&lt;br /&gt;assocR :: Vect k (Tensor (Tensor u v) w) -&amp;gt; Vect k (Tensor u (Tensor v w))&lt;br /&gt;assocR = fmap ( \((a,b),c) -&amp;gt; (a,(b,c)) )&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;It's clear that these functions are linear, since they're defined using fmap. It's also clear that they are bijections, since they are mutually inverse. Hence they are the required isomorphisms.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-size: large;"&gt;Unit&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Last time we saw that the field k is itself a vector space, which can be represented as the free vector space Vect k (). What happens if we take the tensor product k⊗A of the field with some other vector space A = Vect k a? Well, if you think about it in terms of matrices, Vect k () is a one-dimensional vector space, so Vect k ((),a) will be a 1*n matrix (where n is the number of basis elements in a). But a 1*n matrix looks just the same as an n-vector:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;a1 a2 ... &amp;nbsp; &amp;nbsp; &amp;nbsp;a1 a2 ...&lt;br /&gt;() ( . &amp;nbsp;. &amp;nbsp; &amp;nbsp;) ~= ( . &amp;nbsp;. &amp;nbsp; &amp;nbsp;)&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;So we should expect that k⊗A = Vect k ((),a) is isomorphic to A = Vect k a. And indeed it is - here are the relevant isomorphisms:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;unitInL = fmap ( \a -&amp;gt; ((),a) )&lt;br /&gt;&lt;br /&gt;unitOutL = fmap ( \((),a) -&amp;gt; a )&lt;br /&gt;&lt;br /&gt;unitInR = fmap ( \a -&amp;gt; (a,()) )&lt;br /&gt;&lt;br /&gt;unitOutR = fmap ( \(a,()) -&amp;gt; a )&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;So tensor product is associative, and has a unit. In other words, vector spaces form a monoid under tensor product.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-size: large;"&gt;Tensor product of functions&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Given linear functions f: A -&amp;gt; A', g: B -&amp;gt; B', we can define a linear function f⊗g: A⊗B -&amp;gt; A'⊗B' by&lt;br /&gt;(f⊗g)(a⊗b) = f(a)⊗g(b)&lt;br /&gt;&lt;br /&gt;Exercise: Prove that f⊗g is linear&lt;br /&gt;&lt;br /&gt;Here's the Haskell code:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;tf :: (Num k, Ord a', Ord b') =&amp;gt; (Vect k a -&amp;gt; Vect k a') -&amp;gt; (Vect k b -&amp;gt; Vect k b')&lt;br /&gt;&amp;nbsp;&amp;nbsp; -&amp;gt; Vect k (Tensor a b) -&amp;gt; Vect k (Tensor a' b')&lt;br /&gt;tf f g (V ts) = sum [x *&amp;gt; te (f $ return a) (g $ return b) | ((a,b), x) &amp;lt;- ts]&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;where sum = foldl add zero&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;(Mnemonic: "tf" stands for "&lt;b&gt;t&lt;/b&gt;ensor product of &lt;b&gt;f&lt;/b&gt;unctions".)&lt;br /&gt;&lt;br /&gt;Let's just check that this is linear:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;prop_Linear_tf ((f,g),k,(a1,a2,b1,b2)) = prop_Linear (linfun f `tf` linfun g) (k, a1 `te` b1, a2 `te` b2)&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;where types = (f,g,k,a1,a2,b1,b2) :: (LinFun Q ABasis SBasis, LinFun Q BBasis TBasis, Q,&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Vect Q ABasis, Vect Q ABasis, Vect Q BBasis, Vect Q BBasis)&lt;br /&gt;&lt;br /&gt;&amp;gt; quickCheck prop_Linear_tf&lt;br /&gt;+++ OK, passed 100 tests.&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;So we now have tensor product operations on objects and on arrows. In each case, tensor product takes a pair of objects/arrows, and returns a new object/arrow.&lt;br /&gt;&lt;br /&gt;There is a product category k-Vect×k-Vect, consisting of pairs of objects and pairs of arrows from k-Vect. The identity arrow is defined to be (id,id), and composition is defined by (f,g) . (f',g') = (f . f', g . g'). Given these definitions, it turns out that tensor product is a functor from k-Vect×k-Vect to k-Vect. (Another way to say this is that tensor product is a bifunctor in the category of vector spaces.)&lt;br /&gt;&lt;br /&gt;Recall that a functor is just a map that "commutes" with the category operations, id and . (composition).&lt;br /&gt;So the conditions for tensor product to be a functor are:&lt;br /&gt;id⊗id = id&lt;br /&gt;(f' . f)⊗(g' . g) = (f'⊗g') . (f⊗g)&lt;br /&gt;&lt;br /&gt;Both of these follow immediately from the definition of f⊗g that was given above. However, just in case you don't believe me, here's a quickCheck property to prove it:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;prop_TensorFunctor ((f1,f2,g1,g2),(a,b)) =&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;(id `tf` id) (a `te` b) == id (a `te` b) &amp;amp;&amp;amp;&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;((f' . f) `tf` (g' . g)) (a `te` b) == ((f' `tf` g') . (f `tf` g)) (a `te` b)&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;where f = linfun f1&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;f' = linfun f2&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;g = linfun g1&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;g' = linfun g2&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;types = (f1,f2,g1,g2,a,b) :: (LinFun Q ABasis ABasis, LinFun Q ABasis ABasis,&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;LinFun Q BBasis BBasis, LinFun Q BBasis BBasis,&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Vect Q ABasis, Vect Q BBasis)&lt;br /&gt;&lt;br /&gt;&amp;gt; quickCheck prop_TensorFunctor&lt;br /&gt;+++ OK, passed 100 tests.&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;We can think of composition as doing things in series, and tensor as doing things in parallel.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="https://lh3.googleusercontent.com/-wNnfMjoPFlw/TYMvykI5OhI/AAAAAAAAAIU/qUYFrMhwJiQ/s1600/ArrowComposition.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="93" src="https://lh3.googleusercontent.com/-wNnfMjoPFlw/TYMvykI5OhI/AAAAAAAAAIU/qUYFrMhwJiQ/s200/ArrowComposition.png" width="200" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="https://lh5.googleusercontent.com/-31cOu-meR0Y/TYMvy56GlSI/AAAAAAAAAIY/xWLUH0S_Nx0/s1600/ArrowTensor.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="200" src="https://lh5.googleusercontent.com/-31cOu-meR0Y/TYMvy56GlSI/AAAAAAAAAIY/xWLUH0S_Nx0/s200/ArrowTensor.png" width="182" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;Then the second bifunctor condition can be paraphrased as "Doing things in parallel, in series, is the same as doing things in series, in parallel", as represented by the following diagram.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="https://lh3.googleusercontent.com/-_YPjcDf51DY/TYMvyPBHKhI/AAAAAAAAAIM/v7xIn8juQuk/s1600/ArrowBifunctor.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="146" src="https://lh3.googleusercontent.com/-_YPjcDf51DY/TYMvyPBHKhI/AAAAAAAAAIM/v7xIn8juQuk/s400/ArrowBifunctor.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;You might recall that there are a couple of Haskell type classes for that kind of thing. The Category typeclass from Control.Category is about doing things in series. Here is the definition:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;class Category cat where&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;id :: cat a a&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;(.) :: cat b c -&amp;gt; cat a b -&amp;gt; cat a c&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;The Arrow typeclass from Control.Arrow is about doing things in parallel:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;class Category arr =&amp;gt; Arrow arr where&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;arr :: (a -&amp;gt; b) -&amp;gt; arr a b&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;first :: arr a b -&amp;gt; arr (a,c) (b,c)&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;second :: arr a b -&amp;gt; arr (c,a) (c,b)&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;(***) :: arr a b -&amp;gt; arr a' b' -&amp;gt; arr (a,a') (b,b')&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;(&amp;amp;&amp;amp;&amp;amp;) :: arr a b -&amp;gt; arr a b' -&amp;gt; arr a (b,b')&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;Intuitively, linear functions (Vect k a -&amp;gt; Vect k b) are arrows, via the definitions:&lt;br /&gt;id = id&lt;br /&gt;(.) = (.)&lt;br /&gt;arr = fmap&lt;br /&gt;first f = f `tf` id&lt;br /&gt;second f = id `tf` f&lt;br /&gt;f *** g = f `tf` g&lt;br /&gt;(f &amp;amp;&amp;amp;&amp;amp; g) a = (f `tf` g) (a `te` a)&lt;br /&gt;&lt;br /&gt;However, in order to define an Arrow instance we'll have to wrap the functions in a newtype.&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;import Prelude as P&lt;br /&gt;import Control.Category as C&lt;br /&gt;import Control.Arrow&lt;br /&gt;&lt;br /&gt;newtype Linear k a b = Linear (Vect k a -&amp;gt; Vect k b)&lt;br /&gt;&lt;br /&gt;instance Category (Linear k) where&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;id = Linear P.id&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;(Linear f) . (Linear g) = Linear (f P.. g)&lt;br /&gt;&lt;br /&gt;instance Num k =&amp;gt; Arrow (Linear k) where&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;arr f = Linear (fmap f)&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;first (Linear f) = Linear f *** Linear P.id&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;second (Linear f) = Linear P.id *** Linear f&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;Linear f *** Linear g = Linear (f `tf2` g)&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;where tf2 f g (V ts) = V $ concat&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;[let V us = x *&amp;gt; te (f $ return a) (g $ return b) in us | ((a,b), x) &amp;lt;- ts]&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;Linear f &amp;amp;&amp;amp;&amp;amp; Linear g = (Linear f *** Linear g) C.. Linear (\a -&amp;gt; a `te` a)&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;Note that we can't use tf directly, because it requires Ord instances for a and b, and Haskell doesn't give us a way to require these. For this reason we define a tf2 function, which is equivalent except that it doesn't guarantee that results are in normal form.&lt;br /&gt;&lt;br /&gt;There is loads of other stuff I could talk about:&lt;br /&gt;Exercise: Show that direct sum is also a monoid, with the zero vector space as its identity. (Write Haskell functions for the necessary isomorphisms.)&lt;br /&gt;Exercise: Show that tensor product distributes over direct sum - A⊗(B⊕C) is isomorphic to (A⊗B)⊕(A⊗C). (Write the isomorphisms.)&lt;br /&gt;Exercise: Show that given f: A-&amp;gt;A', g: B-&amp;gt;B', it is possible to define a linear function f⊕g: A⊕B-&amp;gt;A'⊕B' by (f⊕g)(a⊕b) = f(a)⊕g(b). (Write a dsumf function analogous to tf.)&lt;br /&gt;&lt;br /&gt;There is another arrow related typeclass called ArrowChoice. It represents arrows where you have a choice of doing either one thing or another thing:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;class Arrow arr =&amp;gt; ArrowChoice arr where&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;left :: arr a b -&amp;gt; arr (Either a c) (Either b c)&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;right :: arr a b -&amp;gt; arr (Either c a) (Either c b)&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;(+++) :: arr a a' -&amp;gt; arr b b' -&amp;gt; arr (Either a b) (Either a' b')&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;(|||) :: arr a c -&amp;gt; arr b c -&amp;gt; arr (Either a b) c&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="https://lh3.googleusercontent.com/-gnO5Y_5Ae4I/TYMvyfE-rTI/AAAAAAAAAIQ/fLwtCSVHOX0/s1600/ArrowChoice.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="175" src="https://lh3.googleusercontent.com/-gnO5Y_5Ae4I/TYMvyfE-rTI/AAAAAAAAAIQ/fLwtCSVHOX0/s200/ArrowChoice.png" width="200" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;Exercise: Show that the dsumf function can be used to give an ArrowChoice instance for linear functions, where the left summand goes down one path and the right summand down another.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5195188167565410449-667145862310179957?l=haskellformaths.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://haskellformaths.blogspot.com/feeds/667145862310179957/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://haskellformaths.blogspot.com/2011/03/tensor-products-part-2-monoids-and.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5195188167565410449/posts/default/667145862310179957'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5195188167565410449/posts/default/667145862310179957'/><link rel='alternate' type='text/html' href='http://haskellformaths.blogspot.com/2011/03/tensor-products-part-2-monoids-and.html' title='Tensor Products, part 2: Monoids and Arrows'/><author><name>DavidA</name><uri>http://www.blogger.com/profile/16359932006803389458</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='https://lh3.googleusercontent.com/-wNnfMjoPFlw/TYMvykI5OhI/AAAAAAAAAIU/qUYFrMhwJiQ/s72-c/ArrowComposition.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5195188167565410449.post-8500486448701966655</id><published>2011-02-21T20:13:00.001Z</published><updated>2011-02-22T20:26:06.275Z</updated><title type='text'>Tensor products of vector spaces, part 1</title><content type='html'>A little while back on this blog, we defined the &lt;a href="http://haskellformaths.blogspot.com/2010/12/free-vector-space-on-type-part-1.html"&gt;free k-vector space over a type b&lt;/a&gt;:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;newtype Vect k b = V [(b,k)] deriving (Eq,Ord)&lt;/code&gt;&lt;br /&gt;Elements of Vect k b are k-linear combinations of elements of b.&lt;br /&gt;&lt;br /&gt;Whenever we have a mathematical structure like this, we want to know about building blocks and new-from-old constructions.&lt;br /&gt;&lt;br /&gt;We already looked at one new-from-old construction: given free k-vector spaces A = Vect k a and B = Vect k b, we can construct their direct sum A⊕B = Vect k (Either a b).&lt;br /&gt;&lt;br /&gt;We saw that the direct sum is both the &lt;a href="http://haskellformaths.blogspot.com/2011/02/products-of-lists-and-vector-spaces.html"&gt;product&lt;/a&gt; and the &lt;a href="http://haskellformaths.blogspot.com/2011/01/coproducts-of-lists-and-free-vector.html"&gt;coproduct&lt;/a&gt; in the category of free vector spaces - which means that it is the object which satisfies the universal properties implied by the following two diagrams:&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/-7wIViKlEwSo/TWLBKRnpMBI/AAAAAAAAAH8/q1EsnivgWvU/s1600/Vect_product.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="117" src="http://3.bp.blogspot.com/-7wIViKlEwSo/TWLBKRnpMBI/AAAAAAAAAH8/q1EsnivgWvU/s320/Vect_product.png" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/-DSAB9Cos0LQ/TWLBSgvjKrI/AAAAAAAAAIA/CYsko9S-X2I/s1600/Vect_coproduct.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="117" src="http://2.bp.blogspot.com/-DSAB9Cos0LQ/TWLBSgvjKrI/AAAAAAAAAIA/CYsko9S-X2I/s320/Vect_coproduct.png" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;br /&gt;&lt;/div&gt;So we have injections i1, i2 : Vect k a, Vect k b -&amp;gt; Vect k (Either a b), to put elements of A and B into the direct sum A⊕B, and projections p1, p2 : Vect k (Either a b) -&amp;gt; Vect k a, Vect k b to take them back out again.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;However, there is another obvious new-from-old construction: Vect k (a,b). What does this represent?&lt;br /&gt;&lt;br /&gt;In order to answer that question, we need to look at bilinear functions. The basic idea of a bilinear function is that it is a function of two arguments, which is linear in each argument. So we might start by looking at functions f :: Vect k a -&amp;gt; Vect k b -&amp;gt; Vect k t.&lt;br /&gt;&lt;br /&gt;However, functions of two arguments don't really sit very well in category theory, where arrows are meant to have a single source. (We can handle functions of two arguments in multicategories, but I don't want to go there just yet.) In order to stay within category theory, we need to combine the two arguments into a single argument, using the direct sum construction. So instead of looking at functions f :: Vect k a -&amp;gt; Vect k b -&amp;gt; Vect k t, we will look at functions f :: Vect k (Either a b) -&amp;gt; Vect k t.&lt;br /&gt;&lt;br /&gt;To see that they are equivalent, recall from last time that Vect k (Either a b) is isomorphic to (Vect k a, Vect k b), via the isomorphisms:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;to :: (Vect k a, Vect k b) -&amp;gt; Vect k (Either a b)&lt;br /&gt;to = \(u,v) -&amp;gt; i1 u &amp;lt;+&amp;gt; i2 v&lt;br /&gt;from :: Vect k (Either a b) -&amp;gt; (Vect k a, Vect k b)&lt;br /&gt;from = \uv -&amp;gt; (p1 uv, p2 uv)&lt;/code&gt;&lt;br /&gt;So in going from f :: Vect k a -&amp;gt; Vect k b -&amp;gt; Vect k t to f :: Vect k (Either a b) -&amp;gt; Vect k t, we're really just uncurrying.&lt;br /&gt;&lt;br /&gt;Ok, so suppose we are given f :: Vect k (Either a b) -&amp;gt; Vect k t. It helps to still think of this as a function of two arguments, even though we've wrapped them up together in either side of a direct sum. Then we say that f is bilinear, if it is linear in each side of the direct sum. That is:&lt;br /&gt;- for any fixed a0 in A, the function f_a0 :: Vect k b -&amp;gt; Vect k t, f_a0 = \b -&amp;gt; f (i1 a0 &amp;lt;+&amp;gt; i2 b) is linear&lt;br /&gt;- for any fixed b0 in B, the function f_b0 :: Vect k a -&amp;gt; Vect k t, f_b0 = \a -&amp;gt; f (i1 a &amp;lt;+&amp;gt; i2 b0) is linear&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Here's a QuickCheck property to test whether a function is bilinear:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;prop_Bilinear :: (Num k, Ord a, Ord b, Ord t) =&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; (Vect k (Either a b) -&amp;gt; Vect k t) -&amp;gt; (k, Vect k a, Vect k a, Vect k b, Vect k b) -&amp;gt; Bool&lt;br /&gt;prop_Bilinear f (k,a1,a2,b1,b2) =&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;prop_Linear (\b -&amp;gt; f (i1 a1 &amp;lt;+&amp;gt; i1 b)) (k,b1,b2) &amp;amp;&amp;amp;&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;prop_Linear (\a -&amp;gt; f (i1 a &amp;lt;+&amp;gt; i1 b1)) (k,a1,a2)&lt;br /&gt;&lt;br /&gt;prop_BilinearQn f (a,u1,u2,v1,v2) = prop_Bilinear f (a,u1,u2,v1,v2)&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;where types = (a,u1,u2,v1,v2) :: (Q, Vect Q EBasis, Vect Q EBasis, Vect Q EBasis, Vect Q EBasis)&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;What are some examples of bilinear functions?&lt;br /&gt;&lt;br /&gt;Well, perhaps the most straightforward is the dot product of vectors. If our vector spaces A and B are the same, then we can define the dot product:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;dot0 uv = sum [ if a == b then x*y else 0 | (a,x) &amp;lt;- u, (b,y) &amp;lt;- v]&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;where V u = p1 uv&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;V v = p2 uv&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;However, as it stands, this won't pass our QuickCheck property - because it has the wrong type! This has the type dot0 :: Vect k (Either a b) -&amp;gt; k, whereas we need something of type Vect k (Either a b) -&amp;gt; Vect k t.&lt;br /&gt;&lt;br /&gt;Now, it is of course true that k is a k-vector space. However, as it stands, it's not a free k-vector space over some basis type t. Luckily, this is only a technicality, which is easily fixed. When we want to consider k as itself a (free) vector space, we will take t = (), the unit type, and equate k with Vect k (). Since the type () has only a single inhabitant, the value (), then Vect k () consists of scalar multiples of () - so it is basically just a single copy of k itself. The isomorphism between k and Vect k () is \k -&amp;gt; k *&amp;gt; return ().&lt;br /&gt;&lt;br /&gt;Okay, so now that we know how to represent k as a free k-vector space, we can define dot product again:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;dot1 uv = nf $ V [( (), if a == b then x*y else 0) | (a,x) &amp;lt;- u, (b,y) &amp;lt;- v]&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;where V u = p1 uv&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;V v = p2 uv&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;This now has the type dot1 :: Vect k (Either a b) -&amp;gt; Vect k (). Here's how you use it:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; dot1 ( i1 (e1 &amp;lt;+&amp;gt; 2 *&amp;gt; e2) &amp;lt;+&amp;gt; i2 (3 *&amp;gt; e1 &amp;lt;+&amp;gt; e2) )&lt;br /&gt;5()&lt;/code&gt;&lt;br /&gt;(So thinking of our function as a function of two arguments, what we do is use i1 to inject the first argument into the left hand side of the direct sum, and i2 to inject the second argument into the right hand side.)&lt;br /&gt;&lt;br /&gt;So we can now use the QuickCheck property:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; quickCheck (prop_BilinearQn dot1)&lt;br /&gt;+++ OK, passed 100 tests.&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;Another example of a bilinear function is polynomial multiplication. Polynomials of course form a vector space, with basis {x^i | i &amp;lt;- [0..] }. So we could define a type to represent the monomials x^i, and then form the polynomials as the free vector space in the monomials. In a few weeks we will do that, but for the moment, to save time, let's just use our existing EBasis type, and take E i to represent x^i. Then polynomial multiplication is the following function:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;polymult1 uv = nf $ V [(E (i+j) , x*y) | (E i,x) &amp;lt;- u, (E j,y) &amp;lt;- v]&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;where V u = p1 uv&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;V v = p2 uv&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;Let's just convince ourselves that this is polynomial multiplication:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; polymult1 (i1 (e 0 &amp;lt;+&amp;gt; e 1) &amp;lt;+&amp;gt; i2 (e 0 &amp;lt;+&amp;gt; e 1))&lt;br /&gt;e0+2e1+e2&lt;/code&gt;&lt;br /&gt;So this is just our way of saying that (1+x)*(1+x) = 1+2x+x^2.&lt;br /&gt;&lt;br /&gt;Again, let's verify that this is bilinear:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; quickCheck (prop_BilinearQn polymult1)&lt;br /&gt;+++ OK, passed 100 tests.&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;So what's all this got to do with Vect k (a,b)? Well, here's another bilinear function:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;tensor :: (Num k, Ord a, Ord b) =&amp;gt; Vect k (Either a b) -&amp;gt; Vect k (a, b)&lt;br /&gt;tensor uv = nf $ V [( (a,b), x*y) | (a,x) &amp;lt;- u, (b,y) &amp;lt;- v]&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;where V u = p1 uv; V v = p2 uv&lt;br /&gt;&lt;br /&gt;&amp;gt; quickCheck (prop_BilinearQn tensor)&lt;br /&gt;+++ OK, passed 100 tests.&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;So this "tensor" function takes each pair of basis elements a, b in the input to a basis element (a,b) in the output. The thing that is interesting about this bilinear function is that it is in some sense "the mother of all bilinear functions". Specifically, you can specify a bilinear function completely by specifying what happens to each pair (a,b) of basis elements. It follows that any bilinear function f :: Vect k (Either a b) -&amp;gt; Vect k t can be factored as f = f' . tensor, where f' :: Vect k (a,b) -&amp;gt; Vect k t is the linear function having the required action on the basis elements (a,b) of Vect k (a,b).&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/-APOApycJjVw/TWLDhMB8VEI/AAAAAAAAAIE/Q1aso4KoyJA/s1600/Tensor_vect.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="111" src="http://3.bp.blogspot.com/-APOApycJjVw/TWLDhMB8VEI/AAAAAAAAAIE/Q1aso4KoyJA/s320/Tensor_vect.png" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;For example:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;bilinear :: (Num k, Ord a, Ord b, Ord c) =&amp;gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;((a, b) -&amp;gt; Vect k c) -&amp;gt; Vect k (Either a b) -&amp;gt; Vect k c&lt;br /&gt;bilinear f = linear f . tensor&lt;br /&gt;&lt;br /&gt;dot = bilinear (\(a,b) -&amp;gt; if a == b then return () else zero)&lt;br /&gt;&lt;br /&gt;polymult = bilinear (\(E i, E j) -&amp;gt; return (E (i+j)))&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;We can check that these are indeed the same functions as we were looking at before:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; quickCheck (\x -&amp;gt; dot1 x == dot x)&lt;br /&gt;+++ OK, passed 100 tests.&lt;br /&gt;&amp;gt; quickCheck (\x -&amp;gt; polymult1 x == polymult x)&lt;br /&gt;+++ OK, passed 100 tests.&lt;/code&gt;&lt;br /&gt;So Vect k (a,b) has a special role in the theory of bilinear functions. If A = Vect k a, B = Vect k b, then we write A⊗B = Vect k (a,b) (pronounced "A tensor B").&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/-0-GnsueVhuA/TWLDzsUuWpI/AAAAAAAAAII/gq8-MsjD8GA/s1600/Tensor_product.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="174" src="http://3.bp.blogspot.com/-0-GnsueVhuA/TWLDzsUuWpI/AAAAAAAAAII/gq8-MsjD8GA/s320/Tensor_product.png" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;[By the way, it's possible that this diagram might upset category theorists - because the arrows in the diagram are not all arrows in the category of vector spaces. Specifically, note that bilinear maps are not, in general, linear. We'll come back to this in a moment.]&lt;br /&gt;&lt;br /&gt;So a bilinear map can be specified by its action on the tensor basis (a,b). This corresponds to writing out matrices. To specify any bilinear map Vect k (Either a b) -&amp;gt; Vect k t, you write out a matrix with rows indexed by a, columns indexed by b, and entries in Vect k t.&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp;b1 &amp;nbsp;b2 ...&lt;br /&gt;&amp;nbsp;a1 (t11 t12 ...)&lt;br /&gt;&amp;nbsp;a2 (t21 t22 ...)&lt;br /&gt;... (... &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;)&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;So this says that (ai,bj) is taken to tij. Then given an element of A⊕B = Vect k (Either a b), which we can think of as a vector (x1 a1 + x2 a2 + ...) in A = Vect k a together with a vector (y1 b1 + y2 b2 + ...) in B = Vect k b, then we can calculate its image under the bilinear map by doing matrix multiplication as follows:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;&amp;nbsp;a1 a2 ... &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;b1 &amp;nbsp;b2 ...&lt;br /&gt;(x1 x2 ...) &amp;nbsp;a1 (t11 t12 ...) &amp;nbsp;b1 (y1)&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; a2 (t21 t22 ...) &amp;nbsp;b2 (y2)&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;... (... &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;) ... (...)&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;(Sorry, this diagram might be a bit confusing. The ai, bj are labeling the rows and columns. The xi are the entries in a row vector in A, the yj are the entries in a column vector in B, and the tij are the entries in the matrix.)&lt;br /&gt;&lt;br /&gt;So xi ai &amp;lt;+&amp;gt; yj bj goes to xi yj tij.&lt;br /&gt;&lt;br /&gt;For example, dot product corresponds to the matrix:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;(1 0 0)&lt;br /&gt;(0 1 0)&lt;br /&gt;(0 0 1)&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;Polynomial multiplication corresponds to the matrix:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;e0 e1 e2 ...&lt;br /&gt;e0 (e0 e1 e2 ...)&lt;br /&gt;e1 (e1 e2 e3 ...)&lt;br /&gt;e2 (e2 e3 e4 ...)&lt;br /&gt;...&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;A matrix with entries in T = Vect k t is just a convenient way of specifying a linear map from A⊗B = Vect k (a,b) to T.&lt;br /&gt;&lt;br /&gt;Indeed, any matrix, provided that all the entries are in the same T, defines a bilinear function. So bilinear functions are ten-a-penny.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Now, I stated above that bilinear functions are not in general linear. For example:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; quickCheck (prop_Linear polymult)&lt;br /&gt;*** Failed! Falsifiable (after 2 tests and 2 shrinks): &lt;br /&gt;(0,Right e1,Left e1)&lt;/code&gt;&lt;br /&gt;What went wrong? Well:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; polymult (Right e1)&lt;br /&gt;0&lt;br /&gt;&amp;gt; polymult (Left e1)&lt;br /&gt;0&lt;br /&gt;&amp;gt; polymult (Left e1 &amp;lt;+&amp;gt; Right e1)&lt;br /&gt;e2&lt;/code&gt;&lt;br /&gt;So we fail to have f (a &amp;lt;+&amp;gt; b) = f a &amp;lt;+&amp;gt; f b, which is one of the requirements of a linear function.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Conversely, it's also important to realise that linear functions (on Vect k (Either a b)) are not in general bilinear. For example:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; quickCheck (prop_BilinearQn id)&lt;br /&gt;*** Failed! Falsifiable (after 2 tests): &lt;br /&gt;(1,0,0,e1,0)&lt;/code&gt;&lt;br /&gt;The problem here is:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; id $ i1 (zero &amp;lt;+&amp;gt; zero) &amp;lt;+&amp;gt; i2 e1&lt;br /&gt;Right e1&lt;br /&gt;&amp;gt; id $ (i1 zero &amp;lt;+&amp;gt; i2 e1) &amp;lt;+&amp;gt; (i1 zero &amp;lt;+&amp;gt; i2 e1)&lt;br /&gt;2Right e1&lt;/code&gt;&lt;br /&gt;So we fail to have linearity in the left hand side (or the right for that matter).&lt;br /&gt;&lt;br /&gt;Indeed we can kind of see that linearity and bilinearity are in conflict.&lt;br /&gt;- Linearity requires that f (a1 &amp;lt;+&amp;gt; a2 &amp;lt;+&amp;gt; b) = f a1 &amp;lt;+&amp;gt; f a2 &amp;lt;+&amp;gt; f b&lt;br /&gt;- Bilinearity requires that f (a1 &amp;lt;+&amp;gt; a2 &amp;lt;+&amp;gt; b) = f (a1 &amp;lt;+&amp;gt; b) &amp;lt;+&amp;gt; f (a2 &amp;lt;+&amp;gt; b)&lt;br /&gt;&lt;br /&gt;Exercise: Find a function which is both linear and bilinear.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5195188167565410449-8500486448701966655?l=haskellformaths.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://haskellformaths.blogspot.com/feeds/8500486448701966655/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://haskellformaths.blogspot.com/2011/02/tensor-products-of-vector-spaces-part-1.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5195188167565410449/posts/default/8500486448701966655'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5195188167565410449/posts/default/8500486448701966655'/><link rel='alternate' type='text/html' href='http://haskellformaths.blogspot.com/2011/02/tensor-products-of-vector-spaces-part-1.html' title='Tensor products of vector spaces, part 1'/><author><name>DavidA</name><uri>http://www.blogger.com/profile/16359932006803389458</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/-7wIViKlEwSo/TWLBKRnpMBI/AAAAAAAAAH8/q1EsnivgWvU/s72-c/Vect_product.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5195188167565410449.post-5221975016319242338</id><published>2011-02-01T21:47:00.000Z</published><updated>2011-02-01T21:47:32.121Z</updated><title type='text'>Products of lists and vector spaces</title><content type='html'>&lt;a href="http://haskellformaths.blogspot.com/2011/01/coproducts-of-lists-and-free-vector.html"&gt;Last time&lt;/a&gt;, we looked at coproducts - of sets/types, of lists, and of free vector spaces. I realised afterwards that there were a couple more things I should have said, but forgot.&lt;br /&gt;&lt;br /&gt;Recall that the coproduct of A and B is an object A+B, together with injections i1: A -&amp;gt; A+B, i2: B-&amp;gt; A+B, with the property that whenever we have arrows f: A -&amp;gt; T, g: B -&amp;gt; T, they can be factored through A+B to give an arrow f+g, satisfying f+g . i1 = f, f+g . i2 = g.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/_XQ7FznWBAYE/TTmxj8viJhI/AAAAAAAAAHQ/MAV4M0oczNQ/s1600/Coproduct.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="148" src="http://1.bp.blogspot.com/_XQ7FznWBAYE/TTmxj8viJhI/AAAAAAAAAHQ/MAV4M0oczNQ/s320/Coproduct.png" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;Firstly then, I forgot to say why we called the coproduct A+B with a plus sign. Well, it's because, via the injections i1 and i2, it contains (a copy of) A and (a copy of) B. So it's a bit like a sum of A and B.&lt;br /&gt;&lt;br /&gt;Second, I forgot to say that in the case of vector spaces, the coproduct is called the &lt;i&gt;direct sum&lt;/i&gt;, and has its own special symbol A⊕B.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Okay, so this time I want to look at products.&lt;br /&gt;&lt;br /&gt;Suppose we have objects A and B in some category. Then their product (if it exists) is an object A×B, together with projections p1: A×B -&amp;gt; A, p2: A×B -&amp;gt; B, with the following universal property: whenever we have arrows f: S -&amp;gt; A and g: S -&amp;gt; B, then they can be factored through A×B to give an arrow f×g: S -&amp;gt; A×B, such that f = p1 . f×g, g = p2 . f×g.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/_XQ7FznWBAYE/TUh5Ss3vvXI/AAAAAAAAAHo/DUn5vgfTRos/s1600/Product_Set.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="185" src="http://1.bp.blogspot.com/_XQ7FznWBAYE/TUh5Ss3vvXI/AAAAAAAAAHo/DUn5vgfTRos/s320/Product_Set.png" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;(The definitions of product and coproduct are dual to one another - the diagrams are the same but with the directions of the arrows reversed.)&lt;br /&gt;&lt;br /&gt;In the category Set, the product of sets A and B is their Cartesian product A×B. In the category Hask, of course, the product of types a and b is written (a,b), p1 is called fst, and p2 is called snd.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/_XQ7FznWBAYE/TUh5dvytFMI/AAAAAAAAAHs/5S0F2GqmsIE/s1600/Product_Hask.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="169" src="http://4.bp.blogspot.com/_XQ7FznWBAYE/TUh5dvytFMI/AAAAAAAAAHs/5S0F2GqmsIE/s320/Product_Hask.png" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;We can then define the required product map as:&lt;br /&gt;&lt;code&gt;(f .*. g) x = (f x, g x)&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;Then it should be clear that fst . (f .*. g) = f, and snd . (f .*. g) = g, as required.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Okay, so what do products look like in the category of lists (free monoids)? (Recall that in this category, the arrows are required to be monoid homomorphisms, meaning that f [] = [] and f (xs++ys) = f xs ++ f ys. It follows that we can express f = concatMap f', for some f'.)&lt;br /&gt;&lt;br /&gt;Well, the obvious thing to try as the product is the Cartesian product ([a],[b]). Is the Cartesian product of two monoids a monoid? Well yes it is actually. We could give it a monoid structure as follows:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;(as1, bs1) ++ (as2, bs2) = (as1++as2, bs1++bs2)&lt;br /&gt;[] = ([],[])&lt;/code&gt;&lt;br /&gt;This isn't valid Haskell code of course. It's just my shorthand way of expressing the following code from Data.Monoid:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;instance Monoid [a] where&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;mempty &amp;nbsp;= []&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;mappend = (++)&lt;br /&gt;&lt;br /&gt;instance (Monoid a, Monoid b) =&amp;gt; Monoid (a,b) where&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;mempty = (mempty, mempty)&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;(a1,b1) `mappend` (a2,b2) =&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;(a1 `mappend` a2, b1 `mappend` b2)&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;From these two instances, it follows that ([a],[b]) is a monoid, with monoid operations equivalent to those I gave above. (In particular, it's clear that the construction satisfies the monoid laws: associativity of ++, identity of [].)&lt;br /&gt;&lt;br /&gt;But it feels like there's something unsatisfactory about this. Wouldn't it be better for the product of list types [a] and [b] to be another list type [x], for some type x?&lt;br /&gt;&lt;br /&gt;Our first thought might be to try [(a.b)]. The product map would then need to be something like \ss -&amp;gt; zip (f ss) (g ss). However, we quickly see that this won't work: what if f ss and g ss are not the same length.&lt;br /&gt;&lt;br /&gt;What else might work? Well, if you think of ([a],[b]) as some as on the left and some bs on the right, then the answer should spring to mind. Let's try [Either a b]. We can then define:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;p1 xs = [x | Left x &amp;lt;- xs] -- this is doing a filter and a map at the same time&lt;br /&gt;p2 xs = [x | Right x &amp;lt;- xs]&lt;/code&gt;&lt;br /&gt;with the product map&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;f×g = \ss -&amp;gt; map Left (f ss) ++ map Right (g ss)&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_XQ7FznWBAYE/TUh6bPr8-jI/AAAAAAAAAHw/GievKDbbwNc/s1600/Product_ListEither.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="179" src="http://3.bp.blogspot.com/_XQ7FznWBAYE/TUh6bPr8-jI/AAAAAAAAAHw/GievKDbbwNc/s320/Product_ListEither.png" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;Then it is clear that p1 . f×g = f and p2 . f×g = g, as required.&lt;br /&gt;&lt;br /&gt;What is the relationship between ([a],[b]) and [Either a b]? Well, ([a],[b]) looks a bit like a subset of [Either a b], via the injection i (as,bs) = map Left as ++ map Right bs. However, this injection is not a monoid homomorphism, since&lt;br /&gt;&lt;code&gt;i ([],[b1] ++ ([a1],[]) /= i ([],[b1]) ++ i ([a1],[])&lt;/code&gt;&lt;br /&gt;So ([a],[b]) is not a submonoid of [Either a b].&lt;br /&gt;&lt;br /&gt;On the other hand, there is a projection p :: [Either a b] -&amp;gt; ([a],[b]), p xs = (p1 xs, p2 xs). This is a monoid homomorphism, so ([a],[b]) is a quotient of [Either a b].&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;So which is the right answer? Which of ([a],[b]) and [Either a b] is really the product of [a] and [b]?&lt;br /&gt;&lt;br /&gt;Well, it depends. It depends which category we think we're working in. If we're working in the category of monoids, then it is ([a],[b]). However, if we're working in the category of free monoids (lists), then it is [Either a b].&lt;br /&gt;&lt;br /&gt;You see, ([a],[b]) is not a &lt;i&gt;free&lt;/i&gt; monoid. What does this mean? Well, it basically means it's not a list. But how do we know that ([a],[b]) isn't equivalent to some list? And anyway, what does "free" mean in free monoid?&lt;br /&gt;&lt;br /&gt;"Free" is a concept that can be applied to many algebraic theories, not just monoids. There is more than one way to define it.&lt;br /&gt;&lt;br /&gt;An algebraic theory defines various constants and operations. In the case of monoids, there is one constant - which we may variously call [] or mempty or 0 or 1 - and one operation - ++ or mappend or + or *. Now, a given monoid may turn out to be generated by some subset of its elements - meaning that every element of the monoid can be equated with some expression in the generators, constants, and operations.&lt;br /&gt;&lt;br /&gt;For example, the monoid of natural numbers is generated by the prime numbers: every natural number is equal to some expression in 1, *, and the prime numbers. The monoid [x] is generated by the singleton lists: every element of [x] is equal to some expression in [], ++, and the singleton lists. By a slight abuse of notation, we can say that [x] is generated by x - by identifying the singleton lists with the image of x under \x -&amp;gt; [x].&lt;br /&gt;&lt;br /&gt;Then we say that a monoid is free on its generators if there are no relations among its elements other than those implied by the monoid laws. That is, no two expressions in the generators, constants, and operators are equal to one another, unless it is as a consequence of the monoid laws.&lt;br /&gt;&lt;br /&gt;For example, suppose it happens that&lt;br /&gt;&lt;code&gt;(x ++ y) ++ z = x ++ (y ++ z)&lt;/code&gt;&lt;br /&gt;That's okay, because it follows from the monoid laws. On the other hand, suppose that&lt;br /&gt;&lt;code&gt;x ++ y = y ++ x&lt;/code&gt;&lt;br /&gt;This does not follow from the monoid laws (unless x = [] or y = []), so is a non-trivial relation. (Thus the natural numbers under multiplication are not a free monoid - because they're commutative.)&lt;br /&gt;&lt;br /&gt;What about our type ([a],[b]) then? Well consider the following relations:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;(as,[]) ++ ([],bs) = ([],bs) ++ (as,[])&lt;br /&gt;(as1++as2,bs1) ++ ([],bs2) = (as1,bs1) ++ (as2,bs2) = (as1,[]) ++ (as2,bs1++bs2)&lt;/code&gt;&lt;br /&gt;We have commutativity relations between the [a] and [b] parts of the product. Crucially, these relations are not implied by the monoid structure alone. So intuitively, we can see that ([a],[b]) is not free.&lt;br /&gt;&lt;br /&gt;The "no relations" definition of free is the algebraic way to think about it. However, there is also a category theory way to define it. The basic idea is that if a monoid is free on its generators, then given any other monoid with the same generators, we can construct it as a homomorphic image of our free monoid, by "adding" the appropriate relations.&lt;br /&gt;&lt;br /&gt;In order to express this properly, we're going to need to use some category theory, and specifically the concept of the forgetful functor. Recall that given any algebraic category, such as Mon (monoids), there is a forgetful functor U: Mon -&amp;gt; Set, which consists in simply forgetting the algebraic structure. U takes objects to their underlying sets, and arrows to the underlying functions. In Haskell, U: Mon -&amp;gt; Hask consists in forgetting that our objects (types) are monoids, and forgetting that our arrows (functions) are monoid homomorphisms. (As a consequence, U is syntactically invisible in Haskell. However, to properly understand the definition of free, we have to remember that it's there.)&lt;br /&gt;&lt;br /&gt;Then, given an object x (the generators), a free monoid on x is a monoid y, together with a function i: x -&amp;gt; U y, such that whenever we have an object z in Mon and a function f': x -&amp;gt; U z, then we can lift it to a unique arrow f: y -&amp;gt; z, such that f' = Uf . i.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/_XQ7FznWBAYE/TUh7ULyTe1I/AAAAAAAAAH0/ajdu5dLSRpI/s1600/Free+monoid.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="320" src="http://4.bp.blogspot.com/_XQ7FznWBAYE/TUh7ULyTe1I/AAAAAAAAAH0/ajdu5dLSRpI/s320/Free+monoid.png" width="311" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;When we say that lists are free monoids, we mean specifically that (the type) [x] is free on (the type) x, via the function i = \x -&amp;gt; [x] (on values). This is free, because given any other monoid z, and function f' :: x -&amp;gt; z, then we can lift to a monoid homomorphism f :: [x] -&amp;gt; z, with f' = f . i. How? Well, the basic idea is to use concatMap. The type of concatMap is:&lt;br /&gt;&lt;code&gt;concatMap :: (a -&amp;gt; [b]) -&amp;gt; [a] -&amp;gt; [b]&lt;/code&gt;&lt;br /&gt;So it's doing the lifting we want. However this isn't quite right, because this assumes that the target monoid z is a list. So we need this slight variant:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;mconcatmap :: (Monoid z) =&amp;gt; (x -&amp;gt; z) -&amp;gt; [x] -&amp;gt; z&lt;br /&gt;mconcatmap f xs = mconcat (map f xs)&lt;/code&gt;&lt;br /&gt;If we set f = mconcatmap f', then we will have&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;(f . i) x&lt;br /&gt;= f (i x)&lt;br /&gt;= f [x]&lt;br /&gt;= mconcatmap f' [x]&lt;br /&gt;= mconcat (map f' [x])&lt;br /&gt;= mconcat [f' x]&lt;br /&gt;= foldr mappend mempty [f' x] &amp;nbsp;-- definition of mconcat&lt;br /&gt;= mappend mempty (f' x) &amp;nbsp;-- definition of foldr&lt;br /&gt;= f' x &amp;nbsp;-- identity of mempty&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;Now, what would it mean for ([a],[b]) to be free? Well, first, what is it going to be free on? To be free on a and b is the same as being free on Either a b (the disjoint union of a and b). Then our function i is going to be&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;i (Left a) = ([a],[])&lt;br /&gt;i (Right b) = ([],[b])&lt;/code&gt;&lt;br /&gt;Then for ([a],[b]) to be free would mean that whenever we have a function f' :: Either a b -&amp;gt; z, with z a monoid, then we can lift it to a monoid homomorphism f : ([a],[b]) -&amp;gt; z, such that f' = f . i.&lt;br /&gt;&lt;br /&gt;So can we?&lt;br /&gt;&lt;br /&gt;Well, what if our target monoid z doesn't satisfy the a-b commutativity relations that we saw. That is, what if:&lt;br /&gt;&lt;code&gt;f' a1 `mappend` f' b1 /= f' b1 `mappend` f' a1 -- (A)&lt;/code&gt;&lt;br /&gt;That would be a problem.&lt;br /&gt;&lt;br /&gt;We are required to find an f such that f' = f . i.&lt;br /&gt;We know that i a1 = ([a1],[]), i b1 = ([],[b1]). So we know that i a1 `mappend` i b1 = i b1 `mappend` i a1.&lt;br /&gt;f is required to be a monoid homomorphism, so by definition:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;f (i a1 `mappend` i b1) = f (i a1) `mappend` f (i b1)&lt;br /&gt;f (i b1 `mappend` i a1) = f (i b1) `mappend` f (i a1)&lt;/code&gt;&lt;br /&gt;But then since the two left hand sides are equal, then so are the two right hand sides, giving:&lt;br /&gt;&lt;code&gt;f (i a1) `mappend` f (i b1) = f (i b1) `mappend` f (i a1) -- (B)&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;But now we have a contradiction between (A) and (B), since f' = f . i.&lt;br /&gt;&lt;br /&gt;So for a concrete counterexample, showing that ([a],[b]) is not free, all we need is a monoid z in which the a-b commutativity relations don't hold. Well that's easy: [Either a b]. Just take f' :: Either a b -&amp;gt; [Either a b], f' = \x -&amp;gt; [x]. Now try to find an f :: ([a],[b]) -&amp;gt; [Either a b], with f' = f . i.&lt;br /&gt;&lt;br /&gt;The obvious f is f (as,bs) = map Left as ++ map Right bs.&lt;br /&gt;But the problem is that this f isn't a monoid homomorphism:&lt;br /&gt;&lt;code&gt;f ( ([],[b1]) `mappend` ([a1],[]) ) /= f ([],[b1]) `mappend` f ([a1],[])&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Notice the connection between the two definitions of free. It was because ([a],[b]) had non-trivial relations that we couldn't lift a function to a monoid homomorphism in some cases. The cases where we couldn't were where the target monoid z didn't satisfy the relations.&lt;br /&gt;&lt;br /&gt;Okay, so sorry, that got a bit technical. To summarise, the product of [a], [b] in the category of lists / free monoids is [Either a b].&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;What about vector spaces? What is the product of Vect k a and Vect k b?&lt;br /&gt;&lt;br /&gt;Well, similarly to lists, we can make (Vect k a, Vect k b) into a vector space, by defining&lt;br /&gt;0 = (0,0)&lt;br /&gt;(a1,b1) + (a2,b2) = (a1+a2,b1+b2)&lt;br /&gt;k(a,b) = (ka,kb)&lt;br /&gt;&lt;br /&gt;Exercise: Show that with these definitions, fst, snd and f .*. g are vector space morphisms (linear maps).&lt;br /&gt;&lt;br /&gt;Alternatively, Vect k (Either a b) is of course a vector space. We can define:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;p1 = linear p1' where&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;p1' (Left a) = return a&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;p1' (Right b) = zero&lt;br /&gt;&lt;br /&gt;p2 = linear p2' where&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;p2' (Left a) = zero&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;p2' (Right b) = return b&lt;br /&gt;&lt;br /&gt;prodf f g = linear fg' where&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;fg' b = fmap Left (f (return b)) &amp;lt;+&amp;gt; fmap Right (g (return b))&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/_XQ7FznWBAYE/TUh8jfbnGhI/AAAAAAAAAH4/C1dfdvwkTcw/s1600/Product_VectEither.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="119" src="http://4.bp.blogspot.com/_XQ7FznWBAYE/TUh8jfbnGhI/AAAAAAAAAH4/C1dfdvwkTcw/s320/Product_VectEither.png" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;In this case p1, p2, f×g are vector space morphisms by definition, since they were constructed using "linear". How do we know that they satisfy the product property? Well, this looks like a job for QuickCheck. The following code builds on the code we developed last time:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;prop_Product (f',g',x) =&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;f x == (p1 . fg) x &amp;amp;&amp;amp;&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;g x == (p2 . fg) x&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;where f = linfun f'&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;g = linfun g'&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;fg = prodf f g&lt;br /&gt;&lt;br /&gt;newtype SBasis = S Int deriving (Eq,Ord,Show,Arbitrary)&lt;br /&gt;&lt;br /&gt;prop_ProductQn (f,g,x) = prop_Product (f,g,x)&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;where types = (f,g,x) :: (LinFun Q SBasis ABasis, LinFun Q SBasis BBasis, Vect Q SBasis)&lt;br /&gt;&lt;br /&gt;&amp;gt; quickCheck prop_ProductQn&lt;br /&gt;+++ OK, passed 100 tests.&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;As we did with lists, we can ask again, which is the correct definition of product, (Vect k a, Vect k b), or Vect k (Either a b)?&lt;br /&gt;&lt;br /&gt;Well, in this case it turns out that they are equivalent to one another, via the mutually inverse isomorphisms&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;\(va,vb) -&amp;gt; fmap Left va &amp;lt;+&amp;gt; fmap Right vb&lt;br /&gt;\v -&amp;gt; (p1 v, p2 v)&lt;/code&gt;&lt;br /&gt;Unlike in the list case, these are both vector space morphisms (linear functions).&lt;br /&gt;&lt;br /&gt;Why the difference? Why does it work out for vector spaces whereas it didn't for lists? Well, I think it's basically because vector spaces are commutative.&lt;br /&gt;&lt;br /&gt;(It is also the case that vector spaces are always free on a basis. So since we have an obvious bijection between the bases of (Vect k a, Vect k b) and Vect k (Either a b), then we must have an isomorphism between the vector spaces.)&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Now, we're left with a little puzzle. We have found that both the product and the coproduct of two vector spaces is Vect k (Either a b). So we still haven't figured out what Vect k (a,b) represents.&lt;br /&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5195188167565410449-5221975016319242338?l=haskellformaths.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://haskellformaths.blogspot.com/feeds/5221975016319242338/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://haskellformaths.blogspot.com/2011/02/products-of-lists-and-vector-spaces.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5195188167565410449/posts/default/5221975016319242338'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5195188167565410449/posts/default/5221975016319242338'/><link rel='alternate' type='text/html' href='http://haskellformaths.blogspot.com/2011/02/products-of-lists-and-vector-spaces.html' title='Products of lists and vector spaces'/><author><name>DavidA</name><uri>http://www.blogger.com/profile/16359932006803389458</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_XQ7FznWBAYE/TTmxj8viJhI/AAAAAAAAAHQ/MAV4M0oczNQ/s72-c/Coproduct.png' height='72' width='72'/><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5195188167565410449.post-9216149184343758597</id><published>2011-01-21T16:35:00.001Z</published><updated>2011-01-21T16:38:30.043Z</updated><title type='text'>Coproducts of lists and free vector spaces</title><content type='html'>&lt;a href="http://haskellformaths.blogspot.com/2011/01/free-vector-space-on-type-part-2.html"&gt;Recently&lt;/a&gt; we've been looking at vector spaces. We defined a type Vect k b, representing the free k-vector space over a type b - meaning, the vector space consisting of k-linear combinations of the inhabitants of b - so b is the basis. Like any good mathematical structure, vector spaces admit various new-from-old constructions. Last time I posed the puzzle, what do Vect k (Either a b) and Vect k (a,b) represent? As we're aiming for quantum algebra, I'm going to frame the answers in the language of category theory.&lt;br /&gt;&lt;br /&gt;Suppose we have objects A and B in some category. Then their coproduct (if it exists) is an object A+B, together with injections i1: A -&amp;gt; A+B, i2: B -&amp;gt; A+B, with the following universal property: whenever we have arrows f: A -&amp;gt; T and g: B -&amp;gt; T, they can be factored through A+B to give an arrow f+g: A+B -&amp;gt; T, such that f = f+g . i1, g = f+g . i2.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/_XQ7FznWBAYE/TTmxj8viJhI/AAAAAAAAAHQ/MAV4M0oczNQ/s1600/Coproduct.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="148" src="http://1.bp.blogspot.com/_XQ7FznWBAYE/TTmxj8viJhI/AAAAAAAAAHQ/MAV4M0oczNQ/s320/Coproduct.png" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;Notice that this definition does not give us a &lt;i&gt;construction&lt;/i&gt; for the coproduct. In any given category, it doesn't tell us how to construct the coproduct, or even if there is one. Even if we have a construction for the coproduct in one category, there is no guarantee that it, or something similar, will work in another related category.&lt;br /&gt;&lt;br /&gt;In the category Set, the coproduct of sets A and B is their disjoint union. In order to see this, we can work in the category Hask of Haskell types. We can regard Hask as a subcategory of Set, by identifying a type with its set of inhabitants. If a and b are Haskell types / sets of inhabitants, then their disjoint union is Either a b. The elements of Either a b can be from either a or b (hence, from their union), and they are kept disjoint in the left and right parts of the union (so that for example Either a a contains two copies of a, not just one). The injections i1 and i2 are then the value constructors Left and Right. Given f :: a -&amp;gt; t, g :: b -&amp;gt; t, we define:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;(f .+. g) (Left a) = f a&lt;br /&gt;(f .+. g) (Right b) = g b&lt;/code&gt;&lt;br /&gt;Then it should be clear that (f .+. g) . Left = f, and (f .+. g) . Right = g, as required&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/_XQ7FznWBAYE/TTmx2nMkRkI/AAAAAAAAAHU/1NXMJpw8_b8/s1600/Coproduct_Either.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="148" src="http://4.bp.blogspot.com/_XQ7FznWBAYE/TTmx2nMkRkI/AAAAAAAAAHU/1NXMJpw8_b8/s320/Coproduct_Either.png" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;In a moment, we'll look at coproducts of vector spaces, but first, as a warmup, let's think about the coproducts in a simpler category: lists / free monoids. Recall that a monoid is an algebraic structure having an associative operation ++, and an identity for ++ called []. (That is [] ++ x = x = x ++ [].)&lt;br /&gt;&lt;br /&gt;A monoid homomorphism is a function f :: [a] -&amp;gt; [b] such that f [] = [] and f (a1 ++ a2) = f a1 ++ f a2. With a little thought, you should be able to convince yourself that all monoid homomorphisms are of the form concatMap f', where f' :: a -&amp;gt; [b]. (Which is, incidentally, the same as saying that they are of the form (&amp;gt;&amp;gt;= f').) In the category of free monoids, the arrows are constrained to be monoid homomorphisms.&lt;br /&gt;&lt;br /&gt;So for our coproduct, we are looking for an object satisfying the universal property shown in the following diagram:&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/_XQ7FznWBAYE/TTmyCCzMNCI/AAAAAAAAAHY/2ax929JPXSE/s1600/Coproduct_ListQuestion.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="148" src="http://2.bp.blogspot.com/_XQ7FznWBAYE/TTmyCCzMNCI/AAAAAAAAAHY/2ax929JPXSE/s320/Coproduct_ListQuestion.png" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;Perhaps the first thing to try is the disjoint union: Either [a] [b]. This is the coproduct of [a] and [b] as sets, but is it also their coproduct as monoids? Well, let's see.&lt;br /&gt;&lt;br /&gt;Hmm, firstly, it's not a list (doh!): so you can't apply ++ to it, and it doesn't have a []. However, before we give up on that, let's consider whether we're asking the right question. Perhaps we should only be requiring that Either [a] [b] is (or can be made to be) a Monoid instance. Is the disjoint union of two monoids a monoid? Suppose we try to define ++ for it:&lt;br /&gt;Left a1 ++ Left a2 = Left (a1++a2)&lt;br /&gt;Right b1 ++ Right b2 = Right (b1++b2)&lt;br /&gt;But now we begin to see the problem. What are we going to do for Left as ++ Right bs? There's nothing sensible we can do, because our disjoint union Either [a] [b] does not allow mixed lists of as and bs.&lt;br /&gt;&lt;br /&gt;However, this immediately suggests that we would be better off looking at [Either a b] - the free monoid over the disjoint union of a and b. This is a list - and it does allow us to form mixed lists of as and bs.&lt;br /&gt;&lt;br /&gt;We can then set i1 = map Left, i2 = map Right, and these are list homomorphisms (they interact with [] and ++ in the required way). Then we can define:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;h = concatMap h' where&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;h' (Left a) = f' a&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;h' (Right b) = g' b&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;So our suspicion is that [Either a b] is the coproduct, with h the required coproduct map.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/_XQ7FznWBAYE/TTmyVzlXgOI/AAAAAAAAAHc/h0fPEl-ivp0/s1600/Coproduct_ListEither.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="148" src="http://1.bp.blogspot.com/_XQ7FznWBAYE/TTmyVzlXgOI/AAAAAAAAAHc/h0fPEl-ivp0/s320/Coproduct_ListEither.png" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;Let's just check the coproduct conditions:&lt;br /&gt;h . i1&lt;br /&gt;= concatMap h' . map Left&lt;br /&gt;= concatMap f'&lt;br /&gt;= f&lt;br /&gt;and similarly, h . i2 = g, as required.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Notice that the disjoint union Either [a] [b] is (isomorphic to) a subset of the free monoid [Either a b], via Left as -&amp;gt; map Left as; Right bs -&amp;gt; map Right bs. So we were thinking along the right lines in suggesting Either [a] [b]. The problem was that Either [a] [b] isn't a monoid, it's only a set. We can regard [Either a b] as the &lt;i&gt;closure&lt;/i&gt; of Either [a] [b] under the monoid operations. [Either a b] is the smallest free monoid containing the disjoint union Either [a] [b] (modulo isomorphism of Haskell types).&lt;br /&gt;&lt;br /&gt;(This is a bit hand-wavy. This idea of closure under algebraic operations makes sense in maths / set theory, but I'm not quite sure how best to express it in Haskell / type theory. If anyone has any suggestions, I'd be pleased to hear them.)&lt;br /&gt;&lt;br /&gt;Okay, so what about a coproduct in the category of k-vector spaces. First, recall that the arrows in this category are linear maps f satisfying f (a+b) = f a + f b, f (k*a) = k * f a. Again, it should be obvious that a linear map is fully determined by its action on basis elements - so every linear map f :: Vect k a -&amp;gt; Vect k b can be expressed as linear f' where f' :: a -&amp;gt; Vect k b.&lt;br /&gt;&lt;br /&gt;Recall that we defined linear f' last time - it's really just (&amp;gt;&amp;gt;= f'), but followed by reduction to normal form:&lt;br /&gt;&lt;code&gt;linear f v = nf $ v &amp;gt;&amp;gt;= f&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;Okay, so vector spaces of course have underlying sets, so we will expect the coproduct of Vect k a and Vect k b to contain the disjoint union Either (Vect k a) (Vect k b). As with lists though, we will have the problem that this is not closed under vector addition - we can't add an element of Vect k a to an element of Vect k b within this type.&lt;br /&gt;&lt;br /&gt;So as before, let's try Vect k (Either a b). Then we can set i1 = fmap Left, i2 = fmap Right, and they are both linear maps by construction. (We don't need to call nf afterwards, since Left and Right are order-preserving.)&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;i1 = fmap Left&lt;br /&gt;i2 = fmap Right&lt;/code&gt;&lt;br /&gt;Then we can define the coproduct map (f+g) as follows:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;coprodf f g = linear fg' where&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;fg' (Left a) = f (return a)&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;fg' (Right b) = g (return b)&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_XQ7FznWBAYE/TTm2O7Lg4FI/AAAAAAAAAHg/n9XoBxGxe5U/s1600/Coproduct_VectEither.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="114" src="http://3.bp.blogspot.com/_XQ7FznWBAYE/TTm2O7Lg4FI/AAAAAAAAAHg/n9XoBxGxe5U/s320/Coproduct_VectEither.png" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;We need to verify that this satisfies the coproduct conditions: f+g . i1 = f and f+g . i2 = g. It would be nice to test this using a QuickCheck property. In order to do that, we need a way to construct arbitrary &lt;i&gt;linear&lt;/i&gt; maps. (This is not the same thing as arbitrary &lt;i&gt;functions&lt;/i&gt; Vect k a -&amp;gt; Vect k b, so I don't think that I can use QuickCheck's Coarbitrary class - but the experts may know better.) Luckily, that is fairly straightforward: we can construct arbitrary lists [(a, Vect k b)], and then each pair (a,vb) can be interpreted as saying that the basis element a is taken to the vector vb.&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;type LinFun k a b = [(a, Vect k b)]&lt;br /&gt;&lt;br /&gt;linfun :: (Eq a, Ord b, Num k) =&amp;gt; LinFun k a b -&amp;gt; Vect k a -&amp;gt; Vect k b&lt;br /&gt;linfun avbs = linear f where&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;f a = case lookup a avbs of&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Just vb -&amp;gt; vb&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Nothing -&amp;gt; zero&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;With that preparation, here is a QuickCheck property that expresses the coproduct condition.&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;prop_Coproduct (f',g',a,b) =&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;f a == (fg . i1) a &amp;amp;&amp;amp;&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;g b == (fg . i2) b&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;where f = linfun f'&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;g = linfun g'&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;fg = coprodf f g&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;That property can be used for any vector spaces. Let's define some particular vector spaces to do the test on.&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;newtype ABasis = A Int deriving (Eq,Ord,Show,Arbitrary) -- GeneralizedNewtypeDeriving&lt;br /&gt;newtype BBasis = B Int deriving (Eq,Ord,Show,Arbitrary)&lt;br /&gt;newtype TBasis = T Int deriving (Eq,Ord,Show,Arbitrary)&lt;br /&gt;&lt;br /&gt;instance (Num k, Ord b, Arbitrary k, Arbitrary b) =&amp;gt; Arbitrary (Vect k b) where&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;arbitrary = do ts &amp;lt;- arbitrary :: Gen [(b, k)] -- ScopedTypeVariables&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; return $ nf $ V ts&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;(I should emphasize that not all vector space bases are newtypes around Int - we can have finite bases, or bases with other interesting internal structure, as we will see in later installments. For the purposes of this test however, I think this is sufficient.)&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;prop_CoproductQn (f,g,a,b) = prop_Coproduct (f,g,a,b)&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;where types = (f,g,a,b) :: (LinFun Q ABasis TBasis, LinFun Q BBasis TBasis, Vect Q ABasis, Vect Q BBasis)&lt;br /&gt;&lt;br /&gt;&amp;gt; quickCheck prop_CoproductQn&lt;br /&gt;+++ OK, passed 100 tests.&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;So we do indeed have a coproduct on vector spaces. To summarise: The coproduct of free vector spaces is the free vector space on the coproduct (of the bases).&lt;br /&gt;&lt;br /&gt;Next time, we'll look at products - where there might be a small surprise.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5195188167565410449-9216149184343758597?l=haskellformaths.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://haskellformaths.blogspot.com/feeds/9216149184343758597/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://haskellformaths.blogspot.com/2011/01/coproducts-of-lists-and-free-vector.html#comment-form' title='5 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5195188167565410449/posts/default/9216149184343758597'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5195188167565410449/posts/default/9216149184343758597'/><link rel='alternate' type='text/html' href='http://haskellformaths.blogspot.com/2011/01/coproducts-of-lists-and-free-vector.html' title='Coproducts of lists and free vector spaces'/><author><name>DavidA</name><uri>http://www.blogger.com/profile/16359932006803389458</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_XQ7FznWBAYE/TTmxj8viJhI/AAAAAAAAAHQ/MAV4M0oczNQ/s72-c/Coproduct.png' height='72' width='72'/><thr:total>5</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5195188167565410449.post-1294793580226717866</id><published>2011-01-10T21:26:00.000Z</published><updated>2011-01-10T21:26:41.201Z</updated><title type='text'>The free vector space on a type, part 2</title><content type='html'>&lt;a href="http://haskellformaths.blogspot.com/2010/12/free-vector-space-on-type-part-1.html"&gt;Last time&lt;/a&gt;, I defined the free k-vector space over a type b:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;data Vect k b = V [(b,k)]&lt;/code&gt;&lt;br /&gt;Elements of Vect k b represent formal sums of scalar multiples of elements of b, where the scalars are taken from the field k. (For example, V [(E1,5),(E3,2)] represents the formal sum 5E1+2E3.) Thus b is the basis for the vector space.&lt;br /&gt;&lt;br /&gt;We saw that there is a functor (in the mathematical sense) from the category &lt;b&gt;Hask&lt;/b&gt; (of Haskell types and functions) to the category &lt;b&gt;k-Vect&lt;/b&gt; (of k-vector spaces and k-linear maps). In maths, we would usually represent a functor by a capital letter, eg F, and apply it to objects and arrows by prefixing. For example, if a, b are objects in the source category, then the image objects in the target category would be called F a and F b. If f :: a -&amp;gt; b is an arrow in the source category, then the image arrow in the target category would be called F f.&lt;br /&gt;&lt;br /&gt;Haskell allows us to declare a type constructor as a Functor instance, and give it an implementation of fmap. This corresponds to describing a functor (in the mathematical sense), but with a different naming convention. In our case, we declared the type constructor (Vect k) as a Functor instance. So the functor's action on objects is called (Vect k) - given any object b in &lt;b&gt;Hask&lt;/b&gt; (ie a Haskell type), we can apply (Vect k) to get an object Vect k b in &lt;b&gt;k-Vect&lt;/b&gt; (ie a k-vector space). However, the functor's action on arrows is called fmap - given any arrow f :: a -&amp;gt; b in &lt;b&gt;Hask&lt;/b&gt; (ie a Haskell function), we can apply fmap to get an arrow fmap f :: Vect k a -&amp;gt; Vect k b in &lt;b&gt;k-Vect&lt;/b&gt; (ie a k-linear map).&lt;br /&gt;&lt;br /&gt;Haskell allows us to declare &lt;i&gt;only&lt;/i&gt; type constructors as functors. In maths, there are many functors which are not of this form. For example, the simplest is the &lt;i&gt;forgetful&lt;/i&gt; functor. Given any algebraic category &lt;b&gt;A&lt;/b&gt;, we have the forgetful functor &lt;b&gt;A&lt;/b&gt; -&amp;gt; &lt;b&gt;Set&lt;/b&gt;, which simply forgets the algebraic structure. The forgetful functor takes the objects of &lt;b&gt;A&lt;/b&gt; to their underlying sets, and the arrows of &lt;b&gt;A&lt;/b&gt; to the underlying functions.&lt;br /&gt;&lt;br /&gt;For example, in our case, the forgetful functor &lt;b&gt;k-Vect&lt;/b&gt; -&amp;gt; &lt;b&gt;Hask&lt;/b&gt; consists in forgetting that the objects Vect k b are vector spaces (with addition, scalar multiplication etc defined), and considering them just as Haskell types; and forgetting that the arrows Vect k a -&amp;gt; Vect k b are linear maps, and considering them just as Haskell functions.&lt;br /&gt;&lt;br /&gt;(Notice that when working in Haskell, the category &lt;b&gt;Hask&lt;/b&gt; acts as a kind of stand-in for the category &lt;b&gt;Set&lt;/b&gt;. If we identify a type with its inhabitants (which form a set), then &lt;b&gt;Hask&lt;/b&gt; is something like the computable subcategory of &lt;b&gt;Set&lt;/b&gt;.)&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;So we actually have functors going both ways. We have the &lt;i&gt;free&lt;/i&gt; functor F: &lt;b&gt;Hask&lt;/b&gt; -&amp;gt; &lt;b&gt;k-Vect&lt;/b&gt; - which takes types and functions to free k-vector spaces and linear maps. And we have the forgetful functor G: &lt;b&gt;k-Vect&lt;/b&gt; -&amp;gt; &lt;b&gt;Hask&lt;/b&gt; - which takes free k-vector spaces and linear maps, and just forgets their algebraic structure, so that they're just types and functions again.&lt;br /&gt;&lt;br /&gt;Note that these two functors are &lt;i&gt;not&lt;/i&gt; mutual inverses. (G . F) is not the identity on Hask - indeed it takes b to Vect k b, both considered as objects in &lt;b&gt;Hask&lt;/b&gt; (and similarly with arrows). The two functors are however &lt;i&gt;adjoint&lt;/i&gt;. (I'm not going to explain what this means, but see &lt;a href="http://en.wikipedia.org/wiki/Adjoint_functors"&gt;Wikipedia&lt;/a&gt;, or most books on category theory.)&lt;br /&gt;&lt;br /&gt;Whenever we have an adjunction, then in fact we also have a monad. Here's the definition for our case:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;instance Num k =&amp;gt; Monad (Vect k) where&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;return a = V [(a,1)]&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;V ts &amp;gt;&amp;gt;= f = V $ concat [ [(b,y*x) | let V us = f a, (b,y) &amp;lt;- us] | (a,x) &amp;lt;- ts]&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;This monad is most easily understood using the &lt;a href="http://www.haskell.org/haskellwiki/Monads_as_containers"&gt;monad as container&lt;/a&gt; analogy. A free k-vector space over b is just a container of elements of b. Okay, so it's a slightly funny sort of container. It most resembles a multiset or bag - an unordered container in which you can have more than one of each element. However, free k-vector spaces go further. A free Q-vector space is a container in which we're allowed to have fractional or negative amounts of each basis element, such as 1/2 e1 - 3 e2. In a free C-vector space, we're even allowed imaginary amounts of each element, such as i e3.&lt;br /&gt;&lt;br /&gt;These oddities aside though, free k-vector spaces are monads in much the same way as any other container. For example, let's compare them to the list monad.&lt;br /&gt;- For a container monad, &lt;code&gt;return&lt;/code&gt; means "put into the container". For List, return a is [a]. For k-Vect, return a is 1 a (where 1 is the scalar 1 in our field k).&lt;br /&gt;- For container monads, it's most natural to look next at the &lt;code&gt;join&lt;/code&gt; operation, which combines a container of containers into a single container. For List, join = concat. For k-Vect, join combines a linear combination of linear combinations into a single linear combination (in the obvious way).&lt;br /&gt;- Finally there is bind or &lt;code&gt;(&amp;gt;&amp;gt;=)&lt;/code&gt;. bind can be defined in terms of join and fmap:&lt;br /&gt;&lt;code&gt;x &amp;gt;&amp;gt;= f = join ((fmap f) x)&lt;/code&gt;&lt;br /&gt;For lists, bind is basically concatMap. For k-Vect, bind corresponds to extending a function on basis elements to a function on vectors "by linearity". That is, if f :: a -&amp;gt; Vect k b, then (&amp;gt;&amp;gt;= f) :: Vect k a -&amp;gt; Vect k b is defined (in effect, by structural induction) so as to be linear, by saying that (&amp;gt;&amp;gt;= f) 0 = 0, (&amp;gt;&amp;gt;= f) (k a) = k (f a), (&amp;gt;&amp;gt;= f) (a + b) = f a + f b.&lt;br /&gt;&lt;br /&gt;So k-Vect is like just a strange sort of list. Think of them as distant cousins. Incidentally, this is not only because they're both containers (if you believed my story about k-Vect being a container). There's another reason: Both the List and k-Vect monads arise from free-forgetful adjunctions. In the case of lists, the list datatype is the free monoid, and the list monad arises from the free-forgetful adjunction for monoids.&lt;br /&gt;&lt;br /&gt;That's most of what I wanted to say for now. However, there's one small detail to add. Last time we defined a normal form for elements of Vect k b, in which the basis elements are in order, without repeats, and none of the scalars are zero. In order to calculate this normal form, we require an Ord instance for b. Unfortunately, Haskell doesn't let us specify that when defining the Monad instance for (Vect k). So whenever we use (&amp;gt;&amp;gt;=), we should call nf afterwards, to put the result in normal form.&lt;br /&gt;&lt;br /&gt;For this reason, we define a convenience function:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;linear :: (Ord b, Num k) =&amp;gt; (a -&amp;gt; Vect k b) -&amp;gt; Vect k a -&amp;gt; Vect k b&lt;br /&gt;linear f v = nf $ v &amp;gt;&amp;gt;= f&lt;/code&gt;&lt;br /&gt;Given f :: a -&amp;gt; Vect k b, linear f :: Vect k a -&amp;gt; Vect k b is the extension of f from basis elements to vectors "by linearity". Hence, linear f is guaranteed to be linear, by construction. We can confirm this on an example using the QuickCheck properties we defined last time.&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; let f (E 1) = e1; f (E 2) = e1 &amp;lt;+&amp;gt; e2; f _ = zero&lt;br /&gt;&amp;gt; (linear f) (e1 &amp;lt;+&amp;gt; e2)&lt;br /&gt;2e1+e2&lt;br /&gt;&amp;gt; quickCheck (prop_LinearQn (linear f))&lt;br /&gt;+++ OK, passed 100 tests.&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;Acknowledgement: I'm partially retreading in Dan Piponi's steps here. He first described the free vector space monad &lt;a href="http://blog.sigfpe.com/2007/02/monads-for-vector-spaces-probability.html"&gt;here&lt;/a&gt;.&amp;nbsp;When I come to discuss the connection between quantum algebra and knot theory, I'll be revisiting some more material that Dan sketched out &lt;a href="http://blog.sigfpe.com/2008/10/untangling-with-continued-fractions.html"&gt;here&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;Exercise for next time: What does Vect k (Either a b) represent? What does Vect k (a,b) represent?&lt;br /&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5195188167565410449-1294793580226717866?l=haskellformaths.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://haskellformaths.blogspot.com/feeds/1294793580226717866/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://haskellformaths.blogspot.com/2011/01/free-vector-space-on-type-part-2.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5195188167565410449/posts/default/1294793580226717866'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5195188167565410449/posts/default/1294793580226717866'/><link rel='alternate' type='text/html' href='http://haskellformaths.blogspot.com/2011/01/free-vector-space-on-type-part-2.html' title='The free vector space on a type, part 2'/><author><name>DavidA</name><uri>http://www.blogger.com/profile/16359932006803389458</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5195188167565410449.post-4672069097854647283</id><published>2010-12-13T19:53:00.000Z</published><updated>2010-12-13T19:53:55.048Z</updated><title type='text'>The free vector space on a type, part 1</title><content type='html'>As I mentioned last time, I want to spend the next few posts talking about quantum algebra. Well, we've got to start somewhere, so let's start with vector spaces.&lt;br /&gt;&lt;br /&gt;You probably know what a vector space is. It's what is sounds like: a space of vectors, that you can add together, or multiply by scalars (that is, real numbers, or more generally, elements of some field k). Here's the official definition:&lt;br /&gt;An additive (or Abelian) group is a set with a binary operation called addition, such that&lt;br /&gt;- addition is associative: x+(y+z) = (x+y)+z&lt;br /&gt;- addition is commutative: x+y = y+x&lt;br /&gt;- there is an additive identity: x+0 = x = 0+x&lt;br /&gt;- there are additive inverses: x+(-x) = 0 = (-x)+x&lt;br /&gt;A vector space over a field k is an additive group V, together with an operation k * V -&amp;gt; V called scalar multiplication, such that:&lt;br /&gt;- scalar multiplication distributes over vector addition: a(x+y) = ax+ay&lt;br /&gt;- scalar multiplication distributes over scalar addition: (a+b)x = ax+bx&lt;br /&gt;- associativity: (ab)x = a(bx)&lt;br /&gt;- unit: 1a = a&lt;br /&gt;&lt;br /&gt;There are some obvious examples:&lt;br /&gt;- R^2 is a 2-dimensional vector space over the reals R, R^3 is a 3-dimensional vector space. (I'm not going to define dimension quite yet, but hopefully it's intuitively obvious what it means.)&lt;br /&gt;- R^n is an R-vector space for any n.&lt;br /&gt;- Indeed, k^n is a k-vector space for any field k.&lt;br /&gt;&lt;br /&gt;Some slightly more interesting examples:&lt;br /&gt;- C is a 2-dimensional vector space over R. (The reason it's more interesting is that of course C possesses additional algebraic structure, beyond the vector space structure.)&lt;br /&gt;- 2*2 matrices over k form a 4-dimensional k-vector space.&lt;br /&gt;- Polynomials in X with coefficients in k form an (infinite dimensional) k-vector space.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;If we wanted to code the above definition into Haskell, probably the first idea that would come to mind would be to use type classes:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;class AddGrp a where&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;add :: a -&amp;gt; a -&amp;gt; a&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;zero :: a&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;neg :: a -&amp;gt; a -- additive inverse&lt;br /&gt;&lt;br /&gt;class (Field k, AddGrp v) =&amp;gt; VecSp k v where&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;smult :: k -&amp;gt; v -&amp;gt; v -- scalar multiplication&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;(Type classes similar to these are defined are in the &lt;a href="http://hackage.haskell.org/package/vector-space"&gt;vector-space&lt;/a&gt; package.)&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;For most vector spaces that one encounters in "real life", there is some set of elements, usually obvious, which form a "basis" for the vector space, meaning that all elements can be expressed as linear combinations of basis elements. For example, in R^3, the obvious basis is {(0,0,1), (0,1,0), (1,0,0)}. Any element (x,y,z) of R^3 can be expressed as the linear combination x(1,0,0)+y(0,1,0)+z(0,0,1).&lt;br /&gt;&lt;br /&gt;(Mathematicians would want to stress that there are other bases for R^3 that would serve equally well, and indeed, that a significant part of the theory of vector spaces can go through without even talking about bases. However, for our purposes - we want to write code to calculate in vector spaces - then working with a basis is natural.)&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Okay, so we want a way to build a vector space from a basis. (More specifically, a k-vector space, for some given field k.) What sorts of things shall we allow as our basis? Well, why not just allow any type, whatsoever:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;module Math.Algebras.VectorSpace where&lt;br /&gt;&lt;br /&gt;import qualified Data.List as L&lt;br /&gt;&lt;br /&gt;data Vect k b = V [(b,k)] deriving (Eq,Ord)&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;This says that a k-vector space over basis b consists of a linear combination of elements of b. (So the [(b,k)] is to be thought of as a sum, with each (b,k) pair representing a basis element in b multiplied by a scalar coefficient in k.)&lt;br /&gt;&lt;br /&gt;For example, we can define the "boring basis" type, which just consists of numbered basis elements:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;newtype EBasis = E Int deriving (Eq,Ord)&lt;br /&gt;&lt;br /&gt;instance Show EBasis where show (E i) = "e" ++ show i&lt;br /&gt;&lt;br /&gt;e i = return (E i) -- don't worry about what "return" is doing here for the moment&lt;br /&gt;e1 = e 1&lt;br /&gt;e2 = e 2&lt;br /&gt;e3 = e 3&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;Then a typical element of Vect Double EBasis is:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; :load Math.Algebras.VectorSpace&lt;br /&gt;&amp;gt; V [(E 1, 0.5), (E 3, 0.7)]&lt;br /&gt;0.5e1+0.7e3&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;So of course, the Show instances for EBasis (see above), and Vect k b (not shown) are coming into play here.&lt;br /&gt;&lt;br /&gt;How do we know that this &lt;i&gt;is&lt;/i&gt; a vector space? Well actually, it's not yet, because we haven't defined the addition and scalar multiplication operations on it. So, without further ado:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;infixr 7 *&amp;gt;&lt;br /&gt;infixl 6 &amp;lt;+&amp;gt;&lt;br /&gt;&lt;br /&gt;-- |The zero vector&lt;br /&gt;zero :: Vect k b&lt;br /&gt;zero = V []&lt;br /&gt;&lt;br /&gt;-- |Addition of vectors&lt;br /&gt;add :: (Ord b, Num k) =&amp;gt; Vect k b -&amp;gt; Vect k b -&amp;gt; Vect k b&lt;br /&gt;add (V ts) (V us) = V $ addmerge ts us&lt;br /&gt;&lt;br /&gt;-- |Addition of vectors (same as add)&lt;br /&gt;(&amp;lt;+&amp;gt;) :: (Ord b, Num k) =&amp;gt; Vect k b -&amp;gt; Vect k b -&amp;gt; Vect k b&lt;br /&gt;(&amp;lt;+&amp;gt;) = add&lt;br /&gt;&lt;br /&gt;addmerge ((a,x):ts) ((b,y):us) =&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;case compare a b of&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;LT -&amp;gt; (a,x) : addmerge ts ((b,y):us)&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;EQ -&amp;gt; if x+y == 0 then addmerge ts us else (a,x+y) : addmerge ts us&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;GT -&amp;gt; (b,y) : addmerge ((a,x):ts) us&lt;br /&gt;addmerge ts [] = ts&lt;br /&gt;addmerge [] us = us&lt;br /&gt;&lt;br /&gt;-- |Negation of vector&lt;br /&gt;neg :: (Num k) =&amp;gt; Vect k b -&amp;gt; Vect k b&lt;br /&gt;neg (V ts) = V $ map (\(b,x) -&amp;gt; (b,-x)) ts&lt;br /&gt;&lt;br /&gt;-- |Scalar multiplication (on the left)&lt;br /&gt;smultL :: (Num k) =&amp;gt; k -&amp;gt; Vect k b -&amp;gt; Vect k b&lt;br /&gt;smultL 0 _ = zero -- V []&lt;br /&gt;smultL k (V ts) = V [(ei,k*xi) | (ei,xi) &amp;lt;- ts]&lt;br /&gt;&lt;br /&gt;-- |Same as smultL. Mnemonic is "multiply through (from the left)"&lt;br /&gt;(*&amp;gt;) :: (Num k) =&amp;gt; k -&amp;gt; Vect k b -&amp;gt; Vect k b&lt;br /&gt;(*&amp;gt;) = smultL&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;A few things to mention:&lt;br /&gt;- First, note that we required a Num instance for k. Strictly speaking, as we stated that k is a field, then we should have required a Fractional instance. However, on occasion we are going to break the rules slightly.&lt;br /&gt;- Second, note that for addition, we required an Ord instance for b. We could have defined addition using (++) to concatenate linear combinations - however, the problem with that is that it wouldn't then easily follow that e1+e3 = e3+e1, or that e1+e1 = 2e1. By requiring an Ord instance, we can guarantee that there is a unique normal form in which to express any vector - namely, list the basis elements in order, combine duplicates, remove zero coefficients.&lt;br /&gt;- Finally, note that I didn't define Vect k b as an instance of a vector space type class. That's just because I didn't yet see a reason to.&lt;br /&gt;&lt;br /&gt;It will turn out to be useful to have a function that can take an arbitrary element of Vect k b and return a vector in normal form:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;-- |Convert an element of Vect k b into normal form. Normal form consists in having the basis elements in ascending order,&lt;br /&gt;-- with no duplicates, and all coefficients non-zero&lt;br /&gt;nf :: (Ord b, Num k) =&amp;gt; Vect k b -&amp;gt; Vect k b&lt;br /&gt;nf (V ts) = V $ nf' $ L.sortBy compareFst ts where&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;nf' ((b1,x1):(b2,x2):ts) =&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;case compare b1 b2 of&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;LT -&amp;gt; if x1 == 0 then nf' ((b2,x2):ts) else (b1,x1) : nf' ((b2,x2):ts)&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;EQ -&amp;gt; if x1+x2 == 0 then nf' ts else nf' ((b1,x1+x2):ts)&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;GT -&amp;gt; error "nf': not pre-sorted"&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;nf' [(b,x)] = if x == 0 then [] else [(b,x)]&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;nf' [] = []&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;compareFst (b1,x1) (b2,x2) = compare b1 b2&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;Okay, so we ought to check that the Vect k b is a vector space. Let's write some QuickCheck properties:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;prop_AddGrp (x,y,z) =&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;x &amp;lt;+&amp;gt; (y &amp;lt;+&amp;gt; z) == (x &amp;lt;+&amp;gt; y) &amp;lt;+&amp;gt; z &amp;amp;&amp;amp; -- associativity&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;x &amp;lt;+&amp;gt; y == y &amp;lt;+&amp;gt; x &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;amp;&amp;amp; -- commutativity&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;x &amp;lt;+&amp;gt; zero == x &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;&amp;amp;&amp;amp; -- identity&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;x &amp;lt;+&amp;gt; neg x == zero &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; -- inverse&lt;br /&gt;&lt;br /&gt;prop_VecSp (a,b,x,y,z) =&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;prop_AddGrp (x,y,z) &amp;amp;&amp;amp;&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;a *&amp;gt; (x &amp;lt;+&amp;gt; y) == a *&amp;gt; x &amp;lt;+&amp;gt; a *&amp;gt; y &amp;amp;&amp;amp; -- distributivity through vectors&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;(a+b) *&amp;gt; x == a *&amp;gt; x &amp;lt;+&amp;gt; b *&amp;gt; x &amp;nbsp; &amp;nbsp; &amp;amp;&amp;amp; -- distributivity through scalars&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;(a*b) *&amp;gt; x == a *&amp;gt; (b *&amp;gt; x) &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;amp;&amp;amp; -- associativity&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;1 *&amp;gt; x == x &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;-- unit&lt;br /&gt;&lt;br /&gt;instance Arbitrary EBasis where&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;arbitrary = do n &amp;lt;- arbitrary :: Gen Int&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; return (E n)&lt;br /&gt;&lt;br /&gt;instance Arbitrary Q where&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;arbitrary = do n &amp;lt;- arbitrary :: Gen Integer&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; d &amp;lt;- arbitrary :: Gen Integer&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; return (if d == 0 then fromInteger n else fromInteger n / fromInteger d)&lt;br /&gt;&lt;br /&gt;instance Arbitrary (Vect Q EBasis) where&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;arbitrary = do ts &amp;lt;- arbitrary :: Gen [(EBasis, Q)]&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; return $ nf $ V ts&lt;br /&gt;&lt;br /&gt;prop_VecSpQn (a,b,x,y,z) = prop_VecSp (a,b,x,y,z)&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;where types = (a,b,x,y,z) :: (Q, Q, Vect Q EBasis, Vect Q EBasis, Vect Q EBasis)&lt;br /&gt;&lt;br /&gt;&amp;gt; quickCheck prop_VecSpQn&lt;br /&gt;+++ OK, passed 100 tests.&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;(I'm using Q instead of R as my field in order to avoid false negatives caused by the fact that arithmetic in Double is not exact.)&lt;br /&gt;&lt;br /&gt;So it looks like Vect k b is indeed a vector space. In category theory, it is called the &lt;i&gt;free&lt;/i&gt; k-vector space over b. "Free" here means that there are no relations among the basis elements: it will never turn out, for example, that e1 = e2+e3.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Vector spaces of course form a category, specifically an algebraic category (there are other types, as we'll see in due course). The objects in the category are the vector spaces. The arrows or morphisms in the category are the functions between vector spaces which "commute" with the algebraic structure. Specifically, they are the functions f such that:&lt;br /&gt;- f(x+y) = f(x)+f(y)&lt;br /&gt;- f(0) = 0&lt;br /&gt;- f(-x) = -f(x)&lt;br /&gt;- f(a.x) = a.f(x)&lt;br /&gt;&lt;br /&gt;Such a function is called &lt;i&gt;linear&lt;/i&gt;, the idea being that it preserves lines. This is because it follows from the the conditions that f(a.x+b.y) = a.f(x)+b.f(y) .&lt;br /&gt;&lt;br /&gt;We can write a QuickCheck property to check whether a given function is linear:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;prop_Linear f (a,x,y) =&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;f (x &amp;lt;+&amp;gt; y) == f x &amp;lt;+&amp;gt; f y &amp;amp;&amp;amp;&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;f zero == zero &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;amp;&amp;amp;&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;f (neg x) == neg (f x) &amp;nbsp; &amp;nbsp; &amp;amp;&amp;amp;&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;f (a *&amp;gt; x) == a *&amp;gt; f x&lt;br /&gt;&lt;br /&gt;prop_LinearQn f (a,x,y) = prop_Linear f (a,x,y)&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;where types = (a,x,y) :: (Q, Vect Q EBasis, Vect Q EBasis)&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;For example:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; quickCheck (prop_LinearQn (2 *&amp;gt;))&lt;br /&gt;+++ OK, passed 100 tests.&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;We won't need to use this quite yet, but it's handy to have around.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Now, in category theory we have the concept of a functor, which is a map from one category to another, which commutes with the category structure. Specifically, a functor F consists of a map from the objects of one category to the objects of the other, and from the arrows of one category to the arrows of the other, satisfying:&lt;br /&gt;- F(id_A) = id_F(A)&lt;br /&gt;- F(f . g) = F(f) . F(g) (where dot denotes function composition)&lt;br /&gt;&lt;br /&gt;How does this relate to the Functor type class in Haskell? Well, the Haskell type class enables us to declare that a /type constructor/ is a functor. For example, (Vect k) is a type constructor, which acts on a type b to construct another type Vect k b. (Vect k) is indeed a functor, witness the following declaration:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;instance Functor (Vect k) where&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;fmap f (V ts) = V [(f b, x) | (b,x) &amp;lt;- ts]&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;This says that if we have a function f on our basis elements, then we can lift it to a function on linear combinations of basis elements in the obvious way.&lt;br /&gt;&lt;br /&gt;In mathematics, we would think of the free vector space construction as a functor from &lt;b&gt;Set&lt;/b&gt; (the category of sets) to &lt;b&gt;k-Vect&lt;/b&gt; (the category of k-vector spaces). In Haskell, we need to think of the (Vect k) construction slightly differently. It operates on types, rather than sets, so the source category is &lt;b&gt;Hask&lt;/b&gt;, the category of Haskell types.&lt;br /&gt;&lt;br /&gt;What is the relationship between &lt;b&gt;Hask&lt;/b&gt; and &lt;b&gt;Set&lt;/b&gt;? Well, if we identify a type with the set of values which inhabit it, then we can regard &lt;b&gt;Hask&lt;/b&gt; as a subcategory of &lt;b&gt;Set&lt;/b&gt;, consisting of those sets and functions which can be represented in Haskell. (That would imply for example that we are restricted to &lt;i&gt;computable&lt;/i&gt; functions.)&lt;br /&gt;&lt;br /&gt;So (Vect k) is a functor from &lt;b&gt;Hask&lt;/b&gt; to the subcategory of &lt;b&gt;k-Vect&lt;/b&gt; consisting of vector spaces over sets/types in &lt;b&gt;Hask&lt;/b&gt;.&lt;br /&gt;&lt;br /&gt;So just to spell it out, the (Vect k) functor:&lt;br /&gt;- Takes an object b in &lt;b&gt;Hask&lt;/b&gt;/&lt;b&gt;Set&lt;/b&gt; - ie a type, or its set of inhabitants - to an object Vect k b in &lt;b&gt;k-Vect&lt;/b&gt;&lt;br /&gt;- Takes an arrow f in &lt;b&gt;Hask&lt;/b&gt; (ie a function f :: a -&amp;gt; b), to an arrow (fmap f) :: Vect k a -&amp;gt; Vect k b in &lt;b&gt;k-Vect&lt;/b&gt;.&lt;br /&gt;&lt;br /&gt;Now, there's just one small fly in the ointment. In order to get equality of vectors to work out right, we wanted to insist that they were expressed in normal form, which meant we needed an Ord instance for b. However, in the Functor instance for (Vect k), Haskell doesn't let us express this constraint, and our fmap is unable to use the Ord instance for b. What this means is that fmap f might return a vector which is not in normal form - so we need to remember to call nf afterwards. For example:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;newtype FBasis = F Int deriving (Eq,Ord)&lt;br /&gt;&lt;br /&gt;instance Show FBasis where show (F i) = "f" ++ show i&lt;br /&gt;&lt;br /&gt;&amp;gt; let f = \(E i) -&amp;gt; F (10 - div i 2)&lt;br /&gt;&amp;gt; let f' = fmap f :: Vect Q EBasis -&amp;gt; Vect Q FBasis&lt;br /&gt;&amp;gt; f' (e1 &amp;lt;+&amp;gt; 2 *&amp;gt; e2 &amp;lt;+&amp;gt; e3)&lt;br /&gt;f10+2f9+f9&lt;br /&gt;&amp;gt; let f'' = nf . fmap f :: Vect Q EBasis -&amp;gt; Vect Q FBasis&lt;br /&gt;&amp;gt; f'' (e1 &amp;lt;+&amp;gt; 2 *&amp;gt; e2 &amp;lt;+&amp;gt; e3)&lt;br /&gt;3f9+f10&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;So it might be fairer to say that it is the combination of nf and fmap that forms the functor on arrows.&lt;br /&gt;&lt;br /&gt;The definition of a functor requires that the target arrow is an arrow in the target category. In this case, the requirement is that it is a linear function, rather than just any function between vector spaces. So let's just check:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; quickCheck (prop_LinearQn f'')&lt;br /&gt;+++ OK, passed 100 tests.&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;That's enough for now - more next time.&lt;br /&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5195188167565410449-4672069097854647283?l=haskellformaths.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://haskellformaths.blogspot.com/feeds/4672069097854647283/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://haskellformaths.blogspot.com/2010/12/free-vector-space-on-type-part-1.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5195188167565410449/posts/default/4672069097854647283'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5195188167565410449/posts/default/4672069097854647283'/><link rel='alternate' type='text/html' href='http://haskellformaths.blogspot.com/2010/12/free-vector-space-on-type-part-1.html' title='The free vector space on a type, part 1'/><author><name>DavidA</name><uri>http://www.blogger.com/profile/16359932006803389458</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5195188167565410449.post-3265607331051791135</id><published>2010-11-07T18:16:00.000Z</published><updated>2010-11-07T18:16:52.165Z</updated><title type='text'>New modules - Quantum Algebra</title><content type='html'>I've put up a new version of HaskellForMaths on &lt;a href="http://hackage.haskell.org/package/HaskellForMaths"&gt;Hackage&lt;/a&gt;, v0.3.1. It's quite a significant update, with more than a dozen new modules, plus improved documentation of several existing modules. I wrote the new modules in the course of reading Kassel's Quantum Groups. The modules are about algebras, coalgebras, bialgebras, Hopf algebras, tensor categories and quantum algebra.&lt;br /&gt;&lt;br /&gt;The new modules fall into two groups:&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Math.Algebras.* - Modules about algebras (and co-, bi- and Hopf algebras) in general&lt;/li&gt;&lt;li&gt;Math.QuantumAlgebra.* - Modules specifically about quantum algebra&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;In (slightly) more detail, here are the modules:&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Math.Algebras.VectorSpace - defines a type for the free k-vector space over a basis set b&amp;nbsp;&lt;/li&gt;&lt;li&gt;Math.Algebras.TensorProduct &amp;nbsp;- defines tensor product of two vector spaces&lt;/li&gt;&lt;li&gt;Math.Algebras.Structures - defines a number of additional algebraic structures that can be given to vector spaces: algebra, coalgebra, bialgebra, Hopf algebra, module, comodule&lt;/li&gt;&lt;li&gt;Math.Algebras.Quaternions - a simple example of an algebra&lt;/li&gt;&lt;li&gt;Math.Algebras.Matrix - the 2*2 matrices - another simple example of an algebra&lt;/li&gt;&lt;li&gt;Math.Algebras.Commutative - commutative polynomials (such as x^2+3yz) - another algebra&lt;/li&gt;&lt;li&gt;Math.Algebras.NonCommutative - non-commutative polynomials (where xy /= yx) - another algebra&lt;/li&gt;&lt;li&gt;Math.Algebras.GroupAlgebra - a key example of a Hopf algebra&lt;/li&gt;&lt;li&gt;Math.Algebras.AffinePlane - the affine plane and its symmetries - more Hopf algebras, preparing for the quantum plane&lt;/li&gt;&lt;li&gt;Math.Algebras.TensorAlgebra&lt;/li&gt;&lt;li&gt;Math.Algebras.LaurentPoly - we use Laurent polynomials in q (that is, polynomials in q and q^-1) as our quantum scalars&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Math.QuantumAlgebra.QuantumPlane - the quantum plane and its symmetries, as examples of non-commutative, non-cocommutative Hopf algebras&lt;/li&gt;&lt;li&gt;Math.QuantumAlgebra.TensorCategory&lt;/li&gt;&lt;li&gt;Math.QuantumAlgebra.Tangle - The tangle category (which includes knots, links and braids as subcategories), and some representations (from which we derive knot invariants)&lt;/li&gt;&lt;li&gt;Math.QuantumAlgebra.OrientedTangle&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;The following diagram is something like a dependency diagram, with "above" meaning "depends on".&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_XQ7FznWBAYE/TNbrgDNoIPI/AAAAAAAAAHI/-QY8b9soqns/s1600/QA+modules.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="305" src="http://3.bp.blogspot.com/_XQ7FznWBAYE/TNbrgDNoIPI/AAAAAAAAAHI/-QY8b9soqns/s640/QA+modules.png" width="640" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;Each layer depends on the layers beneath it. There are also a few dependencies within layers, that I've hinted at by proximity.&lt;br /&gt;&lt;br /&gt;Some of these modules overlap somewhat in content with other modules that already exist in HaskellForMaths. In particular, there are already modules for commutative algebra, non-commutative algebra, and knot theory. For the moment, those existing modules still offer some features not offered by the new modules (for example, calculation of Groebner bases).&lt;br /&gt;&lt;br /&gt;For the next little while in this blog, I want to start going through these new modules, and investigating quantum algebra. I should emphasize that this is still work in progress. For example, I'm intending to add modules for quantum enveloping algebras - but I have some reading to do first.&lt;br /&gt;&lt;br /&gt;(Oh, and I know that I did previously promise to look at finite simple groups, and Coxeter groups, in this blog. I'll probably still come back to those at some point.)&lt;br /&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5195188167565410449-3265607331051791135?l=haskellformaths.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://haskellformaths.blogspot.com/feeds/3265607331051791135/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://haskellformaths.blogspot.com/2010/11/new-modules-quantum-algebra.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5195188167565410449/posts/default/3265607331051791135'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5195188167565410449/posts/default/3265607331051791135'/><link rel='alternate' type='text/html' href='http://haskellformaths.blogspot.com/2010/11/new-modules-quantum-algebra.html' title='New modules - Quantum Algebra'/><author><name>DavidA</name><uri>http://www.blogger.com/profile/16359932006803389458</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_XQ7FznWBAYE/TNbrgDNoIPI/AAAAAAAAAHI/-QY8b9soqns/s72-c/QA+modules.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5195188167565410449.post-26919414128897063</id><published>2010-10-14T20:03:00.000+01:00</published><updated>2010-10-14T20:03:26.038+01:00</updated><title type='text'>Word length in the Symmetric group</title><content type='html'>Previously on this blog, we saw how to think about groups abstractly via group presentations, where a group is given as a set of generators satisfying specified relations. Last time, we saw that questions about the length of reduced words in such a presentation can be visualised as questions about the length of paths in the Cayley graph of the group (relative to the generators).&lt;br /&gt;&lt;br /&gt;This time, I want to focus on just one family of groups - the symmetric groups Sn, as generated by the adjacent transpositions {si = (i i+1)}. Here's the Haskell code defining this presentation of Sn:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;newtype SGen = S Int deriving (Eq,Ord)&lt;br /&gt;&lt;br /&gt;instance Show SGen where&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;show (S i) = "s" ++ show i&lt;br /&gt;&lt;br /&gt;_S n = (gs, r ++ s ++ t) where&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;gs = map S [1..n-1]&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;r = [([S i, S i],[]) | i &amp;lt;- [1..n-1]]&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;s = [(concat $ replicate 3 [S i, S (i+1)],[]) | i &amp;lt;- [1..n-2]]&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;t = [([S i, S j, S i, S j],[]) | i &amp;lt;- [1..n-1], j &amp;lt;- [i+2..n-1]]&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;The three sets of relations say: each generator si squares to the identity; if i, j are not adjacent, then si and sj commute; if i, j are adjacent, then (si*sj)^3 is the identity.&lt;br /&gt;&lt;br /&gt;Here is the Cayley graph for S4 under this presentation:&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/_XQ7FznWBAYE/TLdQa1RiUJI/AAAAAAAAAHE/89shnGJO6rA/s1600/CayleyGraphS4.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://4.bp.blogspot.com/_XQ7FznWBAYE/TLdQa1RiUJI/AAAAAAAAAHE/89shnGJO6rA/s1600/CayleyGraphS4.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;The vertices are labeled with the reduced words in the generators si. How can we find out which permutations these correspond to? Well, that's easy:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;fromTranspositions ts = product $ map (\(S i) -&amp;gt; p [[i,i+1]]) ts&lt;/code&gt;&lt;br /&gt;For example:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; :load Math.Algebra.Group.CayleyGraph&lt;br /&gt;&amp;gt; fromTranspositions [S 1, S 2, S 1, S 3, S 2, S 1]&lt;br /&gt;[[1,4],[2,3]]&lt;/code&gt;&lt;br /&gt;This is the permutation that reverses the list [1..4]&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; map (.^ it) [1..4]&lt;br /&gt;[4,3,2,1]&lt;/code&gt;&lt;br /&gt;What about the other way round? Suppose we are given a permutation in Sn. How do we find its expression as a product of the transpositions si? Well the answer is (roughly): use &lt;a href="http://en.wikipedia.org/wiki/Bubblesort"&gt;bubblesort&lt;/a&gt;!&lt;br /&gt;&lt;br /&gt;Here's bubblesort in Haskell:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;bubblesort [] = []&lt;br /&gt;bubblesort xs = bubblesort' [] xs where&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;bubblesort' ls (r1:r2:rs) = if r1 &amp;lt;= r2 then bubblesort' (r1:ls) (r2:rs) else bubblesort' (r2:ls) (r1:rs)&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;bubblesort' ls [r] = bubblesort (reverse ls) ++ [r]&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;So we sweep through the list from front to back, swapping any pairs&amp;nbsp;that we find out of order - and then repeat. At the end of each sweep, we're guaranteed that the last element (in sort order) has reached the end of the list - so for the next sweep, we can leave it at the end and only sweep through the earlier elements. Hence the list we're sweeping through is one element shorter each time, so we're guaranteed to terminate. (We could terminate early, the first time a sweep makes no swaps - but I haven't coded that.)&lt;br /&gt;&lt;br /&gt;Just to prove that the code works:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; bubblesort [2,3,1]&lt;br /&gt;[1,2,3]&lt;/code&gt;&lt;br /&gt;How does this help for turning a permutation into a sequence of transpositions? Well it's simple - every time we swap two elements, we are performing a transposition - so just record which swaps we perform. So here's a modified version of the above code, which records the swaps:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;-- given a permutation of [1..n] (as a list), return the transpositions which led to it&lt;br /&gt;toTrans [] = []&lt;br /&gt;toTrans xs = toTrans' 1 [] [] xs where&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;toTrans' i ts ls (r1:r2:rs) =&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;if r1 &amp;lt;= r2&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;then toTrans' (i+1) ts (r1:ls) (r2:rs) &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; -- no swap needed&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;else toTrans' (i+1) (S i : ts) (r2:ls) (r1:rs) -- swap needed&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;toTrans' i ts ls [r] = toTrans (reverse ls) ++ ts&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;Notice that the ts are returned in reverse to the order that they were used. This is because we are using them to &lt;i&gt;undo&lt;/i&gt; the permutation - so we are performing the &lt;i&gt;inverse&lt;/i&gt; of the permutation we are trying to find. Since each generator is its own inverse, we can recover the permutation we are after simply by reversing. In the code, we reverse as we go along.&lt;br /&gt;&lt;br /&gt;For example:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; toTrans [2,3,1]&lt;br /&gt;[s1,s2]&lt;br /&gt;&amp;gt; toTrans [4,3,2,1]&lt;br /&gt;[s1,s2,s1,s3,s2,s1]&lt;/code&gt;&lt;br /&gt;Now, there's only one problem. As you can see, this code takes as input a rearrangement of [1..n]. This is a permutation, yes, but considered passively. Whereas in this blog we have been more accustomed to thinking of permutations actively, as something a bit like a function, which has an action on a graph, or other combinatorial structure, or if you like, just on the set [1..n]. In other words, our Permutation type represents the action itself, not the outcome of the action. (Recall that the implementation uses a Data.Map of (from,to) pairs.)&lt;br /&gt;&lt;br /&gt;But of course it's easy to convert from one viewpoint to the other. So here's the code to take a permutation in cycle notation and turn it into a sequence of transpositions:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;-- given a permutation action on [1..n], factor it into transpositions&lt;br /&gt;toTranspositions 1 = []&lt;br /&gt;toTranspositions g = toTrans [i .^ (g^-1) | i &amp;lt;- [1..n] ] where&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;n = maximum $ supp g&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;For example:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; toTranspositions $ p [[1,4],[2,3]]&lt;br /&gt;[s1,s2,s1,s3,s2,s1]&lt;/code&gt;&lt;br /&gt;Why does the code have [i .^ (g^-1) | i &amp;lt;- [1..n]], rather than [i .^ g | i &amp;lt;- [1..n]]?&lt;br /&gt;Well, suppose i .^ g = j. This says that g moves i to the j position. But we want to know what ends up in the i position. Suppose that j .^ g = i, for some j. Applying g^-1 to both sides, we see that j = i .^ (g^-1).&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Okay, so given a permutation, in either form, we can reconstruct it as a reduced word in the generators.&lt;br /&gt;&lt;br /&gt;We saw last time that the length of a reduced word is also the length of the shortest path from 1 to the element in the Cayley graph. Distance in the Cayley graph is a metric on the group, so the length of a reduced word tells us "how far" the element is from being the identity.&lt;br /&gt;&lt;br /&gt;If it's only this distance that we're interested in, then there is a more direct way to work it out. Given a permutation g of [1..n], then an &lt;i&gt;inversion&lt;/i&gt; is a pair (i,j) with i &amp;lt; j but i .^ g &amp;gt; j .^ g. In Haskell:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;inversions g = [(i,j) | i &amp;lt;- [1..n], j &amp;lt;- [i+1..n], i &amp;lt; j, i .^ g &amp;gt; j .^ g]&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;where n = maximum $ supp g&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;For example:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; inversions $ fromList [1,4,3,2]&lt;br /&gt;[(2,3),(2,4),(3,4)]&lt;/code&gt;&lt;br /&gt;With a little thought, you should be able to convince yourself that the number of inversions is equal to the length of the reduced word for g - because each swap that we perform during bubblesort corrects exactly one inversion.&lt;br /&gt;&lt;br /&gt;Okay, so this is all very nice, but what use is it? Well, of course, maths doesn't have to be useful, any more than any other aesthetic pursuit. However, as it happens, in this case it is.&lt;br /&gt;&lt;br /&gt;In statistics, the &lt;a href="http://en.wikipedia.org/wiki/Kendall_tau_rank_correlation_coefficient"&gt;Kendall tau test&lt;/a&gt; gives an indicator of the correlation between two measured quantities (for example, the height and weight of the test subjects). Suppose that we are given a list of pairs (eg (height,weight) pairs), and we want to know how strongly correlated the first and second quantities are.&lt;br /&gt;&lt;br /&gt;Ok, so what we do is, we rank the first quantities from lowest to highest, and replace each quantity by its rank (a number from 1 to n). We do the same for the second quantities. So we end up with a list of pairs of numbers from 1 to n. Now, we sort the list on the first element, and then count the number of inversions in the second element.&lt;br /&gt;&lt;br /&gt;For example, suppose our original list was [(1.55m, 60kg), (1.8m, 80kg), (1.5m, 70kg), (1.6m, 72kg)]. Converting to ranks, we get [(2nd,1st),(4th,4th),(1st,2nd),(3rd,3rd)]. Sorting on fst, we get [(1,2),(2,1),(3,3),(4,4)]. Looking at snd, we see that we have just one inversion. The idea is that the fewer inversions we have, the better correlated the two quantities. (Of course in reality there's a bit more to it than that - to convert the number of inversions into a probability, we need to know the distribution of word lengths for Sn, where n is the number of pairs of test data that we have.)&lt;br /&gt;&lt;br /&gt;So you can think of the Kendall tau test as saying: What permutation has been applied in moving from the first quantities (the heights) to the second quantities (the weights)? How far is that permutation from the identity (on the Cayley graph)? What proportion of all permutations (in Sn) lie at that distance or less from the identity? (Even more concretely, we can imagine colouring in successive shells of the Cayley graph, working out from the identity, until we hit the given permutation, and then asking what proportion of the "surface" of the graph has been coloured.)&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5195188167565410449-26919414128897063?l=haskellformaths.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://haskellformaths.blogspot.com/feeds/26919414128897063/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://haskellformaths.blogspot.com/2010/10/word-length-in-symmetric-group.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5195188167565410449/posts/default/26919414128897063'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5195188167565410449/posts/default/26919414128897063'/><link rel='alternate' type='text/html' href='http://haskellformaths.blogspot.com/2010/10/word-length-in-symmetric-group.html' title='Word length in the Symmetric group'/><author><name>DavidA</name><uri>http://www.blogger.com/profile/16359932006803389458</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_XQ7FznWBAYE/TLdQa1RiUJI/AAAAAAAAAHE/89shnGJO6rA/s72-c/CayleyGraphS4.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5195188167565410449.post-556064135994311522</id><published>2010-09-20T21:06:00.000+01:00</published><updated>2010-09-20T21:06:22.353+01:00</updated><title type='text'>Cayley graphs of groups</title><content type='html'>[New version HaskellForMaths 0.2.2 released &lt;a href="http://hackage.haskell.org/package/HaskellForMaths"&gt;here&lt;/a&gt;]&lt;br /&gt;&lt;br /&gt;Recently, we've been looking at group presentations, where a group is presented as a set of generators together with a set of relations that hold between those generators. Group elements are then represented as words in the generators.&lt;br /&gt;&lt;br /&gt;One can then ask questions about these words, such as: What is the longest (reduced) word in the group? How many (reduced) words are there of each length?&lt;br /&gt;&lt;br /&gt;This week I want to look at Cayley graphs, which are a way of visualising groups. Questions about word length translate to questions about path distance in the Cayley graph.&lt;br /&gt;&lt;br /&gt;So, the Cayley graph of a group, relative to a generating set gs, is the graph&lt;br /&gt;- with a vertex for each element of the group&lt;br /&gt;- with an edge from x to y just whenever x*g = y for some generator g in gs&lt;br /&gt;&lt;br /&gt;Notice that as we have defined it, the edges are &lt;i&gt;directed&lt;/i&gt; (from x to y), so this is a directed graph, or digraph.&lt;br /&gt;&lt;br /&gt;Here's the Haskell code:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;module Math.Algebra.Group.CayleyGraph where&lt;br /&gt;&lt;br /&gt;import Math.Algebra.Group.PermutationGroup as P&lt;br /&gt;import Math.Algebra.Group.StringRewriting as SR&lt;br /&gt;import Math.Combinatorics.Graph&lt;br /&gt;&lt;br /&gt;import qualified Data.List as L&lt;br /&gt;import qualified Data.Set as S&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;data Digraph a = DG [a] [(a,a)] deriving (Eq,Ord,Show)&lt;br /&gt;&lt;br /&gt;-- Cayley digraph given a group presentation of generators and relations&lt;br /&gt;cayleyDigraphS (gs,rs) = DG vs es where&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;rs' = knuthBendix rs&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;vs = L.sort $ nfs (gs,rs') -- reduced words&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;es = [(v,v') | v &amp;lt;- vs, v' &amp;lt;- nbrs v ]&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;nbrs v = L.sort [rewrite rs' (v ++ [g]) | g &amp;lt;- gs]&lt;br /&gt;&lt;br /&gt;-- Cayley digraph given group generators as permutations&lt;br /&gt;cayleyDigraphP gs = DG vs es where&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;vs = P.elts gs&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;es = [(v,v') | v &amp;lt;- vs, v' &amp;lt;- nbrs v ]&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;nbrs v = L.sort [v * g | g &amp;lt;- gs]&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;As an example, let's look at the Cayley digraph of the dihedral group D8 (the symmetries of a square), generated by a rotation and a reflection:&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_XQ7FznWBAYE/TJe6DWcIGmI/AAAAAAAAAGc/2DFTDor8I_U/s1600/Square.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://3.bp.blogspot.com/_XQ7FznWBAYE/TJe6DWcIGmI/AAAAAAAAAGc/2DFTDor8I_U/s320/Square.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; :load Math.Algebra.Group.CayleyGraph&lt;br /&gt;&amp;gt; let a = p [[1,2,3,4]]&lt;br /&gt;&amp;gt; let b = p [[1,2],[3,4]]&lt;br /&gt;&amp;gt; a^3*b == b*a&lt;br /&gt;True&lt;br /&gt;&amp;gt; cayleyDigraphS (['a','b'],[("aaaa",""),("bb",""),("aaab","ba")])&lt;br /&gt;DG ["","a","aa","aaa","aab","ab","b","ba"] [("","a"),("","b"),("a","aa"),("a","ab"),("aa","aaa"),("aa","aab"),("aaa",""),("aaa","ba"),("aab","aa"),("aab","ab"),("ab","a"),("ab","b"),("b",""),("b","ba"),("ba","aaa"),("ba","aab")]&lt;br /&gt;&amp;gt; cayleyDigraphP [a,b]&lt;br /&gt;DG [[],[[1,2],[3,4]],[[1,2,3,4]],[[1,3],[2,4]],[[1,3]],[[1,4,3,2]],[[1,4],[2,3]],[[2,4]]] [([],[[1,2],[3,4]]),([],[[1,2,3,4]]),([[1,2],[3,4]],[]),([[1,2],[3,4]],[[1,3]]),([[1,2,3,4]],[[1,3],[2,4]]),([[1,2,3,4]],[[2,4]]),([[1,3],[2,4]],[[1,4,3,2]]),([[1,3],[2,4]],[[1,4],[2,3]]),([[1,3]],[[1,4,3,2]]),([[1,3]],[[1,4],[2,3]]),([[1,4,3,2]],[]),([[1,4,3,2]],[[1,3]]),([[1,4],[2,3]],[[1,3],[2,4]]),([[1,4],[2,3]],[[2,4]]),([[2,4]],[[1,2],[3,4]]),([[2,4]],[[1,2,3,4]])]&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;These are of course the same Cayley digraph, just with different vertex labels. Here's a picture:&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/_XQ7FznWBAYE/TJe6X667ROI/AAAAAAAAAGk/xJQtuxc-j84/s1600/CayleyDigraphD8.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://4.bp.blogspot.com/_XQ7FznWBAYE/TJe6X667ROI/AAAAAAAAAGk/xJQtuxc-j84/s320/CayleyDigraphD8.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;The picture might remind you of something. You can think of a Cayley digraph as a state transition diagram, where the states are the group elements, and the transitions are multiplication (on the right) by g, for each generator g. It might help to think of each edge as being labelled by the generator that caused it.&lt;br /&gt;&lt;br /&gt;A few things to notice.&lt;br /&gt;&lt;br /&gt;First, Cayley digraphs are always regular: the out-degree of each vertex, the number of edges leading out of it, will always equal the number of generators; and similarly for the in-degree, the number of edges leading into each vertex. (Exercise: Prove this.) In fact, we can say more - the graph "looks the same" from any vertex - this follows from the group properties. (Exercise: Explain.)&lt;br /&gt;&lt;br /&gt;Second, notice how some of the edges come in pairs going in opposite directions. Why is that? In this case, it's because one of our generators is its own inverse (which one?) - so if it can take you from x to y, then it can take you back again. In general, whenever our set of generators contains a g such that g^-1 is also in the set, then the edges corresponding to g, g^-1 will come in opposing pairs.&lt;br /&gt;&lt;br /&gt;Given this, we can omit the arrows on the edges if we adopt the convention that whenever we are given a set of generators, their inverses are also implied. In this way, we obtain an undirected or simple graph. Here's the code:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;toSet = S.toList . S.fromList&lt;br /&gt;&lt;br /&gt;-- The Cayley graph on the generators gs *and their inverses*, given relations rs&lt;br /&gt;cayleyGraphS (gs,rs) = graph (vs,es) where&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;rs' = knuthBendix rs&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;vs = L.sort $ nfs (gs,rs') -- all reduced words&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;es = toSet [ L.sort [v,v'] | v &amp;lt;- vs, v' &amp;lt;- nbrs v ] -- toSet orders and removes duplicates&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;nbrs v = [rewrite rs' (v ++ [g]) | g &amp;lt;- gs]&lt;br /&gt;&lt;br /&gt;cayleyGraphP gs = graph (vs,es) where&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;vs = P.elts gs&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;es = toSet [ L.sort [v,v'] | v &amp;lt;- vs, v' &amp;lt;- nbrs v ]&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;nbrs v = [v * g | g &amp;lt;- gs]&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;For example:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; cayleyGraphS (['a','b'],[("aaaa",""),("bb",""),("aaab","ba")])&lt;br /&gt;G ["","a","aa","aaa","aab","ab","b","ba"] [["","a"],["","aaa"],["","b"],["a","aa"],["a","ab"],["aa","aaa"],["aa","aab"],["aaa","ba"],["aab","ab"],["aab","ba"],["ab","b"],["b","ba"]]&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/_XQ7FznWBAYE/TJe6sbmyA9I/AAAAAAAAAGs/Ag01Mhf0ggY/s1600/CayleyGraphD8.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://1.bp.blogspot.com/_XQ7FznWBAYE/TJe6sbmyA9I/AAAAAAAAAGs/Ag01Mhf0ggY/s320/CayleyGraphD8.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;One important point to note is that the Cayley graph of a group is relative to the generators. For example, we saw last time that the dihedral groups can also be generated by two reflections. In the case of D8, we can set r = (1 2)(3 4), s = (1 3).&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/_XQ7FznWBAYE/TJe63ckvTEI/AAAAAAAAAG0/BredDjXIG5E/s1600/SquareReflectionAxes.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://2.bp.blogspot.com/_XQ7FznWBAYE/TJe63ckvTEI/AAAAAAAAAG0/BredDjXIG5E/s320/SquareReflectionAxes.png" /&gt;&lt;/a&gt;&lt;/div&gt;Before scrolling down, see if you can guess what the Cayley graph looks like. I'll give you a clue: Cayley graphs are always regular - what is the valency of each vertex in this case?&lt;br /&gt;&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; let r = p [[1,2],[3,4]]&lt;br /&gt;&amp;gt; let s = p [[1,3]]&lt;br /&gt;&amp;gt; (r*s)^4&lt;br /&gt;[]&lt;br /&gt;&amp;gt; cayleyGraphS (['r','s'],[("rr",""),("ss",""),("rsrsrsrs","")])&lt;br /&gt;G ["","r","rs","rsr","rsrs","s","sr","srs"] [["","r"],["","s"],["r","rs"],["rs","rsr"],["rsr","rsrs"],["rsrs","srs"],["s","sr"],["sr","srs"]]&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/_XQ7FznWBAYE/TJe7Ax34VjI/AAAAAAAAAG8/9F490Zq1vzs/s1600/CayleyGraphD8rs.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://1.bp.blogspot.com/_XQ7FznWBAYE/TJe7Ax34VjI/AAAAAAAAAG8/9F490Zq1vzs/s320/CayleyGraphD8rs.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;So the point to emphasize is that this graph and the one shown previously are Cayley graphs of the same group. The vertices represent the same elements (considered as permutations). However, because we have taken different sets of generators, we get different edges, and hence different graphs.&lt;br /&gt;&lt;br /&gt;Ok, so what does the Cayley graph tell us about the group? Well, as an example, consider the Cayley graph of the Rubik cube group, as generated by the face rotations f, b, l, r, u, d (as defined &lt;a href="http://haskellformaths.blogspot.com/2009/08/how-to-count-number-of-positions-of.html"&gt;previously&lt;/a&gt;). The vertices of the graph can be identified with the possible positions or states of the cube. The group element 1 corresponds to the solved cube. The edges correspond to single moves that can be made on the cube. If someone gives you a scrambled cube to solve, they are asking you to find a path from that vertex of the Cayley graph back to 1.&lt;br /&gt;&lt;br /&gt;Given any graph, and vertices x and y, the distance from x to y is defined as the length of the shortest path from x to y. On the Rubik graph (ie, the Cayley graph of the Rubik cube), the distance from x to 1 is the minimum number of moves needed to solve position x. The HaskellForMaths library provides a distance function on graphs. Thus for example:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; let graphD8rs = cayleyGraphS (['r','s'],[("rr",""),("ss",""),("rsrsrsrs","")])&lt;br /&gt;&amp;gt; distance graphD8rs "" "rsr"&lt;br /&gt;3&lt;/code&gt;&lt;br /&gt;The distance from 1 to an element g is of course the length of the reduced word for g.&lt;br /&gt;&lt;br /&gt;The diameter of a graph is defined as the maximum distance between vertices. So the diameter of the Rubik graph is the maximum number of moves that are required to solve a scrambled position. It has &lt;a href="http://www.bbc.co.uk/news/technology-10929159"&gt;recently&lt;/a&gt; been shown that this number is twenty.&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; diameter graphD8rs&lt;br /&gt;4&lt;/code&gt;&lt;br /&gt;The diameter of a Cayley graph is the length of the longest reduced word.&lt;br /&gt;&lt;br /&gt;That's really all I wanted to say for the moment. Next time, we'll take some of these ideas further.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5195188167565410449-556064135994311522?l=haskellformaths.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://haskellformaths.blogspot.com/feeds/556064135994311522/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://haskellformaths.blogspot.com/2010/09/cayley-graphs-of-groups.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5195188167565410449/posts/default/556064135994311522'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5195188167565410449/posts/default/556064135994311522'/><link rel='alternate' type='text/html' href='http://haskellformaths.blogspot.com/2010/09/cayley-graphs-of-groups.html' title='Cayley graphs of groups'/><author><name>DavidA</name><uri>http://www.blogger.com/profile/16359932006803389458</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_XQ7FznWBAYE/TJe6DWcIGmI/AAAAAAAAAGc/2DFTDor8I_U/s72-c/Square.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5195188167565410449.post-5426688105890175338</id><published>2010-07-18T20:56:00.001+01:00</published><updated>2010-07-18T21:10:10.284+01:00</updated><title type='text'>Group presentations</title><content type='html'>&lt;a href="http://haskellformaths.blogspot.com/2010/05/string-rewriting-and-knuth-bendix.html"&gt;Last time&lt;/a&gt; we looked at string rewriting systems and the Knuth-Bendix completion algorithm. The motivation for doing that was to enable us to think about groups in a more abstract way than before.&lt;br /&gt;&lt;br /&gt;The example we looked at last time was the symmetry group of the square. We found that this group could be generated by two elements a, b, satisfying the relations:&lt;br /&gt;a^4 = 1&lt;br /&gt;b^2 = 1&lt;br /&gt;a^3 b = b a&lt;br /&gt;&lt;br /&gt;This way of thinking about the group is called a group presentation, and the usual notation would be:&lt;br /&gt;&amp;lt;a,b | a^4=1, b^2=1, a^3 b = b a&amp;gt;&lt;br /&gt;&lt;br /&gt;In our Haskell code, we represent this as:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;( ['a','b'], [("aaaa",""),("bb",""),("aaab","ba")] )&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;We saw how to use the Knuth-Bendix algorithm to turn the relations into a confluent rewrite system:&lt;br /&gt;&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; :load Math.Algebra.Group.StringRewriting&lt;br /&gt;&amp;gt; mapM_ print $ knuthBendix [("aaaa",""),("bb",""),("aaab","ba")]&lt;br /&gt;("bb","")&lt;br /&gt;("bab","aaa")&lt;br /&gt;("baa","aab")&lt;br /&gt;("aba","b")&lt;br /&gt;("aaab","ba")&lt;br /&gt;("aaaa","")&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;The rewrite system itself isn't particularly informative. Its importance lies in what it enables us to do. Given any word in the generators, we reduce it as follows: wherever we can find the left hand side of one of rules as a subword, we replace it by the right hand side of the rule. If we keep doing this until there are no more matches, then we end up with a normal form for the element - that is, another word in the generators, which represents the same group element, and is the smallest such word relative to the shortlex ordering. Several things follow from this.&lt;br /&gt;&lt;br /&gt;First, the ability to find the shortest word is sometimes useful in itself. If we could do this for the Rubik cube group (taking the six face rotations as generators), then we would be able to code "God's algorithm" to find the shortest solution to any given cube position.&lt;br /&gt;&lt;br /&gt;Second, any two words that represent the same group element will reduce to the same normal form. Hence, given any two words in the generators, we can tell whether they represent the same element. This is called "solving the word problem" for the group.&lt;br /&gt;&lt;br /&gt;Third, this enables us to list (the normal forms of) all the elements of the group - and hence, among other things, to count them.&lt;br /&gt;&lt;br /&gt;Fourth, it enables us to do arithmetic in the group:&lt;br /&gt;- To multiply two elements, represented as words w1, w2, just concatenate them to w1++w2, then reduce using the rewrite system&lt;br /&gt;- The identity element of the group is of course the empty word ""&lt;br /&gt;- But what about inverses?&lt;br /&gt;&lt;br /&gt;Strings (lists) under concatenation form a monoid, not a group. So what do we do about inverses?&lt;br /&gt;&lt;br /&gt;Well, one possibility is to include them as additional symbols. So, suppose that our generators are a,b. Then we should introduce additional symbols a&lt;sup&gt;-1&lt;/sup&gt;, b&lt;sup&gt;-1&lt;/sup&gt;, and consider words over the four symbols {a,b,a&lt;sup&gt;-1&lt;/sup&gt;,b&lt;sup&gt;-1&lt;/sup&gt;}. (For brevity, it is customary to use the symbols A, B for a&lt;sup&gt;-1&lt;/sup&gt;, b&lt;sup&gt;-1&lt;/sup&gt;.)&lt;br /&gt;&lt;br /&gt;If we take this approach, then we will need to add some new rules too. We will need the rules a a&lt;sup&gt;-1&lt;/sup&gt; = 1, etc. We will probably also need the "inverses" of the relations in our presentation. For example, if we have a^4 = 1, then we should also have a rule (a&lt;sup&gt;-1&lt;/sup&gt;)^4 = 1.&lt;br /&gt;&lt;br /&gt;It's going to be a bit of a pain. (And it's probably going to cause Knuth-Bendix to get indigestion, in some cases at least.) Luckily, for finite groups, we don't really need this. In a finite group, each generator must have finite order: in our example a^4 = 1, b^2 = 1. So the inverse of each generator is itself a power of that generator - a&lt;sup&gt;-1&lt;/sup&gt; = a^3, b&lt;sup&gt;-1&lt;/sup&gt; = b. So for a finite group - or in fact any group where the generators are all of finite order - then the inverses are already there, expressible as words in the generators.&lt;br /&gt;&lt;br /&gt;So for most purposes, we have no need to introduce the inverses as new symbols. For example, if we want to list the elements of a finite group, or tell whether two words in the generators represent the same element, then we are fine. When it will matter is if we are specifically interested in the length of the words. For example, if we want God's algorithm for solving Rubik's cube, we are interested in the length of words in the generators &lt;i&gt;and&lt;/i&gt; their inverses - the clockwise and anti-clockwise face rotations.&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: auto;"&gt;&lt;br /&gt;&lt;/div&gt;There is one situation when even this won't matter - and that is if the generators are their own inverses. If we have a generator g such that g^2 = 1, then it follows that g&lt;sup&gt;-1&lt;/sup&gt; = g. Such an element is called an &lt;i&gt;involution&lt;/i&gt;.&lt;br /&gt;&lt;br /&gt;Are there groups which can be generated by involutions alone? Yes, there are. Let's have a look at a couple.&lt;br /&gt;&lt;br /&gt;Consider the symmetry group of a regular polygon, say a pentagon. Consider the two reflections shown below. 'a' is the reflection in an axis through a vertex, and 'b' is the reflection in an axis through the midpoint of an adjacent edge. Hence the angle between the axes is pi/5 (or for an n-gon, pi/n).&lt;br /&gt;&lt;br /&gt;&lt;a href="http://3.bp.blogspot.com/_XQ7FznWBAYE/TENZM9Ri4wI/AAAAAAAAAF8/9DkYD_7gU8k/s1600/pentagonreflections.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://3.bp.blogspot.com/_XQ7FznWBAYE/TENZM9Ri4wI/AAAAAAAAAF8/9DkYD_7gU8k/s320/pentagonreflections.png" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;It should be clear that ab is a 1/5 rotation of the pentagon. It follows that a,b generate the symmetry group of the pentagon, with a^2 = b^2 = (ab)^5 =1.&lt;br /&gt;&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; elts (['a','b'], [("aa",""), ("bb",""), ("ababababab","")])&lt;br /&gt;["","a","b","ab","ba","aba","bab","abab","baba","ababa"]&lt;br /&gt;&amp;gt; length it&lt;br /&gt;10&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;Next, consider the symmetric group S4 (or in general, Sn). It can be generated by the transpositions s1 = (1 2), s2 = (2 3), and s3 = (3 4), which correspond to the diagrams below:&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/_XQ7FznWBAYE/TENZehvkKCI/AAAAAAAAAGE/VtXtfJgojjA/s1600/s4a.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="46" src="http://4.bp.blogspot.com/_XQ7FznWBAYE/TENZehvkKCI/AAAAAAAAAGE/VtXtfJgojjA/s400/s4a.png" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;Now, multiplication in the group corresponds to concatenation of the diagrams, going down the page. For example:&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_XQ7FznWBAYE/TENZzs__-eI/AAAAAAAAAGM/JlCoJSe8pzI/s1600/s4b.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="171" src="http://3.bp.blogspot.com/_XQ7FznWBAYE/TENZzs__-eI/AAAAAAAAAGM/JlCoJSe8pzI/s640/s4b.png" width="640" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;In fact, each of those diagrams represents the identity element - as you can check by following each point along the lines down the page, and checking that it ends up where it started. Hence each diagram represents a relation for S4. The diagrams show that s1^2 = 1, (s1s2)^3=1, and (s1s3)^2 = 1.&lt;br /&gt;&lt;br /&gt;In the general case, it's clear that Sn can be generated by n-1 transpositions s&lt;sub&gt;i&lt;/sub&gt; of the form (i i+1), and that they satisfy the following relations:&lt;br /&gt;si^2 = 1&lt;br /&gt;(si sj)^3 = 1 if |i-j| = 1&lt;br /&gt;(si sj)^2 = 1 if |i-j| &amp;gt; 1&lt;br /&gt;&lt;br /&gt;Here's some Haskell code to construct these presentations of Sn. (Did I mention that all of the string rewriting code works on arbitrary lists, not just strings?)&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;newtype S = S Int deriving (Eq,Ord)&lt;br /&gt;&lt;br /&gt;instance Show S where&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;show (S i) = "s" ++ show i&lt;br /&gt;&lt;br /&gt;_S n = (gs, r ++ s ++ t) where&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;gs = map S [1..n-1]&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;r = [([S i, S i],[]) | i &amp;lt;- [1..n-1]]&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;s = [(concat $ replicate 3 [S i, S (i+1)],[]) | i &amp;lt;- [1..n-2]]&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;t = [([S i, S j, S i, S j],[]) | i &amp;lt;- [1..n-1], j &amp;lt;- [i+2..n-1]]&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;And just to check:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; _S 4&lt;br /&gt;([s1,s2,s3],[([s1,s1],[]),([s2,s2],[]),([s3,s3],[]),([s1,s2,s1,s2,s1,s2],[]),([s2,s3,s2,s3,s2,s3],[]),([s1,s3,s1,s3],[])])&lt;br /&gt;&amp;gt; elts $ _S 4&lt;br /&gt;[[],[s1],[s2],[s3],[s1,s2],[s1,s3],[s2,s1],[s2,s3],[s3,s2],[s1,s2,s1],[s1,s2,s3],[s1,s3,s2],[s2,s1,s3],[s2,s3,s2],[s3,s2,s1],[s1,s2,s1,s3],[s1,s2,s3,s2],[s1,s3,s2,s1],[s2,s1,s3,s2],[s2,s3,s2,s1],[s1,s2,s1,s3,s2],[s1,s2,s3,s2,s1],[s2,s1,s3,s2,s1],[s1,s2,s1,s3,s2,s1]]&lt;br /&gt;&amp;gt; length it&lt;br /&gt;24&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;Anyway, that's it for now. Where I'm heading with this stuff is finite reflection groups and Coxeter groups, but I might take a couple of detours along the way.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5195188167565410449-5426688105890175338?l=haskellformaths.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://haskellformaths.blogspot.com/feeds/5426688105890175338/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://haskellformaths.blogspot.com/2010/07/group-presentations.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5195188167565410449/posts/default/5426688105890175338'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5195188167565410449/posts/default/5426688105890175338'/><link rel='alternate' type='text/html' href='http://haskellformaths.blogspot.com/2010/07/group-presentations.html' title='Group presentations'/><author><name>DavidA</name><uri>http://www.blogger.com/profile/16359932006803389458</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_XQ7FznWBAYE/TENZM9Ri4wI/AAAAAAAAAF8/9DkYD_7gU8k/s72-c/pentagonreflections.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5195188167565410449.post-3444459781535837747</id><published>2010-05-28T22:08:00.001+01:00</published><updated>2010-05-29T08:38:53.850+01:00</updated><title type='text'>String rewriting and Knuth-Bendix completion</title><content type='html'>&lt;div class="separator" style="clear: both; text-align: auto;"&gt;Previously in this blog we have been looking at symmetry groups of combinatorial structures. We have represented these symmetries concretely as permutations - for example, symmetries of graphs as permutations of their vertices. However, mathematicians tend to think about groups more abstractly.&lt;/div&gt;&lt;br /&gt;Consider the symmetry group of the square (the cyclic graph C4). It can be generated by two permutations:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; :load Math.Algebra.Group.PermutationGroup&lt;br /&gt;&amp;gt; let a = p [[1,2,3,4]]&lt;br /&gt;&amp;gt; let b = p [[1,2],[3,4]]&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="http://2.bp.blogspot.com/_XQ7FznWBAYE/TAAvwBoHKVI/AAAAAAAAAF0/AhwVdfdCjtM/s1600/squaresyms.GIF" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="197" src="http://2.bp.blogspot.com/_XQ7FznWBAYE/TAAvwBoHKVI/AAAAAAAAAF0/AhwVdfdCjtM/s400/squaresyms.GIF" width="400" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;We can list various relations that are satisfied by these generators:&lt;br /&gt;a^4 = 1&lt;br /&gt;b^2 = 1&lt;br /&gt;a^3 b = b a&lt;br /&gt;&lt;br /&gt;Of course, there are other relations that hold between the generators. However, the relations above are in fact sufficient to uniquely identify the group (up to isomorphism).&lt;br /&gt;&lt;br /&gt;Since a and b generate the group, any element in the group can be expressed as a product of a's and b's (and also their inverses, but we'll ignore that for now). However, there are of course an infinite number of such expressions, but only a finite number of group elements, so many of these expressions must represent the same element. For example, since b^2=1, then abba represents the same element as aa.&lt;br /&gt;&lt;br /&gt;Given two expressions, it would obviously be helpful to have a method for telling whether they represent the same group element. What we need is a string rewriting system. We can think of expressions in the generators as words in the symbols 'a' and 'b'. And we can reinterpret the relations above as rewrite rules:&lt;br /&gt;"aaaa" -&amp;gt; ""&lt;br /&gt;"bb" -&amp;gt; ""&lt;br /&gt;"aaab" -&amp;gt; "ba"&lt;br /&gt;&lt;br /&gt;Each of these rules consists of a left hand side and a right hand side. Given any word in the generator symbols, if we find the left hand side anywhere in the word, we can replace it by the right hand side. For example, in the word "abba", we can apply the rule "bb" -&amp;gt; "", giving "aa".&lt;br /&gt;&lt;br /&gt;So, the idea is that given any word in the generator symbols, we repeatedly apply the rewrite rules until we can go no further. The hope is that if two words represent the same group element, then we will end up with the same word after rewriting. We'll see later that there's a bit more to do before that will work, but for the moment, let's at least write some Haskell code to do the string rewriting.&lt;br /&gt;&lt;br /&gt;So the first thing we are going to need to do is try to find the left hand side of a rule as a subword within a word. Actually, we want to do a bit more than that - if X is our word, and Y the subword, then we want to find the A and B such that X = AYB.&lt;br /&gt;&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;import qualified Data.List as L&lt;br /&gt;&lt;br /&gt;splitSubstring xs ys = splitSubstring' [] xs where&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;splitSubstring' ls [] = Nothing&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;splitSubstring' ls (r:rs) =&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;if ys `L.isPrefixOf` (r:rs)&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;then Just (reverse ls, drop (length ys) (r:rs))&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;else splitSubstring' (r:ls) rs&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;Then if our rewrite rule is L -&amp;gt; R, then a single application of the rule consists in replacing L by R within the word:&lt;br /&gt;&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;rewrite1 (l,r) xs =&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;case xs `splitSubstring` l of&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;Nothing -&amp;gt; Nothing&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;Just (a,b) -&amp;gt; Just (a++r++b)&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;Okay, so suppose we have a rewrite system (that is, a collection of rewrite rules), and a word. Then we want to repeatedly apply the rules until we find that no rule applies:&lt;br /&gt;&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;rewrite rules word = rewrite' rules word where&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;rewrite' (r:rs) xs =&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;case rewrite1 r xs of&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Nothing -&amp;gt; rewrite' rs xs&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Just ys -&amp;gt; rewrite' rules ys&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;rewrite' [] xs = xs&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;For example:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; :load Math.Algebra.Group.StringRewriting&lt;br /&gt;&amp;gt; rewrite [("aaaa",""),("bb",""),("aaab","ba")] "abba"&lt;br /&gt;"aa"&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;So far, so good. However, there are some problems with the rewrite system that we constructed above. Suppose that the word we wanted to reduce was "aaabb".&lt;br /&gt;If we apply the rule "aaab" -&amp;gt; "ba", then we have "aaabb" -&amp;gt; "bab".&lt;br /&gt;However, if we apply the rule "bb" -&amp;gt; "", then we have "aaabb" -&amp;gt; "aaa".&lt;br /&gt;Neither "bab" nor "aaa" reduces any further. So we have two problems:&lt;br /&gt;- The same starting word can end up at different end words, depending on the order in which we apply the rules&lt;br /&gt;- We can see from the example that the words "bab" and "aaa" actually represent the same element in our group, but our rewrite system can't rewrite either of them&lt;br /&gt;&lt;br /&gt;What can we do about this? Well here's an idea. Let's just add "bab" -&amp;gt; "aaa" as a new rewrite rule to our system. We know that they are equal as elements of the group, so this is a valid thing to do.&lt;br /&gt;&lt;br /&gt;That's good, but we still have problems. What about the word "aaaab"?&lt;br /&gt;If we apply the rule "aaaa" -&amp;gt; "", then "aaaab" -&amp;gt; "b"&lt;br /&gt;On the other hand, if we apply the rule "aaab" -&amp;gt; "ba", then "aaaab" -&amp;gt; "aba"&lt;br /&gt;&lt;br /&gt;So let's do the same again, and add a new rule "aba" -&amp;gt; "b".&lt;br /&gt;&lt;br /&gt;What we're doing here is called the Knuth-Bendix algorithm. Let's take a step back. So in each case, I found a word that could be reduced in two different ways. How did I do that? Well, what I was looking for is two rules with overlapping left hand sides. That is, I was looking for rules L1 -&amp;gt; R1, L2 -&amp;gt; R2, with&lt;br /&gt;L1 = AB&lt;br /&gt;L2 = BC&lt;br /&gt;A pair of rules like this is called a critical pair. If we can find a critical pair, then by looking at the word ABC, we see that&lt;br /&gt;ABC = (AB)C = L1 C -&amp;gt; R1 C&lt;br /&gt;ABC = A(BC) = A L2 -&amp;gt; A R2&lt;br /&gt;So we are justified in adding a new rule R1 C -&amp;gt; A R2&lt;br /&gt;&lt;br /&gt;So the Knuth-Bendix algorithm basically says, for each critical pair, introduce a new rule, until there are no more critical pairs. There's a little bit more to it than that:&lt;br /&gt;- We want the rewrite system to &lt;i&gt;reduce&lt;/i&gt; the word. That means that we want an ordering on words, and given a pair, we want to make them into a rule that takes the greater to the lesser, rather than vice versa. The most obvious ordering to use is called shortlex: take longer words to shorter words, and if the lengths are equal, use alphabetical ordering.&lt;br /&gt;- Whenever we introduce a new rule, it might be that the left hand side of some existing rule becomes reducible. In that case, the existing rule becomes redundant, since any word that it would reduce can now be reduced by the new rule.&lt;br /&gt;&lt;br /&gt;Here's the code:&lt;br /&gt;&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;-- given two strings x,y, find if possible a,b,c with x=ab y=bc&lt;br /&gt;findOverlap xs ys = findOverlap' [] xs ys where&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;findOverlap' as [] cs = Nothing&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;findOverlap' as (b:bs) cs =&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;if (b:bs) `L.isPrefixOf` cs&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;then Just (reverse as, b:bs, drop (length (b:bs)) cs)&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;else findOverlap' (b:as) bs cs&lt;br /&gt;&lt;br /&gt;shortlex x y = compare (length x, x) (length y, y)&lt;br /&gt;&lt;br /&gt;ordpair x y =&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;case shortlex x y of&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;LT -&amp;gt; Just (y,x)&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;EQ -&amp;gt; Nothing&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;GT -&amp;gt; Just (x,y)&lt;br /&gt;&lt;br /&gt;knuthBendix1 rules = knuthBendix' rules pairs where&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;pairs = [(lri,lrj) | lri &amp;lt;- rules, lrj &amp;lt;- rules, lri /= lrj]&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;knuthBendix' rules [] = rules&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;knuthBendix' rules ( ((li,ri),(lj,rj)) : ps) =&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;case findOverlap li lj of&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Nothing -&amp;gt; knuthBendix' rules ps&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Just (a,b,c) -&amp;gt; case ordpair (rewrite rules (ri++c)) (rewrite rules (a++rj)) of&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Nothing -&amp;gt; knuthBendix' rules ps -- they both reduce to the same thing&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;Just rule' -&amp;gt; let rules' = reduce rule' rules&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;ps' = ps ++&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;[(rule',rule) | rule &amp;lt;- rules'] ++&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;[(rule,rule') | rule &amp;lt;- rules']&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;in knuthBendix' (rule':rules') ps'&lt;br /&gt;&amp;nbsp;&amp;nbsp; reduce rule@(l,r) rules = filter (\(l',r') -&amp;gt; not (L.isInfixOf l l')) rules&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;For example:&lt;br /&gt;&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; knuthBendix1 [("aaaa",""), ("bb",""), ("aaab","ba")]&lt;br /&gt;[("baa","aab"),("bab","aaa"),("aba","b"),("aaaa",""),("bb",""),("aaab","ba")]&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;A few words about the Knuth-Bendix algorithm&lt;br /&gt;- It is not guaranteed to terminate. Every time we introduce a new rule, we have the potential to create new critical pairs, and there are pathological examples where this goes on forever&lt;br /&gt;- The algorithm can be made slightly more efficient, by doing things like choosing to process shorter critical pairs first. In the HaskellForMaths library, a more efficient version is given, called simply "knuthBendix"&lt;br /&gt;&lt;br /&gt;Back to the example. So Knuth-Bendix has found three new rules. The full system, with these new rules added, has no more critical pairs. As a consequence, it is a confluent rewrite system - meaning that if you start at some given word, and reduce it using the rules, then it doesn't matter in what order you apply the rules, you will always end up at the same word. This word that you end up with can therefore be used as a normal form.&lt;br /&gt;&lt;br /&gt;This allows us to "solve the word problem" for this group. That is, given any two words in the generator symbols, we can find out whether they represent the same group element by rewriting them both, and seeing if they end up at the same normal form. For example:&lt;br /&gt;&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; let rules = knuthBendix [("aaaa",""), ("bb",""), ("aaab","ba")]&lt;br /&gt;&amp;gt; rewrite rules "aaaba"&lt;br /&gt;"aab"&lt;br /&gt;&amp;gt; rewrite rules "baabb"&lt;br /&gt;"aab"&lt;br /&gt;&amp;gt; rewrite rules "babab"&lt;br /&gt;"b"&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;So we see that "aaaba" and "baabb" represent the same group element, whereas "babab" represents a different one. (If you want, you could go back and check this using the original permutations.)&lt;br /&gt;&lt;br /&gt;We can even list (the normal forms of) all elements of the group. What we do is start with the empty word (which represents the identity element of the group), and then incrementally build longer and longer words. At each stage, we look at all combinations that can be formed by pre-pending a generator symbol to a word from the preceding stage. However, if we ever come across a word which can be reduced, then we know that it - and any word that could be formed from it at a later stage - is not a normal form, and so can be discarded. Here's the code:&lt;br /&gt;&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;nfs (gs,rs) = nfs' [[]] where&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;nfs' [] = [] -- we have run out of words&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;nfs' ws = let ws' = [g:w | g &amp;lt;- gs, w &amp;lt;- ws, (not . isNewlyReducible) (g:w)]&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;in ws ++ nfs' ws'&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;isNewlyReducible w = any (`L.isPrefixOf` w) (map fst rs)&lt;br /&gt;&lt;br /&gt;elts (gs,rs) = nfs (gs, knuthBendix rs)&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;For example:&lt;br /&gt;&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; elts (['a','b'], [("aaaa",""), ("bb",""), ("aaab","ba")])&lt;br /&gt;["","a","b","aa","ab","ba","aaa","aab"]&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;As expected, we have eight elements.&lt;br /&gt;&lt;br /&gt;That's enough for now. Next time (hopefully) I'll look at some more examples.&lt;br /&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5195188167565410449-3444459781535837747?l=haskellformaths.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://haskellformaths.blogspot.com/feeds/3444459781535837747/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://haskellformaths.blogspot.com/2010/05/string-rewriting-and-knuth-bendix.html#comment-form' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5195188167565410449/posts/default/3444459781535837747'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5195188167565410449/posts/default/3444459781535837747'/><link rel='alternate' type='text/html' href='http://haskellformaths.blogspot.com/2010/05/string-rewriting-and-knuth-bendix.html' title='String rewriting and Knuth-Bendix completion'/><author><name>DavidA</name><uri>http://www.blogger.com/profile/16359932006803389458</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_XQ7FznWBAYE/TAAvwBoHKVI/AAAAAAAAAF0/AhwVdfdCjtM/s72-c/squaresyms.GIF' height='72' width='72'/><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5195188167565410449.post-5516831584764881753</id><published>2010-04-25T20:55:00.000+01:00</published><updated>2010-04-25T20:55:58.516+01:00</updated><title type='text'>Block systems and block homomorphism</title><content type='html'>&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: auto;"&gt;Recently in this blog, we looked at the strong generating set (SGS) algorithm for permutation groups, and how we can use it to investigate the structure of groups. Last time, we saw how to partially "factor" intransitive groups, using the transitive constituent homomorphism. (Recall that by "factoring" a group G, we mean finding a proper normal subgroup K, and consequently also a quotient group G/K - which is equivalent to finding a proper homomorphism from G.) This time, I want to do the same for imprimitive groups. So, what is an imprimitive group?&lt;/div&gt;&lt;br /&gt;Well, given a permutation group acting on a set X, it can happen that X consists of "blocks" Y1, Y2, ... of points which always "move together". That is, a subset Y of X is a block if for all g in G, Y^g (the image of Y under the action of g) is either equal to Y or disjoint from it. A full set of blocks (that is, blocks Y1, Y2, ... which are disjoint, and whose union is the whole of X) is called a block system.&lt;br /&gt;&lt;br /&gt;For example, suppose that X is the vertices of the hexagon. The symmetry group of the hexagon is the dihedral group D12, generated by a rotation and a reflection:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; :load Math.Algebra.Group.Subquotients&lt;br /&gt;&amp;gt; mapM_ print $ _D 12&lt;br /&gt;[[1,2,3,4,5,6]]&lt;br /&gt;[[1,6],[2,5],[3,4]]&lt;/code&gt;&lt;br /&gt;&lt;a href="http://2.bp.blogspot.com/_XQ7FznWBAYE/S9SbaTA-AqI/AAAAAAAAAFU/s4qYvoXmDZs/s1600/hexagon.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://2.bp.blogspot.com/_XQ7FznWBAYE/S9SbaTA-AqI/AAAAAAAAAFU/s4qYvoXmDZs/s320/hexagon.png" /&gt;&lt;/a&gt;&lt;br /&gt;A block system for the hexagon is shown below. The blocks are the pairs of opposite vertices. You can verify that they satisfy the definition of blocks: any symmetry must take a pair of opposite points either to itself, or to another pair disjoint from it.&lt;br /&gt;&lt;a href="http://1.bp.blogspot.com/_XQ7FznWBAYE/S9Sbh1EjrNI/AAAAAAAAAFc/2XyS8SkJKWc/s1600/hexagon14.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://1.bp.blogspot.com/_XQ7FznWBAYE/S9Sbh1EjrNI/AAAAAAAAAFc/2XyS8SkJKWc/s320/hexagon14.png" /&gt;&lt;/a&gt;&lt;br /&gt;A given group can have more than one block system. Here is another block system for the hexagon. The blocks are the two equilateral triangles formed by the vertices.&lt;br /&gt;&lt;a href="http://2.bp.blogspot.com/_XQ7FznWBAYE/S9Sbl3gM5eI/AAAAAAAAAFk/sNxzgnmKXn8/s1600/hexagon135.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://2.bp.blogspot.com/_XQ7FznWBAYE/S9Sbl3gM5eI/AAAAAAAAAFk/sNxzgnmKXn8/s320/hexagon135.png" /&gt;&lt;/a&gt;&lt;br /&gt;There are also the trivial block systems, consisting of either just one block containing all the points, or a block for each point. From now on, we will silently exclude these.&lt;br /&gt;&lt;br /&gt;So, I was meant to be telling you what an imprimitive group is. Well, it's just a group which has a non-trivial block system. Conversely, a primitive group is one which has no non-trivial block system.&lt;br /&gt;&lt;br /&gt;When we have an imprimitive group, we will be able to form a homomorphism - and hence factor the group - by considering the induced action of the group on the blocks. But I'm jumping ahead. First we need to write some Haskell code - to find block systems.&lt;br /&gt;&lt;br /&gt;The idea is to write a function that, given a pair of points Y = {y1,y2} in X (or indeed any subset Y of X), can find the smallest block containing Y. The way it works is as follows. We start by supposing that each point is in a block of its own, except for the points in Y. We initialise a map, with the points in X as keys, and the blocks as values, where we represent a block by its least element.&lt;br /&gt;&lt;br /&gt;Now, suppose that we currently think that the minimal block is Y = {y1,y2,...}. What we're going to do is work through the elements of Y, and work through the generators of G, trying to find a problem. So suppose that we have got as far as some element y of Y, and some generator g of G. We know that y is in the same block as y1, and what we have to check is that y^g is in the same block as y1^g. So we look up their representatives in the map, and check that they're the same. If they're not, then we need to merge the two classes. Here's the code (it's a little opaque, but it's basically doing what I just described).&lt;br /&gt;&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;minimalBlock gs ys@(y1:yt) = minimalBlock' p yt gs where&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;xs = foldl union [] $ map supp gs&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;p = M.fromList $ [(yi,y1) | yi &amp;lt;- ys] ++ [(x,x) | x &amp;lt;- xs \\ ys]&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;minimalBlock' p (q:qs) (h:hs) =&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;let r = p M.! q &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; -- representative of class containing q&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;k = p M.! (q .^ h) &amp;nbsp;-- rep of class (q^h)&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;l = p M.! (r .^ h) &amp;nbsp;-- rep of class (r^h)&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;in if k /= l -- then we need to merge the classes&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; then let p' = M.map (\x -&amp;gt; if x == l then k else x) p&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;qs' = qs ++ [l]&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;in minimalBlock' p' (q:qs') hs&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; else minimalBlock' p (q:qs) hs&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;minimalBlock' p (q:qs) [] = minimalBlock' p qs gs&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;minimalBlock' p [] _ =&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;let reps = toListSet $ M.elems p&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;in L.sort [ filter (\x -&amp;gt; p M.! x == r) xs | r &amp;lt;- reps ]&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;Once we have this function, then finding the block systems is simple - just take each pair {x1,xi} from X, and find the minimal block containing it.&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;blockSystems gs&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;| isTransitive gs = toListSet $ filter (/= [x:xs]) $ map (minimalBlock gs) [ [x,x'] | x' &amp;lt;- xs ]&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;| otherwise = error "blockSystems: not transitive"&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;where x:xs = foldl union [] $ map supp gs&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;If we have an SGS for G, then we can do slightly better. For suppose that within the stabiliser G&lt;sub&gt;x1&lt;/sub&gt;, there is an element taking xi to xj. Then clearly xi and xj must be in the same minimal block. So in fact, we need only consider pairs {x1,xi}, with xi the minimal element of each orbit in G&lt;sub&gt;x1&lt;/sub&gt;. (Of course, the point is that if we have an SGS for G, then we can trivially list a set of generators for G&lt;sub&gt;x1&lt;/sub&gt;.)&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;blockSystemsSGS gs = toListSet $ filter (/= [x:xs]) $ map (minimalBlock gs) [ [x,x'] | x' &amp;lt;- rs ]&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;where x:xs = foldl union [] $ map supp gs&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;hs = filter (\g -&amp;gt; x &amp;lt; minsupp g) gs -- sgs for stabiliser Gx&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;os = orbits hs&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;rs = map head os ++ (xs \\ L.sort (concat os)) -- orbit representatives, including singleton cycles&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;Let's test it:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; mapM_ print $ blockSystems $ _D 12&lt;br /&gt;[[1,3,5],[2,4,6]]&lt;br /&gt;[[1,4],[2,5],[3,6]]&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;Okay, so given a group, we can find its non-trivial block systems, if any. What next? Well, as I hinted earlier, this enables us to factor the group. For if there is a non-trivial block system, then the action of the group on the points induces a well-defined action on the blocks. This induced action gives us a homomorphism from our original group G, a subgroup of Sym(X), to another group H, a subgroup of Sym(B), where B is the set of blocks.&lt;br /&gt;&lt;br /&gt;So as we did &lt;a href="http://haskellformaths.blogspot.com/2010/03/transitive-constituent-homomorphism.html"&gt;last time&lt;/a&gt;, we can find the kernel and image of the homomorphism, and thus factor the group. How do we do that?&lt;br /&gt;&lt;br /&gt;Well, it's simple. In the following code, the function lr takes a group element acting on the points, and returns a group element acting on the blocks (in the Left side) and the points (in the Right side) in an Either union. If we do this to all the group generators, and then find an SGS, then as the Left blocks sort before the Right points, then the SGS will split neatly into two parts:&lt;br /&gt;- The initial segment of the SGS will consist of elements which move the Left blocks. If we restrict their action to just the blocks, we will have an SGS for the image of the homomorphism, acting on the blocks.&lt;br /&gt;- The final segment of the SGS will consist of elements which fix all the Left blocks. These elements move points but not blocks, so they form an SGS for the kernel of the homomorphism.&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;blockHomomorphism' gs bs = (ker,im) where&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;gs' = sgs $ map lr gs&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;lr g = fromPairs $ [(Left b, Left $ b -^ g) | b &amp;lt;- bs] ++ [(Right x, Right y) | (x,y) &amp;lt;- toPairs g]&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;ker = map unRight $ dropWhile (isLeft . minsupp) gs' -- stabiliser of the blocks&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;im = map restrictLeft $ takeWhile (isLeft . minsupp) gs' -- restriction to the action on blocks&lt;br /&gt;&lt;br /&gt;blockHomomorphism gs bs&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;| bs == closure bs [(-^ g) | g &amp;lt;- gs] -- validity check on bs&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;= blockHomomorphism' gs bs&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;Let's try it out on our two block systems for the hexagon:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; blockHomomorphism (_D 12) [[1,4],[2,5],[3,6]]&lt;br /&gt;([[[1,4],[2,5],[3,6]]],&lt;br /&gt;&amp;nbsp;[[[[1,4],[2,5],[3,6]]],[[[2,5],[3,6]]]])&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;I've formatted the output for clarity. The first line is (an SGS for) the kernel, consisting of elements of D12 which permute points within the blocks, without permuting the blocks. In this case, the kernel is generated by the 180 degree rotation, which swaps the points within each pair. The second line is (an SGS for) the image, consisting of the induced action of D12 on the blocks. In this case, we have the full permutation group S3 acting on the three pairs of points.&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; blockHomomorphism (_D 12) [[1,3,5],[2,4,6]]&lt;br /&gt;([[[1,5,3],[2,6,4]],[[2,6],[3,5]]],&lt;br /&gt;&amp;nbsp;[[[[1,3,5],[2,4,6]]]])&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;In this case the kernel is generated by a 120 degree rotation and a reflection, and consists of all group elements which send odd points to odd and even points to even, thus preserving the blocks. The image has only one non-trivial element, which just swaps the two blocks.&lt;br /&gt;&lt;br /&gt;Armed with this new tool, let's have another look at Rubik's cube. Recall that we labelled the faces of the cube as follows:&lt;br /&gt;&lt;a href="http://2.bp.blogspot.com/_XQ7FznWBAYE/S9Sbric1KNI/AAAAAAAAAFs/wGAb_LcSOwo/s1600/rubik.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://2.bp.blogspot.com/_XQ7FznWBAYE/S9Sbric1KNI/AAAAAAAAAFs/wGAb_LcSOwo/s320/rubik.png" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Last time, we split the Rubik cube group into two homomorphic images - a group acting on just the corner faces, and a group acting on just the edge faces. Let's look for block systems in these groups:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; :load Math.Projects.Rubik&lt;br /&gt;&amp;gt; let [cornerBlocks] = blockSystems imCornerFaces&lt;br /&gt;&amp;gt; let [edgeBlocks] = blockSystems imEdgeFaces&lt;br /&gt;&amp;gt; cornerBlocks&lt;br /&gt;[[1,17,23],[3,19,41],[7,29,31],[9,33,47],[11,21,53],[13,43,51],[27,37,59],[39,49,57]]&lt;br /&gt;&amp;gt; edgeBlocks&lt;br /&gt;[[2,18],[4,26],[6,44],[8,32],[12,52],[14,22],[16,42],[24,56],[28,34],[36,48],[38,58],[46,54]]&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;It's obvious really - in the corner group, we have a block system with blocks consisting of the three corner faces that belong to the same corner piece, and in the edge group, we have a block system with blocks consisting of the two edge faces that belong to the same edge piece. Furthermore, these are the only block systems.&lt;br /&gt;&lt;br /&gt;So we can form the kernel and image under the block homomorphism:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; let (kerCornerBlocks,imCornerBlocks) = blockHomomorphism imCornerFaces cornerBlocks&lt;br /&gt;&amp;gt; let (kerEdgeBlocks,imEdgeBlocks) = blockHomomorphism imEdgeFaces edgeBlocks&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;If we look at the sizes of these groups, the structure will be obvious:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; orderSGS kerCornerBlocks&lt;br /&gt;2187&lt;br /&gt;&amp;gt; orderSGS imCornerBlocks&lt;br /&gt;40320&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;These are 3^7, and 8! respectively. The kernel is the permutations of the corner faces which leave the corner blocks where they are. It turns out that whenever you twist one corner block, you must untwist another. So when you have decided what to do with seven corners, the eighth is determined - hence 3^7. For the image, we have eight blocks, and 8! permutations of them, so this must be the full symmetry group S8 - meaning that we can perform any rearrangement of the corner blocks that is desired.&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; orderSGS kerEdgeBlocks&lt;br /&gt;2048&lt;br /&gt;&amp;gt; orderSGS imEdgeBlocks&lt;br /&gt;479001600&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;These are 2^11 and 12! respectively. For the kernel, whenever we flip one edge piece we must also flip another. So when we have decided what to do with eleven edges, the twelfth is determined - hence 2^11. For the image, we have twelve pieces, and 12! permutations of them, so we have the full symmetry group S12 on edge blocks.&lt;br /&gt;&lt;br /&gt;That's it.&lt;br /&gt;&lt;br /&gt;Incidentally, my references for this material are:&lt;br /&gt;- Holt, Handbook of Computational Group Theory&lt;br /&gt;- Seress, Permutation Group Algorithms&lt;br /&gt;both of which are very good - but expensive.&lt;br /&gt;&lt;br /&gt;These books, particularly the latter, go on to describe further algorithms that can be used to factor even transitive primitive groups, enabling us to arrive at a full decomposition of a group into simple groups. Unfortunately, the algorithms get a bit more complicated after this, and I haven't yet implemented the rest in HaskellForMaths.&lt;br /&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5195188167565410449-5516831584764881753?l=haskellformaths.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://haskellformaths.blogspot.com/feeds/5516831584764881753/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://haskellformaths.blogspot.com/2010/04/block-systems-and-block-homomorphism.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5195188167565410449/posts/default/5516831584764881753'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5195188167565410449/posts/default/5516831584764881753'/><link rel='alternate' type='text/html' href='http://haskellformaths.blogspot.com/2010/04/block-systems-and-block-homomorphism.html' title='Block systems and block homomorphism'/><author><name>DavidA</name><uri>http://www.blogger.com/profile/16359932006803389458</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_XQ7FznWBAYE/S9SbaTA-AqI/AAAAAAAAAFU/s4qYvoXmDZs/s72-c/hexagon.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5195188167565410449.post-8118862860927630860</id><published>2010-03-23T20:44:00.000Z</published><updated>2010-03-23T20:44:17.984Z</updated><title type='text'>Transitive constituent homomorphism</title><content type='html'>&lt;div class="separator" style="clear: both; text-align: auto;"&gt;[New version HaskellForMaths 0.2.1 available &lt;a href="http://hackage.haskell.org/package/HaskellForMaths"&gt;here&lt;/a&gt;]&lt;/div&gt;&lt;br /&gt;Last time, we looked at what it means to be a strong generating set (SGS) for a permutation group, and how to find one. We saw that with an SGS we can easily calculate the number of elements in the group, and test whether an arbitrary permutation is a member of the group.&lt;br /&gt;&lt;br /&gt;This time, I want to start showing how strong generating sets can be used as the basis for further algorithms to investigate the structure of groups. In a &lt;a href="http://haskellformaths.blogspot.com/2009/10/simple-groups-atoms-of-symmetry.html"&gt;previous post&lt;/a&gt;, I explained about normal subgroups and quotient groups. Over the next couple of posts, I want to show how, for groups having the appropriate form, we can find some of the easier normal subgroups and quotient groups. Specifically, in this post I want to look at intransitive groups, and in the next at imprimitive groups.&lt;br /&gt;&lt;br /&gt;Many of the groups we have looked at so far - such as the symmetries of the pentagon, cube, and so on - have been "transitive" - meaning that given any two points, there is a group element which takes one to the other. (Or to put it another way, all the points "look the same".)&lt;br /&gt;&lt;br /&gt;However, it is easy to find groups that are not transitive on their points. For example, if we take any graph which is not regular (that is, it has vertices of different valencies), then the symmetry group is not transitive (because no symmetry can take a vertex of one valency to a vertex of another valency).&lt;br /&gt;&lt;br /&gt;In order to demonstrate the code that I want to look at this week, I'm going to need an example - so here are the generators of a group G:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; :load Math.Algebra.Group.Subquotients&lt;br /&gt;&amp;gt; let gs = map p [ [[1,2,3]], [[4,5,6]], [[1,2],[4,5]] ]&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;We can think of this group as the symmetry group of some kind of plastic puzzle, as shown below. The puzzle consists of a blue and a red triangle. Valid moves consist of rotating the blue triangle, rotating the red triangle, or performing a double flip, which swaps a pair of blue points and at the same time a pair of red points.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://3.bp.blogspot.com/_XQ7FznWBAYE/S6kicceiFgI/AAAAAAAAAFE/lXbSLLO3JOE/s1600-h/oddtriangles.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://3.bp.blogspot.com/_XQ7FznWBAYE/S6kicceiFgI/AAAAAAAAAFE/lXbSLLO3JOE/s320/oddtriangles.png" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;It's a small group, so let's have a look at the elements:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; mapM_ print $ elts gs&lt;br /&gt;[]&lt;br /&gt;[[1,2],[4,5]]&lt;br /&gt;[[1,2],[4,6]]&lt;br /&gt;[[1,2],[5,6]]&lt;br /&gt;[[1,2,3]]&lt;br /&gt;[[1,2,3],[4,5,6]]&lt;br /&gt;[[1,2,3],[4,6,5]]&lt;br /&gt;[[1,3,2]]&lt;br /&gt;[[1,3,2],[4,5,6]]&lt;br /&gt;[[1,3,2],[4,6,5]]&lt;br /&gt;[[1,3],[4,5]]&lt;br /&gt;[[1,3],[4,6]]&lt;br /&gt;[[1,3],[5,6]]&lt;br /&gt;[[2,3],[4,5]]&lt;br /&gt;[[2,3],[4,6]]&lt;br /&gt;[[2,3],[5,6]]&lt;br /&gt;[[4,5,6]]&lt;br /&gt;[[4,6,5]]&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;Notice that the group is not transitive: there is no move that takes a blue point to a red point, or vice versa.&lt;br /&gt;&lt;br /&gt;So we have a group G, acting on a set X. If G is not transitive on X, then we can write X as a disjoint union of subsets A, B, C, ..., such that G is transitive on each of the subsets. In our example, we can take A = [1,2,3], B = [4,5,6], and we see that G is transitive on A, and on B. A and B are called the transitive constituents (or the orbits) of G.&lt;br /&gt;&lt;br /&gt;It's fairly straightforward to write Haskell code to find the orbits of a permutation group. This allows us to find:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; orbits gs&lt;br /&gt;[[1,2,3],[4,5,6]]&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;[The implementation is left as an exercise.]&lt;br /&gt;&lt;br /&gt;Okay, so what's so interesting about intransitive groups? Well, an intransitive group always has a non-trivial normal subgroup, so we can always "factor" it into smaller groups. How?&lt;br /&gt;&lt;br /&gt;Well, take one of the transitive constituents. In our example, let's take [1,2,3]. Then we can consider the restriction of the action of the group to just this constituent. For our generators, the restriction gives us:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;[[1,2,3]] -&amp;gt; [[1,2,3]]&lt;br /&gt;[[4,5,6]] -&amp;gt; []&lt;br /&gt;[[1,2],[4,5]] -&amp;gt; [[1,2]]&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;It might look as if the first generator got taken to itself - but it is important to realise that these are elements of different groups. This would have been clearer if we had written out the permutations in row notation:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;1 2 3 4 5 6 -&amp;gt; 1 2 3&lt;br /&gt;2 3 1 4 5 6 &amp;nbsp; &amp;nbsp;2 3 1&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;Or if we had included singleton cycles in the cycle notation:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;[[1,2,3],[4],[5],[6]] -&amp;gt; [[1,2,3]]&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;So this restriction of the action is a map from our group - a subgroup of Sym([1..6]) - to another group - a subgroup of Sym([1..3]).&lt;br /&gt;&lt;br /&gt;If we call this map f, then it should be obvious that f has the following properties:&lt;br /&gt;- f(1) = 1 (where 1 here means the identity element in the respective groups, not the point 1)&lt;br /&gt;- f(g*h) = f(g)*f(h) (consider their action on a point x)&lt;br /&gt;- f(g^-1) = f(g)^-1&lt;br /&gt;&lt;br /&gt;Hence f is a homomorphism - a function between groups that preserves the group structure.&lt;br /&gt;&lt;br /&gt;The kernel of a homomorphism is the set {g &amp;lt;- G | f(g) = 1}. The kernel is always a normal subgroup:&lt;br /&gt;It's obviously a subgroup. (Why?)&lt;br /&gt;And f(h^-1 g h) = f(h)^-1 f(g) f(h) (by the homomorphism properties).&lt;br /&gt;So if f(g) = 1, then f(h^-1 g h) = 1.&lt;br /&gt;In other words, if g is in the kernel, then so is h^-1 g h - which is just the condition for the kernel to be a normal subgroup.&lt;br /&gt;&lt;br /&gt;The image of a homomorphism is the set {f(g) | g &amp;lt;- G}. This is naturally isomorphic to the quotient group G/K, where K is the kernel: since for k &amp;lt;- K, f(gk) = f(g) f(k) = f(g) (since f(k) = 1) - so elements of the image are in one-to-one correspondence with cosets of the kernel, and it is easy to check that the group operations correspond too.&lt;br /&gt;&lt;br /&gt;Hence any (non-trivial) homomorphism induces a "factorisation" of the group into smaller groups K and G/K - the kernel and the image.&lt;br /&gt;&lt;br /&gt;In our example, the kernel is &amp;lt; [[4,5,6]] &amp;gt; &amp;lt;= Sym([1..6]), and the image is &amp;lt; [[1,2,3]], [[1,2]] &amp;gt; = Sym([1..3]).&lt;br /&gt;&lt;br /&gt;We would like a Haskell function that can find the kernel and image of a transitive constituent homomorphism for us, like so:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; transitiveConstituentHomomorphism gs [1,2,3]&lt;br /&gt;([[[4,5,6]]],[[[1,2,3]],[[1,2]]])&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;So how do we do that? Well, the basic idea is that we will convert all our permutations on X into permutations on Either A B, where A is the transitive constituent we are restricting to, and B is the rest. In our group, we have X = [1..6], A = [1,2,3], B = [4,5,6], so for example we will convert [[1,2],[4,5]] into [[Left 1, Left 2], [Right 4, Right 5]].&lt;br /&gt;&lt;br /&gt;Next, we construct a strong generating set for the new group. Since the SGS algorithm looks for point stabilisers /in order/ - it will first find point stabilisers for all the Left points, and then for all the Right points. So our SGS will split into two parts:&lt;br /&gt;- A first part consisting of elements that move some of the Left points. If we restrict this part to its action on the Left points, we will have an SGS for the image.&lt;br /&gt;- A second part consisting of elements that move only the Right points. This part is an SGS for the kernel.&lt;br /&gt;&lt;br /&gt;For example:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; mapM_ print $ sgs $ map p [ [[Left 1, Left 2, Left 3]], [[Right 4, Right 5, Right 6]], [[Left 1, Left 2],[Right 4, Right 5]] ]&lt;br /&gt;[[Left 1,Left 2,Left 3],[Right 4,Right 6,Right 5]]&lt;br /&gt;[[Left 2,Left 3],[Right 4,Right 5]]&lt;br /&gt;[[Right 4,Right 6,Right 5]]&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;[Note that as our SGS algorithm uses randomness, we might get a different set of generators each time we call it.]&lt;br /&gt;&lt;br /&gt;Okay, so we know what to do. Time for some code. We need first of all a few utility functions for packing and unpacking our group elements into the Left / Right parts of the Either type.&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;isLeft (Left _) = True&lt;br /&gt;isLeft (Right _) = False&lt;br /&gt;&lt;br /&gt;isRight (Right _) = True&lt;br /&gt;isRight (Left _) = False&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;unRight = fromPairs . map (\(Right a, Right b) -&amp;gt; (a,b)) . toPairs&lt;br /&gt;&lt;br /&gt;restrictLeft g = fromPairs [(a,b) | (Left a, Left b) &amp;lt;- toPairs g]&lt;br /&gt;-- note that this is doing a filter - taking only the left part of the action - and a map, unLefting&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;Then the code is simple:&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;code&gt;&lt;br /&gt;transitiveConstituentHomomorphism gs delta&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;| delta == closure delta [(.^ g) | g &amp;lt;- gs] -- delta is a transitive constituent&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; = transitiveConstituentHomomorphism' gs delta&lt;br /&gt;&lt;br /&gt;transitiveConstituentHomomorphism' gs delta = (ker, im) where&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;gs' = sgs $ map (fromPairs . map (\(a,b) -&amp;gt; (lr a, lr b)) . toPairs) gs&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;-- as delta is a transitive constituent, we will always have a and b either both Left or both Right&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;lr x = if x `elem` delta then Left x else Right x&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;ker = map unRight $ dropWhile (isLeft . minsupp) gs' -- pointwise stabiliser of delta&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;im = map restrictLeft $ takeWhile (isLeft . minsupp) gs' -- restriction of the action to delta&lt;/code&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;That's it.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Okay, so the transitive constituent homomorphism gives us a way to "take apart" intransitive groups. Let's use it to take another look at the Rubik cube group.&lt;br /&gt;&lt;br /&gt;Remember how we labelled the faces of the Rubik cube:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://2.bp.blogspot.com/_XQ7FznWBAYE/S6kl9aO9aYI/AAAAAAAAAFM/1gIIe-qX8h4/s1600-h/rubik.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://2.bp.blogspot.com/_XQ7FznWBAYE/S6kl9aO9aYI/AAAAAAAAAFM/1gIIe-qX8h4/s320/rubik.png" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Then the generators of the Rubik cube group are:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;f = p [[ 1, 3, 9, 7],[ 2, 6, 8, 4],[17,41,33,29],[18,44,32,26],[19,47,31,23]]&lt;br /&gt;b = p [[51,53,59,57],[52,56,58,54],[11,27,39,43],[12,24,38,46],[13,21,37,49]]&lt;br /&gt;l = p [[21,23,29,27],[22,26,28,24],[ 1,31,59,11],[ 4,34,56,14],[ 7,37,53,17]]&lt;br /&gt;r = p [[41,43,49,47],[42,46,48,44],[ 3,13,57,33],[ 6,16,54,36],[ 9,19,51,39]]&lt;br /&gt;u = p [[11,13,19,17],[12,16,18,14],[ 1,21,51,41],[ 2,22,52,42],[ 3,23,53,43]]&lt;br /&gt;d = p [[31,33,39,37],[32,36,38,34],[ 7,47,57,27],[ 8,48,58,28],[ 9,49,59,29]]&lt;br /&gt;&lt;br /&gt;rubikCube = [f,b,l,r,u,d]&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;What orbits does it have?&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; :load Math.Projects.Rubik&lt;br /&gt;&amp;gt; orbits rubikCube&lt;br /&gt;[[1,3,7,9,11,13,17,19,21,23,27,29,31,33,37,39,41,43,47,49,51,53,57,59],[2,4,6,8,12,14,16,18,22,24,26,28,32,34,36,38,42,44,46,48,52,54,56,58]]&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;It's obvious really: the odd numbered points are corner faces, and the even numbered points are edge faces - and there is no valid move that can take a corner face to an edge face, or vice versa. So the group is intransitive. So we can define:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;[cornerFaces,edgeFaces] = orbits rubikCube&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;We then have two different ways to split the group, depending which transitive constituent we take:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;(kerCornerFaces,imCornerFaces) = transitiveConstituentHomomorphism rubikCube cornerFaces&lt;br /&gt;&lt;br /&gt;(kerEdgeFaces,imEdgeFaces) = transitiveConstituentHomomorphism rubikCube edgeFaces&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;Note that:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; orderSGS imCornerFaces * orderSGS imEdgeFaces&lt;br /&gt;86504006548979712000&lt;br /&gt;&amp;gt; let rubikSGS = sgs rubikCube&lt;br /&gt;&amp;gt; orderSGS rubikSGS&lt;br /&gt;43252003274489856000&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;Why are they not equal? It is because in fact we can't operate on corners and edges totally independently. The following examples show that we can't swap a pair of corners without also swapping a pair of edges, and vice versa:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; isMemberSGS rubikSGS (p [[1,3],[17,41],[19,23],[2,6],[18,44]])&lt;br /&gt;True&lt;br /&gt;&amp;gt; isMemberSGS rubikSGS (p [[1,3],[17,41],[19,23]])&lt;br /&gt;False&lt;br /&gt;&amp;gt; isMemberSGS rubikSGS (p [[2,6],[18,44]])&lt;br /&gt;False&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;(This is similar to the way that in our original example, we couldn't swap a pair of blue points without also swapping a pair of red points.)&lt;br /&gt;&lt;br /&gt;That's it for now. Next time, I'll show how we can factor the corner and edge groups still further, to arrive at a fairly complete understanding of the structure of the Rubik cube group.&lt;br /&gt;&lt;br /&gt;(By the way, I would like to acknowledge a debt to the following example from the GAP system: &lt;a href="http://www.gap-system.org/Doc/Examples/rubik.html"&gt;http://www.gap-system.org/Doc/Examples/rubik.html&lt;/a&gt; )&lt;br /&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5195188167565410449-8118862860927630860?l=haskellformaths.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://haskellformaths.blogspot.com/feeds/8118862860927630860/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://haskellformaths.blogspot.com/2010/03/transitive-constituent-homomorphism.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5195188167565410449/posts/default/8118862860927630860'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5195188167565410449/posts/default/8118862860927630860'/><link rel='alternate' type='text/html' href='http://haskellformaths.blogspot.com/2010/03/transitive-constituent-homomorphism.html' title='Transitive constituent homomorphism'/><author><name>DavidA</name><uri>http://www.blogger.com/profile/16359932006803389458</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_XQ7FznWBAYE/S6kicceiFgI/AAAAAAAAAFE/lXbSLLO3JOE/s72-c/oddtriangles.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5195188167565410449.post-4961294898391549693</id><published>2010-02-14T20:10:00.001Z</published><updated>2010-02-14T20:13:45.654Z</updated><title type='text'>How to find a strong generating set</title><content type='html'>Nearly three months since my last post - sorry! I have an excuse - we've been moving house - but the truth is I've had a bit of writer's block over this post. Anyway...&lt;br /&gt;&lt;br /&gt;Previously in this blog we have been looking at permutation groups, and especially those which arise as the symmetry groups of graphs or other combinatorial objects. We developed a "graphAuts" function, which finds generators for the group of symmetries of a graph. In fact, "graphAuts" finds a &lt;i&gt;strong generating set&lt;/i&gt; or SGS, which is a generating set of a special form, making various calculations in permutation groups relatively easy. What I want to look at this week, is how we can find a strong generating set for an arbitrary permutation group, given generators for the group.&lt;br /&gt;&lt;br /&gt;Let's remind ourselves what a strong generating set looks like. Here's&amp;nbsp;q 3, the graph of the cube.&lt;br /&gt;&lt;a href="http://2.bp.blogspot.com/_XQ7FznWBAYE/S3hXJWheV7I/AAAAAAAAAE8/a6NVNQZbYB4/s1600-h/cube.GIF" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://2.bp.blogspot.com/_XQ7FznWBAYE/S3hXJWheV7I/AAAAAAAAAE8/a6NVNQZbYB4/s320/cube.GIF" /&gt;&lt;/a&gt;&lt;br /&gt;The graphAuts function returns us a strong generating set for the symmetry group of the cube:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; :load Math.Combinatorics.GraphAuts&lt;br /&gt;&amp;gt; mapM_ print $ graphAuts $ q 3&lt;br /&gt;[[0,1],[2,3],[4,5],[6,7]]&lt;br /&gt;[[0,2,3,1],[4,6,7,5]]&lt;br /&gt;[[0,4,6,7,3,1],[2,5]]&lt;br /&gt;[[1,2],[5,6]]&lt;br /&gt;[[1,4,2],[3,5,6]]&lt;br /&gt;[[2,4],[3,5]]&lt;/code&gt;&lt;br /&gt;So what is special about this set of symmetries? Well, first of all, the strong generating set &lt;i&gt;is&lt;/i&gt; a generating set for the group - all symmetries of the cube can be expressed as products of these generators (and their inverses). But notice how the SGS is composed of a number of "levels". The first level consists of some elements that move 0. The second level consists of some elements that fix 0 but move 1. The third level consists of an element that fixes 0 and 1 but moves 2. So we can think of there being a 0-level, a 1-level, and a 2-level. The sequence [0,1,2] is called the base for the SGS.&lt;br /&gt;&lt;br /&gt;Let's call our group G. Now, if we discard the first level of the SGS (the 0-level), the elements that remain form an SGS for the subgroup of elements that fix 0, which is called the point stabiliser of 0, denoted G&lt;sub&gt;0&lt;/sub&gt;. If we also discard the second level (the 1-level), the element that remains is an SGS for the subgroup of elements that fix both 0 and 1, which is called the pointwise stabiliser of {0,1}, denoted G&lt;sub&gt;{0,1}&lt;/sub&gt;. So an SGS defines a sequence of subgroups called a point stabiliser chain for the group:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;G_0 = G &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;&amp;gt; &amp;nbsp;G_1 = G_{0} &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;&amp;gt; &amp;nbsp;G_2 = G_{0,1} &amp;nbsp;&amp;gt; &amp;nbsp;G_{0,1,2} = 1&lt;br /&gt;[[0,1],[2,3],[4,5],[6,7]] &amp;nbsp; &amp;nbsp; [[1,2],[5,6]] &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; [[2,4],[3,5]]&lt;br /&gt;[[0,2,3,1],[4,6,7,5]] &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; [[1,4,2],[3,5,6]]&lt;br /&gt;[[0,4,6,7,3,1],[2,5]] &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; [[2,4],[3,5]]&lt;br /&gt;[[1,2],[5,6]]&lt;br /&gt;[[1,4,2],[3,5,6]]&lt;br /&gt;[[2,4],[3,5]]&lt;/code&gt;&lt;br /&gt;Now, consider a pair of successive subgroups within this chain, for example, G_{0} and G_{0,1}. G_{0} consists of some elements that fix 1, some elements that send 1 to 2, and some elements that send 1 to 4. The elements that fix 1 are just the subgroup G_{0,1}, the point stabiliser of 1 within G_{0}. The elements that send 1 to 2, and those that send 1 to 4, are &lt;i&gt;cosets&lt;/i&gt; of G_{0,1} within G_{0}. We look for a representative of each coset - that is, an element that sends 1 to 2 - [[1,2],[5,6]], an element that sends 1 to 4 - [[1,4,2],[3,5,6]], and for an element that sends 1 to 1 (fixes 1), we can take the identity. Once we have done that, every element of G_{0} can be expressed the product of an element of G_{0,1} with one of these representatives.&lt;br /&gt;&lt;br /&gt;Just to show you what I mean, here are some representatives for each link in the chain for the cube group:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;U_0 &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;U_1 &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;U_2&lt;br /&gt;(0,[]) &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; (1,[]) &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; (2,[])&lt;br /&gt;(1,[[0,1],[2,3],[4,5],[6,7]]) &amp;nbsp; &amp;nbsp;(2,[[1,2],[5,6]]) &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;(4,[[2,4],[3,5]])&lt;br /&gt;(2,[[0,2,3,1],[4,6,7,5]]) &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;(4,[[1,4,2],[3,5,6]])&lt;br /&gt;(3,[[0,3],[1,2],[4,7],[5,6]])&lt;br /&gt;(4,[[0,4,6,7,3,1],[2,5]])&lt;br /&gt;(5,[[0,5,6,3],[1,4,7,2]])&lt;br /&gt;(6,[[0,6,3],[1,4,7]])&lt;br /&gt;(7,[[0,7],[1,6],[2,5],[3,4]])&lt;/code&gt;&lt;br /&gt;At each link in the point stabiliser chain, we have a base element b that we are going to stabilise. For each point within the orbit of b, we find a group element that takes b to that point. This set of group elements, U_b, is a set of coset representatives - a transversal for the cosets.&lt;br /&gt;&lt;br /&gt;The Haskell code to find these transversals, given an SGS, is fairly straightforward:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;baseTransversalsSGS gs = [let hs = filter ( (b &amp;lt;=) . minsupp ) gs in (b, cosetRepsGx hs b) | b &amp;lt;- bs]&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;where bs = toListSet $ map minsupp gs&lt;br /&gt;&lt;br /&gt;cosetRepsGx gs x = cosetRepsGx' gs M.empty (M.singleton x 1) where&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;cosetRepsGx' gs interior boundary&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;| M.null boundary = interior&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;| otherwise =&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;let interior' = M.union interior boundary&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;boundary' = M.fromList [(p .^ g, h*g) | g &amp;lt;- gs, (p,h) &amp;lt;- M.toList boundary] M.\\ interior'&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;in cosetRepsGx' gs interior' boundary'&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;Here it is in action:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; :load Math.Algebra.Group.RandomSchreierSims&lt;br /&gt;&amp;gt; let gs = map p [ [[0,1],[2,3],[4,5],[6,7]], [[0,2,3,1],[4,6,7,5]], [[0,4,6,7,3,1],[2,5]], [[1,2],[5,6]], [[1,4,2],[3,5,6]], [[2,4],[3,5]] ]&lt;br /&gt;&amp;gt; mapM_ print $ baseTransversalsSGS gs&lt;br /&gt;(0, fromList [(0,[]), (1,[[0,1],[2,3],[4,5],[6,7]]), (2,[[0,2,3,1],[4,6,7,5]]), (3,[[0,3],[1,2],[4,7],[5,6]]), (4,[[0,4,6,7,3,1],[2,5]]), (5,[[0,5,6,3],[1,4,7,2]]), (6,[[0,6,3],[1,4,7]]), (7,[[0,7],[1,6],[2,5],[3,4]])])&lt;br /&gt;(1, fromList [(1,[]), (2,[[1,2],[5,6]]), (4,[[1,4,2],[3,5,6]])])&lt;br /&gt;(2, fromList [(2,[]), (4,[[2,4],[3,5]])])&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;Okay, so what do these transversals give us? Well, hopefully it is obvious (by induction) that every element in our original group G can be expressed as a product u2*u1*u0, taking an element ui from each transversal. Moreover, this expression is unique.&lt;br /&gt;&lt;br /&gt;This is the key to the usefulness of strong generating sets. For example, it means we can easily calculate the order of the group (the number of elements). Just multiply the sizes of the transversals. For the cube group, this tells us that we have 8 * 3 * 2 = 48 elements. We can also use the transversals to create a search tree for the group (more on this some other time). Of particular importance, we can test an arbitrary permutation for membership in the group, by "sifting" it through the transversals, as follows.&lt;br /&gt;&lt;br /&gt;Given a permutation g, we look at its action on the first base, b1, and see if we can find an element in the b1-transversal with the same action. If we can, say u1, then we replace g by g*u1^-1, and proceed to the second base. We now look at the action of g*u1^-1 on b2, and see if there is an element in the b2-transversal with the same action. If g is in our group G, then g = un*...*u1, so at the end of this process, we will get 1. Conversely, if g is not in our group, then at some stage we will fail to find a transversal element matching the action of our g (including possibly the case where we go through all the transversals, but are still left at the end with a g /= 1).&lt;br /&gt;&lt;br /&gt;Here's the code:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;isMemberSGS gs h = sift bts h where&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;bts = baseTransversalsSGS gs&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;sift _ 1 = True&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;sift ((b,t):bts) g = case M.lookup (b .^ g) t of&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Nothing -&amp;gt; False&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Just h -&amp;gt; sift bts (g * inverse h)&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;sift [] _ = False&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;Okay, so we've seen what a strong generating set looks like, and why they're useful. Now, suppose someone gives us a set of generators for a group. How do we find a strong generating set for the group?&lt;br /&gt;&lt;br /&gt;Well, it's actually fairly simple. We start with a set of empty transversals, corresponding to an empty SGS. What we're going to try to do is add elements to the SGS (and hence to the transversals), until we end up with a full SGS for the group. First of all, we need a slightly modified version of the sift function from above:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;sift _ 1 = Nothing&lt;br /&gt;sift ((b,t):bts) g = case M.lookup (b .^ g) t of&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Nothing -&amp;gt; Just g&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Just h -&amp;gt; sift bts (g * inverse h)&lt;br /&gt;sift [] g = Just g -- g == 1 case already caught above&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;In this version, instead of returning either True or False, we return either Nothing, indicating success, or Just g, indicating failure, where g is the part-sifted element that we had at the point of failure.&lt;br /&gt;&lt;br /&gt;Okay, so to find an SGS, what we do is feed random elements of the group through the transversals, using the sifting procedure. (How do we generate random elements of the group? - well roughly, we just form random products of the generators - but see below.) If our random element sifts through, then we move on to the next random element. If our random element doesn't sift through, then at some transversal, we have a group element whose action on a base is not matched by any element in the transversal. If this happens, what we have to do is add this group element as a new element to the SGS, and recalculate the transversals. In that way, we will gradually grow the SGS. What we're hoping is that if we look at enough random elements, we'll end up with an SGS for the whole group.&lt;br /&gt;&lt;br /&gt;This algorithm is called random Schreier-Sims. It's a Monte-Carlo algorithm - meaning it can return the wrong answer: We might stop too soon, and return an SGS which only generates a subgroup, not the whole group. However, we can make this unlikely by choosing enough random elements.&lt;br /&gt;&lt;br /&gt;Okay, so first of all, given some generators for a group, we need to be able to generate random elements of the group. One way to do this is called the "product replacement algorithm". What we will do is create an array containing 10 random elements of the group. Then every time we're asked for another random element, we will randomly replace one element of the array, by multiplying it either on left or right, by either another element or its inverse. The replacement element will thus be a new random element of the group. Ok, but how do we get 10 random elements in the array in the first place? Well, what we do is start off by putting the generators into the array, repeated as many times as necessary - eg a,b,c,a,b,c,a,b,c,a. Then we just run the product replacement a few times to "mix up" the array.&lt;br /&gt;&lt;br /&gt;Here's the code. (Note, in this version, we maintain an eleventh element (the zeroth) in the array as an accumulator for results we are going to return - this has been found empirically to give better results.)&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;initProdRepl :: (Ord a, Show a) =&amp;gt; [Permutation a] -&amp;gt; IO (Int, IOArray Int (Permutation a))&lt;br /&gt;initProdRepl gs =&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;let n = length gs&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;r = max 10 n -- if we have more than 10 generators, we have to use a larger array&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;xs = (1:) $ take r $ concat $ repeat gs&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;in do xs' &amp;lt;- newListArray (0,r) xs&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;replicateM_ 60 $ nextProdRepl (r,xs') -- perform initial mixing&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;return (r,xs')&lt;br /&gt;&lt;br /&gt;nextProdRepl :: (Ord a, Show a) =&amp;gt; (Int, IOArray Int (Permutation a)) -&amp;gt; IO (Maybe (Permutation a))&lt;br /&gt;nextProdRepl (r,xs) = -- r will usually be 10&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;do s &amp;lt;- randomRIO (1,r)&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; t &amp;lt;- randomRIO (1,r)&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; u &amp;lt;- randomRIO (0,3 :: Int)&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; out &amp;lt;- updateArray xs s t u&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; return out&lt;br /&gt;&lt;br /&gt;updateArray xs s t u =&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;let (swap,invert) = quotRem u 2 in&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;if s == t&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;then return Nothing&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;else do&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;x_0 &amp;lt;- readArray xs 0&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;x_s &amp;lt;- readArray xs s&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;x_t &amp;lt;- readArray xs t&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;let x_s' = mult (swap,invert) x_s x_t&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;x_0' = mult (swap,0) x_0 x_s'&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;writeArray xs 0 x_0'&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;writeArray xs s x_s'&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;return (Just x_0')&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;where mult (swap,invert) a b = case (swap,invert) of&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; (0,0) -&amp;gt; a * b&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; (0,1) -&amp;gt; a * b^-1&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; (1,0) -&amp;gt; b * a&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; (1,1) -&amp;gt; b^-1 * a&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;One thing to point out: We have to select a random element to replace - x_s - and another random element to multiply by - x_t. We're not allowed to have s=t. The way I get round this is to wrap Maybe around the return type, and return Nothing if I happen to pick s=t. With a little bit of work I could probably do better.&lt;br /&gt;&lt;br /&gt;Okay, so we can generate random elements from the generators. Now, to find a strong generating set, we start out with an empty set of transversals, and keep sifting random elements through the transversals, updating the transversals when we find elements that don't sift:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&lt;pre&gt;sgs :: (Ord a, Show a) =&amp;gt; [Permutation a] -&amp;gt; [Permutation a]&lt;br /&gt;sgs gs = toListSet $ concatMap snd $ rss gs&lt;br /&gt;&lt;br /&gt;rss gs = unsafePerformIO $&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;do (r,xs) &amp;lt;- initProdRepl gs&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; rss' (r,xs) (initLevels gs) 0&lt;br /&gt;&lt;br /&gt;rss' (r,xs) levels i&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;| i == 25 = return levels -- stop if we've had 25 successful sifts in a row&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;| otherwise = do g &amp;lt;- nextProdRepl (r,xs)&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; let (changed,levels') = updateLevels levels g&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; rss' (r,xs) levels' (if changed then 0 else i+1)&lt;br /&gt;-- if we currently have an sgs for a subgroup of the group, then it must have index &amp;gt;= 2&lt;br /&gt;-- so the chance of a random elt sifting to identity is &amp;lt;= 1/2&lt;br /&gt;&lt;br /&gt;initLevels gs = [((b,M.singleton b 1),[]) | b &amp;lt;- bs]&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;where bs = toListSet $ concatMap supp gs&lt;br /&gt;&lt;br /&gt;updateLevels levels Nothing = (False,levels)&lt;br /&gt;updateLevels levels (Just g) =&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;case sift (map fst levels) g of&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;Nothing -&amp;gt; (False, levels)&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;Just g' -&amp;gt; (True, updateLevels' [] levels g' (minsupp g'))&lt;br /&gt;&lt;br /&gt;updateLevels' ls (r@((b,t),s):rs) h b' =&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;if b == b'&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;then reverse ls ++ ((b, cosetRepsGx (h:s) b), h:s) : rs&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;else updateLevels' (r:ls) rs h b'&lt;br /&gt;&lt;/pre&gt;&lt;/code&gt;&lt;br /&gt;That's it.&lt;br /&gt;&lt;br /&gt;This algorithm makes it possible to work with much larger permutation groups. We already saw it in use in an &lt;a href="http://haskellformaths.blogspot.com/2009/08/how-to-count-number-of-positions-of.html"&gt;earlier post&lt;/a&gt;, where I found an SGS for the Rubik's cube group, in order to calculate the number of possible positions (approximately 4*10^19). Over the next few weeks I want to look at how we can use strong generating sets to investigate the structure of groups.&lt;br /&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5195188167565410449-4961294898391549693?l=haskellformaths.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://haskellformaths.blogspot.com/feeds/4961294898391549693/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://haskellformaths.blogspot.com/2010/02/how-to-find-strong-generating-set.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5195188167565410449/posts/default/4961294898391549693'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5195188167565410449/posts/default/4961294898391549693'/><link rel='alternate' type='text/html' href='http://haskellformaths.blogspot.com/2010/02/how-to-find-strong-generating-set.html' title='How to find a strong generating set'/><author><name>DavidA</name><uri>http://www.blogger.com/profile/16359932006803389458</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_XQ7FznWBAYE/S3hXJWheV7I/AAAAAAAAAE8/a6NVNQZbYB4/s72-c/cube.GIF' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5195188167565410449.post-4291100236673057171</id><published>2009-11-18T21:40:00.000Z</published><updated>2009-11-18T21:40:58.039Z</updated><title type='text'>Three new modules in HaskellForMaths</title><content type='html'>It's been a little while since I posted here - and the reason is that I've been too busy writing code. The fruits of my labour are three new modules for &lt;a href="http://hackage.haskell.org/package/HaskellForMaths"&gt;HaskellForMaths&lt;/a&gt;, which I've uploaded in version 0.2.0. In due course, each of these deserves a blog post in its own right, but for the moment, I thought I'd give a quick overview.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Math.Algebra.Group.RandomSchreierSims&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;In our investigation of graph automorphisms, we've come across the concept of a &lt;a href="http://haskellformaths.blogspot.com/2009/07/strong-generating-sets-for-graph.html"&gt;strong generating set&lt;/a&gt; (SGS) for a permutation group. This is a generating set of a special form, which makes certain calculations particularly easy - for example, calculating the order of the group (the number of elements it contains). Our graphAuts algorithm naturally produced a strong generating set for the group. But what if we are just given any old set of generators for a permutation group.&lt;br /&gt;&lt;br /&gt;The random Schreier-Sims algorithm is a fast Monte-Carlo algorithm for finding a strong generating set, given a set of generators. Monte-Carlo means that there is a small probability that it will return the wrong answer - in this case, it might return an SGS for a subgroup of the group we are interested in, instead of the whole group. In practice, this isn't an issue. First, we can make the probability of error as small as we like. Second, we usually know the order of the group, so can check whether the SGS is right.&lt;br /&gt;&lt;br /&gt;The main function provided in Math.Algebra.Group.RandomSchreierSims is sgs, which takes a set of generators and returns a strong generating set.&lt;br /&gt;&lt;br /&gt;(We already have an sgs function in Math.Algebra.Group.SchreierSims. This is guaranteed to give the right answer - but can be significantly slower. So, you have a choice which one to use.)&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Math.Projects.MiniquaternionGeometry&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;A projective plane is an incidence structure of points and lines satisfying the following rules:&lt;br /&gt;- Given any two distinct points, there is exactly one line which passes through both&lt;br /&gt;- Given any two distinct lines, there is exactly one point which lies on both&lt;br /&gt;- There exists a quadrangle: a set of four points, no three of which are collinear&lt;br /&gt;&lt;br /&gt;(There is another definition of projective plane, in terms of designs, but we haven't covered those yet.)&lt;br /&gt;&lt;br /&gt;For any prime power q = p^n, the projective geometry PG(2,Fq) over the finite field Fq is a projective plane. However, it turns out that there are projective planes which are not of this form, associated with algebraic systems called near-fields, which are nearly fields but not quite.&lt;br /&gt;&lt;br /&gt;In Math.Projects.MiniquaternionGeometry I construct PG(2,F9), together with three "non-Desarguesian" planes of order 9, based on the near-field J9 of order 9. The name Miniquaternion Geometry comes from the book of that name by Room and Kirkpatrick, because the multiplicative structure of J9 is the same as that of the unit quaternions.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Math.Combinatorics.LatinSquares&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;A latin square is an n*n grid, with each cell containing a letter from an alphabet of n letters, such that in each row and in each column, each letter appears exactly once. For example:&lt;br /&gt;&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;1 2 3 4&lt;br /&gt;2 1 4 3&lt;br /&gt;3 4 1 2&lt;br /&gt;4 3 2 1&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;Two latin squares are orthogonal if, when you superimpose them, each of the n^2 possible letter-pairs appears exactly once in the grid. For example:&lt;br /&gt;&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;1a 2b 3c 4d&lt;br /&gt;2d 1c 4b 3a&lt;br /&gt;3b 4a 1d 2c&lt;br /&gt;4c 3d 2a 1b&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;It turns out that the existence of mutually orthogonal n*n latin squares is related to the existence of projective planes of order n. Euler's &lt;a href="http://en.wikipedia.org/wiki/Thirty-six_officers_problem"&gt;thirty six officers problem&lt;/a&gt;&amp;nbsp;asks for a pair of orthogonal latin squares of order 6. It turns out that there is no solution, and this is related to the fact that there is no projective plane of order 6.&lt;br /&gt;&lt;br /&gt;(There are projective planes of order q for any prime power q = p^n, and these give rise to mutually orthogonal q*q latin squares. However 6 is not a prime power. It is not known whether there exist projective planes whose order is not a prime power, but it is known by exhaustive computer search that there is no projective plane of order 6, and no pair of mutually orthogonal latin squares of order 6.)&lt;br /&gt;&lt;br /&gt;This module lets you construct mutually orthogonal latin squares from projective planes, and other nice stuff.&lt;br /&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5195188167565410449-4291100236673057171?l=haskellformaths.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://haskellformaths.blogspot.com/feeds/4291100236673057171/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://haskellformaths.blogspot.com/2009/11/three-new-modules-in-haskellformaths.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5195188167565410449/posts/default/4291100236673057171'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5195188167565410449/posts/default/4291100236673057171'/><link rel='alternate' type='text/html' href='http://haskellformaths.blogspot.com/2009/11/three-new-modules-in-haskellformaths.html' title='Three new modules in HaskellForMaths'/><author><name>DavidA</name><uri>http://www.blogger.com/profile/16359932006803389458</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5195188167565410449.post-5318230480452365652</id><published>2009-10-26T20:36:00.001Z</published><updated>2009-10-26T20:52:09.835Z</updated><title type='text'>Simple groups, the atoms of symmetry</title><content type='html'>&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;span style="font-family: 'Lucida Grande'; font-size: small;"&gt;&lt;span class="Apple-style-span" style="font-size: 11px;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: auto;"&gt;[New release, HaskellForMaths-0.1.9, available &lt;a href="http://hackage.haskell.org/package/HaskellForMaths"&gt;here&lt;/a&gt;. Contains documentation improvements, new version of graphAuts using equitable partitions, and the code used in this blog post.]&lt;br /&gt;&lt;/div&gt;&lt;br /&gt;Over the last few months on this blog, we have been looking at symmetry. We started out by looking at symmetries of graphs, then more recently we've been looking at symmetries of finite geometries. Last time, I said that there was more to say about finite geometries. There is, but first of all I realised that I need to say a bit more about groups.&lt;br /&gt;&lt;br /&gt;We've come across groups already many times. Whenever we have an object, then the collection of all symmetries of that object is called a group. (Recall that a symmetry is a change that leaves the object looking the same.) Suppose that we call this group G, and arbitrary elements in the group g and h. Then G satisfies the following properties:&lt;br /&gt;- There is a binary operation, *, defined on G. (g*h is the symmetry that you get if you do g then h.)&lt;br /&gt;- If g and h are in G, then so is g*h. (Doing one symmetry followed by another is again a symmetry.)&lt;br /&gt;- There is an element 1 in G, the identity, such that 1*g = g = g*1. (1 is the symmetry "do nothing".)&lt;br /&gt;- If g is in G, then there is an element g^-1 in G, called the inverse of g, such that g*g^-1 = g^-1*g = 1. (Doing g^-1 is the same as undoing g.)&lt;br /&gt;- (g*h)*k = g*(h*k)&lt;br /&gt;&lt;br /&gt;Mathematicians sometimes (usually) turn things the other way round: They define a group by the properties above, and then observe that the collection of symmetries of an object fits the definition. The problem with this is that it covers over the original intuitions behind the concept.&lt;br /&gt;&lt;br /&gt;For us, the key point to take away from the definition is that groups are "closed". We've been looking at symmetries, represented by permutations. But symmetries and permutations just are the sort of thing that you can multiply (do one then another), that have an identity (do nothing), and that have inverses (undo). Those all come for free. So when we say that a collection &lt;i&gt;of symmetries&lt;/i&gt;, or &lt;i&gt;of permutations&lt;/i&gt;, is a group, the only new thing that we're claiming is that the collection is &lt;i&gt;closed&lt;/i&gt; - for g, h in G, g*h, g^-1, and 1 are also in G.&lt;br /&gt;&lt;br /&gt;Okay, so now we know what a group is. Now you know what mathematicians are like - as soon as they've discovered some new type of structure, they always ask themselves questions like the following:&lt;br /&gt;- can we classify all structures of this type?&lt;br /&gt;- can smaller structures of this type be combined into larger structures of the same type?&lt;br /&gt;- can larger structures of this type be broken down into smaller structures?&lt;br /&gt;- can we classify all "atomic" structures of this type, that can't be broken down any further?&lt;br /&gt;&lt;br /&gt;What about groups? Well, the answer is, for &lt;i&gt;finite&lt;/i&gt; groups, mostly yes. Specifically, groups are made out of "atoms" - but there isn't a complete understanding of all the different ways they can be built from the atoms. Anyway, for the moment I'm going to skip the first two questions (about building groups up), and concentrate on the last two (breaking them down).&lt;br /&gt;&lt;br /&gt;Okay, so can groups be broken down into smaller groups? Well, groups can have smaller groups contained within them. For example, let's consider the group D10 - the symmetries of the pentagon. (Somewhat confusingly, it's called D10 rather than D5, because it has 10 elements.)&lt;br /&gt;&lt;br /&gt;&lt;a href="http://1.bp.blogspot.com/_XQ7FznWBAYE/SuYB5z7JqWI/AAAAAAAAAEs/WPv9oP0HCEU/s1600-h/c5.GIF" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://1.bp.blogspot.com/_XQ7FznWBAYE/SuYB5z7JqWI/AAAAAAAAAEs/WPv9oP0HCEU/s320/c5.GIF" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; :load Math.Algebra.Group.PermutationGroup&lt;br /&gt;&amp;gt; mapM_ print $ elts $ _D 10&lt;br /&gt;[]&lt;br /&gt;[[1,2],[3,5]]&lt;br /&gt;[[1,2,3,4,5]]&lt;br /&gt;[[1,3,5,2,4]]&lt;br /&gt;[[1,3],[4,5]]&lt;br /&gt;[[1,4],[2,3]]&lt;br /&gt;[[1,4,2,5,3]]&lt;br /&gt;[[1,5,4,3,2]]&lt;br /&gt;[[1,5],[2,4]]&lt;br /&gt;[[2,5],[3,4]]&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;Now, any subset of these elements, which is &lt;i&gt;closed&lt;/i&gt;, is again a group. For example, the set { [], [[1,2],[3,5]] } is closed - 1, and all products and inverses of elements in the set, are in the set. This is called a &lt;i&gt;subgroup&lt;/i&gt; of D10.&lt;br /&gt;&lt;br /&gt;Here's Haskell code to find all subgroups of a group. (The group is passed in as a list of generators, and the subgroups are returned as lists of generators.)&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; subgps gs = [] : subgps' S.empty [] (map (:[]) hs) where&lt;br /&gt;&amp;gt; &amp;nbsp; &amp;nbsp; hs = filter isMinimal $ elts gs&lt;br /&gt;&amp;gt; &amp;nbsp; &amp;nbsp; subgps' found ls (r:rs) =&lt;br /&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; let ks = elts r in&lt;br /&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; if ks `S.member` found&lt;br /&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; then subgps' found ls rs&lt;br /&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; else r : subgps' (S.insert ks found) (r:ls) rs&lt;br /&gt;&amp;gt; &amp;nbsp; &amp;nbsp; subgps' found [] [] = []&lt;br /&gt;&amp;gt; &amp;nbsp; &amp;nbsp; subgps' found ls [] = subgps' found [] [l ++ [h] | l &amp;lt;- reverse ls, h &amp;lt;- hs, last l &amp;lt; h]&lt;br /&gt;&lt;br /&gt;&amp;gt; -- g is the minimal elt in the cyclic subgp it generates&lt;br /&gt;&amp;gt; isMinimal 1 = False&lt;br /&gt;&amp;gt; isMinimal g = all (g &amp;lt;=) primitives -- g == minimum primitives&lt;br /&gt;&amp;gt; &amp;nbsp; &amp;nbsp; where powers = takeWhile (/=1) $ tail $ iterate (*g) 1&lt;br /&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; n = orderElt g -- == length powers + 1&lt;br /&gt;&amp;gt; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; primitives = filter (\h -&amp;gt; orderElt h == n) powers&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;By the way, most of the code we'll be looking at this week, including the above, should only be used with small groups.&lt;br /&gt;&lt;br /&gt;Okay, so let's try it out:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; mapM_ print $ subgps $ _D 10&lt;br /&gt;[]&lt;br /&gt;[[[1,2],[3,5]]]&lt;br /&gt;[[[1,2,3,4,5]]]&lt;br /&gt;[[[1,3],[4,5]]]&lt;br /&gt;[[[1,4],[2,3]]]&lt;br /&gt;[[[1,5],[2,4]]]&lt;br /&gt;[[[2,5],[3,4]]]&lt;br /&gt;[[[1,2],[3,5]],[[1,2,3,4,5]]]&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;The last in the list is D10 itself, generated by a reflection and a rotation. Then there are five subgroups generated by five different reflections. (For example, recall that [[2,5],[3,4]] is the reflection that swaps 2 with 5, and 3 with 4 - in other words, reflection in the vertical axis.). Then there is also a subgroup generated by the rotation [[1,2,3,4,5]] - the clockwise rotation that moves 1 to 2, 2 to 3, 3 to 4, 4 to 5, and 5 to 1. Finally, [] is the subgroup with no generators, which by convention is the group consisting of just the single element 1.&lt;br /&gt;&lt;br /&gt;So can we break a group down into its subgroups? Well, not quite. When we break something down, we should end up with two (or more) parts. Subgroups are parts. But we may not always be able to find another matching part to make up the group.&lt;br /&gt;&lt;br /&gt;I'm probably being a bit cryptic. Let's consider an analogy. We know that 15 = 3*5. We also have 5 = 15/3. So 15 = 3 * (15/3). Given a group G, and a subgroup H, we would like to be able to build G as G = H * (G/H). I'm not going to dwell on the * in this statement. What about the / ?&lt;br /&gt;&lt;br /&gt;Well, it turns out that we can form a quotient G/H of a group by a subgroup - but only if the subgroup satisfies some extra conditions.&lt;br /&gt;&lt;br /&gt;Given any subsets X and Y of G, we can define a product XY:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; xs -*- ys = toListSet [x*y | x &amp;lt;- xs, y &amp;lt;- ys]&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;For X or Y consisting of just a single element, we can form the products xY = {x}Y, or Xy = X{y}:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; xs -* &amp;nbsp;y &amp;nbsp;= L.sort [x*y | x &amp;lt;- xs] -- == xs -*- [y]&lt;br /&gt;&amp;gt; x &amp;nbsp; *- ys = L.sort [x*y | y &amp;lt;- ys] -- == [x] -*- ys&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;Now it turns out that we can use this multiplication to define a quotient G/H. We let the elements of G/H be the (right) "cosets" Hg, for g in G. For example:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; let hs = [1, p [[1,2],[3,5]]]&lt;br /&gt;&amp;gt; let x = p [[1,2,3,4,5]]&lt;br /&gt;&amp;gt; let y = p [[2,5],[3,4]]&lt;br /&gt;&amp;gt; hs -* x&lt;br /&gt;[[[1,2,3,4,5]],[[1,3],[4,5]]]&lt;br /&gt;&amp;gt; hs -* y&lt;br /&gt;[[[1,5,4,3,2]],[[2,5],[3,4]]]&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;Here we have taken a subgroup H in D10, and then found the cosets Hx, Hy, for a couple of elements x, y in D10.&lt;br /&gt;&lt;br /&gt;Now, to get our quotient group working, what we'd like is to have (Hx) (Hy) = H(xy). Unfortunately, this isn't guaranteed to be true. For example:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; (hs -* x) -*- (hs -* y)&lt;br /&gt;[[],[[1,2],[3,5]],[[1,4,2,5,3]],[[1,5],[2,4]]]&lt;br /&gt;&amp;gt; hs -* (x*y)&lt;br /&gt;[[[1,4,2,5,3]],[[1,5],[2,4]]]&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;Oh dear. But there are some subgroups for which this construction does work. Suppose that we had a subgroup K, such that for all g in G, g^-1 K g = K. Then Kx Ky = Kx (x^-1Kx)y = KKxy = Kxy. (KK=K follows from the fact that K is a subgroup, hence closed.)&lt;br /&gt;&lt;br /&gt;A subgroup K of G such that g^-1 K g = K for all g in G is called a normal subgroup. For a normal subgroup, we can form the quotient G/K, and so we have broken G down into two parts, K and G/K.&lt;br /&gt;&lt;br /&gt;Here's the code:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; isNormal gs ks = all (== ks') [ (g^-1) *- ks' -* g | g &amp;lt;- gs]&lt;br /&gt;&amp;gt; &amp;nbsp; &amp;nbsp; where ks' = elts ks&lt;br /&gt;&lt;br /&gt;&amp;gt; normalSubgps gs = filter (isNormal gs) (subgps gs)&lt;br /&gt;&lt;br /&gt;&amp;gt; quotientGp gs ks&lt;br /&gt;&amp;gt; &amp;nbsp; &amp;nbsp; | ks `isNormal` gs = gens $ toSn [action cosetsK (-* g) | g &amp;lt;- gs]&lt;br /&gt;&amp;gt; &amp;nbsp; &amp;nbsp; | otherwise = error "quotientGp: not well defined unless ks normal in gs"&lt;br /&gt;&amp;gt; &amp;nbsp; &amp;nbsp; where cosetsK = cosets gs ks&lt;br /&gt;&lt;br /&gt;&amp;gt; gs // ks = quotientGp gs ks&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;For example:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; mapM_ print $ normalSubgps $ _D 10&lt;br /&gt;[]&lt;br /&gt;[[[1,2,3,4,5]]]&lt;br /&gt;[[[1,2],[3,5]],[[1,2,3,4,5]]]&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;Two of these are "trivial". [] is the subgroup {1}, which will always be normal. The last subgroup listed is D10 itself. A group will always be a normal subgroup of itself. So the only "proper" normal subgroup is [ [[1,2,3,4,5]] ] - the subgroup of rotations of the pentagon.&lt;br /&gt;&lt;br /&gt;In this case, there are only two cosets:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; mapM_ print $ cosets (_D 10) [p [[1,2,3,4,5]]]&lt;br /&gt;[[],[[1,2,3,4,5]],[[1,3,5,2,4]],[[1,4,2,5,3]],[[1,5,4,3,2]]]&lt;br /&gt;[[[1,2],[3,5]],[[1,3],[4,5]],[[1,4],[2,3]],[[1,5],[2,4]],[[2,5],[3,4]]]&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;Not surprisingly then, the quotient group is rather simple:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; _D 10 // [p [[1,2,3,4,5]]]&lt;br /&gt;[[[1,2]]]&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;Just to explain a little: In quotientGp gs ks, we form the cosets of ks, and then we look at the action of the gs on the cosets by right multiplication - g sends Kh to Khg. So the elements of the quotient group are permutations of the cosets. However, if we printed this out, it would be hard to see what was going on. So we call "toSn", which labels the cosets 1,2,..., and rewrites the elements as permutations of the numbers. In the case above, we see that the quotient group can be generated by a single element, which simply swaps the two cosets.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://4.bp.blogspot.com/_XQ7FznWBAYE/SuYB_uZJSwI/AAAAAAAAAE0/QYTQ1H4hhLs/s1600-h/tetrahedron.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://4.bp.blogspot.com/_XQ7FznWBAYE/SuYB_uZJSwI/AAAAAAAAAE0/QYTQ1H4hhLs/s320/tetrahedron.png" /&gt;&lt;/a&gt;&lt;br /&gt;Let's look at another example. S 4 is the group of all permutations of [1..4] - which also happens to be the symmetry group of the tetrahedron. S 4 has lots of subgroups:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; mapM_ print $ subgps $ _S 4&lt;br /&gt;[]&lt;br /&gt;[[[1,2]]]&lt;br /&gt;[[[1,2],[3,4]]]&lt;br /&gt;[[[1,2,3]]]&lt;br /&gt;[[[1,2,3,4]]]&lt;br /&gt;[[[1,2,4,3]]]&lt;br /&gt;[[[1,2,4]]]&lt;br /&gt;[[[1,3],[2,4]]]&lt;br /&gt;[[[1,3,2,4]]]&lt;br /&gt;[[[1,3]]]&lt;br /&gt;[[[1,3,4]]]&lt;br /&gt;[[[1,4],[2,3]]]&lt;br /&gt;[[[1,4]]]&lt;br /&gt;[[[2,3]]]&lt;br /&gt;[[[2,3,4]]]&lt;br /&gt;[[[2,4]]]&lt;br /&gt;[[[3,4]]]&lt;br /&gt;[[[1,2]],[[1,2],[3,4]]]&lt;br /&gt;[[[1,2]],[[1,2,3]]]&lt;br /&gt;[[[1,2]],[[1,2,3,4]]]&lt;br /&gt;[[[1,2]],[[1,2,4]]]&lt;br /&gt;[[[1,2]],[[1,3],[2,4]]]&lt;br /&gt;[[[1,2],[3,4]],[[1,2,3]]]&lt;br /&gt;[[[1,2],[3,4]],[[1,2,3,4]]]&lt;br /&gt;[[[1,2],[3,4]],[[1,2,4,3]]]&lt;br /&gt;[[[1,2],[3,4]],[[1,3],[2,4]]]&lt;br /&gt;[[[1,3],[2,4]],[[1,3]]]&lt;br /&gt;[[[1,3]],[[1,3,4]]]&lt;br /&gt;[[[1,4],[2,3]],[[1,4]]]&lt;br /&gt;[[[2,3]],[[2,3,4]]]&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;However, only a few of them are normal:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; mapM_ print $ normalSubgps $ _S 4&lt;br /&gt;[]&lt;br /&gt;[[[1,2]],[[1,2,3,4]]]&lt;br /&gt;[[[1,2],[3,4]],[[1,2,3]]]&lt;br /&gt;[[[1,2],[3,4]],[[1,3],[2,4]]]&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;The first two are 1 and S4 itself. The third is the group of rotations of the tetrahedron. The fourth is just the rotations which move all the points.&lt;br /&gt;&lt;br /&gt;Why are these subgroups normal, but the others aren't? Well, let's have a look at some of the ones that aren't.&lt;br /&gt;&lt;br /&gt;We have a subgroup [ [[1,2]] ], and another [ [[1,3]] ], and several others that "look the same" - they just swap two points. We also have a subgroup [ [[1,2,3]] ], a subgroup [ [[1,2,4]] ], and several others that "look the same" - they just just rotate three points while leaving the fourth fixed. In fact, in both cases we have a set of conjugate subgroups. What I mean is, for a subgroup to be normal, we required that g^-1 K g = K. If a subgroup is not normal, then we can find a g such that g^-1 H g /= H. Then g^-1 H g will also be a subgroup (exercise: why?), and it will "look the same" as H.&lt;br /&gt;&lt;br /&gt;By contrast, a normal subgroup is often the only subgroup that "looks like that".&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Okay, so we've seen how groups can be broken down into smaller groups. Now, what about "atoms"? Well, a group always has itself and 1 as normal subgroups. If it has other normal subgroups, then we can break it down as K, G/K. We can keep going, and see if either of these in their turn has non-trivial normal subgroups. Eventually, we will end up with groups which have no non-trivial normal subgroups. Such a group is called a &lt;i&gt;simple group&lt;/i&gt;. This is a bit like a prime number - which has no other factors besides itself and 1. Mark Ronan, in Symmetry and the Monster, calls simple groups the "atoms of symmetry". They are the pieces out of which all symmetry groups are made.&lt;br /&gt;&lt;br /&gt;(Note that starting from a group G, with normal subgroups K1, K2, ..., we have more than one choice of how to break it down. Luckily, it turns out that if we keep going, then at the end we will have the same collection of "atoms", no matter which order we chose.)&lt;br /&gt;&lt;br /&gt;So, what do simple groups look like?&lt;br /&gt;&lt;br /&gt;Well, first, here's some code to detect them:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; isSimple gs = length (normalSubgps gs) == 2&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;Then the finite simple groups are as follows:&lt;br /&gt;&lt;br /&gt;- The cyclic groups of order p, p prime (the rotation group of a regular p-gon):&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; _C n | n &amp;gt;= 2 = [p [[1..n]]]&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;For example:&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; isSimple $ _C 5&lt;br /&gt;True&lt;br /&gt;&amp;gt; isSimple $ _C 6&lt;br /&gt;False&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;Exercise: Why is C n simple when n is prime, and not when it is not?&lt;br /&gt;&lt;br /&gt;- The alternating groups A n, n &amp;gt;= 5. (A n is the subgroup of S n consisting of those elements of S n which are "even" - see &lt;a href="http://en.wikipedia.org/wiki/Even_permutation"&gt;wikipedia&lt;/a&gt;.)&lt;br /&gt;&lt;code&gt;&lt;/code&gt;&lt;br /&gt;&lt;code&gt;&amp;gt; isSimple $ _A 4&lt;br /&gt;False&lt;br /&gt;&amp;gt; isSimple $ _A 5&lt;br /&gt;True&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;(As it happens, A4 is the group of rotations of the tetrahedron, and A5 is the group of rotations of the dodecahedron. Unfortunately, An, n&amp;gt;5, doesn't have any such intuitive interpretation.)&lt;br /&gt;&lt;br /&gt;- The projective special linear groups PSL(n,Fq) (except for PSL(2,F2), PSL(2,F3) and PSL(3,F2)). This is the subgroup of PGL(n,Fq) consisting of projective transformations with determinant 1. (Recall that &lt;a href="http://haskellformaths.blogspot.com/2009/10/symmetries-of-pgnfq.html"&gt;last time&lt;/a&gt; we looked at PΓL(n,Fq), the group of symmetries of the projective geometry PG(n-1,Fq), which includes both projective transformations and field automorphisms.)&lt;br /&gt;&lt;br /&gt;- Several more infinite families of subgroups of PΓL(n,Fq), with geometric significance.&lt;br /&gt;&lt;br /&gt;- Finally, there are 26 "sporadic" simple groups, which don't belong to any of the infinite families described above.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;I should just say a thing or two about the &lt;a href="http://en.wikipedia.org/wiki/Classification_of_finite_simple_groups"&gt;classification of finite simple groups&lt;/a&gt;:&lt;br /&gt;- The proof of the classification was a monster effort by a whole generation of group theorists. The proof runs to 10000 pages.&lt;br /&gt;- The part that mathematicians find most interesting is the sporadic simple groups. In many mathematical classifications, one only finds infinite families (for example, the primes), so the sporadic groups feel like little miracles that have dropped out of the sky. Why are they there?&lt;br /&gt;&lt;br /&gt;Two of the things I'm hoping to do in future blog posts is describe and construct some of the other "simple groups of Lie type" (subgroups of PΓL(n,Fq), and describe and construct some of the sporadic simple groups.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5195188167565410449-5318230480452365652?l=haskellformaths.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://haskellformaths.blogspot.com/feeds/5318230480452365652/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://haskellformaths.blogspot.com/2009/10/simple-groups-atoms-of-symmetry.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5195188167565410449/posts/default/5318230480452365652'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5195188167565410449/posts/default/5318230480452365652'/><link rel='alternate' type='text/html' href='http://haskellformaths.blogspot.com/2009/10/simple-groups-atoms-of-symmetry.html' title='Simple groups, the atoms of symmetry'/><author><name>DavidA</name><uri>http://www.blogger.com/profile/16359932006803389458</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_XQ7FznWBAYE/SuYB5z7JqWI/AAAAAAAAAEs/WPv9oP0HCEU/s72-c/c5.GIF' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5195188167565410449.post-5701363515709149075</id><published>2009-10-08T19:55:00.000+01:00</published><updated>2009-10-08T20:21:17.497+01:00</updated><title type='text'>Symmetries of PG(n,Fq)</title><content type='html'>&lt;div&gt;Previously in this blog we looked at the affine geometries AG(n,Fq), and their symmetries. Following that, we looked at the points and lines in the projective geometries PG(n,Fq). This week I want to look at the symmetries of PG(n,Fq).&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;However, first, something I forgot to mention last week. When we looked at the points in PG(n,Fq), we saw that they can be thought of as the points of AG(n,Fq), plus some additional points "at infinity". What I forgot to do last week was to discuss how the lines in PG(n,Fq) relate to the lines in AG(n,Fq).&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Here's PG(2,F3) again:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_XQ7FznWBAYE/Ss42alXwUrI/AAAAAAAAAEM/bTxJx6Vy0eg/s1600-h/ptspg2f3txt.gif"&gt;&lt;img src="http://3.bp.blogspot.com/_XQ7FznWBAYE/Ss42alXwUrI/AAAAAAAAAEM/bTxJx6Vy0eg/s320/ptspg2f3txt.gif" border="0" alt="" id="BLOGGER_PHOTO_ID_5390305634345308850" style="cursor: pointer; width: 320px; height: 268px; " /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Recall that the points in PG(2,F3) are (by definition) the lines through the origin in F3^3, each of which can be represented, as here, by one of the non-zero points on it. (Note that for technical reasons, the coordinates are shown as zyx instead of xyz.) We see that these "points" of PG(2,F3) consist of something looking very much like AG(2,F3) - the blue points - together with some additional points, which are said to be "at infinity".&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Then, the lines of PG(2,F3) are (by definition) the planes through 0 in F3^3. What is the relationship between lines in AG(2,F3) and "lines" in PG(2,F3)? Well, any line in (the copy of) AG(2,F3) (embedded in PG(2,F3)) is contained in a plane through 0 in F3^3. So each line in AG(2,F3) gives rise to a line in PG(2,F3). In PG(2,F3), however, the line will have an additional point "at infinity".&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;For example, there is a line in AG(2,F3) consisting of the points {(0,0),(1,0),(2,0)}. These embed into PG(2,F3) as {(1:0:0), (1:1:0), (1:2:0)}. We can ask what is the closure in PG(2,F3) of these points - the least "flat" (point, line, plane, etc) that contains them:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;code&gt;&lt;/div&gt;&lt;div&gt;&amp;gt; closurePG [ [1,0,0],[1,1,0],[1,2,0] ] :: [[F3]]&lt;/div&gt;&lt;div&gt;[[0,1,0],[1,0,0],[1,1,0],[1,2,0]]&lt;/div&gt;&lt;div&gt;&lt;/code&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;So in PG(2,F3), the line gains a point at infinity, (0:1:0).&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Similarly, there is a line in AG(2,F3) consisting of the points {(0,1),(1,1),(2,1)}. These embed into PG(2,F3) as {(1:0:1), (1:1:1), (1:2:1)}.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;code&gt;&lt;/div&gt;&lt;div&gt;&amp;gt; closurePG [ [1,0,1],[1,1,1],[1,2,1] ] :: [[F3]]&lt;/div&gt;&lt;div&gt;[[0,1,0],[1,0,1],[1,1,1],[1,2,1]]&lt;/div&gt;&lt;div&gt;&lt;/code&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Notice that both lines from AG(2,F3) gained the same point at infinity. This is because the two lines were parallel. In PG(2,F3), "parallel lines meet at infinity".&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;So each line in AG(2,F3) extends to a line in PG(2,F3), gaining an extra point at infinity. In addition to these lines, there is one more line in PG(2,F3) - a line at infinity consisting of the points {(0:0:1), (0:1:0), (0:1:1), (0:1:2)}, corresponding to the plane z = 0 in F3^3.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;So, that was all stuff that I should really have talked about last time, but I forgot.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;                  --*--&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Okay, so back to the symmetries of PG(n,Fq). So we consider PG(n,Fq) as an incidence structure between points and lines, and we ask which permutations of the points preserve collinearity. That is, given a permutation g of the points of PG(n,Fq), we say that it is a symmetry (or automorphism, or collineation) of PG(n,Fq) if, whenever a and b are collinear points, then so are a^g and b^g, and vice versa.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;We can find the symmetries of PG(n,Fq) by forming its incidence graph. This is the bipartite graph having as vertices: the points of PG(n,Fq) on the left (or in blue), the lines of PG(n,Fq) on the right (or in red), with an edge joining a point vertex to a line vertex if the point is incident with the line.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Here's the code:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;pre&gt;&lt;code&gt;&lt;/div&gt;&lt;div&gt;incidenceGraphPG n fq = G vs es where&lt;/div&gt;&lt;div&gt;     points = ptsPG n fq&lt;/div&gt;&lt;div&gt;     lines = linesPG n fq&lt;/div&gt;&lt;div&gt;     vs = L.sort $ map Left points ++ map Right lines&lt;/div&gt;&lt;div&gt;     es = L.sort [ [Left x, Right b] | b &amp;lt;- lines, x &amp;lt;- closurePG b]&lt;/div&gt;&lt;div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Let's look at an example. Here's PG(2,F2):&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_XQ7FznWBAYE/Ss42bM9e7rI/AAAAAAAAAEU/rO_aus8srpk/s1600-h/ptspg2f2txt.png"&gt;&lt;img src="http://3.bp.blogspot.com/_XQ7FznWBAYE/Ss42bM9e7rI/AAAAAAAAAEU/rO_aus8srpk/s320/ptspg2f2txt.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5390305644972535474" style="cursor: pointer; width: 289px; height: 233px; " /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The lines in PG(2,F2), corresponding to planes through 0 in F2^3, are:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;code&gt;&lt;/div&gt;&lt;div&gt;&amp;gt; mapM_ (print . closurePG) $ linesPG 2 f2&lt;/div&gt;&lt;div&gt;[[0,1,0],[1,0,0],[1,1,0]]&lt;/div&gt;&lt;div&gt;[[0,1,1],[1,0,0],[1,1,1]]&lt;/div&gt;&lt;div&gt;[[0,1,0],[1,0,1],[1,1,1]]&lt;/div&gt;&lt;div&gt;[[0,1,1],[1,0,1],[1,1,0]]&lt;/div&gt;&lt;div&gt;[[0,0,1],[1,0,0],[1,0,1]]&lt;/div&gt;&lt;div&gt;[[0,0,1],[1,1,0],[1,1,1]]&lt;/div&gt;&lt;div&gt;[[0,0,1],[0,1,0],[0,1,1]]&lt;/div&gt;&lt;div&gt;&lt;/code&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;PG(2,F2) is called the Fano plane, and is often represented by the picture below:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_XQ7FznWBAYE/Ss42bUS_pzI/AAAAAAAAAEc/kCGO6J9F1E4/s1600-h/fanoplanetxt.png"&gt;&lt;img src="http://1.bp.blogspot.com/_XQ7FznWBAYE/Ss42bUS_pzI/AAAAAAAAAEc/kCGO6J9F1E4/s320/fanoplanetxt.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5390305646941808434" style="cursor: pointer; width: 306px; height: 225px; " /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;You can easily check that this is right. Notice the embedded AG(2,F2) in the bottom left, with parallel lines meeting "at infinity" as expected.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The incidence graph of the Fano plane is called the Heawood graph, and it is often represented like this:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_XQ7FznWBAYE/Ss42bvV_bOI/AAAAAAAAAEk/ItTPaNEaECw/s1600-h/heawoodlabelled.png"&gt;&lt;img src="http://2.bp.blogspot.com/_XQ7FznWBAYE/Ss42bvV_bOI/AAAAAAAAAEk/ItTPaNEaECw/s320/heawoodlabelled.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5390305654202133730" style="cursor: pointer; width: 320px; height: 286px; " /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The blue vertices are the points of the Fano plane, the red vertices are the lines. For each red vertex, you can check that the three blue vertices it is connected to are indeed a line of the Fano plane.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Then we can find the automorphisms of the Fano plane as follows:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;code&gt;&lt;/div&gt;&lt;div&gt;&amp;gt; let heawood = incidenceGraphPG 2 f2&lt;/div&gt;&lt;div&gt;&amp;gt; let auts = incidenceAuts heawood&lt;/div&gt;&lt;div&gt;&amp;gt; mapM_ print auts&lt;/div&gt;&lt;div&gt;[[[0,0,1],[0,1,0]],[[1,0,1],[1,1,0]]]&lt;/div&gt;&lt;div&gt;[[[0,0,1],[0,1,1],[0,1,0]],[[1,0,1],[1,1,1],[1,1,0]]]&lt;/div&gt;&lt;div&gt;[[[0,0,1],[1,0,0],[0,1,0]],[[0,1,1],[1,0,1],[1,1,0]]]&lt;/div&gt;&lt;div&gt;[[[0,1,0],[0,1,1]],[[1,1,0],[1,1,1]]]&lt;/div&gt;&lt;div&gt;[[[0,1,0],[1,0,0]],[[0,1,1],[1,0,1]]]&lt;/div&gt;&lt;div&gt;[[[0,1,0],[1,1,0],[1,0,0]],[[0,1,1],[1,1,1],[1,0,1]]]&lt;/div&gt;&lt;div&gt;[[[1,0,0],[1,0,1]],[[1,1,0],[1,1,1]]]&lt;/div&gt;&lt;div&gt;[[[1,0,0],[1,1,0]],[[1,0,1],[1,1,1]]]&lt;/div&gt;&lt;div&gt;&amp;gt; orderSGS auts&lt;/div&gt;&lt;div&gt;168&lt;/div&gt;&lt;div&gt;&lt;/code&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Now the symmetries of PG(n,Fq) actually have a very simple description in terms of matrices. Consider the invertible matrices of degree n+1 over Fq - the general linear group GL(n+1,Fq). These matrices act on vectors in Fq^(n+1) by multiplication, and in doing so, they permute the lines and planes through 0 in Fq^(n+1), whilst preserving containment of lines within planes. In other words, they permute the points and lines of PG(n,Fq) and preserve collinearity. However, any scalar matrix within GL(n+1,Fq) - that is, kI, where k is in Fq\{0} and I is the identity - acts as the identity on PG(n,Fq) - since if (x0:x1:...:xn) is a "point" of PG(n,Fq), then (kx0:kx1:...:kxn) represents the same point. So the group of projective transformations of PG(n,Fq) is actually GL(n+1,Fq) factored out by scalar multiplications. This group is called the projective general linear group, PGL(n+1,Fq).&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;If we are given a symmetry of PG(n,Fq), expressed as a permutation, then we can express it as a matrix by looking at what it does to the basis vectors. For example, consider the symmetry [[[0,0,1],[0,1,0]],[[1,0,1],[1,1,0]]] of PG(2,F2) from above. It sends [1,0,0] to [1,0,0], [0,1,0] to [0,0,1], and [0,0,1] to [0,1,0]. So it can be represented by the matrix&lt;/div&gt;&lt;div&gt;&lt;code&gt;&lt;/div&gt;&lt;div&gt;[1 0 0]&lt;/div&gt;&lt;div&gt;[0 0 1]&lt;/div&gt;&lt;div&gt;[0 1 0]&lt;/div&gt;&lt;div&gt;&lt;/code&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;We already worked out the order of GL(n,Fq) in the orderGL function. There are q-1 non-zero scalars to factor out by. So the order of PGL(n,Fq) will be given by the following:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;code&gt;&lt;/div&gt;&lt;div&gt;orderPGL n q = orderGL n q `div` (q-1)&lt;/div&gt;&lt;div&gt;&lt;/code&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Let's test it (remembering that we want PGL(n+1,Fq) for PG(n,Fq)):&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;code&gt;&lt;/div&gt;&lt;div&gt;&amp;gt; orderPGL 3 2&lt;/div&gt;&lt;div&gt;168&lt;/div&gt;&lt;div&gt;&amp;gt; orderSGS $ incidenceAuts $ incidenceGraphPG 2 f3&lt;/div&gt;&lt;div&gt;5616&lt;/div&gt;&lt;div&gt;&amp;gt; orderPGL 3 3&lt;/div&gt;&lt;div&gt;5616&lt;/div&gt;&lt;div&gt;&amp;gt; orderSGS $ incidenceAuts $ incidenceGraphPG 2 f4&lt;/div&gt;&lt;div&gt;120960&lt;/div&gt;&lt;div&gt;&amp;gt; orderPGL 3 4&lt;/div&gt;&lt;div&gt;60480&lt;/div&gt;&lt;div&gt;&lt;/code&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;With F4, we get twice as many automorphisms as we might have expected. If you've been paying attention, you'll remember that this is because F4 has a field automorphism (the Frobenius automorphism) of order 2. The extended group, PGL(n,Fq) plus field automorphisms, is called PGammaL(n,Fq) (or better - but I'm not sure this will work for everybody - P&amp;#x0393;L(n,Fq)).&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Now, this is kind of a trivial point, but I think it's significant: PG(n,Fq) has &lt;i&gt;more&lt;/i&gt; symmetries than AG(n,Fq). Somehow, by adding the points at infinity, PG(n,Fq) has made AG(n,Fq) "whole", making it more symmetrical.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Anyway, that'll do for now. Next week, just a few more odds and ends about finite geometries.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5195188167565410449-5701363515709149075?l=haskellformaths.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://haskellformaths.blogspot.com/feeds/5701363515709149075/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://haskellformaths.blogspot.com/2009/10/symmetries-of-pgnfq.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5195188167565410449/posts/default/5701363515709149075'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5195188167565410449/posts/default/5701363515709149075'/><link rel='alternate' type='text/html' href='http://haskellformaths.blogspot.com/2009/10/symmetries-of-pgnfq.html' title='Symmetries of PG(n,Fq)'/><author><name>DavidA</name><uri>http://www.blogger.com/profile/16359932006803389458</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_XQ7FznWBAYE/Ss42alXwUrI/AAAAAAAAAEM/bTxJx6Vy0eg/s72-c/ptspg2f3txt.gif' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5195188167565410449.post-765379194246422155</id><published>2009-09-30T20:15:00.000+01:00</published><updated>2009-10-01T08:39:00.322+01:00</updated><title type='text'>Finite geometries, part 4: Lines in PG(n,Fq)</title><content type='html'>&lt;div&gt;[Apologies about the formatting - blogger seems to have won the struggle this time.]&lt;br /&gt;&lt;br /&gt;Last time we saw that the points in the projective geometry PG(n,Fq) are defined to be the lines (through the origin) in the vector space Fq^(n+1). What about lines in PG(n,Fq)? Well, it's simple - they are defined as the planes (through 0) in Fq^(n+1).&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Let's look at an example, to try to clarify our intuitions. Here are the points of PG(2,F3) again.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_XQ7FznWBAYE/SsEdWB4wrNI/AAAAAAAAAEE/PDoy1TbFIfs/s1600-h/ptspg2f3txt.gif"&gt;&lt;img src="http://4.bp.blogspot.com/_XQ7FznWBAYE/SsEdWB4wrNI/AAAAAAAAAEE/PDoy1TbFIfs/s320/ptspg2f3txt.gif" alt="" id="BLOGGER_PHOTO_ID_5386618893612657874" style="cursor: pointer; width: 320px; height: 268px;" border="0" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div&gt;&lt;code&gt;&lt;/code&gt;&lt;/div&gt;&lt;code&gt;&lt;/code&gt;&lt;div&gt;&lt;br /&gt;&lt;code&gt;&lt;br /&gt;&lt;/code&gt;&lt;/div&gt;&lt;div&gt;&gt; :load Math.Combinatorics.FiniteGeometry&lt;/div&gt;&lt;div&gt;&gt; ptsPG 2 f3&lt;/div&gt;&lt;div&gt;[[0,0,1],[0,1,0],[0,1,1],[0,1,2],[1,0,0],[1,0,1],[1,0,2],[1,1,0],[1,1,1],[1,1,2],[1,2,0],[1,2,1],[1,2,2]]&lt;/div&gt;&lt;div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;(Recall that each point in PG(2,F3) represents a line through 0 in F&lt;sub&gt;3&lt;/sub&gt;&lt;sup&gt;3&lt;/sup&gt;. We have chosen the representative which has 1 as its first non-zero coordinate. This has the consequence that the coordinates in the diagram are best read back-to-front - they're zyx instead of xyz.)&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;So the lines in PG(2,F3) correspond to the planes through 0 in F&lt;sub&gt;3&lt;/sub&gt;&lt;sup&gt;3&lt;/sup&gt;. For example, the plane x = 0 (the left hand face of the cube) gives rise to a line in PG(2,F3) containing the points 010, 100, 110, 120. The plane x = z (a diagonal cut through the cube from front left to back right) gives rise to the line containing the points 010, 101, 111, 121. Just to be clear, we are only looking at the planes in F&lt;sub&gt;3&lt;/sub&gt;&lt;sup&gt;3&lt;/sup&gt; that contain 0. So for example, 012, 101, 111, 121 - the plane x+z=2 - is &lt;i&gt;not&lt;/i&gt; a line in PG(2,F3). Also, remember that F&lt;sub&gt;3&lt;/sub&gt; wraps around - so for example 010, 102, 112, 122 is a line in PG(2,F3).&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Okay, onwards. As we did with AG(n,Fq), we can represent a line in PG(n,Fq) by any pair of points on it. We would then like to be able to do the following:&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;ol&gt;&lt;li&gt;Given a line of PG(n,Fq), list the points of PG(n,Fq) which are on it&lt;/li&gt;&lt;li&gt;Given a line of PG(n,Fq), find a &lt;i&gt;canonical&lt;/i&gt; representation for it&lt;/li&gt;&lt;li&gt;Find all lines of PG(n,Fq)&lt;/li&gt;&lt;/ol&gt;&lt;/div&gt;&lt;div&gt;(These are basically just the same things that we did when we looked at AG(n,Fq) a couple of weeks back.)&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;Okay, so suppose that we are given two distinct points of PG(n,Fq), representing a line, and we are asked to list all the points of PG(n,Fq) that are on the line. Well, the two points are also points in F&lt;sub&gt;q&lt;/sub&gt;&lt;sup&gt;n+1&lt;/sup&gt;, and as they are distinct in PG(n,Fq), they are linearly independent in F&lt;sub&gt;q&lt;/sub&gt;&lt;sup&gt;n+1&lt;/sup&gt; - hence they generate a plane through 0. We are being asked to find all lines through 0 that are contained in the plane. Well, that's easy:&lt;/div&gt;&lt;div&gt;&lt;ol&gt;&lt;li&gt;Generate all linear combinations of the two points - all points in the plane in F&lt;sub&gt;q&lt;/sub&gt;&lt;sup&gt;n+1&lt;/sup&gt;.&lt;/li&gt;&lt;li&gt;Each non-zero point in the plane represents a line through 0, and hence a point of PG(n,Fq)&lt;/li&gt;&lt;li&gt;However, we can filter out all except those that are in the canonical form for pts of PG(n,Fq) - having a 1 as their first non-zero coordinate - in order to avoid double counting the lines.&lt;/li&gt;&lt;/ol&gt;&lt;/div&gt;&lt;div&gt; So we need an auxiliary function to detect whether a point is in canonical form - what I call "projective normal form" or PNF:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;code&gt;&lt;br /&gt;&lt;/code&gt;&lt;/div&gt;&lt;div&gt;&lt;code&gt;&lt;/code&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;ispnf (0:xs) = ispnf xs&lt;/div&gt;&lt;div&gt;ispnf (1:xs) = True&lt;/div&gt;&lt;div&gt;ispnf _ = False&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Then, given two distinct points, to list all points on the line they generate:&lt;span style="font-family:monospace;"&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;code&gt;&lt;br /&gt;linePG [p1,p2] = toListSet $ filter ispnf [(a *&gt; p1) &amp;lt;+&gt; (b *&gt; p2) | a &amp;lt;- fq, b &amp;lt;- fq]&lt;br /&gt;    where fq = eltsFq undefined&lt;br /&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;br /&gt;&lt;br /&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;(Recall that c *&gt; v is scalar multiplication of a vector, and u &amp;lt;+&gt; v is addition of vectors. Note that in HaskellForMaths &lt;= 0.1.9, you must use the more general "closurePG" function instead, which has the same behaviour.)&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;For example, here are the two lines of PG(2,F3) that we discussed earlier:&lt;br /&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style=";font-family:monospace,serif;font-size:85%;"  &gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-family:monospace,serif;"&gt;&gt; linePG [[0,1,0],[1,0,0]] :: [[F3]]&lt;/span&gt;&lt;/div&gt;&lt;code&gt;&lt;/code&gt;&lt;div&gt;&lt;code&gt;&lt;/code&gt;&lt;div&gt;[[0,1,0],[1,0,0],[1,1,0],[1,2,0]]&lt;/div&gt;&lt;div&gt;&gt; linePG [[0,1,0],[1,0,1]] :: [[F3]]&lt;/div&gt;&lt;div&gt;[[0,1,0],[1,0,1],[1,1,1],[1,2,1]]&lt;/div&gt;&lt;div&gt;&lt;code&gt;&lt;span class="Apple-style-span"  style="font-family:Georgia,serif;"&gt;&lt;span class="Apple-style-span"&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;span class="Apple-style-span"  style="font-family:Georgia,serif;"&gt;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;A line in PG(n,Fq) can be represented by any pair of distinct points on the line. Is there a canonical pair to pick? Yes, there is. Given any pair of points of PG(n,Fq), we can consider them as the rows of a matrix. Any matrix which can be obtained from this matrix by elementary row operations represents the same plane in F&lt;sub&gt;q&lt;/sub&gt;&lt;sup&gt;n+1&lt;/sup&gt;.  For our canonical form, we can take the reduced row echelon form for the matrix (see &lt;a href="http://en.wikipedia.org/wiki/Reduced_row_echelon_form#Reduced_row_echelon_form"&gt;wikipedia&lt;/a&gt;). However, I'm not going to dwell on this - use the "reducedRowEchelonForm" function if you'd like to experiment.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;What I'd like to consider instead is how we can list all the lines in PG(n,Fq). The answer is simply to list all the reduced row echelon forms. We can do this in two steps:&lt;/div&gt;&lt;div&gt;&lt;ol&gt;&lt;li&gt;List all the reduced row echelon "shapes"&lt;/li&gt;&lt;li&gt;Fill in all possible combinations of values from Fq, to get all reduced row echelon forms&lt;/li&gt;&lt;/ol&gt;&lt;div&gt;An example will make this clearer. Suppose we want to find the lines in PG(3,Fq). Thus we are looking for the 2-dimensional subspaces of Fq^4. The reduced row echelon shapes are the following:&lt;br /&gt;&lt;pre&gt;&lt;code&gt;&lt;br /&gt;[1 0 * *]                   [1 * 0 *]                   [1 * * 0]&lt;br /&gt;[0 1 * *]                   [0 0 1 *]                   [0 0 0 1]&lt;br /&gt;&lt;br /&gt;[0 1 0 *]                   [0 1 * 0]                   [0 0 1 0]&lt;br /&gt;[0 0 1 *]                   [0 0 0 1]                   [0 0 0 1]&lt;br /&gt;&lt;/code&gt;&lt;/pre&gt;&lt;br /&gt;To get the reduced row echelon forms, we take each shape in turn, and fill in the stars with the values from Fq in all possible ways.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The Haskell code finds row echelon shapes for k-dimensional subspaces of Fq^n. (For lines, set k = 2):&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;code&gt;&lt;span style="font-family:monospace;"&gt;&lt;/span&gt;&lt;br /&gt;data ZeroOneStar = Zero | One | Star deriving (Eq)&lt;br /&gt;&lt;br /&gt;instance Show ZeroOneStar where&lt;br /&gt;   show Zero = "0"&lt;br /&gt;   show One  = "1"&lt;br /&gt;   show Star = "*"&lt;br /&gt;&lt;br /&gt;rrefs n k = map (rref 1 1) (combinationsOf k [1..n]) where&lt;br /&gt;   rref r c (x:xs) =&lt;br /&gt;       if c == x&lt;br /&gt;       then zipWith (:) (oneColumn r) (rref (r+1) (c+1) xs)&lt;br /&gt;       else zipWith (:) (starColumn r) (rref r (c+1) (x:xs))&lt;br /&gt;   rref _ c [] = replicate k (replicate (n+1-c) Star)&lt;br /&gt;   oneColumn r = replicate (r-1) Zero ++ One : replicate (k-r) Zero&lt;br /&gt;   starColumn r = replicate (r-1) Star ++ replicate (k+1-r) Zero&lt;br /&gt;&lt;/code&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;Thus we can calculate the shapes we saw above as follows:&lt;br /&gt;&lt;br /&gt;&lt;code&gt;&lt;br /&gt;&gt; mapM_ print $ rrefs 4 2&lt;br /&gt;[[1,0,*,*],[0,1,*,*]]&lt;br /&gt;[[1,*,0,*],[0,0,1,*]]&lt;br /&gt;[[1,*,*,0],[0,0,0,1]]&lt;br /&gt;[[0,1,0,*],[0,0,1,*]]&lt;br /&gt;[[0,1,*,0],[0,0,0,1]]&lt;br /&gt;[[0,0,1,0],[0,0,0,1]]&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;Next, to find all lines in PG(n,Fq), we need to substitute values from Fq for the stars. The following code does the trick, although I suspect that there might be a better way to write it:&lt;span style="font-family:monospace;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;pre&gt;&lt;code&gt;&lt;br /&gt;flatsPG n fq k = concatMap substStars $ rrefs (n+1) (k+1) where&lt;br /&gt;   substStars (r:rs) = [r':rs' | r' &lt;- substStars' r, rs' &lt;- substStars rs]&lt;br /&gt;   substStars [] = [[]]&lt;br /&gt;   substStars' (Star:xs) = [x':xs' | x' &lt;- fq, xs' &lt;- substStars' xs]&lt;br /&gt;   substStars' (Zero:xs) = map (0:) $ substStars' xs&lt;br /&gt;   substStars' (One:xs) = map (1:) $ substStars' xs&lt;br /&gt;   substStars' [] = [[]]&lt;br /&gt;&lt;br /&gt;linesPG n fq = flatsPG n fq 1 &lt;/code&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;div&gt;For example:&lt;br /&gt;&lt;br /&gt;&lt;code&gt;&lt;br /&gt;&gt; mapM_ print $ linesPG 2 f3&lt;br /&gt;[[1,0,0],[0,1,0]]&lt;br /&gt;[[1,0,0],[0,1,1]]&lt;br /&gt;[[1,0,0],[0,1,2]]&lt;br /&gt;[[1,0,1],[0,1,0]]&lt;br /&gt;[[1,0,1],[0,1,1]]&lt;br /&gt;[[1,0,1],[0,1,2]]&lt;br /&gt;[[1,0,2],[0,1,0]]&lt;br /&gt;[[1,0,2],[0,1,1]]&lt;br /&gt;[[1,0,2],[0,1,2]]&lt;br /&gt;[[1,0,0],[0,0,1]]&lt;br /&gt;[[1,1,0],[0,0,1]]&lt;br /&gt;[[1,2,0],[0,0,1]]&lt;br /&gt;[[0,1,0],[0,0,1]]&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;It occurs to me that I should explain why PG(n,Fq) is worth looking at. Well, I mentioned before that the symmetry groups of PG(n,Fq) are some of the "atoms of symmetry", from which all symmetry groups are composed. In addition, projective geometry is in fact more fundamental than affine geometry (at least from the point of view of algebraic geometry, although I'm not sure how you would justify that statement in general). Well, maybe you'll just have to take my word for it that PG(n,Fq) is a beautiful thing, for the moment.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;Next time, symmetries of PG(n,Fq).&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5195188167565410449-765379194246422155?l=haskellformaths.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://haskellformaths.blogspot.com/feeds/765379194246422155/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://haskellformaths.blogspot.com/2009/09/finite-geometries-part-4-lines-in-pgnfq.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5195188167565410449/posts/default/765379194246422155'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5195188167565410449/posts/default/765379194246422155'/><link rel='alternate' type='text/html' href='http://haskellformaths.blogspot.com/2009/09/finite-geometries-part-4-lines-in-pgnfq.html' title='Finite geometries, part 4: Lines in PG(n,Fq)'/><author><name>DavidA</name><uri>http://www.blogger.com/profile/16359932006803389458</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_XQ7FznWBAYE/SsEdWB4wrNI/AAAAAAAAAEE/PDoy1TbFIfs/s72-c/ptspg2f3txt.gif' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5195188167565410449.post-7886475740804781034</id><published>2009-09-25T09:00:00.000+01:00</published><updated>2009-09-25T09:24:24.365+01:00</updated><title type='text'>Finite geometries, part 3: Points in PG(n,Fq)</title><content type='html'>&lt;div&gt;[New release &lt;a href="http://hackage.haskell.org/package/HaskellForMaths"&gt;HaskellForMaths-0.1.8&lt;/a&gt; - contains bugfix to Fractional instance for ExtensionField, plus documentation improvements.]&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Over the last two weeks, we have looked at the finite affine geometries AG(n,Fq). Hopefully, this has all been fairly straightforward - affine geometry is just the familiar Euclidean geometry of points and lines, but without angles or distances, and all we have done is replace the reals R by the finite fields Fq.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;This week we're going to look at finite projective geometries. Now, I know from my own experience that it can take a little while to "get" projective geometry. There are probably several reasons for this, but one of them is that mathematicians define projective geometry one way, but then most of the time think about it in a different way. So, I'll do my best to explain it, but don't worry if you don't get it at first, one day it will all click into place.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Okay, so let's start with the points. The points of PG(n,Fq) are defined to be the lines through the origin (ie the one-dimensional subspaces) in Fq^(n+1). That should sound a bit strange at first reading - how can the points (in PG(n,Fq)) be lines (in Fq^(n+1))? Bear with me - we'll see why it makes sense to think of them as points.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;For example, the points of PG(2,F3) are the lines through 0 in F3^3. So the line {(0,0,0), (0,1,0), (0,2,0)} is a point of PG(2,F3), as is the line {(0,0,0), (1,0,2), (2,0,1)}. (Recall that arithmetic in F3 is modulo 3.) As a line in F3^3 is a one-dimensional subspace, it is generated by any non-zero point on the line, and consists of all scalar multiples of such a point. So we could represent the two lines we just mentioned as &amp;lt;(0,1,0)&amp;gt; and &amp;lt;(1,0,2)&amp;gt;, with angle brackets meaning "the line generated by". Any non-zero point on the line will do to represent the line, but it would be good to have a way of choosing a canonical representative. For the moment, let's say that we will choose the point whose last non-zero coordinate is 1. Here, then, are the points of PG(2,F3):&lt;/div&gt;&lt;div&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_XQ7FznWBAYE/SrYshfL-gaI/AAAAAAAAAD8/lLsQGtGsPsU/s1600-h/ptspg2f3.GIF"&gt;&lt;img src="http://4.bp.blogspot.com/_XQ7FznWBAYE/SrYshfL-gaI/AAAAAAAAAD8/lLsQGtGsPsU/s320/ptspg2f3.GIF" border="0" alt="" id="BLOGGER_PHOTO_ID_5383539358387044770" style="cursor: pointer; width: 320px; height: 299px; " /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div&gt;So the blue points are the points of the form (x,y,1). Every point of the form (x,y,1) represents a line through 0 in F3^3. However, this is not all the lines through 0, for it misses out those lines which are in the plane z = 0. The green points are the points of the form (x,1,0). Each point of the form (x,1,0) represents a line through 0 in the plane z = 0. However, this is still not all, for we are missing the line y = z = 0. This is represented by the red point (1,0,0). And that's it - these are all the "points" of PG(2,F3). (Confirm for yourself that every line through zero in F3^3 is represented by one of these "points".)&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Since any scalar multiple of a point in F3^3 represents the same line through 0, and hence the same point of PG(2,F3), mathematicians often write the representatives as (x:y:z), with the colons indicating that it is only the ratios we are interested in. Thus (0:1:0) and (0:2:0) represent the same line in F3^3, and hence the &lt;i&gt;same&lt;/i&gt; point of PG(2,F3). (Note that (0:0:0) is &lt;i&gt;not&lt;/i&gt; a point of PG(2,F3).)&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Okay, so I mentioned that mathematicians define PG(n,Fq) one way, but then think about it another way. So how do mathematicians think about it?&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Well, the blue points in PG(2,F3) look just like a copy of AG(2,F3), don't they. Then the green points are called "the line at infinity". Finally, the red point is called "the point at infinity". So mathematicians think of PG(2,F3) as being like AG(2,F3), but with some additional points "at infinity". In particular, although PG(2,F3) was defined in terms of lines through 0 in F3^3, mathematicians actually think of it as consisting of points (not lines).&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The idea that the additional points are "at infinity" comes from perspective drawing. Imagine an artist sitting in front of a canvas, looking along the z-axis. Their eye is the origin. The canvas is the plane {(x,y,1)}. To draw what they see, the artist needs to project a line from their eye, through the canvas, until it hits something. Given a large enough canvas, the artist can draw anything that is in front. However, things that are above or to the sides would have to be drawn "at infinity".&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;One thing I should emphasize is that, contrary to appearance, the line and point at infinity are no different from the other points. Indeed, I could have had my artist looking along the x-axis instead of the z-axis. In that case, the embedded affine plane would have been the points {(1,y,z)}, the line at infinity would have been {(0,1,z)}, and the point at infinity would have been (0,0,1). If we revert to thinking about lines in F3^3 for a moment, it should be clear that the points at infinity are not distinguished in any way from other lines in F3^3. Their appearance of being distinguished has arisen solely from our choice of coordinate system.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Okay, time for some code. In the above, we said that the canonical representative for a line through 0 would be the point with 1 as its last non-zero coordinate - corresponding to looking along the z-axis. That made things easier to explain. However, it turns out that it's usually more convenient to choose the point with 1 as its &lt;i&gt;first&lt;/i&gt; non-zero coordinate (ie, looking along the x-axis). Here is the code to list the points of PG(n,Fq):&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;code&gt;&lt;/code&gt;&lt;/div&gt;&lt;code&gt;&lt;div&gt;ptsPG 0 _ = [[1]]&lt;/div&gt;&lt;div&gt;ptsPG n fq = map (0:) (ptsPG (n-1) fq) ++ map (1:) (ptsAG n fq)&lt;/div&gt;&lt;/code&gt;&lt;div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;For example:&lt;/div&gt;&lt;div&gt;&lt;code&gt;&lt;/code&gt;&lt;/div&gt;&lt;code&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&gt; ptsPG 2 f3&lt;/div&gt;&lt;div&gt;[[0,0,1],[0,1,0],[0,1,1],[0,1,2],[1,0,0],[1,0,1],[1,0,2],[1,1,0],[1,1,1],[1,1,2],[1,2,0],[1,2,1],[1,2,2]]&lt;/div&gt;&lt;div&gt;&gt; map reverse it&lt;/div&gt;&lt;div&gt;[[1,0,0],[0,1,0],[1,1,0],[2,1,0],[0,0,1],[1,0,1],[2,0,1],[0,1,1],[1,1,1],[2,1,1],[0,2,1],[1,2,1],[2,2,1]]&lt;/div&gt;&lt;/code&gt;&lt;div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;If we reverse the coordinates, you can see that we have the point at infinity, the line at infinity, and the embedded copy of AG(2,F3), as before.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;That's probably enough to take in at one sitting. Next time, we'll look at lines in PG(n,Fq).&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5195188167565410449-7886475740804781034?l=haskellformaths.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://haskellformaths.blogspot.com/feeds/7886475740804781034/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://haskellformaths.blogspot.com/2009/09/finite-geometries-part-3-points-in.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5195188167565410449/posts/default/7886475740804781034'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5195188167565410449/posts/default/7886475740804781034'/><link rel='alternate' type='text/html' href='http://haskellformaths.blogspot.com/2009/09/finite-geometries-part-3-points-in.html' title='Finite geometries, part 3: Points in PG(n,Fq)'/><author><name>DavidA</name><uri>http://www.blogger.com/profile/16359932006803389458</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_XQ7FznWBAYE/SrYshfL-gaI/AAAAAAAAAD8/lLsQGtGsPsU/s72-c/ptspg2f3.GIF' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5195188167565410449.post-7958819987416747163</id><published>2009-09-18T13:00:00.000+01:00</published><updated>2009-09-18T20:02:39.400+01:00</updated><title type='text'>Finite geometries, part 2: Symmetries of AG(n,Fq)</title><content type='html'>&lt;div&gt;Last time, we met the finite affine geometries AG(n,Fq), which are analogous to n-dimensional Euclidean geometry, but defined over the finite fields Fq instead of the reals R. This time I want to talk about the symmetries of AG(n,Fq).&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Recall that when we were &lt;a href="http://haskellformaths.blogspot.com/2009/06/graph-symmetries.html"&gt;looking at graphs&lt;/a&gt;, we defined a symmetry of a graph as a permutation of the vertices which left the edges (collectively) in the same places. For the moment, we will think of a finite geometry as a configuration of points and lines, and define a symmetry as a permutation of the points which leaves the lines (collectively) in the same places.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Here's AG(2,F3) again:&lt;/div&gt;&lt;div&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_XQ7FznWBAYE/Sq_xw0tZa8I/AAAAAAAAAD0/goO8O4OdIZE/s1600-h/ag2f3.GIF"&gt;&lt;img src="http://2.bp.blogspot.com/_XQ7FznWBAYE/Sq_xw0tZa8I/AAAAAAAAAD0/goO8O4OdIZE/s320/ag2f3.GIF" border="0" alt="" id="BLOGGER_PHOTO_ID_5381785900816100290" style="cursor: pointer; width: 320px; height: 312px; " /&gt;&lt;/a&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;There's an obvious four-fold rotational symmetry, which moves all the points except the middle point, moves all the lines, but leaves the lines collectively in the same places. Less obvious perhaps is that reflection in the middle line, or translating everything one point to the right (with wraparound), are also symmetries. Remember that the straightness or otherwise of the lines doesn't matter - the only thing we're interested in is which points they are incident with.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;So how can we find all the symmetries? Well, when we were looking at graphs, we used depth-first search, trying to create a pairing of source and target vertices, and backtracking whenever the adjacency between the source vertices didn't match the adjacency between the target vertices. Unfortunately, this method doesn't easily generalize to finite geometries. Exercise: Why not?&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Instead, we will change the problem into a problem about graphs. Given a collection of points and lines, we can create an "incidence graph". This is the bipartite graph constructed as follows. On the left, we have a vertex for every point. On the right, we have a vertex for every line. We have an edge joining a point on the left to a line on the right just in the case that the point is incident with the line in our geometry.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Here's the code to create the incidence graph of AG(n,Fq):&lt;/div&gt;&lt;div&gt;&lt;pre&gt;&lt;code&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;incidenceGraphAG n fq = G vs es where&lt;/div&gt;&lt;div&gt;    points = ptsAG n fq&lt;/div&gt;&lt;div&gt;    lines = linesAG n fq&lt;/div&gt;&lt;div&gt;    vs = L.sort $ map Left points ++ map Right lines&lt;/div&gt;&lt;div&gt;    es = L.sort [ [Left x, Right b] | b &lt;- lines, x &lt;- closureAG b]&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;(Recall that a line is represented by two distinct points on it. "closureAG" calculates all the points on the line.)&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Now, the trick is, that we can find the symmetries of the finite geometry by finding the symmetries of its incidence graph. For clearly, any symmetry of the finite geometry permutes the points among themselves, permutes the lines among themselves, and preserves incidence - hence it gives rise to a symmetry of the incidence graph. So in principle, we can find the symmetries of the finite geometry by doing the following:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;code&gt;&lt;/code&gt;&lt;/div&gt;&lt;code&gt;&lt;div&gt;&gt; mapM_ print $ graphAuts $ incidenceGraphAG 2 f3&lt;/div&gt;&lt;div&gt;&lt;div&gt;[[Left [0,0],Left [0,1]],[Left [1,0],Left [2,2],Left [1,1],Left [2,1],Left [1,2],Left [2,0]],[Right [[0,0],[1,0]],Right [[0,1],[1,0]],Right [[0,0],[1,1]],Right [[0,1],[1,1]],Right [[0,0],[1,2]],Right [[0,1],[1,2]]],[Right [[0,2],[1,0]],Right [[0,2],[1,2]],Right [[0,2],[1,1]]],[Right [[1,0],[1,1]],Right [[2,0],[2,1]]]]&lt;/div&gt;&lt;div&gt;[[Left [0,0],Left [0,2],Left [0,1]],[Left [2,0],Left [2,1],Left [2,2]],[Right [[0,0],[1,0]],Right [[0,2],[1,0]],Right [[0,1],[1,0]]],[Right [[0,0],[1,1]],Right [[0,2],[1,1]],Right [[0,1],[1,1]]],[Right [[0,0],[1,2]],Right [[0,2],[1,2]],Right [[0,1],[1,2]]]]&lt;/div&gt;&lt;div&gt;...&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;However, there are a couple of minor problems with this:&lt;/div&gt;&lt;div&gt;&lt;ol&gt;&lt;li&gt;We only really want to know what happens to the points. What happens to the lines follows from that. So we only want to know about the Left parts, not the Right parts. (Furthermore, if we were to show only the Left part, we then wouldn't need all those Lefts and Rights which are cluttering up the output.)&lt;/li&gt;&lt;li&gt;It is just possible that the incidence graph might have some symmetries which interchange points and lines. This is a very interesting situation, which we'll discuss in due course. But it is a symmetry of the graph which does not correspond to a symmetry of the geometry.&lt;br /&gt;&lt;/li&gt;&lt;/ol&gt;&lt;div&gt;For these reasons, &lt;a href="http://hackage.haskell.org/package/HaskellForMaths"&gt;HaskellForMaths&lt;/a&gt; provides an "incidenceAuts" function, which works just like graphAuts, except that it knows that the input is an incidence graph, so it avoids wasting time looking for point-line crossover symmetries, and outputs only the permutations of the points.&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;code&gt;&lt;/code&gt;&lt;/div&gt;&lt;code&gt;&lt;div&gt;&gt; mapM_ print $ incidenceAuts $ incidenceGraphAG 2 f3&lt;/div&gt;&lt;div&gt;[[[0,0],[0,1]],[[1,0],[2,2],[1,1],[2,1],[1,2],[2,0]]]&lt;/div&gt;&lt;div&gt;[[[0,0],[0,2],[0,1]],[[2,0],[2,1],[2,2]]]&lt;/div&gt;&lt;div&gt;[[[0,0],[1,0],[0,1]],[[0,2],[2,0],[2,2]],[[1,1],[2,1],[1,2]]]&lt;/div&gt;&lt;div&gt;[[[0,1],[0,2]],[[1,1],[1,2]],[[2,1],[2,2]]]&lt;/div&gt;&lt;div&gt;[[[0,1],[1,0]],[[0,2],[2,0]],[[1,2],[2,1]]]&lt;/div&gt;&lt;div&gt;[[[0,1],[1,1],[1,2],[2,0],[0,2],[2,2],[2,1],[1,0]]]&lt;/div&gt;&lt;div&gt;[[[1,0],[1,1],[1,2]],[[2,0],[2,2],[2,1]]]&lt;/div&gt;&lt;div&gt;[[[1,0],[2,0]],[[1,1],[2,1]],[[1,2],[2,2]]]&lt;/div&gt;&lt;/code&gt;&lt;div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;This is a &lt;a href="http://haskellformaths.blogspot.com/2009/07/strong-generating-sets-for-graph.html"&gt;strong generating set&lt;/a&gt; for the symmetries of AG(2,F3). We could now go on to investigate further. We could ask what different types of symmetries there are, using the &lt;code&gt;conjClassReps&lt;/code&gt; function, or how many symmetries there are, using the &lt;code&gt;orderSGS&lt;/code&gt; function.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;For example, among the elements of the SGS, some have obvious interpretations:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style=" ;font-family:-webkit-monospace;"&gt;&lt;span class="Apple-style-span"  style="  ;font-family:Georgia;"&gt;[[[1,0],[2,0]],[[1,1],[2,1]],[[1,2],[2,2]] - remember that this means that it swaps [1,0] and [2,0], swaps [1,1] and [2,1], and swaps [1,2] and [2,2]. So it can be thought of as a reflection in the line x = 0, or as a 2* stretch of the x-coordinate. It is given by the matrix [[2,0],[0,1]]. (In F3, 2 = -1, which is why this can be thought of as either a reflection or a stretch.)&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;[[[1,0],[1,1],[1,2]],[[2,0],[2,2],[2,1]]] is the shear given by the matrix [[1,1],[0,1]].&lt;br /&gt;&lt;/div&gt;&lt;div&gt;[Later: You might question whether a shear is a symmetry. We have defined a symmetry informally as a change that leaves things looking the same. But, for example, a shear in R^2 doesn't leave the unit square looking the same. Is it really a symmetry? Well, when we're studying geometries within combinatorics, we're only interested in the incidence between points and lines, and not, for example, distances. From this point of view, a shear is indeed a symmetry, because it preserves incidence. We will later look at what happens when we require symmetries to preserve more than just incidence.]&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;[[[0,0],[1,0],[0,1]],[[0,2],[2,0],[2,2]],[[1,1],[2,1],[1,2]]] cannot be represented by a 2*2 matrix, since it moves the origin. However, it can be represented by a 3*3 matrix, as follows:&lt;/div&gt;&lt;div&gt;&lt;code&gt;&lt;/code&gt;&lt;/div&gt;&lt;code&gt;&lt;div&gt;[x']   [2 2 1] [x]&lt;br /&gt;&lt;/div&gt;&lt;div&gt;[y'] = [1 0 0] [y]&lt;/div&gt;&lt;div&gt;[ 1]   [0 0 1] [1]&lt;/div&gt;&lt;/code&gt;&lt;div&gt;&lt;/div&gt;&lt;div&gt;This is the same as saying:&lt;/div&gt;&lt;div&gt;&lt;code&gt;&lt;/div&gt;&lt;div&gt;[x'] = [2 2] [x] + [1]&lt;/div&gt;&lt;div&gt;[y']   [1 0] [y]   [0]&lt;/div&gt;&lt;div&gt;&lt;/code&gt;&lt;/div&gt;&lt;div&gt;So we see that it is a composite of some sort of shear, followed by a translation.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Indeed, we could conjecture that all symmetries of AG(n,Fq) are of this form - a linear transformation followed by a translation. Let's see whether we can confirm this conjecture by counting symmetries.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;We have seen that some symmetries can be represented by a 2*2 matrix. In fact, the matrix must be non-singular, otherwise we won't have a permutation. The group of 2*2 non-singular matrices over Fq is the general linear group GL(2,Fq). How many elements does it have? Well, for the first row, we can choose any non-zero vector in Fq^2, of which there are q^2-1. For the second row, we can choose any vector in Fq^2 which is not linearly dependent on the one we already chose, of which there are q^2-q. Generalising to GL(n,Fq), the following code calculates the number of elements of GL(n,Fq):&lt;/div&gt;&lt;div&gt;&lt;code&gt;&lt;/code&gt;&lt;/div&gt;&lt;code&gt;&lt;/code&gt;&lt;div&gt;&lt;code&gt;&lt;div&gt;orderGL n q = product [q^n - q^i | i &lt;- [0..n-1] ]&lt;/div&gt;&lt;/code&gt;&lt;div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Then, in addition to these linear transformations, all of which leave the origin fixed, we can do a translation. The number of translations of Fq^n is of course q^n.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;A transformation consisting of a translation and a linear transformation is called an affine transformation. The group of all such is called the affine group, Aff(n,Fq). It is straightforward to calculate the number of elements of this group:&lt;/div&gt;&lt;div&gt;&lt;code&gt;&lt;/code&gt;&lt;/div&gt;&lt;code&gt;&lt;div&gt;orderAff n q = q^n * orderGL n q&lt;/div&gt;&lt;/code&gt;&lt;div&gt;&lt;/div&gt;&lt;div&gt;(Note that by multiplying the number of translations by the number of linear transformations, I have made an assumption that they are "semi-independent" of one another. This assumption is valid, but at this stage I don't want to spell out exactly what it means.)&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;Okay, so our conjecture is that all symmetries of AG(n,Fq) are affine transformations. Let's test our conjecture:&lt;/div&gt;&lt;div&gt;&lt;code&gt;&lt;/code&gt;&lt;/div&gt;&lt;code&gt;&lt;div&gt;&gt; orderSGS $ incidenceAuts $ incidenceGraphAG 2 f3&lt;/div&gt;&lt;div&gt;432&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&gt; orderAff 2 3&lt;/div&gt;&lt;div&gt;432&lt;/div&gt;&lt;div&gt;&gt; orderSGS $ incidenceAuts $ incidenceGraphAG 2 f4&lt;/div&gt;&lt;div&gt;5760&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&gt; orderAff 2 4&lt;/div&gt;&lt;div&gt;2880&lt;/div&gt;&lt;/code&gt;&lt;div&gt;&lt;/div&gt;&lt;div&gt;Uh-oh - what's going on here? We seem to have twice as many symmetries as we expected.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;It took me quite a while to figure this one out actually. What's going on?&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Well, remember that when we looked at extension fields, I mentioned that there can be field automorphisms. (For example, in the complex numbers, we have complex conjugation.) When we looked at the finite fields Fq, for q = p^n a prime power, I mentioned the Frobenius automorphism x -&gt; x^p. Okay, well what's going on is that the field automorphisms of Fq give rise to further symmetries of AG(n,Fq). If q = p^n, then the number of such automorphisms is n, so the total number of symmetries of AG(n,Fq) will be n * orderAff n q.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;And now that really does account for all symmetries of AG(n,Fq). Apart from the following exceptions:&lt;/div&gt;&lt;div&gt;&lt;code&gt;&lt;/code&gt;&lt;/div&gt;&lt;code&gt;&lt;div&gt;&gt; orderSGS $ incidenceAuts $ incidenceGraphAG 3 f2&lt;/div&gt;&lt;div&gt;40320&lt;/div&gt;&lt;div&gt;&gt; orderAff 3 2&lt;/div&gt;&lt;div&gt;1344&lt;/div&gt;&lt;/code&gt;&lt;div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;What's going on here? Well the problem is that AG(n,F2) is degenerate. A line in AG(n,F2) has just two points - and every pair of points forms a line. So in fact, AG(n,F2) is just the complete graph K (2^n), and hence has (2^n)! symmetries.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;That's it for now. Next time, finite projective geometries.&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5195188167565410449-7958819987416747163?l=haskellformaths.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://haskellformaths.blogspot.com/feeds/7958819987416747163/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://haskellformaths.blogspot.com/2009/09/finite-geometries-part-2-symmetries-of.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5195188167565410449/posts/default/7958819987416747163'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5195188167565410449/posts/default/7958819987416747163'/><link rel='alternate' type='text/html' href='http://haskellformaths.blogspot.com/2009/09/finite-geometries-part-2-symmetries-of.html' title='Finite geometries, part 2: Symmetries of AG(n,Fq)'/><author><name>DavidA</name><uri>http://www.blogger.com/profile/16359932006803389458</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_XQ7FznWBAYE/Sq_xw0tZa8I/AAAAAAAAAD0/goO8O4OdIZE/s72-c/ag2f3.GIF' height='72' width='72'/><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5195188167565410449.post-230882234981950341</id><published>2009-09-13T14:00:00.000+01:00</published><updated>2009-09-13T21:12:36.458+01:00</updated><title type='text'>Finite geometries, part 1: AG(n,Fq)</title><content type='html'>Time to recap a little. Through June and July, we looked at graphs and their symmetries. Then through August, we looked at finite fields. Now I want to look at finite geometries. Why?&lt;br /&gt;&lt;br /&gt;Well, it's a continuation of my earlier investigation of symmetry. I chose graphs as the starting point because they're easy to understand. However, graphs are not the only combinatorial structure having symmetry - far from it. Finite geometries are important for several reasons:&lt;br /&gt;1. We'll see that all symmetry groups turn out to be composed out of "atoms of symmetry", called "simple groups". There are several infinite families of finite simple groups - and a few "sporadic" ones that don't fit into any family. One of the infinite families are symmetries of finite projective geometries, which we will look at in due course. (I got the term "atoms of symmetry" from Ronan, Symmetry and the Monster, which is a nice easy read on this stuff.)&lt;br /&gt;2. The symmetry groups of infinite geometries are very important in physics (especially particle physics). So if you like, you can pretend that what we're looking at here will help you understand physics.&lt;br /&gt;&lt;br /&gt;This time, I want to look at affine geometries, but what we're aiming towards is projective geometries.&lt;br /&gt;&lt;br /&gt;Within mathematics, there are probably many different definitions of geometry. Within combinatorics, we're just interested in the combinatorial structure, so roughly speaking, a geometry is just a collection of points, lines, planes and so on, together with a relation, called "incidence", saying which points are on which lines, and so on. Most of the time it's natural to think of a line as just being a set of points, in which case we can think of the incidence relation as just being the membership relation. However, this isn't always the best way to represent things in the code (or in the maths), as we shall see.&lt;br /&gt;&lt;br /&gt;The most familiar geometry is the Euclidean plane, R^2. In this geometry, every pair of lines meets in at most one point. Then of course, there are Euclidean spaces of three or more dimensions. There are also non-Euclidean geometries. For example, spherical geometry is geometry on the surface of a sphere, with the lines being great circles. In this case, every pair of lines meets in exactly two points.&lt;br /&gt;&lt;br /&gt;If you know about vector spaces, then you might think that Euclidean geometry is basically about the vector space R^n, with points, lines, planes being the zero-, one-, two-dimensional subspaces, and so on. However, this isn't quite right, because a vector space has a distinguished point, the origin, and every subspace of a vector space contains the origin. In Euclidean geometry there is no such distinguished point - we can have points, lines, planes which don't contain the origin. So we need to take the vector space R^n, and then "forget" the origin. If we do this, we get what is called an affine space. The subspaces of an affine space are the subspaces of the underlying vector space &lt;span style="font-style:italic;"&gt;and&lt;/span&gt; their translates. The collection of points, lines, planes, and so on in an affine space, together with the incidence relation, is called an affine geometry.&lt;br /&gt;&lt;br /&gt;We can construct finite affine geometries simply by replacing R by a finite field Fq in the above. In this way, we obtain the affine geometries AG(n,Fq). So in AG(n,Fq), we will have points, lines, planes and so on, but there will be only a finite number of each. Let's start with the points. Well, that's easy. The points of AG(n,Fq) are just the points Fq^n. The &lt;a href="http://hackage.haskell.org/package/HaskellForMaths"&gt;HaskellForMaths&lt;/a&gt; code goes like this:&lt;br /&gt;&lt;code&gt;&lt;br /&gt;ptsAG 0 fq = [[]]&lt;br /&gt;ptsAG n fq = [x:xs | x &lt;- fq, xs &lt;- ptsAG (n-1) fq] &lt;/code&gt;&lt;br /&gt;&lt;br /&gt;Actually, I think the stylish way to do this in Haskell would be:&lt;br /&gt;&lt;code&gt;&lt;br /&gt;ptsAG n fq = sequence $ replicate n fq&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;In any case, here's an example:&lt;br /&gt;&lt;code&gt;&lt;br /&gt;&gt; :load Math.Combinatorics.FiniteGeometry&lt;br /&gt;&gt; ptsAG 2 f3&lt;br /&gt;[[0,0],[0,1],[0,2],[1,0],[1,1],[1,2],[2,0],[2,1],[2,2]]&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;In other words, the points of AG(2,F3) form a 3*3 grid.&lt;br /&gt;&lt;br /&gt;What about the lines? Well, the points of AG(2,F3) are the vector space F3^2. The lines are just the one-dimensional subspaces of this vector space, and their translates. Here's a picture:&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_XQ7FznWBAYE/SqlnXDDiJpI/AAAAAAAAADs/CV4RS4-xK0g/s1600-h/ag2f3.GIF"&gt;&lt;img style="cursor: pointer; width: 320px; height: 312px;" src="http://2.bp.blogspot.com/_XQ7FznWBAYE/SqlnXDDiJpI/AAAAAAAAADs/CV4RS4-xK0g/s320/ag2f3.GIF" alt="" id="BLOGGER_PHOTO_ID_5379944875525547666" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Don't worry about the fact that the lines aren't straight. All we're concerned about is which points are incident with which lines. You might just want to convince yourself that every one-dimensional subspace of F3^2 is shown, as well as every translate.&lt;br /&gt;&lt;br /&gt;Hopefully this picture makes the analogy between finite geometries and graphs pretty obvious. Points are analogous to vertices; lines are analogous to edges. The main difference is that unlike edges, lines can have more than two points on them. Indeed, if we consider just the points and lines (and not planes, hyperplanes, etc), then a finite geometry gives rise to a "hypergraph" - a graph, but where the "hyperedges" are allowed to contain more than two points. However, finite geometries have some additional structure, since they can also contain planes, hyperplanes etc.&lt;br /&gt;&lt;br /&gt;Anyway, we want to calculate the symmetries of finite geometries, and in order to do that, we're going to need to work out what the lines are. First, let's think about how to represent the lines in code. There are several ways we could do it, including at least the following:&lt;br /&gt;- We could represent a line as a list/set of the points on it. We will occasionally want to list all the points on a line, but as a way to represent a line, it's not very efficient.&lt;br /&gt;- A line in the plane has an equation ax+by=c. We could represent a line by the triple (a,b,c). However, that would only work in two dimensions.&lt;br /&gt;- We could represent a line by any pair of distinct points on the line. This will work in any number of dimensions, and will generalise to planes (represented by three points), and so on.&lt;br /&gt;&lt;br /&gt;This last approach is the one we will take. So we would like two functions:&lt;br /&gt;- given a pair of points in AG(n,Fq), return all points on the line that they generate&lt;br /&gt;- given n, Fq, return all lines in AG(n,Fq) (represented as a pair of points)&lt;br /&gt;&lt;br /&gt;The first is straightforward:&lt;br /&gt;&lt;code&gt;&lt;pre&gt;&lt;br /&gt;lineAG [p1,p2] = L.sort [ p1 &amp;lt;+&amp;gt; (c *&amp;gt; (p2 &amp;lt;-&amp;gt; p1)) | c &lt;- fq ] where&lt;/code&gt;&lt;div&gt;&lt;code&gt;&lt;/code&gt;&lt;div&gt;&lt;code&gt;    fq = eltsFq undefined &lt;/code&gt;&lt;br /&gt;&lt;/pre&gt;&lt;br /&gt;There are a couple of unfamiliar things here. &amp;lt;+&amp;gt;, &amp;lt;-&amp;gt;, *&amp;gt; are the HaskellForMaths functions for adding two vectors, subtracting, and multiplying by a scalar. The second line is just a little bit of phantom type trickery, using type inference to magic up the elements of Fq. Note that the points are returned sorted.&lt;br /&gt;&lt;br /&gt;Now to find all the lines. The naive way to do this is as follows:&lt;br /&gt;- list all pairs of points in AG(n,Fq)&lt;br /&gt;- but any pair of points on a line generates the same line, so each line will be listed many times&lt;br /&gt;- so keep only those pairs of points which are the first two points on the line&lt;br /&gt;&lt;br /&gt;Here's the code:&lt;div&gt;&lt;pre&gt;&lt;code&gt;&lt;br /&gt;&lt;div&gt;&lt;div&gt;&lt;code&gt;&lt;/code&gt;&lt;/div&gt;&lt;code&gt;&lt;div&gt;linesAG1 n fq = [ [x,y] | [x,y] &lt;- combinationsOf 2 (ptsAG n fq),&lt;/div&gt;&lt;div&gt;                          [x,y] == take 2 (lineAG [x,y]) ]&lt;/div&gt;&lt;/code&gt;&lt;div&gt;&lt;/div&gt;&lt;div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;(This is not a very efficient way to find the lines, but we don't yet have the background to do it more efficiently.)&lt;/div&gt;&lt;div&gt;&lt;br /&gt;For example:&lt;br /&gt;&lt;code&gt;&lt;br /&gt;&gt; mapM_ print $ linesAG1 2 f3&lt;br /&gt;[[0,0],[0,1]]&lt;br /&gt;[[0,0],[1,0]]&lt;br /&gt;[[0,0],[1,1]]&lt;br /&gt;[[0,0],[1,2]]&lt;br /&gt;[[0,1],[1,0]]&lt;br /&gt;[[0,1],[1,1]]&lt;br /&gt;[[0,1],[1,2]]&lt;br /&gt;[[0,2],[1,0]]&lt;br /&gt;[[0,2],[1,1]]&lt;br /&gt;[[0,2],[1,2]]&lt;br /&gt;[[1,0],[1,1]]&lt;br /&gt;[[2,0],[2,1]]&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;Or we can list all the points on all the lines:&lt;br /&gt;&lt;code&gt;&lt;br /&gt;&gt; mapM_ (print . lineAG) $ linesAG1 2 f3&lt;br /&gt;[[0,0],[0,1],[0,2]]&lt;br /&gt;[[0,0],[1,0],[2,0]]&lt;br /&gt;[[0,0],[1,1],[2,2]]&lt;br /&gt;[[0,0],[1,2],[2,1]]&lt;br /&gt;[[0,1],[1,0],[2,2]]&lt;br /&gt;[[0,1],[1,1],[2,1]]&lt;br /&gt;[[0,1],[1,2],[2,0]]&lt;br /&gt;[[0,2],[1,0],[2,1]]&lt;br /&gt;[[0,2],[1,1],[2,0]]&lt;br /&gt;[[0,2],[1,2],[2,2]]&lt;br /&gt;[[1,0],[1,1],[1,2]]&lt;br /&gt;[[2,0],[2,1],[2,2]]&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;You can verify for yourself that these are the lines shown in the picture above.&lt;br /&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;That'll do for now. Next time we'll look at how to work out the symmetries of AG(n,Fq).&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5195188167565410449-230882234981950341?l=haskellformaths.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://haskellformaths.blogspot.com/feeds/230882234981950341/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://haskellformaths.blogspot.com/2009/09/finite-geometries-part-1-agnfq.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5195188167565410449/posts/default/230882234981950341'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5195188167565410449/posts/default/230882234981950341'/><link rel='alternate' type='text/html' href='http://haskellformaths.blogspot.com/2009/09/finite-geometries-part-1-agnfq.html' title='Finite geometries, part 1: AG(n,Fq)'/><author><name>DavidA</name><uri>http://www.blogger.com/profile/16359932006803389458</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_XQ7FznWBAYE/SqlnXDDiJpI/AAAAAAAAADs/CV4RS4-xK0g/s72-c/ag2f3.GIF' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5195188167565410449.post-6058806744569358090</id><published>2009-09-02T20:00:00.000+01:00</published><updated>2009-09-02T22:34:01.144+01:00</updated><title type='text'>Finite fields, part 2</title><content type='html'>Okay, so last time, we saw how to extend Q, the field of rational numbers, by adjoining a new element which is the zero of a polynomial in Q[X]. For example, we can adjoin sqrt 2, a zero of X^2-2, to obtain the field Q(sqrt2).&lt;br /&gt;&lt;br /&gt;The time before that, we saw that for each prime p, there is a finite field Fp, consisting of the set {0,1,...,p-1}, with the arithmetic operations carried out modulo p.&lt;br /&gt;&lt;br /&gt;This week we're going to bring these two ideas together, and look at algebraic extensions of Fp.&lt;br /&gt;&lt;br /&gt;Algebraic extensions of Fp work just the same way as algebraic extensions of Q. First, find an irreducible polynomial over the base field K - that means, a polynomial f(x) in K[x], which can't be expressed as f(x) = g(x) * h(x), for g(x), h(x) in K[x] of lower degree. For example, whenever d is an element of K without a square root in K, then x^2-d is irreducible. Then, form the quotient ring K[x]/&lt;f&gt; - that means, work in K[x], but use the division with remainder algorithm to set f=0, which has the effect of making x act like a zero of f. For f(x)=x^2-d, this means x ends up acting like sqrt d.&lt;br /&gt;&lt;br /&gt;For example, in F3, and also in F5, there is no square root of 2. So we can form the fields F3(sqrt2) or F5(sqrt2). We need to be a bit careful about terminology. 2 in F3, 2 in F5, and 2 in Q, are all different things - consequently sqrt2 in F3, in F5, and in Q are all different things. We mustn't make the mistake of thinking they're the same thing.&lt;br /&gt;&lt;br /&gt;We saw last time that Q(sqrt2) consists of { a + b sqrt2 | a, b in Q }. Similarly, F5(sqrt2) consists of { a + b sqrt2 | a, b in F5 }. In particular, it follows that F5(sqrt2) has 25 elements. In general, a simple algebraic extension of Fp will have p^n elements, where n is the degree of the irreducible polynomial.&lt;br /&gt;&lt;br /&gt;Over Q, we were able to form the fields Q(sqrt2), Q(sqrt3), Q(i), and so on. Somewhat surprisingly, over Fp, it turns out that any algebraic extension of degree n is isomorphic to any other. For example, F5(sqrt2) is isomorphic to F5(sqrt3). For this reason, we typically just talk about F25 (meaning, the field with 25 elements) - or in general, Fq, where q = p^n is a prime power.&lt;br /&gt;&lt;br /&gt;However, in order to do arithmetic in F25 or in Fq, we clearly need to have some particular irreducible polynomial in mind. Otherwise, we won't know whether to reduce x^2 to 2, or to 3, for example. So for each p, n, we need to agree on an irreducible polynomial of degree n over Fp. There doesn't appear to be any natural way to choose, so an artificial way to choose has been devised, called the Conway polynomials.&lt;br /&gt;&lt;br /&gt;So, using a little phantom type trickery as before, here are the first few finite fields of prime power order, as defined in the HaskellForMaths library:&lt;br /&gt;&lt;code&gt;&lt;br /&gt;data ConwayF4&lt;br /&gt;instance PolynomialAsType F2 ConwayF4 where pvalue _ = convert $ x^2+x+1&lt;br /&gt;type F4 = ExtensionField F2 ConwayF4&lt;br /&gt;f4 = map Ext (polys 2 f2) :: [F4]&lt;br /&gt;a4 = embed x :: F4&lt;br /&gt;&lt;br /&gt;data ConwayF8&lt;br /&gt;instance PolynomialAsType F2 ConwayF8 where pvalue _ = convert $ x^3+x+1&lt;br /&gt;type F8 = ExtensionField F2 ConwayF8&lt;br /&gt;f8 = map Ext (polys 3 f2) :: [F8]&lt;br /&gt;a8 = embed x :: F8&lt;br /&gt;&lt;br /&gt;data ConwayF9&lt;br /&gt;instance PolynomialAsType F3 ConwayF9 where pvalue _ = convert $ x^2+2*x+2&lt;br /&gt;type F9 = ExtensionField F3 ConwayF9&lt;br /&gt;f9 = map Ext (polys 2 f3) :: [F9]&lt;br /&gt;a9 = embed x :: F9&lt;br /&gt;&lt;br /&gt;data ConwayF16&lt;br /&gt;instance PolynomialAsType F2 ConwayF16 where pvalue _ = convert $ x^4+x+1&lt;br /&gt;type F16 = ExtensionField F2 ConwayF16&lt;br /&gt;f16 = map Ext (polys 4 f2) :: [F16]&lt;br /&gt;a16 = embed x :: F16&lt;br /&gt;&lt;br /&gt;data ConwayF25&lt;br /&gt;instance PolynomialAsType F5 ConwayF25 where pvalue _ = convert $ x^2+4*x+2&lt;br /&gt;type F25 = ExtensionField F5 ConwayF25&lt;br /&gt;f25 = map Ext (polys 2 f5) :: [F25]&lt;br /&gt;a25 = embed x :: F25&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;As we did with the fields Fp, we provide the functions f4, f8, f9, etc, which just return a list of the elements of the corresponding fields. a4, a8, a9 etc return the element that has been adjoined to the underlying field to make the extension field.&lt;br /&gt;&lt;br /&gt;For example:&lt;br /&gt;&lt;code&gt;&lt;br /&gt;&gt; f8&lt;br /&gt;[0,a^2,a,a+a^2,1,1+a^2,1+a,1+a+a^2] :: [F8]&lt;br /&gt;&gt; a8 ^ 3&lt;br /&gt;1+a :: F8&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;We can partially verify the claim that F25 = F5(sqrt2) = F5(sqrt3), by exhibiting square roots of 2 and 3 in F25:&lt;br /&gt;&lt;code&gt;&lt;br /&gt;&gt; (2+a25)^2&lt;br /&gt;2 :: F25&lt;br /&gt;&gt; (1+3*a25)^2&lt;br /&gt;3 :: F25&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;Last time, I mentioned that extension fields can have automorphisms, without really spelling out what that means. Well, if L is an extension of K, then f is an automorphism of the extension if for a,b in L, c in K, f(a+b) = f(a)+f(b), f(a*b) = f(a)*f(b), etc, and f(c*a)= c*f(a). In other words, f(L) looks the same as L, viewed from K.&lt;br /&gt;&lt;br /&gt;The finite fields Fq (q = p^n) have an automorphism called the Frobenius automorphism, defined by f(x)=x^p. For example:&lt;br /&gt;&lt;code&gt;&lt;br /&gt;&gt; f9&lt;br /&gt;[0,a,2a,1,1+a,1+2a,2,2+a,2+2a]&lt;br /&gt;&gt; map frobeniusAut f9&lt;br /&gt;[0,1+2a,2+a,1,2+2a,a,2,2a,1+a]&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;Finally, a word about efficiency. The HaskellForMaths implementation of fields Fq is very natural from a mathematical point of view, but it isn't very efficient. &lt;br /&gt;&lt;br /&gt;Firstly, we are working with polynomials with coefficients in Fp. To multiply two degree n polynomials together, we have to do O(n^2) additions and multiplications of coefficients. If the coefficients are in Fp, each one of these involves a (`mod` p) operation. It would be better to do the polynomial arithmetic over the integers, and only do the `mod` p at the end. We would then only have to do O(n) (`mod` p) operations.&lt;br /&gt;&lt;br /&gt;Secondly, we could consider implementing the fields Fq using Zech logarithms instead.&lt;br /&gt;&lt;br /&gt;Anyway, it doesn't matter too much for the moment, as I only want to use the finite fields to construct combinatorial structures. But sometime I might try to write a more efficient implementation.&lt;br /&gt;&lt;br /&gt;Next time: finite geometries.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5195188167565410449-6058806744569358090?l=haskellformaths.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://haskellformaths.blogspot.com/feeds/6058806744569358090/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://haskellformaths.blogspot.com/2009/09/finite-fields-part-2.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5195188167565410449/posts/default/6058806744569358090'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5195188167565410449/posts/default/6058806744569358090'/><link rel='alternate' type='text/html' href='http://haskellformaths.blogspot.com/2009/09/finite-fields-part-2.html' title='Finite fields, part 2'/><author><name>DavidA</name><uri>http://www.blogger.com/profile/16359932006803389458</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5195188167565410449.post-7447389483205642313</id><published>2009-08-27T20:00:00.000+01:00</published><updated>2009-08-27T21:04:28.334+01:00</updated><title type='text'>Extension fields</title><content type='html'>[New version HaskellForMaths 0.1.7 uploaded &lt;a href="http://hackage.haskell.org/package/HaskellForMaths"&gt;here&lt;/a&gt;.]&lt;br /&gt;&lt;br /&gt;Last week, we looked at the finite fields of prime order, Fp. A field, remember, is a set in which you can do addition, subtraction, multiplication and division. For a given prime p, the field Fp consists of the set {0, 1, ..., p-1}, with arithmetic operations done modulo p. Why does p have to be a prime? We can do addition, subtraction, and multiplication modulo n for any n, prime or not. However, for division, we require p to be prime. (Why?)&lt;br /&gt;&lt;br /&gt;You might think, therefore, that the fields Fp of prime order are the only finite fields. However, you would be wrong. We shall see that there are in fact fields Fq of order q (ie, with q elements) for every prime power q = p^n. However, before we can do that we need to understand about algebraic extensions of a field.&lt;br /&gt;&lt;br /&gt;It can happen that a field is contained within another field. For example, we have Q &lt;ul&gt;&lt;li&gt;At step 0, start with Q&lt;/li&gt;&lt;li&gt;At step 1, add a&lt;/li&gt;&lt;li&gt;At step n, add all elements that can be obtained by combining two elements from step n-1 using one of the arithmetic operators.&lt;/li&gt;&lt;/ul&gt;So for example, at step 2, we will have 1+a, 2*a, a*a, 1/a, etc. Anything which appears in the list at step n for some n is in Q(a).&lt;br /&gt;&lt;br /&gt;The field Q(a), obtained by adjoining a single element, is called a simple extension of Q. (We could then go on to adjoin another element b, to get Q(a,b), and so on.) Among simple extensions, there are two ways it can go.&lt;br /&gt;&lt;br /&gt;Suppose that a is a zero of some polynomial with coefficients in Q - for example x^2-2. In that case, we will construct the whole of Q(a) after only a finite number of steps. For example, in the case where a = sqrt 2, every element of Q(sqrt 2) can be expressed as c + d sqrt 2, for some c, d in Q. For example:&lt;br /&gt;(sqrt 2)^2 = 2&lt;br /&gt;1/(sqrt 2) = (sqrt 2) / 2&lt;br /&gt;1/(1 + sqrt 2) = (1 - sqrt 2) / (1 - 2) = -1 + sqrt 2&lt;br /&gt;This is called an algebraic extension.&lt;br /&gt;&lt;br /&gt;Alternatively, if a is not the zero of a polynomial over Q (for example: e, pi), then the construction of Q(a) in steps will never finish. This is called a transcendental extension.&lt;br /&gt;&lt;br /&gt;Algebraic extensions of Q are fascinating things. Initially, the main motivation for studying them came from number theory. Some other time, I'd like to take a closer look at them. However, for the moment, I just want to show how to construct them in Haskell, as what we're really aiming for is algebraic extensions of the finite fields Fp.&lt;br /&gt;&lt;br /&gt;Let's suppose that we're trying to construct Q(sqrt2). The polynomial we're interested in is x^2-2. Then the basic idea is: &lt;ol&gt;&lt;li&gt;Form the polynomial ring Q[x], consisting of polynomials in x with coefficients in Q&lt;/li&gt;&lt;li&gt;Represent elements of Q(a) as polynomials in Q[x]&lt;/li&gt;&lt;li&gt;Do addition, subtraction, multiplication in Q(a) using the underlying operations in Q[x]&lt;/li&gt;&lt;li&gt;After any arithmetic operation, replace the result by its remainder on division by x^2-2.&lt;/li&gt;&lt;/ol&gt;Step 4 is the key step. Suppose that we end up with a polynomial f. We use division with remainder to write f = q(x^2-2)+r, and we replace f by r. In effect, this means that we're setting x^2-2 = 0, which is the same as setting x = sqrt 2. If we do this consistently, then the x ends up acting as sqrt2, and we end up with Q(sqrt2).&lt;br /&gt;&lt;br /&gt;To construct Q(sqrt3), Q(i), etc, just replace x^2-2 by the appropriate polynomial.&lt;br /&gt;&lt;br /&gt;Okay, rather belatedly, time for some code. First, we need to define a type for (univariate) polynomials:&lt;br /&gt;&lt;code&gt;&lt;br /&gt;newtype UPoly a = UP [a] deriving (Eq,Ord)&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;UP [c0,c1,...cn] is to be interpreted as the polynomial c0 + c1 x + ... + cn x^n.&lt;br /&gt;&lt;br /&gt;Exercise: Define a Num instance for UPoly a.&lt;br /&gt;&lt;br /&gt;If we then define&lt;br /&gt;&lt;code&gt;&lt;br /&gt;x = UP [0,1] :: UPoly Integer&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;together with a suitable Show instance, then we can do things like the following:&lt;br /&gt;&lt;code&gt;&lt;br /&gt;&gt; (1+x)^3&lt;br /&gt;1+3x+3x^2+x^3&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;Exercise: Write quotRemUP, to perform division with remainder in UPoly k, on the assumption that k is a field (that is, a Fractional instance).&lt;br /&gt;&lt;br /&gt;So we now have a type, UPoly Q, representing Q[x]. Next, we want to wrap this in another type to represent extension fields Q(a). Rather than have to do this over again each time for Q(sqrt2), Q(sqrt3), Q(i), and so on, we use a little bit of phantom type trickery again. Last time, we used phantom types to represent integers; this time, we're going to use phantom types to represent the polynomials that define the fields (x^2-2, x^2-3, x^2+1, etc).&lt;br /&gt;&lt;pre&gt;&lt;code&gt;&lt;br /&gt;class PolynomialAsType k poly where&lt;br /&gt;    pvalue :: (k,poly) -&gt; UPoly k&lt;br /&gt;&lt;br /&gt;data ExtensionField k poly = Ext (UPoly k) deriving (Eq,Ord)&lt;br /&gt;&lt;/code&gt;&lt;/pre&gt;&lt;br /&gt;Here, k represents the field we are extending - Q, to begin with - and if a is the element that we want to adjoin, then poly represents the polynomial over Q of which a is a zero.&lt;br /&gt;&lt;br /&gt;From here, it's a short step to define some extension fields:&lt;br /&gt;&lt;pre&gt;&lt;code&gt;&lt;br /&gt;data Sqrt a = Sqrt a&lt;br /&gt;&lt;br /&gt;-- n should be square-free&lt;br /&gt;instance IntegerAsType n =&gt; PolynomialAsType Q (Sqrt n) where&lt;br /&gt;    pvalue _ = convert $ x^2 - fromInteger (value (undefined :: n))&lt;br /&gt;&lt;br /&gt;type QSqrt2 = ExtensionField Q (Sqrt T2)&lt;br /&gt;sqrt2 = embed x :: QSqrt2&lt;br /&gt;&lt;br /&gt;type QSqrt3 = ExtensionField Q (Sqrt T3)&lt;br /&gt;sqrt3 = embed x :: QSqrt3&lt;br /&gt;&lt;/code&gt;&lt;/pre&gt;&lt;br /&gt;And now we can do arithmetic in Q(sqrt2), Q(sqrt3), etc. For example:&lt;br /&gt;&lt;code&gt;&lt;br /&gt;&gt; :set +t&lt;br /&gt;&gt; (1+sqrt2)^2&lt;br /&gt;3+2a :: QSqrt2&lt;br /&gt;&gt; (1+sqrt3)^2&lt;br /&gt;4+2a :: QSqrt3&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;As you see, the show function of ExtensionField k poly shows the adjoined element as "a", regardless of which field we're working in. With a little more type hackery, we could have made that a parameter to the type too, so that it would show "sqrt2", "sqrt3", "i", depending which field we were in. However, that would have obscured the code even more. In practice, when we use this code we're only going to be working over one field at a time, so it's fine as it is.&lt;br /&gt;&lt;br /&gt;Another limitation of this code is that it's going to be a bit unwieldy to construct Q(a,b) as the type ExtensionField (ExtensionField k poly1) poly2. Luckily, this doesn't matter, as any algebraic extension Q(a,b) is equal to Q(c) for some c.&lt;br /&gt;&lt;br /&gt;Exercise: Show that Q(sqrt2, sqrt3) = Q(sqrt2 + sqrt3), and find the polynomial of which sqrt2 + sqrt3 is a zero.&lt;br /&gt;&lt;br /&gt;I apologise that I've only really explained the bare bones of how this works. Hopefully you can fill in the gaps. (They're all in the HaskellForMaths source - see link at top of page.) This post has already taken me far too long to write, but there is just one other thing I ought to mention.&lt;br /&gt;&lt;br /&gt;Field extensions can have automorphisms (symmetries). Recall that a symmetry is a change that leaves something looking the same. In the case of Q(sqrt2), there is a non-trivial symmetry that sends c + d sqrt 2 to c - d sqrt2 (conjugation). Field automorphisms have a very important place in the history of mathematics: &lt;a href="http://en.wikipedia.org/wiki/Galois"&gt;Galois&lt;/a&gt; invented group theory in order to study these symmetries, during his investigations into the unsolvability (in general) of the &lt;a href="http://en.wikipedia.org/wiki/Quintic"&gt;quintic&lt;/a&gt;.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5195188167565410449-7447389483205642313?l=haskellformaths.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://haskellformaths.blogspot.com/feeds/7447389483205642313/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://haskellformaths.blogspot.com/2009/08/extension-fields.html#comment-form' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5195188167565410449/posts/default/7447389483205642313'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5195188167565410449/posts/default/7447389483205642313'/><link rel='alternate' type='text/html' href='http://haskellformaths.blogspot.com/2009/08/extension-fields.html' title='Extension fields'/><author><name>DavidA</name><uri>http://www.blogger.com/profile/16359932006803389458</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5195188167565410449.post-3160846306433363350</id><published>2009-08-17T21:24:00.000+01:00</published><updated>2009-08-18T22:17:14.727+01:00</updated><title type='text'>Finite fields, part 1</title><content type='html'>Okay, so you could think of this as the beginning of chapter two of the imaginary book that I'm writing in this blog. Chapter one was about graphs, groups and symmetry. In chapter two I want to talk about some other highly symmetric combinatorial objects: finite geometries and (probably) designs. But first, I need to talk about finite fields.&lt;br /&gt;&lt;br /&gt;A field is roughly a set where you can add, subtract, multiply and divide. You're probably familiar with the following fields:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Q, the rational numbers&lt;/li&gt;&lt;li&gt;R, the real numbers&lt;/li&gt;&lt;li&gt;C, the complex numbers&lt;/li&gt;&lt;/ul&gt;And also with the following non-fields:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;N, the natural numbers. Not a field, because it doesn't contain additive inverses (the negative numbers), so you can't always subtract.&lt;br /&gt;&lt;/li&gt;&lt;li&gt;Z, the integers. Not a field, because it doesn't contain multiplicative inverses, so you can't always divide.&lt;/li&gt;&lt;/ul&gt;I can't resist mentioning a little curiosity. To a number theorist / discrete mathematician like myself, the most interesting of the above are Q and Z. I like to think that this is why they were given the most interesting letters. Indeed, the interestingness of the mathematical objects appears to be correlated with the Scrabble value of the letters: in Scrabble, Q and Z are worth 10, indicating highly interesting; R and N are worth 1, indicating a bit boring; C is worth 3 - moderately interesting. Going further, H, the quaternions, is worth 4 - slightly more interesting than C. However, O, the octonions, is only worth 1, which looks like an anomaly.&lt;br /&gt;&lt;br /&gt;Anyway, back to fields. Before we go any further, here's the HaskellForMaths version of Q, the rationals:&lt;br /&gt;&lt;pre&gt;&lt;code&gt;import Data.Ratio&lt;br /&gt;&lt;br /&gt;newtype Q = Q Rational deriving (Eq,Ord,Num,Fractional)&lt;br /&gt;&lt;br /&gt;instance Show Q where&lt;br /&gt;   show (Q x) | b == 1    = show a&lt;br /&gt;              | otherwise = show a ++ "/" ++ show b&lt;br /&gt;              where a = numerator x&lt;br /&gt;                    b = denominator x&lt;br /&gt;&lt;/code&gt;&lt;/pre&gt;Silly, I know, but I just got a bit annoyed with those percentage signs all over the place when using Haskell's built-in rationals.&lt;br /&gt;&lt;br /&gt;Anyway, in addition to the fields Q, R, and C, there are many other types of field. The ones I want to look at here are the finite fields of prime order. These are the fields you get if you do arithmetic modulo p, where p is a prime. For each different prime p, we have a field Fp, consisting of the set {0,1, ... p-1}, with arithmetic performed modulo p. For example, in the field F5, we have 3*3 = 4; while in the field F7, we have 3*3 = 2.&lt;br /&gt;&lt;br /&gt;Okay, so how to represent these fields in Haskell?&lt;br /&gt;&lt;br /&gt;Well, in an earlier version of the HaskellForMaths code, I tried this:&lt;br /&gt;&lt;code&gt;&lt;br /&gt;data Fp = F Integer Integer&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;The idea is that the first integer is p, the prime we are working over, and the second is the value. For example, F 5 3 would represent the value 3 in the field F5.&lt;br /&gt;&lt;br /&gt;Next we either derive or define instances of Eq, Ord, Show, Num, Fractional. The Num instance looks something like this:&lt;br /&gt;&lt;pre&gt;&lt;code&gt;&lt;br /&gt;instance Num Fp where&lt;br /&gt;    F p x + F q y | p == q  = F p $ (x+y) `mod` p&lt;br /&gt;    F p x * F q y | p == q  = F p $ (x*y) `mod` p&lt;br /&gt;    fromInteger _ = error "Fp.fromInteger: not well defined"&lt;br /&gt;&lt;/code&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;The drawbacks of this approach are probably pretty obvious:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Whenever I have two Fp values, F p x and F q y, I first have to check that p = q. This is a run-time check of something that would be better handled at compile time by the type system&lt;/li&gt;&lt;li&gt;When defining numerical types in Haskell, the fromInteger function gives us implicit type conversion from integers to our type, so that we can just write code like 3 * 3, and have the type system convert them into the correct type. However, in this case, we can't give a sensible definition of fromInteger, because we don't know which prime p the user wants to work over. This means that everything is going to have to be written longhand as F 5 3 * F 5 3, and so on.&lt;/li&gt;&lt;li&gt;It feels like a waste of memory to be carrying that first Integer p around all the time, when if I'm using the code correctly, all the ps in a given calculation will be the same.&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;For these reasons, I decided to try another approach. The aim is to make F2, F3, F5, etc be different types, so that adding 3 :: F5 to 3 :: F7 is a type error, which can be spotted at compile time, rather than a runtime error.&lt;br /&gt;&lt;br /&gt;It's fairly simple really.&lt;br /&gt;&lt;br /&gt;First, define a whole bunch of phantom types representing the primes:&lt;br /&gt;&lt;code&gt;&lt;br /&gt;data T2&lt;br /&gt;data T3&lt;br /&gt;data T5&lt;br /&gt;&lt;code&gt;...&lt;br /&gt;&lt;/code&gt;&lt;/code&gt;&lt;br /&gt;Now define a mechanism for retrieving the values from the types:&lt;br /&gt;&lt;pre&gt;&lt;code&gt;class IntegerAsType a where&lt;br /&gt;   value :: a -&gt; Integer&lt;br /&gt;&lt;br /&gt;instance IntegerAsType T2 where value _ = 2&lt;br /&gt;instance IntegerAsType T3 where value _ = 3&lt;br /&gt;instance IntegerAsType T5 where value _ = 5&lt;br /&gt;...&lt;br /&gt;&lt;/code&gt;&lt;/pre&gt;&lt;br /&gt;This enables us to do the following:&lt;br /&gt;&lt;br /&gt;&lt;code&gt;&gt; value (undefined :: T2)&lt;br /&gt;2&lt;br /&gt;&gt; value (undefined :: T3)&lt;br /&gt;3&lt;br /&gt;&lt;code&gt;&lt;br /&gt;&lt;/code&gt;&lt;/code&gt;Now, we use the phantom types to parameterise a finite field data type:&lt;br /&gt;&lt;code&gt;&lt;br /&gt;newtype Fp n = Fp Integer&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;Here, Fp (the one on the left of the equals sign) is a type constructor which takes a type parameter n. This type parameter will be one of the phantom types we just defined. For example:&lt;br /&gt;&lt;code&gt;&lt;br /&gt;type F2 = Fp T2&lt;br /&gt;type F3 = Fp T3&lt;br /&gt;type F5 = Fp T5&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;As before, we need to derive or define Eq, Ord, Show, Num, and Fractional instances. The Num instance looks like this:&lt;br /&gt;&lt;pre&gt;&lt;code&gt;&lt;br /&gt;instance IntegerAsType n =&gt; Num (Fp n) where&lt;br /&gt;    Fp x + Fp y = Fp $ (x+y) `mod` p where p = value (undefined :: n)&lt;br /&gt;    Fp x * Fp y = Fp $ (x*y) `mod` p where p = value (undefined :: n)&lt;br /&gt;    fromInteger m = Fp $ m `mod` p where p = value (undefined :: n)&lt;br /&gt;    ...&lt;br /&gt;&lt;/code&gt;&lt;/pre&gt;&lt;br /&gt;This is all fairly standard phantom type trickery, but it now enables us to do things like the following:&lt;br /&gt;&lt;code&gt;&lt;br /&gt;&gt; :load Math.Algebra.Field.Base&lt;br /&gt;&gt; 3*3 :: F5&lt;br /&gt;4&lt;br /&gt;&gt; 3*3 :: F7&lt;br /&gt;2&lt;br /&gt;&gt; (3 :: F5) * (3 :: F7)&lt;br /&gt;#type error#&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;We still have to tell Haskell which prime p we're working over, but now we do it via a type annotation, rather than an argument to the constructor. This means that Haskell can catch errors where we mix values from different fields at compile time, rather than run time.&lt;br /&gt;&lt;br /&gt;The module Math.Algebra.Field.Base provides types for the finite fields F2, F3, F5, ..., F97. The module also provides functions to list the elements of each field:&lt;br /&gt;&lt;code&gt;&lt;br /&gt;f2 = map fromInteger [0..1] :: [F2]&lt;br /&gt;f3 = map fromInteger [0..2] :: [F3]&lt;br /&gt;f5 = map fromInteger [0..4] :: [F5]&lt;br /&gt;...&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;And that's about it.&lt;br /&gt;&lt;br /&gt;However, we must recognise, in all honesty, that there are also drawbacks to this implementation of the fields Fp.&lt;br /&gt;&lt;br /&gt;First, whenever we want to work over a new prime p, we have to define a new type. For example, if we want to work over F101, we are going to have to define a new type for it.&lt;br /&gt;&lt;br /&gt;(For the avoidance of doubt: No, type level Peano arithmetic is not what I'm after.)&lt;br /&gt;&lt;br /&gt;Second, I can't decide at run time what field to work over. For certain applications it would be very useful to choose my p at run time. I could do that with the first implementation, but I can't do it with the second.&lt;br /&gt;&lt;br /&gt;What I really want then, is for Fp and Fq to be different types when p /= q, but to be able to decide at run-time which p to work over. For example, I might have calculated the p I want to work over as the result of a previous calculation. This is impossible in Haskell.&lt;br /&gt;&lt;br /&gt;However, there is a technology called "dependent types", in which what I want is possible. Dependent types means roughly that types can depend upon values. So in a Haskell-like language that supported dependent types, I might be able to write code like the following (not valid Haskell):&lt;br /&gt;&lt;code&gt;&lt;br /&gt;&gt; 3 * 3 :: F 5&lt;br /&gt;4&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;Here, the value 5 (not the type T5) is being passed as a type parameter to the type constructor F.&lt;br /&gt;&lt;br /&gt;For a while, I thought this was the holy grail. I even considered defecting from Haskell. I did look into languages that support dependent types, such as Coq. However, I concluded that, sadly, this is not the answer. The problem is that - I think - if you have types dependent on values, then you lose the distinction between compile time and run time, which means - I think - that you can't compile the language any more.&lt;br /&gt;&lt;br /&gt;Or does someone else know better?&lt;code&gt;&lt;code&gt;&lt;br /&gt;&lt;/code&gt;&lt;/code&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5195188167565410449-3160846306433363350?l=haskellformaths.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://haskellformaths.blogspot.com/feeds/3160846306433363350/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://haskellformaths.blogspot.com/2009/08/finite-fields-part-1.html#comment-form' title='8 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5195188167565410449/posts/default/3160846306433363350'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5195188167565410449/posts/default/3160846306433363350'/><link rel='alternate' type='text/html' href='http://haskellformaths.blogspot.com/2009/08/finite-fields-part-1.html' title='Finite fields, part 1'/><author><name>DavidA</name><uri>http://www.blogger.com/profile/16359932006803389458</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>8</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5195188167565410449.post-1586703868034532492</id><published>2009-08-05T21:42:00.000+01:00</published><updated>2009-08-05T21:57:59.241+01:00</updated><title type='text'>Where we've been, and where we're going</title><content type='html'>It feels like about time to take a step back and survey where we've been and where we're going in this blog.&lt;br /&gt;&lt;br /&gt;What I'm trying to do in this blog is a kind of guided tour of my &lt;a href="http://hackage.haskell.org/package/HaskellForMaths"&gt;HaskellForMaths library&lt;/a&gt;. In turn, the library itself, although it contains efficient implementations of fundamental algorithms in computer algebra, is really an educational tool. I wrote it to help me understand maths I wanted to learn about, and especially, to help develop my intuitions about the objects of study. My hope is that by retracing my steps in this blog, I can help others to do the same.&lt;br /&gt;&lt;br /&gt;Up to now, we've been looking at graphs and their symmetries, considered as permutation groups. Group theory is sometimes described as the study of symmetry, and I hope that by introducing groups through symmetries of graphs, I've made it clear why.&lt;br /&gt;&lt;br /&gt;We've really only looked at quite a small amount of code. The main functions we've seen (forgetting a few stepping stones along the way) are:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;graph (vs,es) - for constructing a graph with given vertices and edges&lt;/li&gt;&lt;li&gt;graphAuts g - returns a strong generating set for the group of symmetries / automorphisms of the graph&lt;/li&gt;&lt;li&gt;orderSGS sgs - given a strong generating set, return the order of the group - for example, the total number of symmetries of the graph&lt;/li&gt;&lt;li&gt;conjClassReps - given a generating set for a group, return representatives and sizes for the conjugacy classes - the classes of elements which are "the same, just viewed from a different angle"&lt;/li&gt;&lt;li&gt;p [[1,2,3],[4,5]] - given a list of cycles, return the corresponding permutation&lt;/li&gt;&lt;li&gt;g*h, 1, g^-1 - multiplication, identity and inverses for group elements&lt;br /&gt;&lt;/li&gt;&lt;li&gt;v .^ g - the action of a permutation on a vertex&lt;/li&gt;&lt;li&gt;e -^ g - the action of a permutation on an edge&lt;/li&gt;&lt;li&gt;sgs gs - return a strong generating set, given a set of generators&lt;/li&gt;&lt;/ul&gt;This is the core of the HaskellForMaths code for graphs, graph automorphisms, and permutation groups. Incidentally, I think this code really shows off Haskell to advantage: code to do the same things would be much more complicated in C++, say. I'm especially pleased with the ability to define graphs and groups over arbitrary types, which enables very natural constructions of various classes of graphs and groups. (In this respect, though not in others, I think the library probably beats even dedicated packages such as GAP and MAGMA.)&lt;br /&gt;&lt;br /&gt;However, the graph and permutation group code represents, I would say, only about one sixth of what's in the HaskellForMaths library. (Not to mention that I have quite a lot more code in various stages of development back in the lab.) So what else is there still to look at?&lt;br /&gt;&lt;br /&gt;Well, what I want to do next is stick with permutation groups a little longer, but widen the set of combinatorial objects that we consider, to include finite geometries, designs, and perhaps other incidence structures.&lt;br /&gt;&lt;br /&gt;One of my aims is to take a look at a few of the sporadic finite simple groups. Finite simple groups are the "atoms of symmetry". Most of them fall in one of several infinite families (the cyclic groups of prime order, the alternating groups, and the finite groups of Lie type, aka Chevalley groups). However, it turns out that there are also 26 "sporadic" finite simple groups, which don't belong to any of these families. The HaskellForMaths library contains code for investigating both the infinite families, and some of the smaller sporadic groups.&lt;br /&gt;&lt;br /&gt;Then there are root systems and Coxeter groups, including string rewriting and the Knuth-Bendix algorithm. Root systems have some connection to Lie algebras, and are therefore important in physics, though I don't claim to fully understand all that stuff.&lt;br /&gt;&lt;br /&gt;Another major part of the library, which I'll get round to eventually, is commutative algebra and Groebner bases. Groebner bases provide a way of calculating with ideals in polynomial rings, which is equivalent to doing algebraic geometry.&lt;br /&gt;&lt;br /&gt;Finally, there's some code for working in non-commutative algebra. As an example, there's some code which uses Temperley-Lieb and Hecke algebras to calculate polynomial invariants of knots (the Jones polynomial and the HOMFLY polynomial). In due course, I'm hoping to expand this part of the library to do group algebras, representation theory, and other stuff.&lt;br /&gt;&lt;br /&gt;In case I've managed to interest anyone, I should probably mention a few of the books that I used for inspiration:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Ronan, Symmetry and the Monster&lt;/li&gt;&lt;li&gt;Godsil and Royle, Algebraic Graph Theory&lt;/li&gt;&lt;li&gt;Cameron, Combinatorics&lt;/li&gt;&lt;li&gt;Seress, Permutation Group Algorithms&lt;br /&gt;&lt;/li&gt;&lt;li&gt;Holt, Handbook of Computational Group Theory&lt;/li&gt;&lt;/ul&gt;Next time - back to the maths.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5195188167565410449-1586703868034532492?l=haskellformaths.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://haskellformaths.blogspot.com/feeds/1586703868034532492/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://haskellformaths.blogspot.com/2009/08/where-weve-been-and-where-were-going.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5195188167565410449/posts/default/1586703868034532492'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5195188167565410449/posts/default/1586703868034532492'/><link rel='alternate' type='text/html' href='http://haskellformaths.blogspot.com/2009/08/where-weve-been-and-where-were-going.html' title='Where we&apos;ve been, and where we&apos;re going'/><author><name>DavidA</name><uri>http://www.blogger.com/profile/16359932006803389458</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5195188167565410449.post-5408584066017567741</id><published>2009-08-01T19:00:00.000+01:00</published><updated>2009-08-01T19:57:17.264+01:00</updated><title type='text'>How to count the number of positions of Rubik's cube</title><content type='html'>Previously on this blog, we have been looking at the symmetries of graphs, which we defined as permutations of the vertices which leave the edges (collectively) in the same places. We think of permutations as active rearrangements of the vertices, rather than as static sequences. For example, we think of the static permutation [2,1,4,3] as the active rearrangement [[1,2],[3,4]], which swaps the 1 and 2 positions, and swaps the 3 and 4 positions. Given two permutations g and h, the active viewpoint enables us to ask what happens when you do g then h, which we think of as multiplication g*h; and how to undo g, which we think of as an inverse g^-1.&lt;br /&gt;&lt;br /&gt;The symmetries of a graph form a &lt;span style="font-style: italic;"&gt;group&lt;/span&gt;. This means, that if g and h are symmetries, then so are g*h, and g^-1. So a group is a set of permutations which is &lt;span style="font-style: italic;"&gt;closed&lt;/span&gt; under multiplication and inverses.&lt;br /&gt;&lt;br /&gt;We can use permutation groups to study all sorts of things besides graph symmetries. As an example, this week I'm going to look at Rubik's cube.&lt;br /&gt;&lt;br /&gt;First, we need to work out a way to represent cube moves as permutations. That's easy - we just number the little squares as follows:&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_XQ7FznWBAYE/SmtzDINGV6I/AAAAAAAAADk/ErAbAQG0wGw/s1600-h/rubik.GIF"&gt;&lt;img style="cursor: pointer; width: 320px; height: 241px;" src="http://4.bp.blogspot.com/_XQ7FznWBAYE/SmtzDINGV6I/AAAAAAAAADk/ErAbAQG0wGw/s320/rubik.GIF" alt="" id="BLOGGER_PHOTO_ID_5362506278893934498" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;Then a clockwise rotation of the front face (the blue face, with an F in the centre), can be represented as the following permutation:&lt;br /&gt;&lt;br /&gt;&lt;code&gt;f = [[1,3,9,7],[2,6,8,4],[17,41,33,29],[18,44,32,26],[19,47,31,23]]&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;This says, that f sends the 1 square to the 3 position, 3 to 9, 9 to 7, 7 to 1 (the corners of the blue face); 2 to 6, 6 to 8, 8 to 4, 4 to 2 (the edges of the blue face); together with three other cycles (the corners and edges adjacent to the blue face).&lt;br /&gt;&lt;br /&gt;It's then an easy matter to write down permutations corresponding to the clockwise rotations of the other faces:&lt;br /&gt;&lt;code&gt;&lt;br /&gt;b = p [[51,53,59,57],[52,56,58,54],[11,27,39,43],[12,24,38,46],[13,21,37,49]]&lt;br /&gt;l = p [[21,23,29,27],[22,26,28,24],[ 1,31,59,11],[ 4,34,56,14],[ 7,37,53,17]]&lt;br /&gt;r = p [[41,43,49,47],[42,46,48,44],[ 3,13,57,33],[ 6,16,54,36],[ 9,19,51,39]]&lt;br /&gt;u = p [[11,13,19,17],[12,16,18,14],[ 1,21,51,41],[ 2,22,52,42],[ 3,23,53,43]]&lt;br /&gt;d = p [[31,33,39,37],[32,36,38,34],[ 7,47,57,27],[ 8,48,58,28],[ 9,49,59,29]]&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;Note that I didn't bother to number the centre square in each face, or write down any permutations that move a middle slice. For example, we could have written down:&lt;br /&gt;&lt;code&gt;&lt;br /&gt;x = [[4,44,54,24],[5,45,55,25],[6,46,56,26]]&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;However, doing x is the same as doing u^-1 * d^-1 and then turning the whole cube round. What we're interested in is the possible positions of the 54 squares relative to one another, so we don't really want to count rotations of the whole cube. For this reason, we leave out rotations of the middle slices.&lt;br /&gt;&lt;br /&gt;Okay, so we have written down six permutations, corresponding to clockwise rotations of the six faces. Hopefully it's clear that these six elements generate the Rubik's cube group. So we are now in a position to perform calculations within the Rubik's cube group.&lt;br /&gt;&lt;br /&gt;Within a group, the order of an element g is defined as the least n&gt;0 such that g^n = 1. In other words, the order of g is the number of times you have to do g to get back where you started. (Recall that the order of a &lt;span style="font-style: italic;"&gt;group&lt;/span&gt; is the number of elements it contains. The order of an element is equal to the order of the group generated by the element, since this is {g, g^2, ... , g^n-1, g^n = 1}, having n elements.)&lt;br /&gt;&lt;br /&gt;For permutations written in cycle notation, it is easy to see the order by inspection. For example:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;The order of g = [[1,2]] is 2 - you have to do g twice to get back where you started.&lt;/li&gt;&lt;li&gt;The order of h = [[1,2,3]] is 3 - you have to do h three times to get back where you started.&lt;/li&gt;&lt;li&gt;The order of k = [[1,2,3],[4,5]] is 6. At k^2, we have [[1,3,2]] - the 4 and 5 are in the correct positions, but the 1, 2, 3 are not. At k^3 we have [[4,5]] - now the 1, 2, 3 are in the correct positions, but the 4 and 5 are not. Only at k^6 does everything get back to the starting position.&lt;/li&gt;&lt;/ul&gt;Hopefully it's clear that the order of an element will always be the least common multiple of the lengths of the cycles:&lt;br /&gt;&lt;code&gt;&lt;br /&gt;orderElt g = foldl lcm 1 $ map length $ toCycles g&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;&lt;br /&gt;For example, in the Rubik's cube group we have:&lt;br /&gt;&lt;code&gt;&lt;br /&gt;&gt; f*l&lt;br /&gt;[[1,3,9,37,53,17,41,33,27,21,23,19,47,59,11],[2,6,8,34,56,14,4],[7,31,29],[18,44,32,28,24,22,26]]&lt;br /&gt;&gt; orderElt it&lt;br /&gt;105&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;That is, if you do a front turn followed by a left turn, you will have to repeat 105 times before you get back to the starting position.&lt;br /&gt;&lt;br /&gt;When we were looking at graph symmetries, we found that it was particularly useful to have our generators in the form of a &lt;span style="font-style: italic;"&gt;strong generating set&lt;/span&gt;, since it was then straightforward, for example, to calculate the order of the group.&lt;br /&gt;&lt;br /&gt;Rubik's cube isn't a graph. Luckily, the &lt;a href="http://hackage.haskell.org/package/HaskellForMaths"&gt;HaskellForMaths library&lt;/a&gt; has a function to calculate a strong generating set from any set of generators:&lt;br /&gt;&lt;code&gt;&lt;br /&gt;&gt; mapM_ print $ sgs [f,b,l,r,u,d]&lt;br /&gt;[[1,3,9,7],[2,6,8,4],[17,41,33,29],[18,44,32,26],[19,47,31,23]]&lt;br /&gt;[[1,21,51,41],[2,22,52,42],[3,23,53,43],[11,13,19,17],[12,16,18,14]]&lt;br /&gt;[[1,31,59,11],[4,34,56,14],[7,37,53,17],[21,23,29,27],[22,26,28,24]]&lt;br /&gt;[[2,34,22,26,32,44,16,12,46,38,24],[3,31,47,13,49,37,21],[4,8,6,42,52,54,58,56,18,28,14],[7,9,43,39,27,11,41],[19,29,33,51,57,59,53]]&lt;br /&gt;[[3,13,57,33],[6,16,54,36],[9,19,51,39],[41,43,49,47],[42,46,48,44]]&lt;br /&gt;[[4,38,32,46,16,44,34,24,36,14,12],[6,28,56,48,22,52,26,58,8,54,42],[7,31,29],[9,59,57,51,53,33,37,49,13,21,47,27,39,43,11]]&lt;br /&gt;[[6,52,56,54],[9,53,57,51],[11,39,43,47],[12,24,46,44],[13,33,21,49]]&lt;br /&gt;[[7,47,57,27],[8,48,58,28],[9,49,59,29],[31,33,39,37],[32,36,38,34]]&lt;br /&gt;[[8,56,16,52,54],[9,33,47],[12,46,32,24,42],[13,51,43]]&lt;br /&gt;[[9,59],[12,38],[13,49],[24,46],[27,47],[33,37],[39,43],[51,57],[52,58],[54,56]]&lt;br /&gt;[[11,27,39,43],[12,24,38,46],[13,21,37,49],[51,53,59,57],[52,56,58,54]]&lt;br /&gt;[[11,37,39,43],[13,21,59,49],[16,56,54,58],[24,46,38,42],[27,57,51,53]]&lt;br /&gt;[[12,38,46,42,24],[16,56,52,58,54],[27,37,59],[39,57,49]]&lt;br /&gt;[[12,56,58,36],[16,54],[24,38,48,52],[27,37,59],[39,57,49],[42,46]]&lt;br /&gt;[[13,39,43,57,51,49],[16,24,46,38,42,56,54,58],[27,37,59],[36,48]]&lt;br /&gt;[[13,43,51],[27,37,59]]&lt;br /&gt;[[14,28,38,48,54,16,56],[22,34,58,36,46,42,24],[27,59,37],[39,49,57]]&lt;br /&gt;[[16,56,58],[24,38,42],[27,37,59],[39,57,49]]&lt;br /&gt;[[24,36,56,48],[27,37,59],[38,54,58,46],[39,57,49]]&lt;br /&gt;[[24,46,38],[27,37,59],[39,57,49],[54,58,56]]&lt;br /&gt;[[27,39],[36,48],[37,49],[38,46,58,54],[57,59]]&lt;br /&gt;[[27,59,37],[39,49,57]]&lt;br /&gt;[[28,54,58],[34,46,38]]&lt;br /&gt;[[36,38,54],[46,48,58]]&lt;br /&gt;[[36,58,54],[38,46,48]]&lt;br /&gt;[[38,58],[46,54]]&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;Okay, so strong generating sets aren't all that interesting in themselves - although the last few entries in this one are a little bit interesting, as they show that there are moves which just rotate a pair of corners, or a pair of edges.&lt;br /&gt;&lt;br /&gt;Remember though, that SGS allow us to easily calculate the order of a group. In this case:&lt;br /&gt;&lt;code&gt;&lt;br /&gt;&gt; orderSGS $ sgs [f,b,l,r,u,d]&lt;br /&gt;43252003274489856000&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;That's about 4 * 10^19 different positions.&lt;br /&gt;&lt;br /&gt;The sgs function, to calculate a strong generating set, uses the Schreier-Sims algorithm. I'll probably explain how it works some time, but not just yet.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5195188167565410449-5408584066017567741?l=haskellformaths.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://haskellformaths.blogspot.com/feeds/5408584066017567741/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://haskellformaths.blogspot.com/2009/08/how-to-count-number-of-positions-of.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5195188167565410449/posts/default/5408584066017567741'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5195188167565410449/posts/default/5408584066017567741'/><link rel='alternate' type='text/html' href='http://haskellformaths.blogspot.com/2009/08/how-to-count-number-of-positions-of.html' title='How to count the number of positions of Rubik&apos;s cube'/><author><name>DavidA</name><uri>http://www.blogger.com/profile/16359932006803389458</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_XQ7FznWBAYE/SmtzDINGV6I/AAAAAAAAADk/ErAbAQG0wGw/s72-c/rubik.GIF' height='72' width='72'/><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5195188167565410449.post-4167352833128780238</id><published>2009-07-25T20:00:00.000+01:00</published><updated>2009-07-25T21:34:58.947+01:00</updated><title type='text'>Strong generating sets for graph symmetries</title><content type='html'>[New release: HaskellForMaths 0.1.6 &lt;a href="http://hackage.haskell.org/package/HaskellForMaths"&gt;here&lt;/a&gt;]&lt;br /&gt;&lt;br /&gt;Previously in this blog, we've been using two functions for finding generators for the group of symmetries of a graph. Both graphAuts2 and graphAuts3 use depth first search to find a transversal generating sequence - the only difference is that graphAuts3 does more pruning of the search tree, and so is faster.&lt;br /&gt;&lt;br /&gt;A transversal generating sequence, remember, means this: If our vertices are labelled 1 to n, then we first try to find a symmetry that takes 1 to 2, then another that takes 1 to 3, and so on up to n; then looking only at those that leave 1 fixed, another which takes 2 to 3, then another which takes 2 to 4, and so on; then looking only at those that fix 1 and 2, another which takes 3 to 4, and so on; and so on. So the answer we get always takes the form of a series of levels - first, a set of symmetries taking 1 to some of [2..n], then a set of symmetries fix 1, and take 2 to some of [3..n], and so on.&lt;br /&gt;&lt;br /&gt;Okay, so now suppose that we're doing our depth first search. Suppose that we have already found [[1,2],[3,4]], and [[1,3,2]]. Then in fact we don't need to search for a symmetry taking 1 to 4, because the two symmetries we know already generate one. Specifically:&lt;br /&gt;&lt;code&gt;&lt;br /&gt;&gt; p [[1,3,2]] * p [[1,2],[3,4]]&lt;br /&gt;[[1,4,3]]&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;So the point is that, as it happens, it was already possible to take 1 to 4 by repeated application of the symmetries we had already found to take 1 to 2 and 3.&lt;br /&gt;&lt;br /&gt;Given a set of group elements, the &lt;span style="font-style: italic;"&gt;orbit&lt;/span&gt; of a vertex is defined as those vertices that we can send it to by repeated application of the elements:&lt;br /&gt;&lt;code&gt;&lt;br /&gt;orbitV gs x = closure [x] [ (.^ g) | g &lt;- gs ] &lt;/code&gt;&lt;br /&gt;&lt;br /&gt;(This is using the closure algorithm that we defined previously. Recall that v .^ g means the image of v after action by g.)&lt;br /&gt;&lt;br /&gt;This idea enables us to write an even faster graphAuts function. As before, we start by looking for symmetries that send 1 to 2, 1 to 3, and so on. However, this time, we don't bother to look for symmetries sending 1 to a vertex which is already in the orbit of the symmetries we have found.&lt;br /&gt;&lt;br /&gt;An example will probably make this clearer. First of all, here is the full transversal generating set of symmetries of the cube:&lt;br /&gt;&lt;code&gt;&lt;br /&gt;&gt; mapM_ print $ graphAuts3 $ q 3&lt;br /&gt;[[0,1],[2,3],[4,5],[6,7]]&lt;br /&gt;[[0,2,3,1],[4,6,7,5]]&lt;br /&gt;[[0,3],[4,7]]&lt;br /&gt;[[0,4,6,7,3,1],[2,5]]&lt;br /&gt;[[0,5,3],[2,4,7]]&lt;br /&gt;[[0,6,5,3],[1,2,4,7]]&lt;br /&gt;[[0,7],[1,3],[2,5],[4,6]]&lt;br /&gt;[[1,2],[5,6]]&lt;br /&gt;[[1,4,2],[3,5,6]]&lt;br /&gt;[[2,4],[3,5]]&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;The list consists of three levels: symmetries moving 0 to 1, 2, 3, 4, 5, 6, 7; symmetries fixing 0 and moving 1 to 2, 4; and a symmetry fixing 0 and 1 and moving 2 to 4. The sequence [0,1,2] is called the &lt;span style="font-style: italic;"&gt;base&lt;/span&gt; for the TGS.&lt;br /&gt;&lt;br /&gt;Now, here's what happens if we skip the search for vertices that are already in the orbit, using a new graphAuts function:&lt;br /&gt;&lt;code&gt;&lt;br /&gt;&gt; mapM_ print $ graphAuts $ q 3&lt;br /&gt;[[0,1],[2,3],[4,5],[6,7]]&lt;br /&gt;[[0,2,3,1],[4,6,7,5]]&lt;br /&gt;[[0,4,6,7,3,1],[2,5]]&lt;br /&gt;[[1,2],[5,6]]&lt;br /&gt;[[1,4,2],[3,5,6]]&lt;br /&gt;[[2,4],[3,5]]&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;We still have three levels, with the same base, [0,1,2]. But in the first level, we haven't needed to find as many symmetries. After we found the 0 to 1 and 0 to 2 symmetries, we didn't need to find a 0 to 3 symmetry, because 3 was already in the orbit of the symmetries we had found. Then, after we had found the 0 to 4 symmetry, we didn't need to find the 0 to 5, 6, 7 symmetries.&lt;br /&gt;&lt;code&gt;&lt;br /&gt;&gt; orbitV [ p [[0,1],[2,3],[4,5],[6,7]], p [[0,2,3,1],[4,6,7,5]], p [[0,4,6,7,3,1],[2,5]] ] 0&lt;br /&gt;[0,1,2,3,4,5,6,7]&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;This new graphAuts function (I won't give the code, as it's really only a minor variation on the graphAuts3 function) is faster still, because we're now pruning the search tree even more.&lt;br /&gt;&lt;br /&gt;However, it has lost the nice property of the graphAuts2 and graphAuts3 functions, that the returned list was a transversal generating sequence. Recall that a TGS made it particularly easy to work out the order of the group, or list its elements.&lt;br /&gt;&lt;br /&gt;However all is not lost. We can easily reconstruct a TGS from the output of the graphAuts function. Consider that first level in the cube symmetries again. In the TGS, we had symmetries taking 0 to 1, 2, 3, 4, 5, 6, 7. The graphAuts function only returned symmetries taking 0 to 1, 2, 4. That was because it turned out that 3, 5, 6, 7 were already in the orbit of 0 under these symmetries. To reconstruct the TGS, what we need to do is, calculate the orbit of 0 under the 0 to 1, 2, 4 symmetries, but this time, keep track of the group elements as we go. For example, the reason 3 is in the orbit is that:&lt;br /&gt;&lt;code&gt;&lt;br /&gt;&gt; p [[0,2,3,1],[4,6,7,5]] * p [[0,1],[2,3],[4,5],[6,7]]&lt;br /&gt;[[0,3],[4,7]]&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;Here's the code. We find the base "bs" by looking at the minimum supports (the least vertex that is moved) of the inputs. We then sort the inputs into levels, using this base. Finally, for each level, we use a modified version of the closure algorithm, that tracks not only where we've got to in the orbit, but also how we got there.&lt;br /&gt;&lt;pre&gt;&lt;code&gt;&lt;br /&gt;tgsFromSgs sgs = concatMap transversal bs where&lt;br /&gt;    bs = toListSet $ map minsupp sgs&lt;br /&gt;    transversal b = closure b $ filter ( (b &lt;=) . minsupp ) sgs&lt;br /&gt;    closure b gs = closure' M.empty (M.fromList [(b, 1)]) where&lt;br /&gt;        closure' interior boundary&lt;br /&gt;            | M.null boundary = filter (/=1) $ M.elems interior&lt;br /&gt;            | otherwise =&lt;br /&gt;                 let interior' = M.union interior boundary&lt;br /&gt;                     boundary' = M.fromList [(x .^ g, h*g) | (x,h) &lt;- M.toList boundary, g &lt;- gs] M.\\ interior'&lt;br /&gt;                 in closure' interior' boundary' &lt;/code&gt;&lt;/pre&gt;&lt;br /&gt;A set of generators from which we can reconstruct a TGS in this way is called a &lt;span style="font-style: italic;"&gt;strong generating set&lt;/span&gt;, or SGS. (Strictly speaking, an SGS is relative to a base - we've been using the base that is implied by the Ord instance and the minimum supports of the elements.)&lt;br /&gt;&lt;br /&gt;In practice, we prefer to work with SGS than TGS, because they're shorter. Since you can so easily reconstruct a TGS from them, they're just as useful.&lt;br /&gt;&lt;br /&gt;I should admit that I'm sidestepping a few subtleties here. The take-home message is:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;A strong generating set is a set of generators for a group of a particularly useful form&lt;/li&gt;&lt;li&gt;The graphAuts function gives us a strong generating set by construction&lt;/li&gt;&lt;/ul&gt;That means that for symmetries of graphs, we are doing pretty well. We have an efficient algorithm (graphAuts) for finding a strong generating set for the symmetry group.&lt;br /&gt;&lt;br /&gt;However, what we would like is, given just any set of generators for a group, to be able to construct a strong generating set. For that, we will need the Schreier-Sims algorithm. Next time, I'll show why this would be so useful, by looking at Rubik's cube.&lt;br /&gt;&lt;code&gt;&lt;code&gt;&lt;br /&gt;&lt;/code&gt;&lt;/code&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5195188167565410449-4167352833128780238?l=haskellformaths.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://haskellformaths.blogspot.com/feeds/4167352833128780238/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://haskellformaths.blogspot.com/2009/07/strong-generating-sets-for-graph.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5195188167565410449/posts/default/4167352833128780238'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5195188167565410449/posts/default/4167352833128780238'/><link rel='alternate' type='text/html' href='http://haskellformaths.blogspot.com/2009/07/strong-generating-sets-for-graph.html' title='Strong generating sets for graph symmetries'/><author><name>DavidA</name><uri>http://www.blogger.com/profile/16359932006803389458</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5195188167565410449.post-6844018435761400454</id><published>2009-07-20T21:25:00.000+01:00</published><updated>2009-07-20T22:07:10.489+01:00</updated><title type='text'>Faster graph symmetries using distance partitions</title><content type='html'>Up to now, we've been using the graphAuts2 function to find us a generating set (in fact a transversal generating sequence) for the symmetries of a graph. Unfortunately, graphAuts2 simply isn't fast enough for some of the larger graphs we would like to investigate. (I'll give an example at the end.) This week therefore, we're going to look at a more efficient way to find graph automorphisms, based on distance partitions.&lt;br /&gt;&lt;br /&gt;Given a graph, the distance between vertices x and y is defined as the length of the shortest path between them - that is, the number of edges on the path. For example, in the cube shown below, the distance between the 0 and 7 vertices is 3. There are several routes, but each route involves passing along at least 3 edges. In &lt;a href="http://hackage.haskell.org/package/HaskellForMaths"&gt;HaskellForMaths&lt;/a&gt;, we can use the "distance" function to find this out:&lt;br /&gt;&lt;code&gt;&lt;br /&gt;&gt; :load Math.Combinatorics.GraphAuts&lt;br /&gt;&gt; distance (q 3) 0 7&lt;br /&gt;3&lt;br /&gt;&lt;/code&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_XQ7FznWBAYE/SmOEXR-VD5I/AAAAAAAAADU/dMG899_GDaI/s1600-h/cubedistance.GIF"&gt;&lt;img style="cursor: pointer; width: 298px; height: 320px;" src="http://4.bp.blogspot.com/_XQ7FznWBAYE/SmOEXR-VD5I/AAAAAAAAADU/dMG899_GDaI/s320/cubedistance.GIF" alt="" id="BLOGGER_PHOTO_ID_5360273516997709714" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;In the picture, we see that the vertices fall into four levels, depending on whether their distance from 0 is 0, 1, 2, or 3. Thus, distance from a given vertex can be used to partition the vertices in a graph. This is called the distance partition:&lt;br /&gt;&lt;code&gt;&lt;br /&gt;&gt; distancePartition (q 3) 0&lt;br /&gt;[[0],[1,2,4],[3,5,6],[7]]&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;(The implementation of distancePartition is a variant on the closure algorithm that we have seen a couple of times before, and is left as an exercise.)&lt;br /&gt;&lt;br /&gt;Okay, so how do distance partitions help us with finding graph symmetries?&lt;br /&gt;&lt;br /&gt;Well, recall that the way graphAuts2 works is, using depth-first search, try to find a symmetry that sends 0 to 1, another that sends 0 to 2, another that sends 0 to 3, and so on, then looking only at those that fix 0, another that sends 1 to 2, another that sends 1 to 3, and so on, then looking only at those that fix 0 and 1, another that sends 2 to 3, another that sends 2 to 4, and so on, and so on.&lt;br /&gt;&lt;br /&gt;Well, the first step is to realise that graph symmetries must preserve distance. That is, if g is a symmetry, then we must have: distance x y = distance (x .^ g) (y .^ g). So, suppose that we are looking for a graph symmetry that sends 0 to 1, and we're wondering where to send x. Well, we need only consider those y such that distance 0 x = distance 1 y.&lt;br /&gt;&lt;br /&gt;In terms of distance partitions, this means that, once we have decided to map 0 to 1, then we must map each cell in the distance partition of 0 to the corresponding cell in the distance partition of 1.&lt;br /&gt;&lt;code&gt;&lt;br /&gt;&gt; distancePartition (q 3) 0&lt;br /&gt;[[0],[1,2,4],[3,5,6],[7]]&lt;br /&gt;&gt; distancePartition (q 3) 1&lt;br /&gt;[[1],[0,3,5],[2,4,7],[6]]&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;Specifically, we must map {1,2,4} to {0,3,5} (though not necessarily in that order), and {3,5,6} to {2,4,7}, and 7 to 6.&lt;br /&gt;&lt;br /&gt;This is already going to cut down our search space considerably, perhaps especially on graphs with many vertices. However, we can go further.&lt;br /&gt;&lt;br /&gt;Suppose now that we are looking for symmetries that send 0 to 0 and 1 to 1. Because they send 0 to 0, they must preserve the cells in the distance partition from 0. So they must send {1,2,4} to {1,2,4} (but not necessarily in that order), {3,5,6} to {3,5,6}, and 7 to 7. Also, because they send 1 to 1, they must preserve the cells in the distance partition from 1. So they must send {0,3,5} to {0,3,5}, {2,4,7} to {2,4,7}, and 6 to 6.&lt;br /&gt;&lt;br /&gt;I think of this a being a bit like triangulation. Suppose we're wondering where to send 2 to. Well, looking at the distance partition from 0, we see that it must go to one of {1,2,4}. On the other hand, looking at the distance partition from 1, we see that it must go to one of {2,4,7}. So that actually narrows it down to {2,4}. (Yes, I know, in this case we could already see that, but you get the idea.) If we take another fix from a third point, that will narrow it down even further, and so on.&lt;br /&gt;&lt;br /&gt;So the idea of our new graphAuts3 function will be as follows.&lt;br /&gt;&lt;ul&gt;&lt;li&gt;We will still start by trying to send 0 to 1, 2, 3, 4, 5, 6, 7, and finally 0.&lt;/li&gt;&lt;li&gt;If we're trying to send 0 to x, then when we come to wonder where to send 1, we'll consider only those y with distance x y = distance 0 1. In terms of the distance partition, this means that if 1 falls into cell d of the distance partition from 0, then y must fall into cell d of the distance partition from x.&lt;/li&gt;&lt;li&gt;As we successively decide where to send 1, 2, etc, we will "refine" the cells of the partition by triangulation with the new point.&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;The code for refining partitions is very simple:&lt;br /&gt;&lt;code&gt;&lt;br /&gt;refine p1 p2 = concat [ [c1 `intersect` c2 | c2 &lt;- p2] | c1 &lt;- p1] &lt;/code&gt;&lt;br /&gt;For example:&lt;br /&gt;&lt;code&gt;&lt;br /&gt;&gt; distancePartition (q 3) 0 `refine` distancePartition (q 3) 1&lt;br /&gt;[[],[0],[],[],[1],[],[2,4],[],[],[3,5],[],[6],[],[],[7],[]]&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;We get quite a few empty lists in the refinement - they're a necessary evil, but we will remove them as we go along.&lt;br /&gt;&lt;br /&gt;Okay, so here's our graphAuts3 function:&lt;br /&gt;&lt;pre&gt;&lt;code&gt;&lt;br /&gt;graphAuts3 g@(G vs es) = graphAuts' [] [vs] where&lt;br /&gt;   graphAuts' us ((x:ys):pt) =&lt;br /&gt;       let px = refine (ys : pt) (dps M.! x)&lt;br /&gt;           p y = refine ((x : L.delete y ys) : pt) (dps M.! y)&lt;br /&gt;           uus = zip us us&lt;br /&gt;           p' = L.sort $ filter (not . null) $ px&lt;br /&gt;       in concat [take 1 $ dfs ((x,y):uus) px (p y) | y &lt;- ys]&lt;br /&gt;        ++ graphAuts' (x:us) p'&lt;br /&gt;   graphAuts' us ([]:pt) = graphAuts' us pt&lt;br /&gt;   graphAuts' _ [] = []&lt;br /&gt;   dfs xys p1 p2&lt;br /&gt;       | map length p1 /= map length p2 = []&lt;br /&gt;       | otherwise =&lt;br /&gt;            let p1' = filter (not . null) p1&lt;br /&gt;                p2' = filter (not . null) p2&lt;br /&gt;            in if all isSingleton p1'&lt;br /&gt;               then let xys' = xys ++ zip (concat p1') (concat p2')&lt;br /&gt;                    in if isCompatible xys' then [fromPairs' xys'] else []&lt;br /&gt;               else let (x:xs):p1'' = p1'&lt;br /&gt;                        ys:p2'' = p2'&lt;br /&gt;                    in concat [dfs ((x,y):xys)&lt;br /&gt;                                   (refine (xs : p1'') (dps M.! x))&lt;br /&gt;                                   (refine ((L.delete y ys):p2'') (dps M.! y))&lt;br /&gt;                              | y &lt;- ys]&lt;br /&gt;   isCompatible xys = and [([x,x'] `S.member` es') == (L.sort [y,y'] `S.member` es') | (x,y) &lt;- xys, (x',y') &lt;- xys, x &lt; x']&lt;br /&gt;   dps = M.fromList [(v, distancePartition g v) | v &lt;- vs]&lt;br /&gt;   es' = S.fromList es &lt;/code&gt;&lt;/pre&gt;&lt;br /&gt;It looks complicated (and perhaps I could tidy it up a little), but what it's doing is really fairly straightforward. Basically, we're still doing depth first search, in levels, as before. However, we're now maintaining two partitions as we go, the source partition p1 and the target partition p2, and we're constrained to map cells in the source partition to the corresponding cells in the target partition. If we ever find that the "shape" (map length) of the source and target partitions are different, then we know we have gone wrong and need to backtrack. Having satisfied ourselves that the shapes are the same, we can remove those pesky empty lists. Finally, as soon as we find that every cell is a singleton, then we can can shortcut any further search - although we still need to check that the implied mapping is a valid symmetry, to avoid false positives.&lt;br /&gt;&lt;br /&gt;Okay, so how about a brief demonstration of its power. Time for a confession - on most of the smallish graphs that we've considered so far, graphAuts3 is actually slightly slower than graphAuts2. However, it's not too difficult to find larger graphs where it wins out.&lt;br /&gt;&lt;br /&gt;The Kneser graphs are defined as follows:&lt;br /&gt;&lt;pre&gt;&lt;code&gt;&lt;br /&gt;kneser n k | 2*k &lt;= n = graph (vs,es) where&lt;br /&gt;   vs = combinationsOf k [1..n]&lt;br /&gt;   es = [ [v1,v2] | [v1,v2] &lt;- combinationsOf 2 vs, disjoint v1 v2] &lt;/code&gt;&lt;/pre&gt;&lt;br /&gt;So the Kneser graph has as vertices the k-subsets of [1..n], with edges joining subsets which are disjoint.&lt;br /&gt;&lt;br /&gt;We've already met one of them - the Petersen graph is kneser 5 2. Kneser 7 3 is shown below - it has 35 vertices and 70 edges:&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_XQ7FznWBAYE/SmOEXm3SE_I/AAAAAAAAADc/iXGWvXdPxLw/s1600-h/kneser73.gif"&gt;&lt;img style="cursor: pointer; width: 320px; height: 313px;" src="http://2.bp.blogspot.com/_XQ7FznWBAYE/SmOEXm3SE_I/AAAAAAAAADc/iXGWvXdPxLw/s320/kneser73.gif" alt="" id="BLOGGER_PHOTO_ID_5360273522605298674" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;So here's an experiment you can try at home. Compare the running times of graphAuts2 and graphAuts3 on kneser 7 3. On my laptop, graphAuts3 manages to find a transversal generating set of 49 symmetries in less than a second. On the other hand, graphAuts2 had only managed to find 1 symmetry after 10 minutes, at which point I gave up.&lt;br /&gt;&lt;br /&gt;Incidentally:&lt;br /&gt;&lt;code&gt;&lt;br /&gt;&gt; orderTGS $ graphAuts3 $ kneser 7 3&lt;br /&gt;5040&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;... which is 7 factorial. That's because the action of S 7 on [1..7] induces an action on kneser 7 3, and it turns out that every symmetry of kneser 7 3 arises from an underlying permutation of [1..7].&lt;br /&gt;&lt;br /&gt;Anyway, with graphAuts3, and the orderTGS function from last time (for calculating the number of symmetries, given a transversal generating sequence), we are beginning to have the tools to investigate some larger graphs. However, there is another improvement we can make, which will lead us to strong generating sets and the Schreier-Sims algorithm - next time.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5195188167565410449-6844018435761400454?l=haskellformaths.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://haskellformaths.blogspot.com/feeds/6844018435761400454/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://haskellformaths.blogspot.com/2009/07/faster-graph-symmetries-using-distance.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5195188167565410449/posts/default/6844018435761400454'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5195188167565410449/posts/default/6844018435761400454'/><link rel='alternate' type='text/html' href='http://haskellformaths.blogspot.com/2009/07/faster-graph-symmetries-using-distance.html' title='Faster graph symmetries using distance partitions'/><author><name>DavidA</name><uri>http://www.blogger.com/profile/16359932006803389458</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_XQ7FznWBAYE/SmOEXR-VD5I/AAAAAAAAADU/dMG899_GDaI/s72-c/cubedistance.GIF' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5195188167565410449.post-1478136751988056645</id><published>2009-07-15T08:01:00.000+01:00</published><updated>2009-07-15T08:17:20.015+01:00</updated><title type='text'>Counting symmetries using transversals</title><content type='html'>Previously, we've been using the &lt;a href="http://hackage.haskell.org/package/HaskellForMaths"&gt;HaskellForMaths&lt;/a&gt; library's graphAuts2 function to find a generating set for the symmetries of a graph. The generating set returned by graphAuts2 has a particularly useful form, which I want to explore this week. Let's just remind ourselves how it works.&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_XQ7FznWBAYE/SlpAoLzqRLI/AAAAAAAAAC8/9_NyJo1Gat0/s1600-h/cube.GIF"&gt;&lt;img style="cursor: pointer; width: 221px; height: 252px;" src="http://3.bp.blogspot.com/_XQ7FznWBAYE/SlpAoLzqRLI/AAAAAAAAAC8/9_NyJo1Gat0/s320/cube.GIF" alt="" id="BLOGGER_PHOTO_ID_5357665765819040946" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;pre&gt;&lt;code&gt;&gt; :load Math.Combinatorics.GraphAuts&lt;br /&gt;&gt; mapM_ print $ graphAuts2 $ q 3&lt;br /&gt;[[0,1],[2,3],[4,5],[6,7]]&lt;br /&gt;[[0,2,3,1],[4,6,7,5]]&lt;br /&gt;[[0,3],[4,7]]&lt;br /&gt;[[0,4,6,7,3,1],[2,5]]&lt;br /&gt;[[0,5,3],[2,4,7]]&lt;br /&gt;[[0,6,5,3],[1,2,4,7]]&lt;br /&gt;[[0,7],[1,3],[2,5],[4,6]]&lt;br /&gt;[[1,2],[5,6]]&lt;br /&gt;[[1,4,2],[3,5,6]]&lt;br /&gt;[[2,4],[3,5]]&lt;/code&gt;&lt;/pre&gt;What graphAuts2 does is, it tries to find a symmetry which sends 0 to 1, another which sends 0 to 2, another which sends 0 to 3, and so on, then looking only at symmetries which leave 0 where it is, another which sends 1 to 2, another which sends 1 to 3, and so on, then leaving 0 and 1 where they are, another which sends 2 to 3, another which sends 2 to 4, and so on, and so on. Most of the time, it won't find &lt;span style="font-style: italic;"&gt;all&lt;/span&gt; the symmetries it's looking for - as in this case.&lt;br /&gt;&lt;br /&gt;Now, a few weeks back I set a puzzle: Given just the list returned by graphAuts2, how can I instantly tell how many symmetries there are in total?&lt;br /&gt;&lt;br /&gt;Well, think of it like this.&lt;br /&gt;&lt;br /&gt;First of all, I'm trying to find all the different places that I can move the 0 vertex to. Hopefully it's obvious that I can move the 0 vertex to any of the eight vertices. The first seven elements returned by graphAuts2 are symmetries which move 0 to 1,2,3,4,5,6,7 respectively - plus of course I can always just leave the cube as it is and leave 0 at 0.&lt;br /&gt;&lt;br /&gt;Now, suppose that I have decided to leave 0 where it is. Next, I try to find all the different places that I can move 1 to, but leaving 0 where it is. If you look at the picture, I hope you can see that having fixed 0, our only choices for 1 are 1, 2, and 4. The next two elements returned by graphAuts2 are symmetries which move 1 to 2 and 4 respectively - plus of course, the identity element leaves the cube as it is, so leaves 0 at 0 and 1 at 1.&lt;br /&gt;&lt;br /&gt;Now, suppose that I have decided to leave 0 and 1 where they are. Next I try to find all the different places that I can move 2 to. If you look at the picture, you'll see that if we fix 0 and 1, our only choices for 2 are 2 and 4. The identity leaves 2 where it is, and the last element returned by graphAuts2 moves 2 to 4.&lt;br /&gt;&lt;br /&gt;Now, suppose that we decide to leave 0, 1 and 2 where they are. Next I try to find all the different places I can move 3 to. graphAuts2 didn't return any more elements. What it is telling us is that once I have decided to fix 0, 1, and 2, I have no choice but to fix 3 and all the other vertices too.&lt;br /&gt;&lt;br /&gt;Okay, so how many symmetries are there in total? Well, I had 8 choices for the first vertex, then 3 choices for the second vertex, then 2 choices for the third vertex, then no more choices. So that gives us 8*3*2 = 48. Let's just check:&lt;br /&gt;&lt;code&gt;&lt;br /&gt;&gt; length $ elts $ graphAuts2 $ q 3&lt;br /&gt;48&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;We can write code to do this for us.&lt;br /&gt;&lt;pre&gt;&lt;code&gt;&lt;br /&gt;orderTGS tgs =&lt;br /&gt;    let transversals = map (1:) $ L.groupBy (\g h -&gt; minsupp g == minsupp h) tgs&lt;br /&gt;    in product $ map L.genericLength transversals&lt;br /&gt;&lt;/code&gt;&lt;/pre&gt;&lt;br /&gt;This code needs a bit of explaining:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;The &lt;span style="font-style: italic;"&gt;order&lt;/span&gt; of a group means the number of elements. That's what we're trying to calculate.&lt;/li&gt;&lt;li&gt;&lt;span style="font-style: italic;"&gt;TGS&lt;/span&gt; stands for transversal generating set. This is my name for a generating set of the type returned by graphAuts2. (I'll explain the name in a little bit.)&lt;br /&gt;&lt;/li&gt;&lt;li&gt;Given a TGS, the first thing we need to do is group the elements into levels based on what I call the minimum support - the least vertex that they move. In the case of the cube, we have three levels - seven elements that move 0, two elements that fix 0 and move 1, and one element that fixes 0 and 1 but moves 2&lt;/li&gt;&lt;li&gt;Next we add the identity element (called 1) to each level, to show that we can also just leave the vertex where it is&lt;/li&gt;&lt;li&gt;Finally, to calculate the order, we multiply the number of elements in each level.&lt;/li&gt;&lt;/ul&gt;&lt;code&gt;&lt;br /&gt;&gt; orderTGS $ graphAuts2 $ q 3&lt;br /&gt;48&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;Let's think a little more about how this works. What are transversals, and what's so special about "transversal generating sets"? Well, consider the following picture:&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_XQ7FznWBAYE/Slzdd_94itI/AAAAAAAAADM/m_AIqjyOBmc/s1600-h/transversals2.JPG"&gt;&lt;img style="cursor: pointer; width: 320px; height: 133px;" src="http://2.bp.blogspot.com/_XQ7FznWBAYE/Slzdd_94itI/AAAAAAAAADM/m_AIqjyOBmc/s320/transversals2.JPG" alt="" id="BLOGGER_PHOTO_ID_5358401164120984274" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;On the left, the 48 dots represent the 48 symmetries of the cube. The eight columns divide the elements according to whether they send 0 to 0, 1, 2, 3, 4, 5, 6, or 7. The red dot in the top left is the identity, which sends 0 to 0. Then graphAuts2 finds us one dot (the blue one) in each column. So we have a representative from each column. This set of representatives is called a transversal.&lt;br /&gt;&lt;br /&gt;Next, on the right, we confine our attention to only the first column, the elements that send 0 to 0. The three blocks divide the elements by whether they send 1 to 1, 2, or 4. The red dot in the top block is again the identity. Then graphAuts2 finds us a blue dot in each of the other two blocks. So we have another transversal.&lt;br /&gt;&lt;br /&gt;Finally (not shown), we divide the top left block by whether the elements send 2 to 2 or 4. In this case we have just two divisions, consisting of a single element each - the identity, and the last element found by graphAuts2. Taken together, these two elements constitute our third transversal.&lt;br /&gt;&lt;br /&gt;So a transversal generating set - it should really be called a transversal generating sequence, because the order matters - is a sequence of transversals (but omitting the identity), starting at the outermost layer and going successively inwards, that generates the group.&lt;br /&gt;&lt;br /&gt;And now it should be totally clear why this method of calculating the order of the group works. The only thing which perhaps isn't clear is, how do we know that there are the same number of elements in each column, and in each block, at each stage? (This is clearly required, if the method is to work.) Well, the answer is simply that multiplication by the blue dot in any column gives a one-to-one correspondence between the elements in the left hand column and the elements in that column, so the number of elements must be the same.&lt;br /&gt;&lt;br /&gt;As this last remark perhaps suggests, we can also use these transversals to list the elements of a group:&lt;br /&gt;&lt;pre&gt;&lt;code&gt;&lt;br /&gt;eltsTGS tgs =&lt;br /&gt;    let transversals = map (1:) $ L.groupBy (\g h -&gt; minsupp g == minsupp h) tgs&lt;br /&gt;    in map product $ sequence transversals&lt;br /&gt;&lt;/code&gt;&lt;/pre&gt;&lt;br /&gt;So once again, we construct some transversals by grouping the TGS into levels, and then adding the identity to each level. Then what the last line is saying is, to get an element of the group, just pick an element from each of the transversals (levels), and multiply them together. To get all the elements of the group, consider all possible such choices.&lt;br /&gt;&lt;br /&gt;The reason that transversal generating sets and the orderTGS function are important, is that they are the first step towards being able to work with large or very large groups. Previously, if we wanted to know how many elements there are in a group, we would have had to use the "elts" function to generate them all, and then count them. This is okay for small groups, but for graphs with thousands or millions of symmetries, it is not going to be practical. With the orderTGS function, we will be able to find out how many symmetries a graph has, even if it is in the thousands or millions, so long as we can find a TGS.&lt;br /&gt;&lt;br /&gt;Our next stumbling block is that the graphAuts2 function itself is not efficient enough to handle these larger graphs. So next time, we'll look at a more efficient way to find a TGS for a graph. After that, we'll look at something called a strong generating set, and an algorithm called the Schreier-Sims algorithm, that can always find us a strong generating set, given only some generators for the group.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5195188167565410449-1478136751988056645?l=haskellformaths.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://haskellformaths.blogspot.com/feeds/1478136751988056645/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://haskellformaths.blogspot.com/2009/07/counting-symmetries-using-transversals.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5195188167565410449/posts/default/1478136751988056645'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5195188167565410449/posts/default/1478136751988056645'/><link rel='alternate' type='text/html' href='http://haskellformaths.blogspot.com/2009/07/counting-symmetries-using-transversals.html' title='Counting symmetries using transversals'/><author><name>DavidA</name><uri>http://www.blogger.com/profile/16359932006803389458</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_XQ7FznWBAYE/SlpAoLzqRLI/AAAAAAAAAC8/9_NyJo1Gat0/s72-c/cube.GIF' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5195188167565410449.post-4305409184519031168</id><published>2009-07-08T08:03:00.000+01:00</published><updated>2009-07-08T20:45:38.292+01:00</updated><title type='text'>Conjugacy classes, part 2</title><content type='html'>Last time we saw that the symmetries of a graph can be divided into conjugacy classes, where the elements of each class are somehow the same symmetry, just viewed from a different angle. This time we're going to look at the conjugacy classes of symmetries of a few simple graphs, to get more of a feel for how this works.&lt;br /&gt;&lt;br /&gt;Let's start with the graph of the cube, otherwise known (in graph theory) as q 3. q n has as vertices all points in {0,1}&lt;sup&gt;n&lt;/sup&gt;, with edges between pairs of vertices which differ in only one position:&lt;br /&gt;&lt;pre&gt;&lt;code&gt;&lt;br /&gt;q' k = graph (vs,es) where&lt;br /&gt;    vs = sequence $ replicate k [0,1]&lt;br /&gt;    es = [ [u,v] | [u,v] &lt;- combinationsOf 2 vs, hammingDistance u v == 1 ]&lt;br /&gt;    hammingDistance as bs = length $ filter id $ zipWith (/=) as bs &lt;/code&gt;&lt;/pre&gt;&lt;br /&gt;For example:&lt;br /&gt;&lt;code&gt;&lt;br /&gt;&gt; :load Math.Combinatorics.GraphAuts&lt;br /&gt;&gt; q' 3&lt;br /&gt;G [[0,0,0],[0,0,1],[0,1,0],[0,1,1],[1,0,0],[1,0,1],[1,1,0],[1,1,1]]&lt;br /&gt;[[[0,0,0],[0,0,1]],[[0,0,0],[0,1,0]],[[0,0,0],[1,0,0]],[[0,0,1],[0,1,1]],&lt;br /&gt;[[0,0,1],[1,0,1]],[[0,1,0],[0,1,1]],[[0,1,0],[1,1,0]],[[0,1,1],[1,1,1]],&lt;br /&gt;[[1,0,0],[1,0,1]],[[1,0,0],[1,1,0]],[[1,0,1],[1,1,1]],[[1,1,0],[1,1,1]]]&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;However, this is a bit hard to read, a jumble of 0s, 1s and square brackets, so we normally work over integers instead, by considering the sequences of 0s and 1s as the bits in the binary representation of a number:&lt;br /&gt;&lt;code&gt;&lt;br /&gt;&gt; q 3&lt;br /&gt;G [0,1,2,3,4,5,6,7]&lt;br /&gt;[[0,1],[0,2],[0,4],[1,3],[1,5],[2,3],[2,6],[3,7],[4,5],[4,6],[5,7],[6,7]]&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_XQ7FznWBAYE/SkvDfny_UII/AAAAAAAAACs/MEniJ0X_RQE/s1600-h/cube.GIF"&gt;&lt;img style="cursor: pointer; width: 221px; height: 252px;" src="http://3.bp.blogspot.com/_XQ7FznWBAYE/SkvDfny_UII/AAAAAAAAACs/MEniJ0X_RQE/s320/cube.GIF" alt="" id="BLOGGER_PHOTO_ID_5353587530086174850" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;Okay, so we would like to investigate the symmetries of the cube, and as we saw last time, what we're really interested in is just the different &lt;span style="font-style: italic;"&gt;classes&lt;/span&gt; of symmetry, rather than listing every single symmetry.&lt;br /&gt;&lt;code&gt;&lt;br /&gt;&gt; mapM_ print $ conjClassReps $ graphAuts2 $ q 3&lt;br /&gt;([],1)&lt;br /&gt;([[0,1],[2,3],[4,5],[6,7]],3)&lt;br /&gt;([[0,1],[2,5],[3,4],[6,7]],6)&lt;br /&gt;([[0,1,3,2],[4,5,7,6]],6)&lt;br /&gt;([[0,1,3,7,6,4],[2,5]],8)&lt;br /&gt;([[0,3],[1,2],[4,7],[5,6]],3)&lt;br /&gt;([[0,3,6,5],[1,2,7,4]],6)&lt;br /&gt;([[0,3,6],[1,7,4]],8)&lt;br /&gt;([[0,3],[4,7]],6)&lt;br /&gt;([[0,7],[1,6],[2,5],[3,4]],1)&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;What we need to do is look at the representatives of each class, and try to understand what they are doing, so as to be able to give a textual description:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;[] is the identity permutation, which leaves everything where it is&lt;br /&gt;&lt;/li&gt;&lt;li&gt;[[0,1],[2,3],[4,5],[6,7]] is a reflection in a plane midway between opposing faces. There are three such elements, because there are three pairs of opposing faces.&lt;/li&gt;&lt;li&gt;[[0,1],[2,5],[3,4],[6,7]] is a 180 degree rotation about an axis joining the midpoints of opposite edges. There are six elements, because there are six pairs of opposite edges.&lt;/li&gt;&lt;li&gt;[[0,1,3,2],[4,5,7,6]] is a 90 degree rotation about an axis through the centres of opposing faces. There are six faces (or three pairs of faces, with clockwise and anti-clockwise turns).&lt;/li&gt;&lt;li&gt;[[0,1,3,7,6,4],[2,5]] - this is an interesting one. There is a 6-cycle, so there's some sort of 60 degree rotation going on. But there is also a 2-cycle, so some sort of reflection or 180 degree rotation. If you think about it, you'll see that what is happening is that you're holding the cube by opposite corners (2 and 5 here), and spinning it by 60 degrees - but at the same time, reflecting it in the plane midway between.&lt;/li&gt;&lt;li&gt;Exercise: Finish off the list&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;Previously we saw that the automorphism group of the complete graph k n is the symmetric group S n.&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_XQ7FznWBAYE/SkvFIrgrcZI/AAAAAAAAAC0/GRcvCquAbc8/s1600-h/k5labelled.GIF"&gt;&lt;img style="cursor: pointer; width: 240px; height: 230px;" src="http://3.bp.blogspot.com/_XQ7FznWBAYE/SkvFIrgrcZI/AAAAAAAAAC0/GRcvCquAbc8/s320/k5labelled.GIF" alt="" id="BLOGGER_PHOTO_ID_5353589334969381266" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;The conjugacy classes of S n are particularly easy to understand. Let's look at an example:&lt;br /&gt;&lt;code&gt;&lt;br /&gt;&gt; mapM_ print $ conjClassReps $ _S 5&lt;br /&gt;([],1)&lt;br /&gt;([[1,2]],10)&lt;br /&gt;([[1,2],[3,4]],15)&lt;br /&gt;([[1,2],[3,4,5]],20)&lt;br /&gt;([[1,2,3]],20)&lt;br /&gt;([[1,2,3,4]],30)&lt;br /&gt;([[1,2,3,4,5]],24)&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;k 5 doesn't have an obvious spatial interpretation (unlike the pentagon and the cube that we have considered previously), so we're going to need a more abstract description of the classes.&lt;br /&gt;&lt;br /&gt;Let's have a look at one of the classes:&lt;br /&gt;&lt;code&gt;&gt; conjClass (_S 5) (p [[1,2]])&lt;br /&gt;[[[1,2]],[[1,3]],[[1,4]],[[1,5]],[[2,3]],[[2,4]],[[2,5]],[[3,4]],[[3,5]],[[4,5]]]&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;There might be a slight surprise here. Surely [[1,2]] and [[1,3]] don't belong in the same class, you might think, because 1 and 2 are next to each other, and 1 and 3 aren't. Or to take another example, we have [[1,2,3,4,5]], and [[1,3,5,2,4]] in the same class - surely that can't be right - we saw when looking at c 5 that one is a 1/5 rotation, and the other is a 2/5 rotation.&lt;br /&gt;&lt;br /&gt;Well, the first thing to point out is that we can be misled by the spatial representation of a graph. In the picture of k 5, it might look like the relationship between 1 and 2 is not the same as the relationship between 1 and 3. However, from the point of the view of the graph, it is. An alien who saw in graph space rather than in 2-d or 3-d space wouldn't be able to tell a difference.&lt;br /&gt;&lt;br /&gt;Another thing to emphasize is that conjugacy classes are relative to the group you're working in. For example, the elements [[1,2,3,4,5]] and [[1,3,5,2,4]] occur in both S 5 and D 10 (the symmetry group of c 5). In S 5, they are in the same conjugacy class, but in D 10, they are not. Specifically, we have:&lt;br /&gt;&lt;code&gt;&lt;br /&gt;&gt; p [[1,2,3,4,5]] ~^ p [[2,3,5,4]]&lt;br /&gt;[[1,3,5,2,4]]&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;[[2,3,5,4]] is an element of S 5, so [[1,2,3,4,5]] and [[1,3,5,2,4]] are conjugate in S 5. However, [[2,3,5,4]] is not an element of D 10, so they are not conjugate in D 10.&lt;br /&gt;&lt;br /&gt;We can translate this back to our intuition about conjugate elements "looking the same but from a different angle". We have our alien who sees in graph space. When our alien is sitting in the graph space k 5, then [[2,3,5,4]], being a symmetry of the graph, is a transformation that moves the graph to a different angle. On the other hand, in the graph space c 5, [[2,3,5,4]] is not a symmetry - instead of moving the graph to a different angle, it just crumples the graph up.&lt;br /&gt;&lt;br /&gt;Let's look a little more closely at that conjugation operation again:&lt;br /&gt;&lt;code&gt;&lt;br /&gt;&gt; p [[1,2,3,4,5]] ~^ p [[2,3,5,4]]&lt;br /&gt;[[1,3,5,2,4]]&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;Let's compare where we started - [[1,2,3,4,5]] - with where we ended - [[1,3,5,2,4]] - what has changed? Well, the 2 turned into a 3, the 3 turned into a 5, the 5 turned into a 4, and the 4 turned into a 2. So conjugating by [[2,3,5,4]] has the same effect as actually applying [[2,3,5,4]] to each of the numbers in the cycle notation for [[1,2,3,4,5]].&lt;br /&gt;&lt;br /&gt;Why is this? Well, remember that:&lt;br /&gt;&lt;code&gt;&lt;br /&gt;g ~^ h = h^-1 * g * h&lt;br /&gt;&lt;/code&gt;&lt;br /&gt;So if h is [[2,3,5,4]], what this says is:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;First, undo [[2,3,5,4]]. That is, put 3 into the 2 position, 5 into the 3 position, and so on.&lt;/li&gt;&lt;li&gt;Next, do g. But if g says to do something to 2, we'll actually be doing it to 3, which is in the 2 position, and so on.&lt;/li&gt;&lt;li&gt;Finally, do [[2,3,5,4]]. So put whatever's in the 2 position back into the 3 position, put what's in the 3 position back in the 5 position, and so on.&lt;/li&gt;&lt;/ul&gt;For example, if g = [[1,2,3,4,5]], then the overall effect on 3 is: put into the 2 position (h^-1), then put into the 3 position (g), then put into the 5 position (h). Notice the way that 3 gets off the merry-go-round at one point, but gets on again at a different point, because someone rearranged the chairs in the middle.&lt;br /&gt;&lt;br /&gt;Well anyway, I'm sure if you think about it hard enough, you'll get it.&lt;br /&gt;&lt;br /&gt;Getting back to the conjugacy classes of S n, it turns out that a conjugacy class in S n just consists of all elements having the same "cycle shape". For example, the class of [[1,2]] consists of all elements having a single 2-cycle. The class of [[1,2],[3,4]] consists of all elements having two 2-cycles. The class of [[1,2],[3,4,5]] consists of all elements having a 2-cycle and a 3-cycle.&lt;br /&gt;&lt;br /&gt;The reason for this should now be obvious. To get from [[1,2],[3,4,5]] to [[a,b],[c,d,e]], you just need to find the permutation that sends 1 to a, 2 to b, 3 to c, 4 to d, and 5 to e. Since S 5 contains &lt;span style="font-style: italic;"&gt;all&lt;/span&gt; permutations of [1..5], this permutation must be among them.&lt;br /&gt;&lt;br /&gt;Hope that all made sense.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5195188167565410449-4305409184519031168?l=haskellformaths.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://haskellformaths.blogspot.com/feeds/4305409184519031168/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://haskellformaths.blogspot.com/2009/07/conjugacy-classes-part-2.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5195188167565410449/posts/default/4305409184519031168'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5195188167565410449/posts/default/4305409184519031168'/><link rel='alternate' type='text/html' href='http://haskellformaths.blogspot.com/2009/07/conjugacy-classes-part-2.html' title='Conjugacy classes, part 2'/><author><name>DavidA</name><uri>http://www.blogger.com/profile/16359932006803389458</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_XQ7FznWBAYE/SkvDfny_UII/AAAAAAAAACs/MEniJ0X_RQE/s72-c/cube.GIF' height='72' width='72'/><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5195188167565410449.post-3016527389892647955</id><published>2009-07-03T08:02:00.000+01:00</published><updated>2009-07-03T15:30:12.199+01:00</updated><title type='text'>Conjugacy classes, part 1</title><content type='html'>(New release: HaskellForMaths 0.1.5 is available &lt;a href="http://hackage.haskell.org/package/HaskellForMaths-0.1.5"&gt;here&lt;/a&gt;, or &lt;a href="http://www.polyomino.f2s.com/haskellformathsv2/HaskellForMathsv2.html"&gt;here&lt;/a&gt;.)&lt;br /&gt;&lt;br /&gt;Over the last few weeks, we've been looking at the HaskellForMaths code for specifying grap
