## Rotating a matrix cannot be done with matrix multiplication

Note that this is different from rotation matrices in our previous discussion on transformation matrices. We were rotating (3D) objects and vertices (points) then. We’re talking about rotating a matrix here.

I read this article by Raymond Chen discussing the rotation of a 2 dimensional array (which is equivalent to a matrix in our case). In it, he stated:

The punch line for people who actually know matrix algebra: Matrix multiplication doesn’t solve the problem anyway.

Yeah, I’m one of those people.

I’ve never quite thought about it before, so I decided to explore it further. Why can’t matrix multiplication be used?

Before we go into that, let’s look at a reference link in the above article from which this whole topic came about. In it, Chris Williams (the author) gave some code for rotating a matrix. I’m not sure what he referred to by “left” and “right” turn because I feel it’s a bit ambiguous.

Anyway, the code on the left turn is wrong. This is what’s given:

```' For LEFT turns

For Y = 0 to 3

For X = 0 to 3

Destination(Y,X) = Source(X,Y)

Next

Next
```

That is the algorithm for transposing a matrix.

He also gave code for the “right” turns, which is correct. I prefer to have “messy” indices on the right side of the assignment. To each his own…

Anyway, here’s what I came up with:

```const int cnSize = 4;
int[,] Source = new int[cnSize, cnSize];
int[,] Destination = new int[cnSize, cnSize];
int i, j;

Console.WriteLine("Source matrix:");
for (i = 0; i < cnSize; ++i)
{
for (j = 0; j < cnSize; ++j)
{
Source[i, j] = i * cnSize + (j + 1);
Console.Write("{0:d2} ", Source[i, j]);
Destination[i, j] = -1;
}
Console.WriteLine();
}
Console.WriteLine();

Console.WriteLine("Using given 'clockwise turn' formula");
// given left turn
for (j = 0; j < cnSize; ++j)
{
for (i = 0; i < cnSize; ++i)
{
Destination[j, i] = Source[i, j];
}
}
for (i = 0; i < cnSize; ++i)
{
for (j = 0; j < cnSize; ++j)
{
Console.Write("{0:d2} ", Destination[i, j]);
}
Console.WriteLine();
}
Console.WriteLine();

Console.WriteLine("Using corrected 'clockwise turn' formula");
// correct given left turn
for (j = 0; j < cnSize; ++j)
{
for (i = 0; i < cnSize; ++i)
{
Destination[j, cnSize - 1 - i] = Source[i, j];
}
}
for (i = 0; i < cnSize; ++i)
{
for (j = 0; j < cnSize; ++j)
{
Console.Write("{0:d2} ", Destination[i, j]);
}
Console.WriteLine();
}
Console.WriteLine();

Console.WriteLine("Using given 'anticlockwise turn' formula");
// given right turn
for (j = 0; j < cnSize; ++j)
{
for (i = 0; i < cnSize; ++i)
{
Destination[cnSize - 1 - j, i] = Source[i, j];
}
}
for (i = 0; i < cnSize; ++i)
{
for (j = 0; j < cnSize; ++j)
{
Console.Write("{0:d2} ", Destination[i, j]);
}
Console.WriteLine();
}
Console.WriteLine();

Console.WriteLine("End of program");
```

I said you'd have to get used to nested for loops, didn't I? *smile* The output looks like this:

Ok, back to the issue at hand. Let me phrase the question as "Is there a general transformation matrix that rotates a square matrix with size N (N > 1) clockwise?" I'm going to try answering that question using proof by contradiction.

Suppose there is such a transformation matrix. Without loss of generality, we'll assume N to be 2. So there is a 2 by 2 matrix A such that

```[ A(0,0)  A(0,1) ]  [ a  b ]  =  [ c  a ]
[ A(1,0)  A(1,1) ]  [ c  d ]     [ d  b ]
```

Let's look at the top left and top right entries of the resulting matrix, which gives us two simultaneous equations:
A(0,0)a + A(0,1)c = c
A(0,0)b + A(0,1)d = a

Taking the 1st equation, we have
A(0,0)a = c - A(0,1)c

Dividing both sides by a, we have
A(0,0) = (c/a) * (1 - A(0,1))

You might find this ok, but take a look at the (c/a) part. This assumes that a is non-zero. Think about that. Our general transformation matrix assumes that the top left entry "a" to be rotated is non-zero. Hmm... Let's continue for a bit.

Substituting the value of A(0,0) into the 2nd equation, we have
b*(c/a)*(1 - A(0,1)) + A(0,1)d = a

Do the algebraic simplifications, and we'll get this
A(0,1) = (a^2 - bc) / (da - bc)

Take a look at the denominator. This assumes that (da - bc) is non-zero. If you have some knowledge of matrices, this is the determinant of the matrix.

So, our general transformation matrix assumes that the top left entry is non-zero and the determinant of the 2 by 2 matrix to be rotated is non-zero. Do you see problems yet? And we're not even looking at the other 2 simultaneous equations yet...

We have arrived at a contradiction. Our "general" transformation matrix isn't general at all. There are hidden assumptions. This means there's no such general transformation matrix for rotating a matrix.

Q.E.D.

I feel my proof given above is kinda weak. Maybe you can come up with a stronger proof?

## Matrix multiplication code

The following code is to illustrate the matrix multiplication method mentioned previously. For simplicity sake, I’m limiting the size of the matrices to 3.

```const int cnSize = 3;
int[,] A = new int[cnSize, cnSize];
int[,] B = new int[cnSize, cnSize];
int[,] C = new int[cnSize, cnSize];
int[] x = new int[cnSize];
int[] y = new int[cnSize];
Random rand = new Random();
int i, j, k;

// fill matrix and vector with random values
for (i = 0; i < cnSize; ++i)
{
for (j = 0; j < cnSize; ++j)
{
A[i, j] = rand.Next(1, 10);
B[i, j] = rand.Next(1, 10);
}
x[i] = rand.Next(1, 10);
}

// matrix-vector multiplication
for (i = 0; i < cnSize; ++i)
{
y[i] = 0;
for (k = 0; k < cnSize; ++k)
{
y[i] += A[i, k] * x[k];
}
}

// matrix-matrix multiplication
for (i = 0; i < cnSize; ++i)
{
for (j = 0; j < cnSize; ++j)
{
C[i, j] = 0;
for (k = 0; k < cnSize; ++k)
{
C[i, j] += A[i, k] * B[k, j];
}
}
}

Console.WriteLine("Matrix-vector multiplication");
for (i = 0; i < cnSize; ++i)
{
Console.Write("[");
for (j = 0; j < cnSize; ++j)
{
}
Console.WriteLine("][{0}] {1} [{2}]", x[i].ToString().PadLeft(3), ((cnSize / 2) == i ? "=" : " "), y[i].ToString().PadLeft(3));
}
Console.WriteLine();

Console.WriteLine("Matrix-matrix multiplication");
for (i = 0; i < cnSize; ++i)
{
Console.Write("[");
for (j = 0; j < cnSize; ++j)
{
}
Console.Write("][");
for (j = 0; j < cnSize; ++j)
{
}
Console.Write("] {0} [", ((cnSize / 2) == i ? "=" : " "));
for (j = 0; j < cnSize; ++j)
{
}
Console.WriteLine("]");
}
Console.WriteLine();
```

You will notice there's a lot of nested `for` loops. Get used to it. Here's a screenshot of the output:

Exercise: Explain what this does. (a ternary operator refresher might help)

```((cnSize / 2) == i ? "=" : " ")
```

## Matrices for programmers

Following the fine tradition of the colour theory post, you are getting another crash course. This time, a lesson in matrices. You’re going to be fine. And yes, I’ll hold your hand while you do this. *smile*

For those who are mathematically inclined, we’ll be working in the realm of real numbers (which I talked about briefly when discussing floating points). Let’s start with…

### Scalars

Scalars are simply numbers. For example, 2 is a scalar. So is 3.14159 and 1.618. And so is -273.15. Bonus points if you can figure out what those numbers are special for.

Scalars are stored as normal variables in code. Your `int`s, `float`s, `double`s come in handy.

Scalars are typically denoted by a lowercase alphabet, such as a or b or c.

### Vectors

Vectors are series of scalars. For example, [1 3 5 7 9] is a vector.

You typically store vectors as an array. For example,

```int[] v = new int[] { 1, 3, 5, 7, 9 };
```

Vectors are typically denoted by a lowercase alphabet in bold, such as v.

### Matrices

Matrices are series of series of scalars, or series of vectors. In code, they are typically stored as an array of arrays.

```int[,] A = new int[3, 3];
```

Matrices are also known as multidimensional arrays. The dimension of a matrix is m-by-n, where m is the number of rows and n is the number of columns.

When either m or n is 1, we get a vector. So a vector is a special case of a matrix. And because of this, we have to define…

### Row and column vectors

It’s easier to just show you how they look like.

A row vector is a matrix where the number of rows is 1. A column vector is a matrix where the number of columns is 1. While we’re at it, a scalar can be thought of as a matrix where the number of rows and columns are both 1.

For our purposes of working towards 3D programming, we’ll be focusing on the column vector. It doesn’t matter which one we use when coding, but in terms of notation, we’ll be using column vectors. You will see why later on.

### Matrix entries

Individual entries are referred to with the notation A[i,j] (or ai,j) where A is the matrix, i is the i-th row and j is the j-th column. Typically, we have
1 <= i <= m and 1 <= j <= n, where m is the number of rows and n is the number of columns. Take note, because you'll be using them in code. So know how your programming language does indices of arrays. If your language starts with the 0-th element, make sure to shift positions by one less. Then you have 0 <= i <= m-1 and 0 <= j <= n-1. The 0 index has tripped many a programmer, so be careful.

### Square matrices

This is a special case where both the number of rows and number of columns are equal. For example, a 3 by 3 matrix, or a 4 by 4 matrix.

In a stroke of coincidence, we will also be focusing on 3 by 3 and 4 by 4 matrices. Hint: It’s because we’re working in 3D.

### Identity matrix

In math, there is a number such that when you multiply anything by it, you get back the same thing. It’s the number 1. For example, 8 * 1 = 1 * 8 = 8.

We have the same concept for matrices. There is a matrix such that when you multiply any matrix by it, you get the same original matrix back. It’s called the identity matrix, typically denoted by an uppercase “I”.

We’ll look at matrix operations soon.

### Zero matrix

You know that multiplication unity described above when defining the identity matrix? Guess what, there’s a number such that when you add anything to it, you get back the same thing. It’s the number 0. For example, 8 + 0 = 0 + 8 = 8.

Similarly, we have the zero matrix. It’s simply a matrix with zeroes in all its entries. It’s denoted by a big gigantic 0. Probably not quite useful to you, but nevertheless, you now know something more.

### Symmetrical matrices

Symmetrical matrices are symmetrical about the diagonal. Where’s the diagonal? Look at this:

For a 3 by 3 matrix with values:
a b c
d e f
g h i

Entries a, e and i form the diagonal. Notation wise, A[i,i] are the diagonal entries.

If a matrix has zeroes in entries below the diagonal, it’s known as an upper triangular matrix. In our case, d = g = h = 0.

Similarly, if a matrix has zeroes in entries above the diagonal, it’s known as a lower triangular matrix. In our case, b = c = f = 0.

What symmetry means in this case is b=d, c=g and f=h. The general formula is
A[i,j] = A[j,i]

To speed up computations when checking symmetry, some algorithms use
A[i,j] = A[j,i], where i < j The extra condition leaves out the diagonal and entries below the diagonal. No point double checking values, right?

### Transpose of a matrix

Now that we know what a symmetrical matrix and its diagonal, we can define the transpose of a matrix. What you do is simply flip the matrix about its diagonal.

For a matrix A whose values are:
a b c
d e f
g h i

Its transpose is:
a d g
b e h
c f i

The transpose of a matrix A is denoted by AT. So if A = AT, A is a symmetrical matrix.

Yes, we’re dealing with square matrices. Rectangular matrices aren’t useful for our purposes in 3D programming, and you’re welcome to research on its practical uses (try “operations research“).

### The inverse of a matrix

The inverse of a square matrix A is denoted by A-1, where
AA-1 = A-1A = I

Yes, I know I still haven’t covered matrix multiplication. Just go with it a little longer…

For a matrix product AB, it’s inverse is
(AB)-1 = B-1A-1

Then this looks beautiful:
(AB)-1AB
= B-1A-1AB
= B-1IB
= B-1B
= I

Don’t you think that looks beautiful? *smile*

### Matrix equality

Matrices A and B are said to be equal if every corresponding entry of both matrices are equal. In notation, A[i,j] = B[i,j] for all i and j.

A matrix C is said to be the sum of matrices A and B if
C[i,j] = A[i,j] + B[i,j] for all i and j.

Subtraction is similar. Let me show you a scalar example.
8 – 5 = 8 + (-5)
Same thing for matrices. The negative sign is “pushed in” to individual entries.

### Matrix multiplication by scalar

Let’s multiply matrices by scalars first. It’s easy.

Just multiply the scalar with all the entries in the matrix.

### Matrix multiplication by vector

This one’s a little more complicated. For our purposes, we are concerned with multiplying a matrix A by a column vector v. Yes, the order and the type of vector matters. Let’s look at a diagram.

The result is a column vector. Suppose we multiply matrix A by column vector x and we get a column vector y, the general formula is
y[i] = A[i,0] * x[0] + A[i,1] * x[1] + … + A[i,n] * x[n]

It actually looks much more concise if I can use the summation notation… BUT, I’m trying to simplify things for you. Hopefully, you can visualise how it works with the diagram. I’ll write another post with code to explain this.

You can’t multiply a matrix by a row vector though. Hopefully through the diagram, you’ll see why it doesn’t work. What happens is, you multiply each row of the matrix by the values down the column vector. Since a row vector only has one value “down the column”, it doesn’t make sense to multiply matrices by row vectors.

You can multiply a row vector by a matrix to get a row vector. But it’s not useful for our purposes. If you understand a little about 3D transformations, then A is a transformation matrix, and x is a vertex. For example, A could be a translation matrix and moves x to point y. If you don’t understand any of this, relax, we’ll get there together soon.

### Matrix and matrix multiplication

This is complicated to show and explain, but once you get the idea, it’s actually easy to code.

I’ll leave it to you to figure out the general formula… It’s similar to the one with matrix by vector multiplication, only with more vectors. *smile* This is what I do in university, write out a’s and subscripts, and summation notations in my lecture notes and tutorial questions…

I’ll write another post explaining this (together with the matrix by vector multiplication) with code to illustrate the use.

In terms of 3D transformations, you could have a bunch of transformations done, say you rotate something, then translate (move) it. So you have something like TRx, where R is the rotation matrix, T is the translation matrix and x is the vertex.

Note the order. The earlier (in order) a transformation is done, the closer it is to the vertex in question. Basically you reverse the order of transformations when implementing.

Since we’re at it, matrix multiplications are not commutative. What it means is that
AB != BA
The order is important.

As an exercise, visualise the difference between moving something then rotate, and rotate then move.

### End of crash course

Whew… *wipe sweat* How’re you doing? Still with me?

Good. This sets the foundation you need for understanding 3D programming. Yay! Review what you’ve read, do some research if needed, and I’ll see you next time.