Optimal width and height after image rotation

A while ago, a blog reader Fabien sent me some code (you can read it here. Thanks Fabien!). The PHP code is a modification of my image rotation code with some upgrades.

I was looking through his code (French variable names!) and was puzzled by the initial section. I believe he based his code on my code where the resulting image wasn’t clipped after rotation, meaning the whole image was still in the picture/bitmap (though rotated).

In that piece of code, I just used the diagonal length of the image (from top-left corner to bottom-right corner) as the final length and breadth of the resulting image. This gave the simplest resulting image dimension without doing complicated maths calculations (a square in this case).

However, what if you want to know the optimal width and height of the resulting image after rotation? Meaning the best-fit width and height that just manages to contain the resulting rotated image. For that, I need to tell you some basic trigonometry and geometry.

Image rotation, optimal width and height

Suppose you have a rectangle with L as the length and H as the height. It is rotated t angles. I’m not going to explain the maths behind it. It involves complementary angles, supplementary angles, rotation symmetry and trigonometry with sines and cosines. Convince yourself that the diagram is true.

So after rotating t angles, the optimal width is L * cos(t) + H * cos(90 – t)

The optimal height is L * sin(t) + H * sin(90 – t)

Short digression: You might notice that any lengths that lie parallel to the x-axis usually involve cosines, and lengths that lie parallel to the y-axis usually involve sines. It’s just the way trigonometry works.

Now, although the image rotation is carried out with respect to the image’s centre, rotating by the top-left corner will result in the same optimal width and height. Again, this is basic maths so you’ll just have to convince yourself it’s true (and that I don’t really want to explain it…).

But that’s if t is an acute angle. What about other angles?

Image rotation, optimal width and height

For those angles, we just need to calculate the acute angle based on the initial rotation angle. After that, just substitute that calculated acute angle into our formula above. I have absolute confidence in your ability to check which quadrant in the Cartesian coordinate system does your rotation angle lie in.

UPDATE: In case you are unable to view images, if your rotation angle is in the 2nd quadrant, the calculated angle is (180 – t). If in the 3rd quadrant, it’s (t – 180). And if in the 4th quadrant, it’s (360 – t).

In practice, you might still want to pad a couple of pixels around. But that should give you the smallest image dimension which can still contain your rotated image.

Image rotation with bilinear interpolation and no clipping

I wrote an article on image rotation with bilinear interpolation to calculate the pixel information. Commenter Jahn wanted to know if the resulting image could be obtained without clipping. I replied with the required code changes (it was fairly minimal), then I thought it might be useful to you too.

The code was cleaned up a bit, and here it is:

const int cnSizeBuffer = 20;
// 30 deg = PI/6 rad
// rotating clockwise, so it's negative relative to Cartesian quadrants
const double cnAngle = -0.52359877559829887307710723054658;
// use whatever image you fancy
Bitmap bm = new Bitmap("slantedtreetopsky.jpg");
// general iterators
int i, j;
// calculated indices in Cartesian coordinates
int x, y;
double fDistance, fPolarAngle;
// for use in neighbouring indices in Cartesian coordinates
int iFloorX, iCeilingX, iFloorY, iCeilingY;
// calculated indices in Cartesian coordinates with trailing decimals
double fTrueX, fTrueY;
// for interpolation
double fDeltaX, fDeltaY;
// pixel colours
Color clrTopLeft, clrTopRight, clrBottomLeft, clrBottomRight;
// interpolated "top" pixels
double fTopRed, fTopGreen, fTopBlue;
// interpolated "bottom" pixels
double fBottomRed, fBottomGreen, fBottomBlue;
// final interpolated colour components
int iRed, iGreen, iBlue;
int iCentreX, iCentreY;
int iDestCentre;
int iWidth, iHeight;
int iDiagonal;
iWidth = bm.Width;
iHeight = bm.Height;
iDiagonal = (int)(Math.Ceiling(Math.Sqrt((double)(bm.Width * bm.Width + bm.Height * bm.Height)))) + cnSizeBuffer;

iCentreX = iWidth / 2;
iCentreY = iHeight / 2;
iDestCentre = iDiagonal / 2;

Bitmap bmBilinearInterpolation = new Bitmap(iDiagonal, iDiagonal);

for (i = 0; i < iDiagonal; ++i)
{
	for (j = 0; j < iDiagonal; ++j)
	{
		bmBilinearInterpolation.SetPixel(j, i, Color.Black);
	}
}

// assigning pixels of destination image from source image
// with bilinear interpolation
for (i = 0; i < iDiagonal; ++i)
{
	for (j = 0; j < iDiagonal; ++j)
	{
		// convert raster to Cartesian
		x = j - iDestCentre;
		y = iDestCentre - i;

		// convert Cartesian to polar
		fDistance = Math.Sqrt(x * x + y * y);
		fPolarAngle = 0.0;
		if (x == 0)
		{
			if (y == 0)
			{
				// centre of image, no rotation needed
				bmBilinearInterpolation.SetPixel(j, i, bm.GetPixel(iCentreX, iCentreY));
				continue;
			}
			else if (y < 0)
			{
				fPolarAngle = 1.5 * Math.PI;
			}
			else
			{
				fPolarAngle = 0.5 * Math.PI;
			}
		}
		else
		{
			fPolarAngle = Math.Atan2((double)y, (double)x);
		}

		// the crucial rotation part
		// "reverse" rotate, so minus instead of plus
		fPolarAngle -= cnAngle;

		// convert polar to Cartesian
		fTrueX = fDistance * Math.Cos(fPolarAngle);
		fTrueY = fDistance * Math.Sin(fPolarAngle);

		// convert Cartesian to raster
		fTrueX = fTrueX + (double)iCentreX;
		fTrueY = (double)iCentreY - fTrueY;

		iFloorX = (int)(Math.Floor(fTrueX));
		iFloorY = (int)(Math.Floor(fTrueY));
		iCeilingX = (int)(Math.Ceiling(fTrueX));
		iCeilingY = (int)(Math.Ceiling(fTrueY));

		// check bounds
		if (iFloorX < 0 || iCeilingX < 0 || iFloorX >= iWidth || iCeilingX >= iWidth || iFloorY < 0 || iCeilingY < 0 || iFloorY >= iHeight || iCeilingY >= iHeight) continue;

		fDeltaX = fTrueX - (double)iFloorX;
		fDeltaY = fTrueY - (double)iFloorY;

		clrTopLeft = bm.GetPixel(iFloorX, iFloorY);
		clrTopRight = bm.GetPixel(iCeilingX, iFloorY);
		clrBottomLeft = bm.GetPixel(iFloorX, iCeilingY);
		clrBottomRight = bm.GetPixel(iCeilingX, iCeilingY);

		// linearly interpolate horizontally between top neighbours
		fTopRed = (1 - fDeltaX) * clrTopLeft.R + fDeltaX * clrTopRight.R;
		fTopGreen = (1 - fDeltaX) * clrTopLeft.G + fDeltaX * clrTopRight.G;
		fTopBlue = (1 - fDeltaX) * clrTopLeft.B + fDeltaX * clrTopRight.B;

		// linearly interpolate horizontally between bottom neighbours
		fBottomRed = (1 - fDeltaX) * clrBottomLeft.R + fDeltaX * clrBottomRight.R;
		fBottomGreen = (1 - fDeltaX) * clrBottomLeft.G + fDeltaX * clrBottomRight.G;
		fBottomBlue = (1 - fDeltaX) * clrBottomLeft.B + fDeltaX * clrBottomRight.B;

		// linearly interpolate vertically between top and bottom interpolated results
		iRed = (int)(Math.Round((1 - fDeltaY) * fTopRed + fDeltaY * fBottomRed));
		iGreen = (int)(Math.Round((1 - fDeltaY) * fTopGreen + fDeltaY * fBottomGreen));
		iBlue = (int)(Math.Round((1 - fDeltaY) * fTopBlue + fDeltaY * fBottomBlue));

		// make sure colour values are valid
		if (iRed < 0) iRed = 0;
		if (iRed > 255) iRed = 255;
		if (iGreen < 0) iGreen = 0;
		if (iGreen > 255) iGreen = 255;
		if (iBlue < 0) iBlue = 0;
		if (iBlue > 255) iBlue = 255;

		bmBilinearInterpolation.SetPixel(j, i, Color.FromArgb(iRed, iGreen, iBlue));
	}
}
bmBilinearInterpolation.Save("rotatedslantedtreetopsky.jpg", System.Drawing.Imaging.ImageFormat.Jpeg);

In the original article, the resulting image had the same dimensions as the source image. This meant that after rotation, some pixel information would be lost. To counter that, we make the dimensions of the resulting image “large enough”.

Considering that this is a rotation operation with an arbitrary angle, when all possible rotations of the source image are placed on top of one another, you get a smudgy circle. Not sure if you can visualise that. So the resulting image must be able to contain a circle, and because our images are stored in rectangular format, we need a square. That’s why in the code, I didn’t use a separate X and Y component of the dimensions, because they’re the same.

The width of the square has to be, at the minimum, the diagonal of the source image (which you can use Pythagoras’ Theorem to calculate). I added a buffer (of 20 pixels) to ensure that our square can contain the resulting image.

Here’s the source image:
Slanted treetop and sky

And here’s the resulting image:
Rotated slanted treetop with sky

Alright, have fun with the code.

Stationary camera, moving scene

Previously, we talked about revolving the entire 3D scene about the camera, and also the problem of the camera looking directly downwards. Today, we’ll look at the mechanics of implementing that stationary camera (it ain’t pretty).

There are 2 transformations to take care of: translation and rotation. Translation takes care of the distance between the camera and the point it’s looking at. Rotation takes care of simulating the camera turning around to look at objects, roughly speaking. Let me use a 2D version to illustrate the concept.

Reverse translation and rotation of 2D scene

Suppose the camera is at some arbitrary position looking at an object. Based on the positions of the camera and the object, you can find the distance between them. You know, with this:
d = sqrt( (cx-ox)^2 + (cy-oy)^2 )
where cx and cy are the x-coordinate and y-coordinate of the camera respectively, and ox and oy are the x-coordinate and y-coordinate of the object respectively.

The camera is looking at the object, so the angle (theta) of its line of sight with respect to the (for example) x-axis can be calculated.

Suppose we want the stationary camera to look in the direction of the positive y-axis, and be positioned at the origin (0,0). To make the scene viewed through a stationary camera the same as that in the original version (the default by the 3D engine), we would rotate the entire scene (90 – theta) degrees, then translate the result of that d units along the positive y-axis.

Remember that order of transformations is important. Rotating first then translating, is (generally) different from translating then rotating.

So that’s the general idea of making a stationary camera work, by moving and rotating the entire scene. The fun part comes because it’s in 3D.

The distance calculation still holds true:
d = sqrt(x^2 + y^2 + z^2)

The angle… not so much. Because it’s in 3D, I adopted spherical coordinates. The radius would simply be the distance calculated previously. But there are now 2 angles to calculate, theta and phi.

Spherical coordinate angles

Suppose the camera is at (a,b,c) and the viewed object is at (p,q,r). We make the viewed object the centre of our attention, so we start our calculations with the object at the origin. Therefore, the camera is at (a-p, b-q, c-r).

We can calculate the distance between them as
d = sqrt( (a-p)^2 + (b-q)^2 + (c-r)^2 )

Then we also solve for the following set of simultaneous equations (note I’m using y-axis as the “upward” axis)
x = r * sin(theta) * sin(phi)
y = r * cos(phi)
z = r * cos(theta) * sin(phi)

==>

a-p = d * sin(theta) * sin(phi)
b-q = d * cos(phi)
c-r = d * cos(theta) * sin(phi)

to look for the angles theta and phi, where
0 <= theta <= 2*PI 0 <= phi < PI Once found, the rendering occurs by rotating the entire scene phi degrees about the positive z-axis (starting from negative y-axis as 0 degrees), then rotate about the positive y-axis (starting from the positive z-axis as 0 degrees), then translate by (-a,-b,-c) (this moves the entire scene away from the camera positioned at the origin). Well, that was a lot of trouble. What was I trying to solve again? Oh yeah, that looking down and losing the "up" vector problem. Notice anything wrong in this implementation? The "up" vector of the camera was never considered. But figuring out all the math was fun... if only it solved something too... *sigh* [Note: all use of "degrees" in this article can be substituted with "radians", depending on your situation. Use accordingly.]

Rotate backwards to stay level

It was a university programming assignment. I was to write an OpenGL program to render a Ferris wheel. The requirements were simple. There had to be 7 spokes emanating from the centre, each at an equal angle from each other. At the end of each spoke, there was to be a carriage. No outer rim was required. All 7 spokes and 7 carriages were to be simple cuboids. The wheel was to turn slowly. Colour aesthetics up to the individual.

I’ve already had lessons on simple rotation and translation operations in OpenGL. Ambient colouring, materials and shading were also taught. And simple cuboids were like basic rendering stuff.

The hard part that my fellow students found was in keeping the carriages level, while rotating the Ferris wheel.

My professor, being the evil mind that he was, chose 7 spokes, so the angle between each spoke was “weird” (no nice number). Believe it or not, that confused a heck of a lot of students… Rendering the spokes were easy. Render a long cuboid with one end at the origin, and rotate multiples of 360/7 degrees. The carriages on the other hand, needed some work…

A simple way of orienting the Ferris wheel is to align it with the XY plane, with the centre of the wheel at the origin. There are then 2 methods to render the carriages. The first is to calculate the XY coordinates of centres of all the carriages, and simply translate them there. Yes, there will be sines and cosines in the calculation. I’ll leave it to you as an exercise. If you were able to follow the article on bilinear interpolation in image rotation, you can do this.

The second method is to just use the rendering engine’s in-built functions. For example, you render a vertical spoke with one end at the origin, and the other end along the positive Y-axis. Then you render a carriage at the latter end of the spoke.

Then what do you do? Render the 2nd spoke-carriage combination exactly the same as the 1st, but rotate the whole thing 360/7 degrees clockwise. Here’s where the problem comes. Since the carriage is “tied” to the spoke, the rotation operation affects the carriage as well.

To keep the carriage level, you have to undo the rotation operation. How do you do that? Rotate the carriage in the other direction with the same angle.

Let’s leave the spokes out of the picture. To render a carriage in the correct position, at the correct angle, this is the series of steps to take (assuming the carriage is at the origin):

  • Rotate -i * (360/7) degrees (anti-clockwise)
  • Translate len units in positive Y direction
  • Rotate i * (360/7) degrees (clockwise)

where i is the number of multiples required, and len is the length of the spoke.

If you’re following this with OpenGL or DirectX, take care. That series of steps have to be reversed, because the 2 rendering engines apply the transformations in reverse order.

Hmm… that was a long story…

The one about chicken heads

Did you know chickens have this ability to keep their heads stable, even if their bodies are moving? Check this video out.

I believe chickens use a similar principle as discussed in the Ferris wheel. For example, if a chicken’s body was moved forwards (in the direction of its head), to keep its head in the same position, it has to move its head backwards.

A fluttering thought

To stay the same in the face of change, one must replicate and execute the change in the opposite manner.

Sort of like Newton’s First and Third Laws combined.

I was just thinking, in the face of changes in these times, staying the same is actually more tiring than going with the flow. I mean, you expend the exact same amount of effort to stay the same, and you have to expend more to improve (on business, on technology and so on). That doesn’t quite make sense…

It’s a cliche, I know, but the only constant in life is change. Expect it. Embrace it. Besides, staying the same is boring at some level…

A simple experiment

To convince yourself of the “backwards” principle discussed in the rendering of a Ferris wheel, try the following.

  • Stand up straight, face and body forwards
  • Turn your body, from the shoulders down, clockwise
  • You must keep your head still facing the same direction as at the start

Did you have to turn your head anti-clockwise to keep facing the same original direction?

Rotating a matrix cannot be done with matrix multiplication

Note that this is different from rotation matrices in our previous discussion on transformation matrices. We were rotating (3D) objects and vertices (points) then. We’re talking about rotating a matrix here.

I read this article by Raymond Chen discussing the rotation of a 2 dimensional array (which is equivalent to a matrix in our case). In it, he stated:

The punch line for people who actually know matrix algebra: Matrix multiplication doesn’t solve the problem anyway.

Yeah, I’m one of those people.

I’ve never quite thought about it before, so I decided to explore it further. Why can’t matrix multiplication be used?

Before we go into that, let’s look at a reference link in the above article from which this whole topic came about. In it, Chris Williams (the author) gave some code for rotating a matrix. I’m not sure what he referred to by “left” and “right” turn because I feel it’s a bit ambiguous.

Duty calls by XKCD

Anyway, the code on the left turn is wrong. This is what’s given:

' For LEFT turns

For Y = 0 to 3

    For X = 0 to 3

        Destination(Y,X) = Source(X,Y)

    Next

Next

That is the algorithm for transposing a matrix.

He also gave code for the “right” turns, which is correct. I prefer to have “messy” indices on the right side of the assignment. To each his own…

Anyway, here’s what I came up with:

const int cnSize = 4;
int[,] Source = new int[cnSize, cnSize];
int[,] Destination = new int[cnSize, cnSize];
int i, j;

Console.WriteLine("Source matrix:");
for (i = 0; i < cnSize; ++i)
{
    for (j = 0; j < cnSize; ++j)
    {
        Source[i, j] = i * cnSize + (j + 1);
        Console.Write("{0:d2} ", Source[i, j]);
        Destination[i, j] = -1;
    }
    Console.WriteLine();
}
Console.WriteLine();

Console.WriteLine("Using given 'clockwise turn' formula");
// given left turn
for (j = 0; j < cnSize; ++j)
{
    for (i = 0; i < cnSize; ++i)
    {
        Destination[j, i] = Source[i, j];
    }
}
for (i = 0; i < cnSize; ++i)
{
    for (j = 0; j < cnSize; ++j)
    {
        Console.Write("{0:d2} ", Destination[i, j]);
    }
    Console.WriteLine();
}
Console.WriteLine();

Console.WriteLine("Using corrected 'clockwise turn' formula");
// correct given left turn
for (j = 0; j < cnSize; ++j)
{
    for (i = 0; i < cnSize; ++i)
    {
        Destination[j, cnSize - 1 - i] = Source[i, j];
    }
}
for (i = 0; i < cnSize; ++i)
{
    for (j = 0; j < cnSize; ++j)
    {
        Console.Write("{0:d2} ", Destination[i, j]);
    }
    Console.WriteLine();
}
Console.WriteLine();

Console.WriteLine("Using given 'anticlockwise turn' formula");
// given right turn
for (j = 0; j < cnSize; ++j)
{
    for (i = 0; i < cnSize; ++i)
    {
        Destination[cnSize - 1 - j, i] = Source[i, j];
    }
}
for (i = 0; i < cnSize; ++i)
{
    for (j = 0; j < cnSize; ++j)
    {
        Console.Write("{0:d2} ", Destination[i, j]);
    }
    Console.WriteLine();
}
Console.WriteLine();

Console.WriteLine("End of program");
Console.ReadLine();

I said you'd have to get used to nested for loops, didn't I? *smile* The output looks like this:

Rotating a matrix

Ok, back to the issue at hand. Let me phrase the question as "Is there a general transformation matrix that rotates a square matrix with size N (N > 1) clockwise?" I'm going to try answering that question using proof by contradiction.

Suppose there is such a transformation matrix. Without loss of generality, we'll assume N to be 2. So there is a 2 by 2 matrix A such that

[ A(0,0)  A(0,1) ]  [ a  b ]  =  [ c  a ]
[ A(1,0)  A(1,1) ]  [ c  d ]     [ d  b ]

Let's look at the top left and top right entries of the resulting matrix, which gives us two simultaneous equations:
A(0,0)a + A(0,1)c = c
A(0,0)b + A(0,1)d = a

Taking the 1st equation, we have
A(0,0)a = c - A(0,1)c

Dividing both sides by a, we have
A(0,0) = (c/a) * (1 - A(0,1))

You might find this ok, but take a look at the (c/a) part. This assumes that a is non-zero. Think about that. Our general transformation matrix assumes that the top left entry "a" to be rotated is non-zero. Hmm... Let's continue for a bit.

Substituting the value of A(0,0) into the 2nd equation, we have
b*(c/a)*(1 - A(0,1)) + A(0,1)d = a

Do the algebraic simplifications, and we'll get this
A(0,1) = (a^2 - bc) / (da - bc)

Take a look at the denominator. This assumes that (da - bc) is non-zero. If you have some knowledge of matrices, this is the determinant of the matrix.

So, our general transformation matrix assumes that the top left entry is non-zero and the determinant of the 2 by 2 matrix to be rotated is non-zero. Do you see problems yet? And we're not even looking at the other 2 simultaneous equations yet...

We have arrived at a contradiction. Our "general" transformation matrix isn't general at all. There are hidden assumptions. This means there's no such general transformation matrix for rotating a matrix.

Q.E.D.

I feel my proof given above is kinda weak. Maybe you can come up with a stronger proof?