Bezier curves prefer tea

My maths professor was hammering on the fact that Citroen used Bezier curves to make sure their cars have aesthetically pleasing curves. Again. (This is not a sponsored post from the automaker).

While I appreciate his effort in trying to make what I’m learning relevant to the real world, I kinda got the idea that Citroen used Bezier curves in their design process. Right about the 3rd tutorial lesson.

My professor then went on to give us homework. “Us” meaning 5 of us. It was an honours degree course. It wasn’t like there was a stampede to take post-graduate advanced maths lessons, you know.

Oh yes, homework. My professor, with wisdom acquired over years of teaching, gave a blend of theoretical and calculation-based questions. Any question that had the words “prove”, “justify”, “show” are probably theoretical questions. Calculation-based questions are like “What is 1 + 1?”. Everyone, at least theoretically (haha!), should be able to do the calculation-based questions. The theoretical questions would require more thinking (“Prove that such and such Bezier curve is actually equal to such and such.”).

My friend, who took the course with me, loved calculation-based questions. She’d sit patiently and hammer at the numbers and the calculator. I can’t say I love them. My professor once gave a question that amounted to solving a system of 5 linear equations with 5 unknowns, which amounted to solving a 5 by 5 matrix. By hand. (It involves 15 divisions, 50 multiplications and 50 subtractions. There’s a reason why linear algebra and numerical methods were pre-requisites) I wanted to scream in frustration, throw my foolscap paper at him, and strangle him. Not necessarily in that order.

This coming from someone who is fine with writing a C program doing memory allocations (using the malloc function. And then manually freeing the pointer with the memory allocation. We didn’t have garbage collection, ok?) to simulate an N-sized matrix, and then perform Gauss-Jordan elimination on the matrix. I used that program to solve a 100 by 100 matrix. But I dreaded solving a 5 by 5 matrix by hand.

It probably explains why I remember Bezier curves so much.

Anyway, a while ago, someone sent me a question (through Facebook, of all channels). He asked, for a given “y” value of a Bezier curve, how do you find the “x” value?

That is a question without a simple answer. The answer is, there’s no guarantee there’s only one “x” value. A cubic Bezier curve has a possibility of having 1, 2 or 3 “x” values (given a “y”). Here’s the “worst” case scenario:

Multi x values

So you can have at most 3 “x” values. In the case of the person who asked the question, this is not just wrong, but actually dangerous. The person was an engineer, working on software that cuts metal (or wood). The software had a Bezier curve in it, which it used to calculate (x,y) coordinate values to direct the laser beam (or whatever cutting tool) to the next point (and thus cut the material).

If a “y” value has multiple “x” values, the software won’t know which “x” value you want. And thus cut the material wrongly.

The only way a Bezier curve has only 1 value, is if it’s monotonically increasing/decreasing. That means for all values of x and y such that x <= y [or x >= y], that f(x) <= f(y) [or f(x) >= f(y)].

Bezier curves don’t work well in the Cartesian plane. They work fine after you’ve used them to calculate values, and then transfer onto the Cartesian plane. Bezier curves prefer to work with values of t.

Unwrapping flowers

Kyle McDonald and Golan Levin used some magic software on images of some unsuspecting flowers. They sliced a flower from the centre to its petals. Then grabbing the cut ends, they bunched the petals together and then dragged them out like playing an accordion.

Basically, they were doing some polar and Cartesian conversion on the images. I can’t show you their images, but you can go look at them at their article.

But wait a minute, I also realised I did something similar… it involved converting fisheye images

So I copied my own code (I don’t have that conversion code anywhere on my computer but my blog. Oh the weirdness…), and cast my own brand of magic on this unsuspecting jasmine here:
Jasmine
[original image by Thai Jasmine]

And I got this result:
Jasmine unwrapped

I think the unwrapping done by Golan and Kyle looks better.

Image rotation with bilinear interpolation

In this article, I’ll show you how to rotate an image about its centre. 3 assignment methods will be shown,

  • assign source pixels to destination pixels
  • assign destination pixels from source pixels
  • assign destination pixels from source pixels with bilinear interpolation

I’ll show you the code, then the results for comparison. So what’s bilinear interpolation?

Bilinear interpolation

Read up on linear interpolation first if you haven’t done so. “Bilinear” means there are 2 directions to interpolate. Let me illustrate.

Bilinear interpolation

In our case, we’re interpolating between 4 pixels. Visualise each pixel as a single point. Linearly interpolate between the top 2 pixels. Linearly interpolate between the bottom 2 pixels. Then linearly interpolate between the calculated results of the previous two.

You can expand on this concept to get trilinear interpolation.

Trilinear interpolation

LERPs is a short form of linear interpolations. When would trilinear interpolation be useful? Voxels, which is out of scope in this article.

Defining the centre of an image

I’m going to be fuzzy about this. I’m going to just take one pixel in the image and define it as the centre. This pixel is defined as having a horizontal index equal to half of its width (rounded down), and a vertical index equal to half its height (rounded down).

This means the image isn’t rotated about its “true” centre, but with a relatively large size, it won’t matter anyway. It’s not like you’re rotating an image of 5 pixel width and 3 pixel height, right?

The preparation part

The actual code is quite long, so I’m separating it into 4 parts.

  • Initialisation and variable declaration
  • Assigning source pixels to destination pixels
  • Assigning destination pixels from source pixels
  • Assigning destination pixels from source pixels with bilinear interpolation

It’s hard-coded with -30 degrees as the angle of rotation, but you can easily write it into a function.

// 30 deg = PI/6 rad
// rotating clockwise, so it's negative relative to Cartesian quadrants
const double cnAngle = -0.52359877559829887307710723054658;
// use whatever image you fancy
Bitmap bm = new Bitmap("rotationsource.jpg");
// general iterators
int i, j;
// calculated indices in Cartesian coordinates
int x, y;
double fDistance, fPolarAngle;
// for use in neighbouring indices in Cartesian coordinates
int iFloorX, iCeilingX, iFloorY, iCeilingY;
// calculated indices in Cartesian coordinates with trailing decimals
double fTrueX, fTrueY;
// for interpolation
double fDeltaX, fDeltaY;
// pixel colours
Color clrTopLeft, clrTopRight, clrBottomLeft, clrBottomRight;
// interpolated "top" pixels
double fTopRed, fTopGreen, fTopBlue;
// interpolated "bottom" pixels
double fBottomRed, fBottomGreen, fBottomBlue;
// final interpolated colour components
int iRed, iGreen, iBlue;
int iCentreX, iCentreY;
int iWidth, iHeight;
iWidth = bm.Width;
iHeight = bm.Height;

iCentreX = iWidth / 2;
iCentreY = iHeight / 2;

Bitmap bmSourceToDestination = new Bitmap(iWidth, iHeight);
Bitmap bmDestinationFromSource = new Bitmap(iWidth, iHeight);
Bitmap bmBilinearInterpolation = new Bitmap(iWidth, iHeight);

for (i = 0; i < iHeight; ++i)
{
    for (j = 0; j < iWidth; ++j)
    {
        // initialise when "throwing" values
        bmSourceToDestination.SetPixel(j, i, Color.Black);
        // since we're looping, we might as well do for the others
        bmDestinationFromSource.SetPixel(j, i, Color.Black);
        bmBilinearInterpolation.SetPixel(j, i, Color.Black);
    }
}

Some of it might not mean anything to you yet. Just wait for the rest of the code. You might want to read up on converting between raster, Cartesian and polar coordinates first before moving on.

Throwing values from source to destination

// assigning pixels from source image to destination image
for (i = 0; i < iHeight; ++i)
{
    for (j = 0; j < iWidth; ++j)
    {
        // convert raster to Cartesian
        x = j - iCentreX;
        y = iCentreY - i;

        // convert Cartesian to polar
        fDistance = Math.Sqrt(x * x + y * y);
        fPolarAngle = 0.0;
        if (x == 0)
        {
            if (y == 0)
            {
                // centre of image, no rotation needed
                bmSourceToDestination.SetPixel(j, i, bm.GetPixel(j, i));
                continue;
            }
            else if (y < 0)
            {
                fPolarAngle = 1.5 * Math.PI;
            }
            else
            {
                fPolarAngle = 0.5 * Math.PI;
            }
        }
        else
        {
            fPolarAngle = Math.Atan2((double)y, (double)x);
        }

        // the crucial rotation part
        fPolarAngle += cnAngle;

        // convert polar to Cartesian
        x = (int)(Math.Round(fDistance * Math.Cos(fPolarAngle)));
        y = (int)(Math.Round(fDistance * Math.Sin(fPolarAngle)));

        // convert Cartesian to raster
        x = x + iCentreX;
        y = iCentreY - y;

        // check bounds
        if (x < 0 || x >= iWidth || y < 0 || y >= iHeight) continue;

        bmSourceToDestination.SetPixel(x, y, bm.GetPixel(j, i));
    }
}
bmSourceToDestination.Save("rotationsrctodest.jpg", System.Drawing.Imaging.ImageFormat.Jpeg);

It should be fairly easy to read. Note the part about checking for the central pixel of the image. No rotation calculation necessary, so we assign and move to the next pixel. Note also the part about checking boundaries.

Finding values from the source

// assigning pixels of destination image from source image
for (i = 0; i < iHeight; ++i)
{
    for (j = 0; j < iWidth; ++j)
    {
        // convert raster to Cartesian
        x = j - iCentreX;
        y = iCentreY - i;

        // convert Cartesian to polar
        fDistance = Math.Sqrt(x * x + y * y);
        fPolarAngle = 0.0;
        if (x == 0)
        {
            if (y == 0)
            {
                // centre of image, no rotation needed
                bmDestinationFromSource.SetPixel(j, i, bm.GetPixel(j, i));
                continue;
            }
            else if (y < 0)
            {
                fPolarAngle = 1.5 * Math.PI;
            }
            else
            {
                fPolarAngle = 0.5 * Math.PI;
            }
        }
        else
        {
            fPolarAngle = Math.Atan2((double)y, (double)x);
        }

        // the crucial rotation part
        // "reverse" rotate, so minus instead of plus
        fPolarAngle -= cnAngle;

        // convert polar to Cartesian
        x = (int)(Math.Round(fDistance * Math.Cos(fPolarAngle)));
        y = (int)(Math.Round(fDistance * Math.Sin(fPolarAngle)));

        // convert Cartesian to raster
        x = x + iCentreX;
        y = iCentreY - y;

        // check bounds
        if (x < 0 || x >= iWidth || y < 0 || y >= iHeight) continue;

        bmDestinationFromSource.SetPixel(j, i, bm.GetPixel(x, y));
    }
}
bmDestinationFromSource.Save("rotationdestfromsrc.jpg", System.Drawing.Imaging.ImageFormat.Jpeg);

The key difference here is the use of the rotation angle. Instead of adding it, we subtract it. The reason is, we rotate source pixels 30 degrees clockwise and assign it to destination pixels. But from destination pixels, we get source pixels which are rotated 30 degrees anticlockwise. Either way, we get a destination image that's the source image rotated 30 degrees clockwise.

Rotating the other way

Also compare the assignment, noting the indices:

bmSourceToDestination.SetPixel(x, y, bm.GetPixel(j, i));
bmDestinationFromSource.SetPixel(j, i, bm.GetPixel(x, y));

x and y variables are calculated and thus "messy". I prefer my messy indices on the right. There's a practical reason for it too, which will be evident when I show you the rotation results.

Image rotation code with bilinear interpolation

// assigning pixels of destination image from source image
// with bilinear interpolation
for (i = 0; i < iHeight; ++i)
{
    for (j = 0; j < iWidth; ++j)
    {
        // convert raster to Cartesian
        x = j - iCentreX;
        y = iCentreY - i;

        // convert Cartesian to polar
        fDistance = Math.Sqrt(x * x + y * y);
        fPolarAngle = 0.0;
        if (x == 0)
        {
            if (y == 0)
            {
                // centre of image, no rotation needed
                bmBilinearInterpolation.SetPixel(j, i, bm.GetPixel(j, i));
                continue;
            }
            else if (y < 0)
            {
                fPolarAngle = 1.5 * Math.PI;
            }
            else
            {
                fPolarAngle = 0.5 * Math.PI;
            }
        }
        else
        {
            fPolarAngle = Math.Atan2((double)y, (double)x);
        }

        // the crucial rotation part
        // "reverse" rotate, so minus instead of plus
        fPolarAngle -= cnAngle;

        // convert polar to Cartesian
        fTrueX = fDistance * Math.Cos(fPolarAngle);
        fTrueY = fDistance * Math.Sin(fPolarAngle);

        // convert Cartesian to raster
        fTrueX = fTrueX + (double)iCentreX;
        fTrueY = (double)iCentreY - fTrueY;

        iFloorX = (int)(Math.Floor(fTrueX));
        iFloorY = (int)(Math.Floor(fTrueY));
        iCeilingX = (int)(Math.Ceiling(fTrueX));
        iCeilingY = (int)(Math.Ceiling(fTrueY));

        // check bounds
        if (iFloorX < 0 || iCeilingX < 0 || iFloorX >= iWidth || iCeilingX >= iWidth || iFloorY < 0 || iCeilingY < 0 || iFloorY >= iHeight || iCeilingY >= iHeight) continue;

        fDeltaX = fTrueX - (double)iFloorX;
        fDeltaY = fTrueY - (double)iFloorY;

        clrTopLeft = bm.GetPixel(iFloorX, iFloorY);
        clrTopRight = bm.GetPixel(iCeilingX, iFloorY);
        clrBottomLeft = bm.GetPixel(iFloorX, iCeilingY);
        clrBottomRight = bm.GetPixel(iCeilingX, iCeilingY);

        // linearly interpolate horizontally between top neighbours
        fTopRed = (1 - fDeltaX) * clrTopLeft.R + fDeltaX * clrTopRight.R;
        fTopGreen = (1 - fDeltaX) * clrTopLeft.G + fDeltaX * clrTopRight.G;
        fTopBlue = (1 - fDeltaX) * clrTopLeft.B + fDeltaX * clrTopRight.B;

        // linearly interpolate horizontally between bottom neighbours
        fBottomRed = (1 - fDeltaX) * clrBottomLeft.R + fDeltaX * clrBottomRight.R;
        fBottomGreen = (1 - fDeltaX) * clrBottomLeft.G + fDeltaX * clrBottomRight.G;
        fBottomBlue = (1 - fDeltaX) * clrBottomLeft.B + fDeltaX * clrBottomRight.B;

        // linearly interpolate vertically between top and bottom interpolated results
        iRed = (int)(Math.Round((1 - fDeltaY) * fTopRed + fDeltaY * fBottomRed));
        iGreen = (int)(Math.Round((1 - fDeltaY) * fTopGreen + fDeltaY * fBottomGreen));
        iBlue = (int)(Math.Round((1 - fDeltaY) * fTopBlue + fDeltaY * fBottomBlue));

        // make sure colour values are valid
        if (iRed < 0) iRed = 0;
        if (iRed > 255) iRed = 255;
        if (iGreen < 0) iGreen = 0;
        if (iGreen > 255) iGreen = 255;
        if (iBlue < 0) iBlue = 0;
        if (iBlue > 255) iBlue = 255;

        bmBilinearInterpolation.SetPixel(j, i, Color.FromArgb(iRed, iGreen, iBlue));
    }
}
bmBilinearInterpolation.Save("rotationbilinearinterpolation.jpg", System.Drawing.Imaging.ImageFormat.Jpeg);

This part is similar to the destination-from-source part, with a lot more calculations. We have to find the 4 pixels that surrounds our "true" position-calculated pixel. Then we perform linear interpolation on the 4 neighbouring pixels.

We need to interpolate for the red, green and blue components individually. Refer to my article on colour theory for a refresher.

Pictures, pictures!

After doing all that, we're finally done. Let me show you my source image first.

Rotation source

I added the marble cylinder for emphasising image quality. I needed something that's straight (vertically or horizontally) in the source image.

Here's what we get after rotating with the source-to-destination method:

Rotation - source to destination

Note the speckled black pixels dotted all over. This is because some of the destination pixels (which are within the image bounds) were unassigned.

Note also that the specks even form patterns. This is due to the sine and cosine functions, and the regularity of pixel width and height. Sine and cosine are periodic functions. Since pixel indices are regular, therefore sine and cosine results are regular too. Hence, calculations regularly fail to assign pixel values.

There might be source pixels that have the same calculated destination pixel (due to sine and cosine and rounding). This also implies that there might be anti-gravity destination pixels that no source pixel can ever matched to! I haven't verified this, but it seems a possibility.

Anti-gravity destination pixel

Still think you should iterate over the source (image/array) instead of over the destination?

Next, we have the image result of the destination-from-source method:

Rotation - destination from source

Compare the quality with the source-to-destination part. No missing pixels. It's still sort of grainy though. This is because some of the destination pixels get their values from the same source pixel, so there might be 2 side-by-side destination pixels with the same colour. This gives mini blocks of identical colour in the result, which on the whole, gives an unpolished look.

Now, we have the bilinear interpolation incorporated version.

Rotation - destination from source with bilinear interpolation

It looks smoother, right? Note the straight edge of the marble cylinder. Compare with the image result without bilinear interpolation.

I might do a version with cubic interpolation for even smoother results, but I feel the bilinear version is good enough for now. Have fun!

[UPDATE: Now, there's a version without the resulting image being clipped.]

If you enjoyed this article and found it useful, please share it with your friends. You should also subscribe to get the latest articles (it's free).

Converting between raster, Cartesian and polar coordinates

As with the article on linear interpolation, this article is in preparation for the upcoming discussion on image rotation. I just realised that if I don’t write this, I’m going to take a long time explaining the image rotation code.

We’ve already covered Cartesian coordinates. So what are raster and polar coordinates?

Raster coordinates

I did some research, and to my surprise, there’s no such thing as a raster coordinate system! So where did this term enter my memory? Hmm…

That’s fine, we’ll define it here then. Raster coordinates are actually very simple.

Raster coordinates

The horizontal axis is similar to the usual x-axis, just only with non-negative values. The vertical axis is an inverted y-axis (so values increase downwards instead of upwards), and also has only non-negative values. I’ve only used raster coordinates with images, so negative indices don’t make sense.

For illustration, suppose w is the width of the image and h is the height of the image (in pixels). Then (0,0) is the top left corner of the image. (w,0) is the top right corner of the image. (0,h) is the bottom left corner of the image. And (w,h) is the bottom right corner of the image.

You’ll encounter raster coordinates when you deal with texture mapping. But that’s another story for another day…

Polar coordinates

Polar coordinates have two components, a length and an angle. 2D Cartesian coordinates also have two components, an x and a y. So what’s the difference?

Polar coordinates

Note that angles are measured from the positive x-axis and goes anti-clockwise. Remember the quadrants of the 2D Cartesian plane? I’ve included an example in the illustration with the line in the 3rd quadrant.

So how are polar coordinates related to Cartesian coordinates?

Polar coordinates and trigonometry

You’ll have to revise your trigonometry lessons. I’ll leave it to you to find out the derivation.

Converting from raster to Cartesian to polar coordinates. And back.

Why would anyone convert from raster to Cartesian to polar, only to convert back from polar to Cartesian to raster? Ahh… let’s look at a diagram.

Coordinate conversion diagram

We can only do proper rotation at the polar coordinate stage. But we start with an image, with raster coordinates. So we convert from (image) raster coordinates to Cartesian, then to polar, do the rotation part, convert back to Cartesian, and then back to raster coordinates.

To determine the formula for raster-Cartesian conversion, let’s look at the four corners of the image. We want
raster (0,0) -> Cartesian (-w/2,h/2)
raster (w,0) -> Cartesian (w/2,h/2)
raster (0,h) -> Cartesian (-w/2,-h/2)
raster (w,h) -> Cartesian (w/2,-h/2)

Based on that, the formula is
x = rasterX – w/2
y = h/2 – rasterY

For Cartesian-polar conversion, we have
r = sqrt(x*x + y*y)
theta = PI/2 if x=0 and y>0
theta = 3*PI/2 if x=0 and y<0 theta = arctan(y/x) otherwise I don't need to restrict theta to be within [0,2*PI) interval, even though it's mentioned here.

[short digression]
The square bracket [ means 0 is included in the interval. The round bracket ) means 2*PI is not included. The sine and cosine functions take in any real values. 2*PI + 1 radians automatically wraps to 1 radian by sine and cosine.
[end digression]

I don’t really need to explain why there are 3 conditional assignments for theta, right?

Alright, fine. x is a denominator as a parameter in the arctan function. That means you need to check for x equal to zero. Next, if x=0, the angle can either be 90 degrees (positive y-axis) or 270 degrees (negative y-axis). Hence spawning the other 2 conditions.

For polar-Cartesian conversion, we have
x = r * cos(theta)
y = r * sin(theta)

For Cartesian-raster conversion, we have
rasterX = x – w/2
rasterY = h/2 – x

As for some of the edge cases, we’ll look at them when we get to the code. Oh yes, there’s code. Lot’s of it. Stay tuned.

UPDATE: “I’m dying to look at the code for rotating images with bilinear interpolation! Bring me there!”

Cartesian coordinates and transformation matrices

If you’re doing any work in 3D, you will need to know about the Cartesian coordinate system and transformation matrices. Cartesian coordinates are typically used to represent the world in 3D programming. Transformation matrices are matrices representing operations on 3D points and objects. The typical operations are translation, rotation, scaling.

2 dimensional Cartesian coordinates

You should have seen something like this in your math class:

2D Cartesian coordinates
[original image]

The Roman letters I, II, III, and IV represent the quadrants of the Cartesian plane. For example, III represents the third quadrant. Not a lot to say here, so moving on…

3 dimensional Cartesian coordinates

And for 3 dimensions, we have this:

3D Cartesian coordinates
[original image]

I don’t quite like the way the z-axis points upward. The idea probably stems from having a piece of paper representing the 2D plane formed by the x and y axes. The paper is placed on a flat horizontal table, and the z-axis sticks right up.

Mathematically speaking, there’s no difference.

However, I find it easier to look at it this way:

Another 3D Cartesian representation

The XY Cartesian plane is upright, representing the screen. The z-axis simply protrudes out of the screen. The viewport can cover all four quadrants of the XY plane. The illustration only covered the first quadrant so I don’t poke your eye out with the z-axis *smile*

There is also something called the right-hand rule, and correspondingly the left-hand rule. The right-hand rule has the z-axis pointing out of the screen, as illustrated above. The left-hand rule has the z-axis pointing into the screen. Observe the right-hand rule:

Right-hand rule

The thumb represents the x-axis, the index finger represents the y-axis and the middle finger represents the z-axis. As for the left-hand rule, we have:

Left-hand rule

We’re looking at the other side of the XY plane, but it’s the same thing. The z-axis points in the other direction. And yes, I have long fingers. My hand span can cover an entire octave on a piano.

What’s the big deal? Because your 3D graphics engine might use a certain rule by default, and you must follow. Otherwise, you could be hunting down errors like why an object doesn’t appear on the screen. Because the object was behind the camera when you thought it’s in front. Your selected graphics engine should also allow you to use the other rule if you so choose.

In case you’re wondering, here’s the right-hand rule illustration with the z-axis pointing upwards:

Right-hand rule with z-axis upwards

I still don’t like a skyward-pointing z-axis. It irks me for some reason…

Scaling (or making something larger or smaller)

So how do you enlarge or shrink something in 3D? You apply the scaling matrix. Let’s look at the 2D version:

Scaling a circle in 2D

If your scaling factor is greater than 1, you’re enlarging an object. If your scaling factor is less than 1, you’re shrinking an object. What do you think happens when your scaling factor is 1? Or when your scaling factor is negative?

So how does the scaling factor look like in a scaling matrix?

Scaling matrix 2D

If you don’t know what that means, or don’t know what the result should be like, review the lesson on matrices and the corresponding program code.

You will notice there are separate scaling factors for x- and y- axes. This means you can scale them independently. For example, we have this:

Sphere above water

And we only enlarge along the x-axis:

Enlarge sphere along x-axis

We can also only enlarge along the y-axis:

Enlarge sphere along y-axis

Yeah, I got tired of drawing 2D pictures, so I decided to render some 3D ones. Speaking of which, you should now be able to come up with the 3D version of the scaling matrix. Hint: just add a scaling factor for the z-axis.

Rotating (or spinning till you puke)

This is what a rotation matrix for 2 dimensions looks like:

Rotation matrix 2D

That symbol that looks like an O with a slit in the middle? That’s theta (pronounced th-ay-tuh), a Greek alphabet. It’s commonly used to represent unknown angles.

I’ll spare you the mathematical derivation of the formula. Just use it.

You can convince yourself with a simple experiment. Use the vector (1,0), or unit vector lying on the x-axis. Plug in 90 degrees for theta and you should get (0,1), the unit vector lying on the y-axis.

That’s anti-clockwise rotation. To rotate clockwise, just use the value with a change of sign. So you’ll have -90 degrees.

Depending on the underlying math libraries you use, you might need to use radians instead of degrees (which is typical in most math libraries). I’m sure you’re clever enough to find out the conversion formula for degree-to-radian yourself…

Now for the hard part. The 3D version of rotation is … a little difficult. You see, what you’ve read above actually rotates about the implied z-axis. Wait, that means you can rotate about the x-axis! And the y-axis! Sacrebleu! You can rotate about any arbitrary axis!

I’ll write another article on this. If you’re into this, then you might want to take a look at this article on 3D rotation. I’ll also touch on a concept where you rotate about one axis and then rotate about another axis. Be prepared for lots of sine’s and cosine’s in the formula. Stop weeping; it’s unseemly of you.

Translating (nothing linguistic here)

What it means is you’re moving points and objects from one position to another. Let’s look at a 1 dimensional example:

Translation in 1 dimension

The squiggly unstable looking d-wannabe? It’s the Greek alphabet delta. Delta-x is a standard notation for “change in x”. In this case “x” refers to distance along the x-axis. We’ll use an easier-to-type version called “dx” for our remaining discussion.

Translation in 2 dimensions

In 2 dimensions, we have the corresponding dy, for “change in y”. Note that there’s no stopping you from using negative values for dx or dy. In the illustration above, dx and dy are negative.

You’ll have to imagine the case for 3D because the diagram is likely to be messy. But it’s easy to visualise. You just do the same for z-axis.

So what’s the transformation matrix for translation? First, you need to extend your matrix size and vector size by one dimension. The exact reasons are due to affine transformations and homogeneous coordinates (I’ve mentioned them briefly earlier).

Consider this: You have a point (x,y,z) and you want it to be at the new position (x + dx, y + dy, z + dz). The matrix will then look like this:

Translation matrix

Notice that for scaling, the important entries are the diagonal entries. For rotation, there are sine’s and cosine’s and they’re all over the place. But for translation, the “main body” of the matrix is actually an identity matrix. The fun stuff happens in the alleyway column on the extreme right of the matrix.

That reminds me. Because you’ll be using all the transformation matrices together, all matrices must be of the same size. So scaling and rotation matrices need to be 4 by 4 too. Just extend them with zero entries except the bottom right entry, which is 1.

Conclusion

We talked about 2D and 3D Cartesian coordinates. I’ve also shown you the right-hand and left-hand rules. This forms the basis for learning basic transformations such as scaling, rotation and translation.

There are two other interesting transformation matrices: shearing and reflection. I thought what we have is good enough for now. You are free to do your own research. When the situation arise, I’ll talk about those two transformations.

If you enjoyed this article and found it useful, please share it with your friends. You should also subscribe to get the latest articles (it’s free).

You might find this article on converting between raster, Cartesian and polar coordinates useful.

You might find these books on “coordinate transformation” useful.