Converting between raster, Cartesian and polar coordinates

As with the article on linear interpolation, this article is in preparation for the upcoming discussion on image rotation. I just realised that if I don’t write this, I’m going to take a long time explaining the image rotation code.

We’ve already covered Cartesian coordinates. So what are raster and polar coordinates?

Raster coordinates

I did some research, and to my surprise, there’s no such thing as a raster coordinate system! So where did this term enter my memory? Hmm…

That’s fine, we’ll define it here then. Raster coordinates are actually very simple.

Raster coordinates

The horizontal axis is similar to the usual x-axis, just only with non-negative values. The vertical axis is an inverted y-axis (so values increase downwards instead of upwards), and also has only non-negative values. I’ve only used raster coordinates with images, so negative indices don’t make sense.

For illustration, suppose w is the width of the image and h is the height of the image (in pixels). Then (0,0) is the top left corner of the image. (w,0) is the top right corner of the image. (0,h) is the bottom left corner of the image. And (w,h) is the bottom right corner of the image.

You’ll encounter raster coordinates when you deal with texture mapping. But that’s another story for another day…

Polar coordinates

Polar coordinates have two components, a length and an angle. 2D Cartesian coordinates also have two components, an x and a y. So what’s the difference?

Polar coordinates

Note that angles are measured from the positive x-axis and goes anti-clockwise. Remember the quadrants of the 2D Cartesian plane? I’ve included an example in the illustration with the line in the 3rd quadrant.

So how are polar coordinates related to Cartesian coordinates?

Polar coordinates and trigonometry

You’ll have to revise your trigonometry lessons. I’ll leave it to you to find out the derivation.

Converting from raster to Cartesian to polar coordinates. And back.

Why would anyone convert from raster to Cartesian to polar, only to convert back from polar to Cartesian to raster? Ahh… let’s look at a diagram.

Coordinate conversion diagram

We can only do proper rotation at the polar coordinate stage. But we start with an image, with raster coordinates. So we convert from (image) raster coordinates to Cartesian, then to polar, do the rotation part, convert back to Cartesian, and then back to raster coordinates.

To determine the formula for raster-Cartesian conversion, let’s look at the four corners of the image. We want
raster (0,0) -> Cartesian (-w/2,h/2)
raster (w,0) -> Cartesian (w/2,h/2)
raster (0,h) -> Cartesian (-w/2,-h/2)
raster (w,h) -> Cartesian (w/2,-h/2)

Based on that, the formula is
x = rasterX – w/2
y = h/2 – rasterY

For Cartesian-polar conversion, we have
r = sqrt(x*x + y*y)
theta = PI/2 if x=0 and y>0
theta = 3*PI/2 if x=0 and y<0 theta = arctan(y/x) otherwise I don't need to restrict theta to be within [0,2*PI) interval, even though it's mentioned here.

[short digression]
The square bracket [ means 0 is included in the interval. The round bracket ) means 2*PI is not included. The sine and cosine functions take in any real values. 2*PI + 1 radians automatically wraps to 1 radian by sine and cosine.
[end digression]

I don’t really need to explain why there are 3 conditional assignments for theta, right?

Alright, fine. x is a denominator as a parameter in the arctan function. That means you need to check for x equal to zero. Next, if x=0, the angle can either be 90 degrees (positive y-axis) or 270 degrees (negative y-axis). Hence spawning the other 2 conditions.

For polar-Cartesian conversion, we have
x = r * cos(theta)
y = r * sin(theta)

For Cartesian-raster conversion, we have
rasterX = x – w/2
rasterY = h/2 – x

As for some of the edge cases, we’ll look at them when we get to the code. Oh yes, there’s code. Lot’s of it. Stay tuned.

UPDATE: “I’m dying to look at the code for rotating images with bilinear interpolation! Bring me there!”

Linear and cubic interpolation

Interpolation is a method of calculating a value from a set of given values. We’ll be looking at interpolation with a bias towards image processing, but the theory can be generalised for other purposes. You’ve probably already solved some interpolation problems without knowing it. Let me give you an example.

A distance problem

Suppose there are 3 towns A, B, C and they happen to lie on a straight line, in that order. B is 5 kilometres away from A, and C is 15 kilometres away from A. If you travel one quarter of the way from town B to town C, how far are you from town A?

Distance problem

To solve it, you can figure out the distance between B and C, which is 15 – 5 = 10 km. One quarter of the way means 1/4 * 10 = 2.5 km. Then add the distance between A and B to this and you have 5 + 2.5 = 7.5 km.

Linear interpolation

If you visualise the problem as interpolating between 2 points, then B becomes the point p0 with a value of 5 (km) and C becomes the point p1 with a value of 15 (km). The usual variable used is t, so the generic formula is:
f(t) = (1 – t) * p0 + t * p1, where t lies between 0 and 1 inclusive.

Using this, we have
f(1/4) = (1 – 1/4) * 5 + 1/4 * 15
= 3/4 * 5 + 1/4 * 15
= 7.5

This is linear interpolation. Linearity refers to the power of the variable t, which is 1. Note that there’s no stopping you from using negative values of t or values greater than 1.

Suppose you travelled from B to A one quarter of the distance between B and C. How far are you from town A?
f(-1/4) = (1 – (-1/4)) * 5 + (-1/4) * 15
= 5/4 * 5 – 1/4 * 15
= 2.5

Suppose you travelled from B to C and went past C by a quarter of the way. How far are you from town A?
f(5/4) = (1 – 5/4) * 5 + 5/4 * 15
= -1/4 * 5 + 5/4 * 15
= 17.5

What happens if you get a negative result?
f(-1) = (1 – (-1)) * 5 + (-1) * 15
= 2 * 5 – 15
= -5

It means you’re 5 kilometres away from town A. You’re just in the opposite direction from towns B and C. The calculation result is correct. It’s how you interpret the value.

Applications in image processing

A common operation in image processing is manipulating height maps. Height maps are usually greyscale bitmap files where a white pixel (RGB values are 255 for all 3) is the highest point, and a black pixel (RGB values are 0 for all 3) is the lowest point.

Terrain editor in Bryce

You know enlarging photographs can give you some weird results. What happens is you’re trying to fill in the blanks in a larger image using values from the original image. Where do you think the image editing software comes up with values? Interpolation.

If you think of the red, green and blue values of image pixels as 3 different “height maps”, then you’re just performing interpolation on 3 values. Suppose we’re talking about linear interpolation between two pixels. You’ll interpolate between the red component of the 2 pixels and get a value. Similarly you do it for the green and blue components. The calculated results of the red, green and blue become the interpolated colour.

Cubic Bezier interpolation

There are all kinds of cubic curves available. The Catmull–Rom spline, the non-uniform rational B-spline (NURBS) and I didn’t really want to write anything on the subject after I remember my Hermite splines… I love Bezier curves though, so I thought maybe I can write something with that.

Instead of 2 points used in linear interpolation, cubic interpolation uses 4 points. To illustrate, suppose you’re on an undulating plain with small hills undulating in their usual carefree manner. You’re in between two such (undulating) hills and you want to find out how high you are.

Instead of linear interpolating your way through these two (undulating) hills, the better way will be to interpolate with another 2 (undulating) hills! Ok, I’m stopping with the undulating thing…

Undulating hill interpolation

The Bezier curve equation looks like this:
B(t) = (1-t)^3 * p0 + 3*(1-t)^2 * t * p1 + 3*(1-t)* t^2 * p2 + t^3 * p3
where p0, p1, p2, p3 are the (height) values, and t lies between 0 and 1 inclusive.

You will be between p1 and p2. Let’s also assume that the hills are equidistant from each other. Like the pixels on an image, the hills shall be of equal distance from its neighbour.

Because of this equidistant property, p1 is 0.33 (roughly 1/3) units away from p0, p2 is 0.67 (roughly 2/3) units away from p0 and p3 is 1 unit away from p0.

How do you know what’s the value of t to use? You might be able to calculate the t if you do linear interpolation between p1 and p2. But that t value is different from the t value in the Bezier curve.

Ahhh… once you get the t-linear value, you interpolate with 0.33 and 0.67 to get the t-Bezier value. Confused? Suppose you’re one quarter way from p1 to p2. Your t-linear value is 1/4. Interpolate that with 0.33 and 0.67 to get
f(1/4) = (1 – 1/4) * 0.33 + 1/4 * 0.67
= 0.415

And 0.415 is your t-Bezier value. Voila!

You skipped the quadratic power!

I know. It’s logical to think that there’s a power 2 somewhere. But there isn’t. There is one fundamental flaw with quadratic interpolation. Which segment do you use?

Quadratic interpolation has issues...

In closing

Interpolation is just a method of creating data values using a set of existing data. What those created values mean is up to you to interpret.

In image processing, interpolation can be used to fill in blanks when enlarging an image. It doesn’t guarantee that the enlarged image looks good. Image processing is very much an aesthetic-based operation. I’ll talk a bit more on this when I get to writing code to rotate images.

Featured demo – the.popular.demo

Today’s featured demo is the Popular demo (video link), by Farbrausch. This is one of my favourites, and is one of the few demos I’ve seen that has vocals in it. It’s about 4 minutes in length and 8.2 MB in size (download page at Pouet).

Why is called the popular demo? Because there’s disco. *smile* The central object of attention is a 3D figure model textured with disco ball bling. The music track is upbeat, and I found myself moving to the groove, if you know what I’m saying. You should watch this demo to listen to the music if nothing else. It’s that awesome.

Here are the lyrics to the vocals:

Tonight tonight, it’s all in motion,
Can’t feel the gravity.
Tonight tonight, what is this potion
That makes a fool of me?
Tonight tonight, I’m seeing stars –
I’m blind with love.
Tonight tonight, we’re Venus and Mars
Winking from above.

Tonight tonight, I feel like moving;
Like flying endlessly.
Tonight tonight, the city’s grooving –
They’re dancing all for me.
Tonight tonight, the perfect night,
The only one.
Tonight tonight, I’m holding you tight;
Tomorrow you will be gone…

I was surprised that I understood the lyrics differently. Yes I have my very own Mondegreen! Actually thought I was correct on at least the first verse…

Anyway, remember the 3D figure model? Animating the figure requires some logic. You know, where the feet go, where the elbows bend and so on.

For example, at the start of the demo, the figure is revealed and comes out of the lift/cylinder and walks down the stairs. That should be one “action sequence”. Then the demo smoothly transitions to show the “fr-025” title and the figure is moving again, this time on another action sequence.

Question: In the entire demo, how many distinct action sequences do you count?

Cartesian coordinates and transformation matrices

If you’re doing any work in 3D, you will need to know about the Cartesian coordinate system and transformation matrices. Cartesian coordinates are typically used to represent the world in 3D programming. Transformation matrices are matrices representing operations on 3D points and objects. The typical operations are translation, rotation, scaling.

2 dimensional Cartesian coordinates

You should have seen something like this in your math class:

2D Cartesian coordinates
[original image]

The Roman letters I, II, III, and IV represent the quadrants of the Cartesian plane. For example, III represents the third quadrant. Not a lot to say here, so moving on…

3 dimensional Cartesian coordinates

And for 3 dimensions, we have this:

3D Cartesian coordinates
[original image]

I don’t quite like the way the z-axis points upward. The idea probably stems from having a piece of paper representing the 2D plane formed by the x and y axes. The paper is placed on a flat horizontal table, and the z-axis sticks right up.

Mathematically speaking, there’s no difference.

However, I find it easier to look at it this way:

Another 3D Cartesian representation

The XY Cartesian plane is upright, representing the screen. The z-axis simply protrudes out of the screen. The viewport can cover all four quadrants of the XY plane. The illustration only covered the first quadrant so I don’t poke your eye out with the z-axis *smile*

There is also something called the right-hand rule, and correspondingly the left-hand rule. The right-hand rule has the z-axis pointing out of the screen, as illustrated above. The left-hand rule has the z-axis pointing into the screen. Observe the right-hand rule:

Right-hand rule

The thumb represents the x-axis, the index finger represents the y-axis and the middle finger represents the z-axis. As for the left-hand rule, we have:

Left-hand rule

We’re looking at the other side of the XY plane, but it’s the same thing. The z-axis points in the other direction. And yes, I have long fingers. My hand span can cover an entire octave on a piano.

What’s the big deal? Because your 3D graphics engine might use a certain rule by default, and you must follow. Otherwise, you could be hunting down errors like why an object doesn’t appear on the screen. Because the object was behind the camera when you thought it’s in front. Your selected graphics engine should also allow you to use the other rule if you so choose.

In case you’re wondering, here’s the right-hand rule illustration with the z-axis pointing upwards:

Right-hand rule with z-axis upwards

I still don’t like a skyward-pointing z-axis. It irks me for some reason…

Scaling (or making something larger or smaller)

So how do you enlarge or shrink something in 3D? You apply the scaling matrix. Let’s look at the 2D version:

Scaling a circle in 2D

If your scaling factor is greater than 1, you’re enlarging an object. If your scaling factor is less than 1, you’re shrinking an object. What do you think happens when your scaling factor is 1? Or when your scaling factor is negative?

So how does the scaling factor look like in a scaling matrix?

Scaling matrix 2D

If you don’t know what that means, or don’t know what the result should be like, review the lesson on matrices and the corresponding program code.

You will notice there are separate scaling factors for x- and y- axes. This means you can scale them independently. For example, we have this:

Sphere above water

And we only enlarge along the x-axis:

Enlarge sphere along x-axis

We can also only enlarge along the y-axis:

Enlarge sphere along y-axis

Yeah, I got tired of drawing 2D pictures, so I decided to render some 3D ones. Speaking of which, you should now be able to come up with the 3D version of the scaling matrix. Hint: just add a scaling factor for the z-axis.

Rotating (or spinning till you puke)

This is what a rotation matrix for 2 dimensions looks like:

Rotation matrix 2D

That symbol that looks like an O with a slit in the middle? That’s theta (pronounced th-ay-tuh), a Greek alphabet. It’s commonly used to represent unknown angles.

I’ll spare you the mathematical derivation of the formula. Just use it.

You can convince yourself with a simple experiment. Use the vector (1,0), or unit vector lying on the x-axis. Plug in 90 degrees for theta and you should get (0,1), the unit vector lying on the y-axis.

That’s anti-clockwise rotation. To rotate clockwise, just use the value with a change of sign. So you’ll have -90 degrees.

Depending on the underlying math libraries you use, you might need to use radians instead of degrees (which is typical in most math libraries). I’m sure you’re clever enough to find out the conversion formula for degree-to-radian yourself…

Now for the hard part. The 3D version of rotation is … a little difficult. You see, what you’ve read above actually rotates about the implied z-axis. Wait, that means you can rotate about the x-axis! And the y-axis! Sacrebleu! You can rotate about any arbitrary axis!

I’ll write another article on this. If you’re into this, then you might want to take a look at this article on 3D rotation. I’ll also touch on a concept where you rotate about one axis and then rotate about another axis. Be prepared for lots of sine’s and cosine’s in the formula. Stop weeping; it’s unseemly of you.

Translating (nothing linguistic here)

What it means is you’re moving points and objects from one position to another. Let’s look at a 1 dimensional example:

Translation in 1 dimension

The squiggly unstable looking d-wannabe? It’s the Greek alphabet delta. Delta-x is a standard notation for “change in x”. In this case “x” refers to distance along the x-axis. We’ll use an easier-to-type version called “dx” for our remaining discussion.

Translation in 2 dimensions

In 2 dimensions, we have the corresponding dy, for “change in y”. Note that there’s no stopping you from using negative values for dx or dy. In the illustration above, dx and dy are negative.

You’ll have to imagine the case for 3D because the diagram is likely to be messy. But it’s easy to visualise. You just do the same for z-axis.

So what’s the transformation matrix for translation? First, you need to extend your matrix size and vector size by one dimension. The exact reasons are due to affine transformations and homogeneous coordinates (I’ve mentioned them briefly earlier).

Consider this: You have a point (x,y,z) and you want it to be at the new position (x + dx, y + dy, z + dz). The matrix will then look like this:

Translation matrix

Notice that for scaling, the important entries are the diagonal entries. For rotation, there are sine’s and cosine’s and they’re all over the place. But for translation, the “main body” of the matrix is actually an identity matrix. The fun stuff happens in the alleyway column on the extreme right of the matrix.

That reminds me. Because you’ll be using all the transformation matrices together, all matrices must be of the same size. So scaling and rotation matrices need to be 4 by 4 too. Just extend them with zero entries except the bottom right entry, which is 1.

Conclusion

We talked about 2D and 3D Cartesian coordinates. I’ve also shown you the right-hand and left-hand rules. This forms the basis for learning basic transformations such as scaling, rotation and translation.

There are two other interesting transformation matrices: shearing and reflection. I thought what we have is good enough for now. You are free to do your own research. When the situation arise, I’ll talk about those two transformations.

If you enjoyed this article and found it useful, please share it with your friends. You should also subscribe to get the latest articles (it’s free).

You might find this article on converting between raster, Cartesian and polar coordinates useful.

You might find these books on “coordinate transformation” useful.

Matrices for programmers

Following the fine tradition of the colour theory post, you are getting another crash course. This time, a lesson in matrices. You’re going to be fine. And yes, I’ll hold your hand while you do this. *smile*

For those who are mathematically inclined, we’ll be working in the realm of real numbers (which I talked about briefly when discussing floating points). Let’s start with…

Scalars

Scalars are simply numbers. For example, 2 is a scalar. So is 3.14159 and 1.618. And so is -273.15. Bonus points if you can figure out what those numbers are special for.

Scalars are stored as normal variables in code. Your ints, floats, doubles come in handy.

Scalars are typically denoted by a lowercase alphabet, such as a or b or c.

Vectors

Vectors are series of scalars. For example, [1 3 5 7 9] is a vector.

You typically store vectors as an array. For example,

int[] v = new int[] { 1, 3, 5, 7, 9 };

Vectors are typically denoted by a lowercase alphabet in bold, such as v.

Matrices

Matrices are series of series of scalars, or series of vectors. In code, they are typically stored as an array of arrays.

int[,] A = new int[3, 3];

3 by 3 matrix

Matrices are also known as multidimensional arrays. The dimension of a matrix is m-by-n, where m is the number of rows and n is the number of columns.

When either m or n is 1, we get a vector. So a vector is a special case of a matrix. And because of this, we have to define…

Row and column vectors

It’s easier to just show you how they look like.

Row and column vectors

A row vector is a matrix where the number of rows is 1. A column vector is a matrix where the number of columns is 1. While we’re at it, a scalar can be thought of as a matrix where the number of rows and columns are both 1.

For our purposes of working towards 3D programming, we’ll be focusing on the column vector. It doesn’t matter which one we use when coding, but in terms of notation, we’ll be using column vectors. You will see why later on.

Matrix entries

Individual entries are referred to with the notation A[i,j] (or ai,j) where A is the matrix, i is the i-th row and j is the j-th column. Typically, we have
1 <= i <= m and 1 <= j <= n, where m is the number of rows and n is the number of columns. Take note, because you'll be using them in code. So know how your programming language does indices of arrays. If your language starts with the 0-th element, make sure to shift positions by one less. Then you have 0 <= i <= m-1 and 0 <= j <= n-1. The 0 index has tripped many a programmer, so be careful.

Square matrices

This is a special case where both the number of rows and number of columns are equal. For example, a 3 by 3 matrix, or a 4 by 4 matrix.

In a stroke of coincidence, we will also be focusing on 3 by 3 and 4 by 4 matrices. Hint: It’s because we’re working in 3D.

Identity matrix

In math, there is a number such that when you multiply anything by it, you get back the same thing. It’s the number 1. For example, 8 * 1 = 1 * 8 = 8.

We have the same concept for matrices. There is a matrix such that when you multiply any matrix by it, you get the same original matrix back. It’s called the identity matrix, typically denoted by an uppercase “I”.

Identity matrix

We’ll look at matrix operations soon.

Zero matrix

You know that multiplication unity described above when defining the identity matrix? Guess what, there’s a number such that when you add anything to it, you get back the same thing. It’s the number 0. For example, 8 + 0 = 0 + 8 = 8.

Similarly, we have the zero matrix. It’s simply a matrix with zeroes in all its entries. It’s denoted by a big gigantic 0. Probably not quite useful to you, but nevertheless, you now know something more.

Symmetrical matrices

Symmetrical matrices are symmetrical about the diagonal. Where’s the diagonal? Look at this:

Matrix diagonal

For a 3 by 3 matrix with values:
a b c
d e f
g h i

Entries a, e and i form the diagonal. Notation wise, A[i,i] are the diagonal entries.

If a matrix has zeroes in entries below the diagonal, it’s known as an upper triangular matrix. In our case, d = g = h = 0.

Similarly, if a matrix has zeroes in entries above the diagonal, it’s known as a lower triangular matrix. In our case, b = c = f = 0.

What symmetry means in this case is b=d, c=g and f=h. The general formula is
A[i,j] = A[j,i]

To speed up computations when checking symmetry, some algorithms use
A[i,j] = A[j,i], where i < j The extra condition leaves out the diagonal and entries below the diagonal. No point double checking values, right?

Transpose of a matrix

Now that we know what a symmetrical matrix and its diagonal, we can define the transpose of a matrix. What you do is simply flip the matrix about its diagonal.

For a matrix A whose values are:
a b c
d e f
g h i

Its transpose is:
a d g
b e h
c f i

The transpose of a matrix A is denoted by AT. So if A = AT, A is a symmetrical matrix.

Yes, we’re dealing with square matrices. Rectangular matrices aren’t useful for our purposes in 3D programming, and you’re welcome to research on its practical uses (try “operations research“).

The inverse of a matrix

The inverse of a square matrix A is denoted by A-1, where
AA-1 = A-1A = I

Yes, I know I still haven’t covered matrix multiplication. Just go with it a little longer…

For a matrix product AB, it’s inverse is
(AB)-1 = B-1A-1

Then this looks beautiful:
(AB)-1AB
= B-1A-1AB
= B-1IB
= B-1B
= I

Don’t you think that looks beautiful? *smile*

Matrix equality

Matrices A and B are said to be equal if every corresponding entry of both matrices are equal. In notation, A[i,j] = B[i,j] for all i and j.

Matrix addition (and subtraction)

A matrix C is said to be the sum of matrices A and B if
C[i,j] = A[i,j] + B[i,j] for all i and j.

Subtraction is similar. Let me show you a scalar example.
8 – 5 = 8 + (-5)
Same thing for matrices. The negative sign is “pushed in” to individual entries.

Matrix multiplication by scalar

Let’s multiply matrices by scalars first. It’s easy.

Scalar matrix multiplication

Just multiply the scalar with all the entries in the matrix.

Matrix multiplication by vector

This one’s a little more complicated. For our purposes, we are concerned with multiplying a matrix A by a column vector v. Yes, the order and the type of vector matters. Let’s look at a diagram.

Matrix vector multiplication

The result is a column vector. Suppose we multiply matrix A by column vector x and we get a column vector y, the general formula is
y[i] = A[i,0] * x[0] + A[i,1] * x[1] + … + A[i,n] * x[n]

It actually looks much more concise if I can use the summation notation… BUT, I’m trying to simplify things for you. Hopefully, you can visualise how it works with the diagram. I’ll write another post with code to explain this.

You can’t multiply a matrix by a row vector though. Hopefully through the diagram, you’ll see why it doesn’t work. What happens is, you multiply each row of the matrix by the values down the column vector. Since a row vector only has one value “down the column”, it doesn’t make sense to multiply matrices by row vectors.

You can multiply a row vector by a matrix to get a row vector. But it’s not useful for our purposes. If you understand a little about 3D transformations, then A is a transformation matrix, and x is a vertex. For example, A could be a translation matrix and moves x to point y. If you don’t understand any of this, relax, we’ll get there together soon.

Matrix and matrix multiplication

This is complicated to show and explain, but once you get the idea, it’s actually easy to code.

Matrix by matrix multiplication

I’ll leave it to you to figure out the general formula… It’s similar to the one with matrix by vector multiplication, only with more vectors. *smile* This is what I do in university, write out a’s and subscripts, and summation notations in my lecture notes and tutorial questions…

I’ll write another post explaining this (together with the matrix by vector multiplication) with code to illustrate the use.

In terms of 3D transformations, you could have a bunch of transformations done, say you rotate something, then translate (move) it. So you have something like TRx, where R is the rotation matrix, T is the translation matrix and x is the vertex.

Note the order. The earlier (in order) a transformation is done, the closer it is to the vertex in question. Basically you reverse the order of transformations when implementing.

Since we’re at it, matrix multiplications are not commutative. What it means is that
AB != BA
The order is important.

As an exercise, visualise the difference between moving something then rotate, and rotate then move.

End of crash course

Whew… *wipe sweat* How’re you doing? Still with me?

Good. This sets the foundation you need for understanding 3D programming. Yay! Review what you’ve read, do some research if needed, and I’ll see you next time.

Puzzle of 7 points and 6 straight lines – 2nd solution

This is the second solution to this puzzle: Construct a geometric shape with 7 points such that there are 6 straight lines, and each line must pass through 3 points. The first solution was already discussed last week.

And here’s the construction:

2nd solution to 7 point and 6 line puzzle

The 7 points are labelled A to G. The 6 lines are ADB, AFC, AGE, DGF, DEC and FEB. No, I didn’t label the points so I’ll have AGE and the short forms of the months December and February. It just happened that way…

The construction starts with point A, and you draw two lines down to get points B and C. The lines AB and AC must be of the same length. Then from point B, draw a perpendicular line to meet line AC. That meeting point is F.

From this construction, angles AFB, AFE, CFB and CFE are right angles (90 degrees).

Do the same thing from point C and draw a perpendicular line to meet line AB, and you’ll get point D. Similarly, angles ADC, ADE, BDC and BDE are right angles.

The point E is formed from the cross point of lines DC and FB that was just formed.

Draw a line joining D and F. Draw another line joining A and E. The cross point of lines DF and AE is point G.

The first solution focused on getting the points right, and then forming lines to fit them. This solution focused on constructing the lines, and the required points magically appear.

On hindsight, we didn’t need the right angles to be there. As long as D and F meet the lines AB and AC respectively in the same ratio, the solution is still valid. There are 2 criteria to meet:

  • Lines AB and AC must be of the same length. This allows symmetry.
  • The length ratios AD:AB and AF:AC must be equal. This is dependent on the previous criteria.

[Update] Yes, that is one heck of a correction. 3 criteria:

  • Points B, A, and C don’t form a straight line (they’re not collinear)
  • D is somewhere on the line AB (and D not equal to A nor B)
  • F is somewhere on the line AC (and F not equal to A nor C)

Then follow similar construction steps for points E and G and as Eric puts it, the rest just happens. Thank you Eric for pointing this out.

Ok, 2 solutions were presented. I hope you had fun reading and thinking about the puzzle.

Puzzle of 7 points and 6 straight lines

My friend presented me with a puzzle: Construct a geometric shape with 7 points such that there are 6 straight lines, and each line must pass through 3 points.

My friend came up with a solution, but he wasn’t sure, so he consulted me. Actually there was a “model” answer, but he wanted to find another solution. So we had 2 solutions. I’ll present that model answer here so you can understand the puzzle better.

Equilateral triangle solution

Points A, B and C form an equilateral triangle. Points D, E and F are the midpoints of lines AB, AC and BC respectively. Point G is the centre of the triangle.

Note: 3 points are collinear if they lie on a straight line.

The 7 points are A to G. The 6 lines are formed by ADB, AEC, BFC, AGF, BGE and CGD.

[I’m not going to go too much into math theories and proofs. Besides, I’m not even sure if I can recall them after losing track for so long…]

The first three lines should be easy to understand. D is the midpoint of AB, so ADB is a straight line. So it is for AEC and BFC.

E is the midpoint of AC, and BE is the perpendicular bisector (remember ABC is an equilateral triangle). And G lies on the line BE, since G is the centre of the triangle. So BGE is a straight line.

Using this reasoning, AGF and CGD are also straight lines.

I know, the explanation I gave isn’t quite backed up with math proofs, more with intuitive reasoning. But I’m sure they are (I just don’t know exactly which theorems to cite…).

So I have 2 favours:

  • Send me a more proper explanation of the solution given above
  • Or send me another solution (remember I have another one?)

I know there are at least 2 solutions. Just wondering if there are more. I will post my other solution, together with any submissions from you next Monday. Or earlier, depending on the number of submissions. You’re welcome to do both. *smile*

Since this is a geometric construction, a picture together with some explanations is appreciated. You can host the image on a site like Flickr, then add a comment to this post with a link to that image and a brief explanation.

Or you can send me your solution via email to vincent at polymathprogrammer dot com. Let me know if you want to remain anonymous (why?), and I’ll just post your solution only.

Your construction doesn’t have to be exact in proportions. Meaning if there’s a right angle, it doesn’t have to be 90 degrees exactly, so long as it looks like it’s 90 degrees. Just add some explanation to supplement the drawing. Try Paint.NET if you don’t have any image editing software.

Have fun!

The barber paradox

Suppose there’s a barber who shaves only those who do not shave themselves. The question is, does the barber shave himself?

That was the question I posed in a previous article. The hint to the answer was actually given in the article: “What if your original assumptions were wrong?”

So let’s begin, with two fictitious men, John and Ricky. John has this unfortunate streak of accidental cuts, so he doesn’t shave himself if he could help it. Ricky has this inexplicable fear of people wielding sharp objects around his face (ever watched Sweeney Todd?), so he shaves his own beard.

It’s actually a mathematical question on logic. So forming the statements, we have:

If John does not shave himself, the barber shaves John.
If Ricky shaves himself, the barber does not shave Ricky.

Here’s the interesting part. Let’s substitute “John” and “Ricky” with “the barber”.

If the barber does not shave himself, the barber shaves the barber.
If the barber shaves himself, the barber does not shave the barber.

Either way, the statements don’t make logical sense, each contradicting itself and creating a paradox. So what went wrong?

The statements were correctly formed. It’s our original assumption that’s wrong. What’s our original assumption? That

there’s a barber who shaves only those who do not shave themselves

So the correct answer is, there’s no such barber.

Fibonacci sequence and Golden Ratio

Baby rabbits by Wee Gan Peng @iStockphoto

I haven’t written a math-related article in a while, so in this article, I’ll tell you about the Fibonacci sequence, the golden ratio and some associated program code.

Say you’re given this math formula, and told to find what the nth term is.
F(n) = F(n-1) + F(n-2)
where F(1) = F(2) = 1

Wow, initial conditions are given, a formula is given. Functions! In fact, use recursive functions! That was easy to code, wasn’t it?

What if you’re given this sequence of numbers and told to find what the nth term is?
1,1,2,3,5,8,13,21 …

You would get exactly the same answer for both questions. Notice that each subsequent number is a sum of the previous two numbers (except the first two in the sequence). Once you find the pattern, you can make use of simple loops instead of recursive functions. Generally speaking, recursive functions are slower, and you would not have known the simpler loop solution if you hadn’t analysed the problem first.

Of course, if you knew the numbers form the Fibonacci sequence, you wouldn’t have much of a problem in the first place, would you? You might have heard of it in Da Vinci Code, where the sequence was used to decipher a password number.

Now, the interesting thing about this sequence is that Nature uses it. I haven’t personally verified this, but sunflowers, pineapples and daisies exhibit the use. If you count the number of bumps on them going to the left, and the number of bumps to the right, you’ll find that the numbers are right next to each other in the Fibonacci sequence. For example, sunflowers have 34 spirals to the left and 55 to the right.

The rabbits, man do they breed!

There’s actually a story about how the Fibonacci sequence came about. You might have guessed that Fibonacci is the name of a mathematician. Anyway, the story goes like this. Suppose there are 2 rabbits, one male and one female. Each month, they produce two rabbits, again, one male and one female. The baby rabbits take one month to grow, and become sexually reproductive in their second month. In this manner, how many rabbits are there at the end of 12 months?

In the first month, there are 2.
At the end of the second month, there’s 4 (2 original, 2 babies).
At the end of the third month, there’s 6 (2 original, 2 from 2nd month, 2 babies from original).

At the end of the fourth month, there’s the original 2, the 2 from 2nd month, the 2 from 3rd month, 2 babies from the original 2, and 2 babies from the 2 of the 2nd month. Total rabbits: 10.

Just typing this out is confusing, but you can see a pattern emerging.
2, 4, 6, 10 …
Each subsequent term is a sum of the previous two terms.

Oh yes, the number of rabbits in the 12th month is 466.

Some obvious assumptions are

  • Rabbits never die
  • Babies are always produced in a pair, one male and one female
  • Moral values are not considered (incest shmincest!)

The golden ratio

There is another interesting thing. The ratio of F(n+1)/F(n) approaches a limit as n goes to infinity, which is approximately 1.618. This number is known as the golden ratio, with other names such as golden mean, divine proportion and others. Other than its mathematical relevance, I know it to be important as an aesthetic factor.

A rectangle with its sides in this ratio is thought of as aesthetically pleasing. This probably influenced the manufacture of computer screens. You have your standard 4:3s (1.333:1), the 800×600, 1024×768 and others. There are also some who tried the square root 2 ratio, 1.414:1 in computer applications. Then there’s the 16:9 (1.777:1) aspect ratios.

What’s the purpose of these aspect ratios? To approach the golden ratio 1.618:1. Of course, with widescreens, this point is probably moot nowadays. Still, it makes for interesting contemplation.

Ok, let’s do some coding, calculating the nth term using a loop, which I’ve hardcoded as 20. I’ll leave it to you as an exercise to write it as a function. The golden ratio is calculated as well. The first 2 terms are skipped, because they are equal to 1 (initial conditions).

int i;
int current, previous, next;
double phi = 0;
current = previous = 1;
for (i = 2; i < 20; ++i)
{
    next = current + previous;
    previous = current;
    current = next;
    phi = (double)current/(double)previous;
    Console.WriteLine("Term {0,0:d2} = {1,0:d5}, Phi = {2,0:f8}", i + 1, current, phi);
}

Fairly simple. The point is to convert a problem into a program solution. The difficult part isn’t the original problem, nor is it about coding skills. It’s about translating a problem into a program that’s difficult.

There you have it. Math theory and supporting program code. Hope you enjoyed it.

Update: Commenter Patrick pointed out the existence of the closed-form expression of the Fibonacci numbers. Thanks Patrick!

Two trains and a bumblebee problem

Bumblebee on flower @iStockPhoto/Eric DelmarI was intrigued by this math problem a while ago. It took me a couple of pages of calculations to get the solution. Then I kicked myself because the solution was actually staring in my face.

I can’t find the original problem now, although there seem to be other variations of it. So I’m going to give you my (highly embellished) version. Then see if you can get the answer. Here goes the story of Blitz the bumblebee and the fateful encounter of two trains…

Blitz the bumblebee and the meeting of two caterpillars
Blitz the bumblebee was an adventurous kind of guy. He’d wander further out in the fields than his fellow bumblebees. Now he’d seen these huge black caterpillars (trains) spewing out lots of grey and black clouds passing his fields before. Now Blitz, despite his big size, could fly very fast. But these caterpillars zoomed past him, leaving him in a whirlwind spiral as he tried to get back into control.

So one sunny morning, he decided to test his limits. He waited for one of these caterpillars, and when he heard their telltale “choo choo” from a distance, he started flying as fast as he could in the direction where the caterpillar would move. When the caterpillar moved close to his level, Blitz simply flew to the head of the caterpillar and hung on.

What a thrill! The wind was howling around him and the scenery was speeding past him. Then Blitz wondered where the caterpillar was going. So, using the caterpillar’s speed as initial momentum, Blitz took off.

To his surprise, he flew even faster than the caterpillar! So Blitz shot straight ahead. After some time, Blitz was starting to worry where he’s going, when he heard the “choo choo” coming from in front of him. Before he realised it, he bounced off another caterpillar. Somehow Blitz survived the bounce and flew, out of instinct, back in the direction where he came from.

This is fun!” Blitz thought. And wondered if he’d meet the first caterpillar again. Sure enough, the first caterpillar appeared, and Blitz bounced off it, and flew towards the second caterpillar. Through this bouncing and flying, it never occurred to Blitz what would happen when the two caterpillars meet.

When the two caterpillars were within sight of each other, Blitz realised that his bounces were getting more frequent. By the time this revelation came, he found he couldn’t stop nor get out of the bouncing cycle anymore.

They’re gonna crash! I’m too young to die!

And Blitz promptly fainted, while the two caterpillars amazingly brushed past each other and continued on their merry way…

The end.

The real math problem
Alright, maybe the original problem wasn’t phrased like that, so I’ll give you the short version.
Two trains and a bumblebee problem diagram
There are two trains running at 45km/h in opposite directions towards each other on two separate (straight) tracks. Assume for the discussion that the difference caused by the separate tracks is negligible. A bumblebee is positioned at the head of the first train. When the trains are 90km apart, the bumblebee flies at 60km/h towards the second train. When the bumblebee meets the second train, it returns and flies back towards the first train. It continues to fly back and forth in this manner until the two trains meet. What is the distance covered by the bumblebee?

I wrote down tons of calculations and numbers. I scrutinised my work and finally found a relationship between the numbers. Then I used mathematical induction to convince myself that the relationship can be written in the form of a geometric progression. The ratio of the geometric progression turned out to be less than 1, so that simplified the equation further. And the answer was … 60km.

Immediately after I got this answer, I found that since the time taken for the two trains to meet is 1 hour (time = distance / speed = 90km / (45km/h + 45km/h)), it also means the bumblebee only flew for 1 hour. With the speed of 60km/h, flying for 1 hour means a distance of 60km covered. I spent about an hour or so and a couple of pieces of paper to figure this out…

Moral of the story? Sometimes, it’s better to think through a problem instead of jumping on the first solution that pops up in your head.