Stationary camera, moving scene

Previously, we talked about revolving the entire 3D scene about the camera, and also the problem of the camera looking directly downwards. Today, we’ll look at the mechanics of implementing that stationary camera (it ain’t pretty).

There are 2 transformations to take care of: translation and rotation. Translation takes care of the distance between the camera and the point it’s looking at. Rotation takes care of simulating the camera turning around to look at objects, roughly speaking. Let me use a 2D version to illustrate the concept.

Reverse translation and rotation of 2D scene

Suppose the camera is at some arbitrary position looking at an object. Based on the positions of the camera and the object, you can find the distance between them. You know, with this:
d = sqrt( (cx-ox)^2 + (cy-oy)^2 )
where cx and cy are the x-coordinate and y-coordinate of the camera respectively, and ox and oy are the x-coordinate and y-coordinate of the object respectively.

The camera is looking at the object, so the angle (theta) of its line of sight with respect to the (for example) x-axis can be calculated.

Suppose we want the stationary camera to look in the direction of the positive y-axis, and be positioned at the origin (0,0). To make the scene viewed through a stationary camera the same as that in the original version (the default by the 3D engine), we would rotate the entire scene (90 – theta) degrees, then translate the result of that d units along the positive y-axis.

Remember that order of transformations is important. Rotating first then translating, is (generally) different from translating then rotating.

So that’s the general idea of making a stationary camera work, by moving and rotating the entire scene. The fun part comes because it’s in 3D.

The distance calculation still holds true:
d = sqrt(x^2 + y^2 + z^2)

The angle… not so much. Because it’s in 3D, I adopted spherical coordinates. The radius would simply be the distance calculated previously. But there are now 2 angles to calculate, theta and phi.

Spherical coordinate angles

Suppose the camera is at (a,b,c) and the viewed object is at (p,q,r). We make the viewed object the centre of our attention, so we start our calculations with the object at the origin. Therefore, the camera is at (a-p, b-q, c-r).

We can calculate the distance between them as
d = sqrt( (a-p)^2 + (b-q)^2 + (c-r)^2 )

Then we also solve for the following set of simultaneous equations (note I’m using y-axis as the “upward” axis)
x = r * sin(theta) * sin(phi)
y = r * cos(phi)
z = r * cos(theta) * sin(phi)

==>

a-p = d * sin(theta) * sin(phi)
b-q = d * cos(phi)
c-r = d * cos(theta) * sin(phi)

to look for the angles theta and phi, where
0 <= theta <= 2*PI 0 <= phi < PI Once found, the rendering occurs by rotating the entire scene phi degrees about the positive z-axis (starting from negative y-axis as 0 degrees), then rotate about the positive y-axis (starting from the positive z-axis as 0 degrees), then translate by (-a,-b,-c) (this moves the entire scene away from the camera positioned at the origin). Well, that was a lot of trouble. What was I trying to solve again? Oh yeah, that looking down and losing the "up" vector problem. Notice anything wrong in this implementation? The "up" vector of the camera was never considered. But figuring out all the math was fun... if only it solved something too... *sigh* [Note: all use of "degrees" in this article can be substituted with "radians", depending on your situation. Use accordingly.]

Cartesian coordinates and transformation matrices

If you’re doing any work in 3D, you will need to know about the Cartesian coordinate system and transformation matrices. Cartesian coordinates are typically used to represent the world in 3D programming. Transformation matrices are matrices representing operations on 3D points and objects. The typical operations are translation, rotation, scaling.

2 dimensional Cartesian coordinates

You should have seen something like this in your math class:

2D Cartesian coordinates
[original image]

The Roman letters I, II, III, and IV represent the quadrants of the Cartesian plane. For example, III represents the third quadrant. Not a lot to say here, so moving on…

3 dimensional Cartesian coordinates

And for 3 dimensions, we have this:

3D Cartesian coordinates
[original image]

I don’t quite like the way the z-axis points upward. The idea probably stems from having a piece of paper representing the 2D plane formed by the x and y axes. The paper is placed on a flat horizontal table, and the z-axis sticks right up.

Mathematically speaking, there’s no difference.

However, I find it easier to look at it this way:

Another 3D Cartesian representation

The XY Cartesian plane is upright, representing the screen. The z-axis simply protrudes out of the screen. The viewport can cover all four quadrants of the XY plane. The illustration only covered the first quadrant so I don’t poke your eye out with the z-axis *smile*

There is also something called the right-hand rule, and correspondingly the left-hand rule. The right-hand rule has the z-axis pointing out of the screen, as illustrated above. The left-hand rule has the z-axis pointing into the screen. Observe the right-hand rule:

Right-hand rule

The thumb represents the x-axis, the index finger represents the y-axis and the middle finger represents the z-axis. As for the left-hand rule, we have:

Left-hand rule

We’re looking at the other side of the XY plane, but it’s the same thing. The z-axis points in the other direction. And yes, I have long fingers. My hand span can cover an entire octave on a piano.

What’s the big deal? Because your 3D graphics engine might use a certain rule by default, and you must follow. Otherwise, you could be hunting down errors like why an object doesn’t appear on the screen. Because the object was behind the camera when you thought it’s in front. Your selected graphics engine should also allow you to use the other rule if you so choose.

In case you’re wondering, here’s the right-hand rule illustration with the z-axis pointing upwards:

Right-hand rule with z-axis upwards

I still don’t like a skyward-pointing z-axis. It irks me for some reason…

Scaling (or making something larger or smaller)

So how do you enlarge or shrink something in 3D? You apply the scaling matrix. Let’s look at the 2D version:

Scaling a circle in 2D

If your scaling factor is greater than 1, you’re enlarging an object. If your scaling factor is less than 1, you’re shrinking an object. What do you think happens when your scaling factor is 1? Or when your scaling factor is negative?

So how does the scaling factor look like in a scaling matrix?

Scaling matrix 2D

If you don’t know what that means, or don’t know what the result should be like, review the lesson on matrices and the corresponding program code.

You will notice there are separate scaling factors for x- and y- axes. This means you can scale them independently. For example, we have this:

Sphere above water

And we only enlarge along the x-axis:

Enlarge sphere along x-axis

We can also only enlarge along the y-axis:

Enlarge sphere along y-axis

Yeah, I got tired of drawing 2D pictures, so I decided to render some 3D ones. Speaking of which, you should now be able to come up with the 3D version of the scaling matrix. Hint: just add a scaling factor for the z-axis.

Rotating (or spinning till you puke)

This is what a rotation matrix for 2 dimensions looks like:

Rotation matrix 2D

That symbol that looks like an O with a slit in the middle? That’s theta (pronounced th-ay-tuh), a Greek alphabet. It’s commonly used to represent unknown angles.

I’ll spare you the mathematical derivation of the formula. Just use it.

You can convince yourself with a simple experiment. Use the vector (1,0), or unit vector lying on the x-axis. Plug in 90 degrees for theta and you should get (0,1), the unit vector lying on the y-axis.

That’s anti-clockwise rotation. To rotate clockwise, just use the value with a change of sign. So you’ll have -90 degrees.

Depending on the underlying math libraries you use, you might need to use radians instead of degrees (which is typical in most math libraries). I’m sure you’re clever enough to find out the conversion formula for degree-to-radian yourself…

Now for the hard part. The 3D version of rotation is … a little difficult. You see, what you’ve read above actually rotates about the implied z-axis. Wait, that means you can rotate about the x-axis! And the y-axis! Sacrebleu! You can rotate about any arbitrary axis!

I’ll write another article on this. If you’re into this, then you might want to take a look at this article on 3D rotation. I’ll also touch on a concept where you rotate about one axis and then rotate about another axis. Be prepared for lots of sine’s and cosine’s in the formula. Stop weeping; it’s unseemly of you.

Translating (nothing linguistic here)

What it means is you’re moving points and objects from one position to another. Let’s look at a 1 dimensional example:

Translation in 1 dimension

The squiggly unstable looking d-wannabe? It’s the Greek alphabet delta. Delta-x is a standard notation for “change in x”. In this case “x” refers to distance along the x-axis. We’ll use an easier-to-type version called “dx” for our remaining discussion.

Translation in 2 dimensions

In 2 dimensions, we have the corresponding dy, for “change in y”. Note that there’s no stopping you from using negative values for dx or dy. In the illustration above, dx and dy are negative.

You’ll have to imagine the case for 3D because the diagram is likely to be messy. But it’s easy to visualise. You just do the same for z-axis.

So what’s the transformation matrix for translation? First, you need to extend your matrix size and vector size by one dimension. The exact reasons are due to affine transformations and homogeneous coordinates (I’ve mentioned them briefly earlier).

Consider this: You have a point (x,y,z) and you want it to be at the new position (x + dx, y + dy, z + dz). The matrix will then look like this:

Translation matrix

Notice that for scaling, the important entries are the diagonal entries. For rotation, there are sine’s and cosine’s and they’re all over the place. But for translation, the “main body” of the matrix is actually an identity matrix. The fun stuff happens in the alleyway column on the extreme right of the matrix.

That reminds me. Because you’ll be using all the transformation matrices together, all matrices must be of the same size. So scaling and rotation matrices need to be 4 by 4 too. Just extend them with zero entries except the bottom right entry, which is 1.

Conclusion

We talked about 2D and 3D Cartesian coordinates. I’ve also shown you the right-hand and left-hand rules. This forms the basis for learning basic transformations such as scaling, rotation and translation.

There are two other interesting transformation matrices: shearing and reflection. I thought what we have is good enough for now. You are free to do your own research. When the situation arise, I’ll talk about those two transformations.

If you enjoyed this article and found it useful, please share it with your friends. You should also subscribe to get the latest articles (it’s free).

You might find this article on converting between raster, Cartesian and polar coordinates useful.

You might find these books on “coordinate transformation” useful.

Translating database column names for globalisation

I was working at a software development house where one of those enterprisey .NET web applications was created for a large company. It was fairly standard. There were workflow processes, inventory tracking business logic, user management and the like.

The web application also had to display information in both English and Japanese (the customer was, you know, a Japanese company).

While the database column names were in English, there must be an option to display the Japanese equivalent. Suppose we had a table scripted like:

create table staff
(
STF_ID nchar(10) not null,
STF_NAME nvarchar(100) not null,
EFF_DT datetime not null
)

When I was creating datasets, I used Pascal case as recommended, and typed out words in full. So I had something like

ds.StaffID = "FIRSTID";
ds.StaffName = "Some name";
ds.EffectiveDate = DateTime.UtcNow;

The lead developer gently asked me why I coded that way. I explained the recommended practices for variable naming. I was actually quite puffed up with pride because of that knowledge.

The lead developer said he understood (though I didn’t think he did). Then he gave me an explanation why my method was a bad idea. He said translations for English and Japanese were based on resource files, and the contents of those files were based on the names used in the database tables.

So in the resource file, there would be an entry for STF_ID to be translated into “Staff ID” in English and “sutafu ID” in Japanese.

Staff ID translation (English and Japanese)

If I used “ds.StaffID”, it would be confusing for other programmers because I did some “translating” on my own. Someone else might translate differently and soon, the whole project would go up in flames. A standard way of referring to the database column name was established, and that was whatever the column name was originally. Even if it doesn’t look quite right in code.

So there would be

ds.STF_ID = "ID09183";
ds.STF_NAME = "another robot";
ds.EFF_DT = DateTime.UtcNow;

After some thinking, I had to concede. Using the database column name in code did make sense. There were full-time people hired to enter a code, word or phrase with the English and Japanese equivalent into the resource files. There were many developers working on that project. It’s just more efficient to use the column names as a standard.

That said, there’s much to be desired for the variable naming skills of the database administrators…