Changing Perspectives

You can change the whole world or you can change your own perspective. In a computer, changing a virtual world is just as easy as changing the camera view point.

With computer graphics, with the right set of values, you can move and rotate the virtual world in such a way that it looks as though you’re moving the virtual camera. When you get down to it, it’s just pushing matrices into the pipeline in OpenGL or DirectX or whatever 3D graphics engine you’re using.

Stationary camera, moving scene

Previously, we talked about revolving the entire 3D scene about the camera, and also the problem of the camera looking directly downwards. Today, we’ll look at the mechanics of implementing that stationary camera (it ain’t pretty).

There are 2 transformations to take care of: translation and rotation. Translation takes care of the distance between the camera and the point it’s looking at. Rotation takes care of simulating the camera turning around to look at objects, roughly speaking. Let me use a 2D version to illustrate the concept.

Reverse translation and rotation of 2D scene

Suppose the camera is at some arbitrary position looking at an object. Based on the positions of the camera and the object, you can find the distance between them. You know, with this:
d = sqrt( (cx-ox)^2 + (cy-oy)^2 )
where cx and cy are the x-coordinate and y-coordinate of the camera respectively, and ox and oy are the x-coordinate and y-coordinate of the object respectively.

The camera is looking at the object, so the angle (theta) of its line of sight with respect to the (for example) x-axis can be calculated.

Suppose we want the stationary camera to look in the direction of the positive y-axis, and be positioned at the origin (0,0). To make the scene viewed through a stationary camera the same as that in the original version (the default by the 3D engine), we would rotate the entire scene (90 – theta) degrees, then translate the result of that d units along the positive y-axis.

Remember that order of transformations is important. Rotating first then translating, is (generally) different from translating then rotating.

So that’s the general idea of making a stationary camera work, by moving and rotating the entire scene. The fun part comes because it’s in 3D.

The distance calculation still holds true:
d = sqrt(x^2 + y^2 + z^2)

The angle… not so much. Because it’s in 3D, I adopted spherical coordinates. The radius would simply be the distance calculated previously. But there are now 2 angles to calculate, theta and phi.

Spherical coordinate angles

Suppose the camera is at (a,b,c) and the viewed object is at (p,q,r). We make the viewed object the centre of our attention, so we start our calculations with the object at the origin. Therefore, the camera is at (a-p, b-q, c-r).

We can calculate the distance between them as
d = sqrt( (a-p)^2 + (b-q)^2 + (c-r)^2 )

Then we also solve for the following set of simultaneous equations (note I’m using y-axis as the “upward” axis)
x = r * sin(theta) * sin(phi)
y = r * cos(phi)
z = r * cos(theta) * sin(phi)


a-p = d * sin(theta) * sin(phi)
b-q = d * cos(phi)
c-r = d * cos(theta) * sin(phi)

to look for the angles theta and phi, where
0 <= theta <= 2*PI 0 <= phi < PI Once found, the rendering occurs by rotating the entire scene phi degrees about the positive z-axis (starting from negative y-axis as 0 degrees), then rotate about the positive y-axis (starting from the positive z-axis as 0 degrees), then translate by (-a,-b,-c) (this moves the entire scene away from the camera positioned at the origin). Well, that was a lot of trouble. What was I trying to solve again? Oh yeah, that looking down and losing the "up" vector problem. Notice anything wrong in this implementation? The "up" vector of the camera was never considered. But figuring out all the math was fun... if only it solved something too... *sigh* [Note: all use of "degrees" in this article can be substituted with "radians", depending on your situation. Use accordingly.]

The problem of looking down

So in film, the camera usually moves a lot, together with the actors and props. The scene doesn’t move. In the virtual 3D world, we can move and revolve the world around the camera, which I talked about previously.

Let’s talk about the virtual camera first before launching into the problem I was trying to solve. There are 3 positional vectors for a virtual camera: its position, its “up” vector, and where it’s looking at. The 1st and 3rd positional vector should be easily understood. The “up” vector refers to where the “head” of the camera is pointing at.

You’re looking at something in front of you. Now tilt your head left. Your eyes are still at the same position (1st vector), and you’re still looking at the same thing (3rd vector). But the top of your head is pointing at a different direction (2nd vector). This changes your view. Refer to my article on lights and virtual cameras for more information (and pictures…).

So far, I haven’t used the 2nd vector to do much, so I’ve kept it at the default (0,1,0), which means point to the sky. Now for the problem…

Suppose you’re looking at something in front of you, say an (absolutely symmetrical) apple, and you move up while keeping the apple at the same position. You’re now looking down at it, aren’t you? Say you’re not the fidgety type, so your head is kept straight. Your head’s not pointed directly up to the sky, but it’s straight. You don’t know how, but you know it’s straight. This is important.

Now, slowly move towards the space directly above the apple. Your head is still kept “straight”. As in, if you tilted your head, the view changes substantially, and the apple appears “lopsided”. Here’s the problem (finally!). What happens when you look at the apple directly from above it?

Moving eye position while looking down

You can now tilt your head in any direction, and the apple still looks the same (as in you’re looking directly above it). Now the 2nd positional vector matters, because the view changes substantially (because the scene around the apple changes). The default “up” direction fails.

I can’t remember the name of this problem (or even if there was one). And I can’t find my OpenGL textbook that describes this, so I’m writing this from memory. If you can find me a reference to this, let me know.

So my young naive mind back then was “How about I don’t move the camera at all!” *sigh* So naive… As you can tell, my solution to move the entire world instead of moving the camera failed. It failed as in not solving the original problem I was trying to solve. But it worked flawlessly in that it worked exactly as the camera was made to move through the 3D world.

And I’ll tell you how I built that camera … next time.

Revolve the world around you

Sit or stand with your head pointing straight up. Tilt your head to your left. Note the view, the slant of the horizon, the movement (if any) of surrounding objects.

Tilt your head back to pointing straight up. Now imagine the view in front of you tilting to your right. Can you imagine the scene as having the same view as if you tilted your head left?

That was exactly what I was trying to achieve when I implemented a custom camera object to overcome a particular viewing problem in 3D. Well, you’ll have to wait for another article for the description of the problem. I’m going to just describe the function of that custom camera in this article.

So in 3D scenes, you have the scene objects and a virtual camera. Using the position and orientation of the camera, you can view the scene (by rendering the scene onto the viewing plane). This is analogous to the physical world.

Now, in the physical world, the scene, the set, and the props typically don’t move. Only the camera moves (we’ll leave out the human actors). I’m referring to the movement where an object goes from one place to another. Movement such as water flowing, or explosions aren’t included (as far as the discussion goes).

For a physical camera, there are limits. You can’t quite fly a camera through an explosion. You need special cameras to go through tiny openings. You’ve got to be careful when working with mirrors, because the camera (and cameraman) can be inadvertently captured (unless that was the effect). And you definitely can’t pass through walls.

A virtual camera in a 3D scene has none of those limitations. As far as the renderer is concerned, a camera is just a point, unless it’s modelled and treated as a 3D object. It can film the hottest of volcanic films, or be submerged in the depths of the seas, and remain undamaged. Now, the virtual camera might be limitless, but that’s not the point. Due to the transformations such as translations, rotations and scaling, the 3D scene itself can be modified at will.

I was inspired by a remark made by my university lecturer. He said that moving the camera towards a stationary object, is the same as moving the object towards the stationary camera. This also implied that rotating the camera clockwise around a stationary object, is the same as rotating the object anti-clockwise around the stationary camera.

This opened my eyes to another corollary. You don’t need to move the camera at all! You can move the entire scene instead.

So I set out to design a camera object where the entire 3D scene depended on it. What I mean is, instead of setting camera properties and have them work nicely with the 3D scene, the 3D scene itself conforms to the camera properties.

For example, if I set the camera position at (1,0,0), in actuality, the camera is still at (0,0,0). But the entire 3D scene is translated (-1,0,0).

What I did was set the camera at a default position, say (0,0,5) (I’m using the upright Y-axis), and set the camera’s “up” vector to (0,1,0) (meaning it’s head is pointing upwards, so it’s level with the ground). Then everything else is done with respect to this default camera orientation.

So why am I doing all this? I was bored, I had time then, and I wanted to solve a particular problem. I’ll tell you more about the mechanics of the camera, and the problem some other time…

Lights, camera, action!

You’ve learned a bit on viewports already. Now, we’ll look at how lights and cameras are used in a 3D environment.

Natural light (or sunlight)

Natural light comes into play when you’re working with outdoor scenes. Practically everything is lit up. Outdoor scenes are also usually filled with objects. Trees, houses, grass.

You never really think about grass until you realise you have to model and render every single blade of grass. Unless you create an illusion that there’s a sea of grass out there… Yes, there are ways.
[end digression]

Light rays in Sofia Cathedral

[image by -lvinst-]

There are also indoor scenes where natural light streams in through a window, and you get a rectangular block of light in the scene. That’s … a little out of scope for now. You can research on volumetric lighting for more information.

Right now, we’ll work on light that we can’t see. I know, it sounds contradictory, but think of it this way: You see an object because light fell on it, not because you see the light. Remember the short science lesson when we discussed wavelengths of different colour reflecting off surfaces?

The most prominent source of natural light is the sun (leave out the moon and the stars). As far as I know, we’ve only got one sun, so we only have one source of light to model.

For the purposes of modelling light rays, natural light rays are parallel. They are not, strictly speaking. But by the time they reach Earth, they are almost parallel. This makes it easier to model, because there’s only one angle to consider.

What angle am I talking about? At noon, light rays hit where you are at about 90 degrees with respect to the ground. At dawn (I know you might not wake up at dawn, just humour me), light rays hit where you are at say 10 degrees. At dusk, maybe 170 degrees. Yes, this is the angle.

With a point source as far as 150 million kilometres away, calculations with a difference of fractions of a degree are wasteful (and unnecessary). Parallel rays simplify calculations. We’ll look at the detailed calculations in another article. Maybe. I don’t like manual math calculations anymore than you do…

Point sources of light

For simplicity, light sources are assumed to be point sources. Your desk lamp, your flashlight, your television, a candle flame, fireflies. And they emit light in all directions (yes, they’re very generous).

Different light angles on object vertices

I’ve made the light source visible so you can see its position relative to the cube, our main object in the scene. At close range, the areas around the (6) points on the cube are rendered and lit differently because light hits the cube at different angles.

Every single point on the cube is hit by our light source at a different angle. Imagine the calculations involved. This is the reason why natural light rays are assumed parallel.

You might also notice that points closer to the light source are more brightly lit. How is that modelled? Let’s look at attenuation.

Attenuation (or falloff)

There are typically 2 types of attenuation: linear and quadratic attenuation. Basically they’re just functions of distance between the position of the light source and the object vertex (the point on the object) in question.

For illustration, let the “strength” of light, L, be full “power” 1.0 at the light source. The further the distance, the lower L becomes. So L could be
L = 1.0 / (c1 * d + c0)
where c1 and c0 are some constants and d is not equal to zero (if d is zero, L is to be 1.0, remember?)

That’s linear attenuation. Then just use L as part of the lighting calculations.

What about quadratic attenuation?
L = 1.0 / (c2 * d^2 + c1 * d + c0)
where c2, c1 and c0 are some constants and d is not equal to zero

Relax, the graphics API you use probably does this (almost) automatically. This is just to let you know what’s going on behind the scenes. I know OpenGL has inbuilt functions so you can just set your choice using just a function call.

There’s another type which is ranged attenuation. Basically, L becomes zero at a certain distance. With the 2 attenuation models above, you never really get zero. I’m not sure if it’s supported by popular engines, so you might want to just keep this in mind. Maybe you’d like to implement your own to simplify calculations or produce a certain effect.


Different types of spotlights

The different spotlights are used for different effects. You’re probably familiar with cylindrical or conical spotlights. They’re used to highlight people when they’re on stage.

The square spotlight and parallel lights are, well, … because we can. *smile* Remember, in the virtual world, it’s sometimes easier to create certain effects. You can create a heart-shaped spotlight if you really want to.

Ambient light

Sometimes, objects are lit brighter than expected. You’ve taken into consideration the natural light and the light sources you specified. Yet the objects still look a bit brighter.

This is ambient light in effect. Maybe your calculations aren’t as precise or as close to “reality” as you thought. Things like light from the other side of the planet bouncing up to the sky, bouncing off clouds, bouncing off the sea, bounce, bounce, bounce and ended up at your scene. Add to that light from digital watches, fireflies, neon signs, candle flames, office buildings and what-not, and you get a low amount of light that’s ever present.

To model this, we simply brighten everything up a little bit, say 5% more. We don’t care what’s the source. See, that was easy.


Rounding this discussion on light, we have radiosity. Basically it’s light bounced off another (diffuse) surface. It’s computationally intensive, so it’s not appropriate for real-time renderings and games.

I don’t have any specific pictures on radiosity to show you (because I haven’t installed my other rendering software with this feature). Try searching at Flickr for some examples.

Where’s the camera?

The virtual camera acts very much like real-world cameras. You are just able to do more stunts with them. *smile*

Camera positions are represented using a vector with X-, Y- and Z-coordinates. Depending on the graphics API you use, you might also have a W-coordinate, which is usually 1.0. For our discussion here, it’s not important. You can refer to more information by searching on homogeneous coordinates and affine transformations.

What are you looking at?

Having a camera position isn’t enough. You need to know what the camera is looking at. Imagine a camera moving from one point to another, yet it keeps looking at the same object.

For example, you change the camera from one position

Camera position 1

to another position, but still looking at the same object

Camera position 2

Are you upright?

A final component of cameras is the upright position. It’s easier just to show you. First, we have the camera in upright position.

Camera in upright position

Then we tilt it left a little.

Camera tilted left

Then we tilt it right a little.

Camera tilted right

We tilt it up a little.

Camera tilted up

And we tilt it down a little.

Camera tilted down

In particular, you can tilt left or right, and still look at the object in question and keep your camera position.

In practise, we usually keep the camera upright. In this case, our representation is not a point, but a direction. We still use the X-, Y-, Z-coordinates though (W as well, depending). Typically, we use (0, 1, 0, 0).

Note that you can probably use (0, 3, 0, 0) and still be fine. It’s the direction that counts, not the magnitude. So you don’t need to normalise the vector, or make the vector a unit vector (magnitude of 1 unit).

Note also that, depending on the graphics API you use, all 3 camera properties may be lumped together. So you may get a function requiring 9 parameters: 3 for the position, 3 for the look-at position, 3 for the upright direction.

Camera paths

Because of the representations of the camera, you can assign values to the camera positions, look-at’s and upright direction dynamically. This is how you can create spectacular views by constantly changing the camera position and what the camera (and viewer) looks at.

The easiest way is to iterate on a linear path. So at the start of time t1, the camera is at position 1. At t2, the camera is at position 2. Then you just do linear interpolation in between t1 and t2 and there you have it, a moving camera.

This was the impetus behind my research in applying Bezier curves to camera positions.


So we’ve covered the basic understanding of light for 3D development. Note that sometimes, you might need to add more lights than necessary, even though in the 3D scene, those extra lights shouldn’t be there. The extra lights are to enhance (or sometimes correct) the final rendered scene.

Your focus should be on what’s finally rendered, not what’s accurately modelled. It’s about results.

We’ve also covered the basics of camera representations. Hopefully, you have an understanding of how to create better scenes through moving cameras.

That’s all for this lesson, and I hope you learn something from it.