Remember the stride

I was reading this article by Raymond Chen where he mentioned this:

Most of the time, your code won’t care (there are just pixels out there that you aren’t using), but if you use the Get­Buffered­Paint­Bits function to obtain direct access to the bits, don’t forget to take the stride into account.

The word “stride” evoked a ton of memories where I used to track (laboriously) the dimensions of a bitmap. This was back in the days when I was fiddling with writing my own computer games and computer graphics. Specifically, I was working with the screen graphics buffer.

Double buffers, 8-bit colours and screen resolutions

What happens (happened?) is that for fast visuals, we need a decent frame rate, and 30 frames per second is fast enough to fool human eyes that we’re looking at smooth uninterrupted motion. Back in those days, computer monitors weren’t that advanced, so the double buffer trick was used. (Is it still used? I don’t know…).

The double buffer trick refers to having 2 sets of screen buffers. The first one is used to “blit” the contents onto the screen. I’m using blit as a verb too, but this will take too long to explain if I told the story of blitting too… While this blitting is being done, the pixels of the next frame is drawn on the second buffer. When the drawing of pixels is completed, this second buffer is used to blit to the screen.

When you force the program to use double buffering, the next frame is held until the buffer has the drawing completed. This is what “lag” means. The frame rate is sort of ignored, because only when the next frame is completely drawn, will the buffer contents be displayed. Usually this isn’t a problem because the pixel drawing for one frame is within 0.03333 seconds (based on 30 FPS). As you go for higher resolutions and more complex objects being drawn and more complex calculations being done (such as calculating hit points and bullet trajectories), this next-frame-drawing gets slowed down.

In code, what you have are 2 bitmaps in memory and 1 pointer (yay, pointers!). You actually point the pointer to either bitmap based on which bitmap’s contents are to be blit.

  • First bitmap on screen, second bitmap drawing next frame, point to first bitmap.
  • Second bitmap done drawing.
  • Point to second bitmap.
  • Draw next frame on first bitmap.
  • Continue till application (most probably a game) is done.

“So what’s the stride got to do with this? And what’s a stride?”

Well, the thing is, when you request a bitmap from within your code, you might not get the exact dimensions you want. What you get is, as Raymond mentioned:

it only promises that the bitmap will be at least the size you requested.

So the bitmap given to you can be larger in size. And the larger size is based on the stride. I’ve not done a whole lot of research, and the following explanation is based on what I remember from those game-developing days. Let’s say the stride is 4 bytes. This means the memory size of the bitmap given to you will be in multiples of 4. … Uh, yeah, I think that’s about it.

If you ask for a bitmap with dimensions such that the projected final size memory is not in multiples of 4, you will be given a bitmap such that it is.

This “problem” is compounded by the fact that you also have to take care of the red, green and blue bytes (3 contiguous bytes for 1 pixel). Sometimes, there’s an alpha component. There’s also the 8-bit colour, where the first 2 bits are for red, next 3 bits are for green, and the final 2 bits are for blue. Not 1 byte for each colour component. (Just FYI, green has more bits because we can differentiate more shades of green than red or blue).

This is why you might find that in graphics programming, you are advised to have dimensions in multiples of 2. For example, your texture images used for mapping onto 2D planes and 3D objects should preferably have dimensions in powers of 2.

Wait, why powers of 2, and not just multiples of 2? I believe it has something to do with the texture mapping functions, because those functions don’t work well when the bitmap/image/texture argument doesn’t have dimensions with powers of 2. This is why I prefer to use square images at 128, 256 or 512 pixels. Mipmaps were used to alleviate this, but that’s another topic…

And the final complication? “What? There’s more?!?” Yes.

The bitmap you requested in code, the one where you might have to take note of the stride? That bitmap might have a different dimension than the screen dimension of the computer monitor. Computer monitors weren’t quite “standard” back then (I’m starting to feel old…). The computer monitor also has its own stride (I’m basing this on the memories of my research. Don’t just take my word for it). This means blitting pixels from a bitmap buffer to the screen isn’t quite so straightforward.

For example, if you’re working on a screen resolution of 800 by 600 pixels (yes, I know that’s like ancient. Work with me here…), and then you ask for a bitmap to represent that. Well, you might get a bitmap with dimensions 1024 by 768 pixels. Maybe it’s because it’s more memory efficient that way. 1024 = 2^10, and 768 = 2^9 + 2^8. Wait, now we have sums of powers of 2?!? I don’t know a whole lot about these things… I’m just trying to make a computer game…

So based on the above example, you have an “extra” width of 224 pixels (1024 – 800) and “extra” height of 168 pixels (768 – 600). So even if the stride is taken note of, the computer might just decide to throw you more memory space. Just for the heck of it.

In summary…

The bitmap you request in code might have a different dimension than what you wanted. The computer monitor might have a different dimension to work with. You have to remember each pixel has a red, green, blue and even alpha component (each of which uses a byte of storage. Or not, in the case of 8-bit colours). Then you have to take note of the dimensions of the textures you’re using to map onto 2D/3D objects.

And none of that has very much to do with the “fun” part of game programming. You’re just trying to work with one image.

I hope you learnt something. Because I sure as heck don’t know what’s the point I was trying to make, other than bombard you with seemingly random pieces of information…

Moving backgrounds, different speeds

Back when I was younger (which is an obtuse way of saying “I haven’t the friggin’ idea exactly when”), I dabbled a bit in game development. There was a period when I was studying side scroller games. Remember those? The classic Super Mario Brothers was one of them.

I also noticed that in some of the games, the backgrounds moved. Yes, backgrounds, plural. I could understand forming a background “tile” made up of hills, clouds, trees, grass, flowers, rocks and whatever suited the game as background. But there was something, else, moving in the (for lack of a better word) background.

There was another background layer, moving at a different speed. Wait. Oh, it’s moving at a slower speed.

When I moved that little sprite (that’s representing my sole means of interacting with the game) on the screen to the right, the flowers and trees and rocks sped past to the left. But that faraway mountain was moving to the left at a slower speed. And the overall effect was a realistic simulation of 3D, a semblance of depth in an essentially 2D game.

Now that I think about it, I have one question. How do you calculate how slow the other background should be? I searched high and low, though I found what this effect is called (parallax scrolling), I found no trace of any suggestion to the relative speeds between the 2 backgrounds.

So I did a little thinking. And drawing. I was trying to work out mathematically the slower speed, given the “distance” between the background layers (there’s practically no distance in implementation. Maybe 0.01 units…) and the speed of the background that’s “in front”.

It didn’t make sense, because no matter how I pivot the movement, the calculations don’t work out.

Parallax scrolling backgrounds

L1 and L2 are the “distances” between the respective layers. d1 and d2 are the distances from the objects in question to the perpendicular line formed by the sprite position. v1 and v2 are the velocities of the respective layers moving to the left (or right, depending on how you view this whole thing and how you define the direction… never mind).

The layers aren’t really separate. There is a tiny distance between the layers, say 0.01 units. If you’re in a fully 2D environment, then the farthest layer is drawn, then the next closest layer is drawn, subject to transparency to allow elements from the farthest layer to be shown, and then the playing layer is drawn (where our sprite and other objects are). There’s no distance (between the layers) to speak of in a true 2D rendering environment.

I started with the “don’t move the focus, move everything else” approach, keeping the sprite in place, and moving both backgrounds to the left. This meant pivoting around the sprite. The objects drawn on the other two layers are what our sprite would see in a straight line towards somewhere forward. Those objects should coincide at the “perpendicular” line together.

Since the distances d1 and d2 are obviously different, therefore the velocity (or speed. I’m just trying to be scientifically correct here) of the two objects moving to the left must be different. There lies my problem. It meant the farthest object had to travel faster, contradicting our original observation.

What if we pivoted around the object in the “front” layer? The sprite moves to the right, and the object on the “back” layer moves to the left, and all three line up in a perpendicular line (perpendicular to the layers anyway). Too troublesome. Same with pivoting around the farthest object.

I toyed with the idea of pivoting around the vanishing point. At this point (no pun intended), I decided to give up.

So I assumed that the background image(s) in the “back” layer are appropriately sized with respect to the “front” layer. I decided a simple ratio probably worked best. Thus we have
v2 = v1 / (L1 + L2)
which should give an appropriately slowed velocity.

And now, finally, I’m telling you this. It might not matter. What matters is that you test the velocities, and if the 2 background layers scroll at a pleasing velocity, then there you have it. Ultimately, we’re just trying to simulate a 3D perspective given a 2D environment. If it’s believable, then that’s the correct velocity.

Multi-sensory input in artificial intelligence

Robot playing chess Artificial intelligence has always been one of my favourite topics of interest. Artificial intelligence or AI, as it’s commonly known, is

the study and design of intelligent agents where an intelligent agent is a system that perceives its environment and takes actions which maximizes its chances of success.

I think of AI more in terms of simulation. How can a certain behaviour be simulated? How can a certain phenomena be simulated? For example, flock behaviour can be simulated using simple rules yet resulting in complex patterns. I’ve even tried my hand at computer virus simulations.

There was a time when I was into game development. I mean, like really into it. I joined an independent game development group. Can’t remember from where (I think I found the group from GameDev). Can’t remember the group name too (I think it has a “rhino” somewhere…)

Anyway, there wasn’t much documentation yet, except that it’s a “get the flag” kind of game (kinda like paintball). I was in charge of the AI component for simulating the computer controlled opposing team. Since this genre was new to me (I don’t play these kinds of games, real life or otherwise), I had to do some research and thinking.

Visual and aural range AI component I came up with using a visual and aural component for the AI. The enemy would have the standard 60 (or 45 or 90) degree field of vision, and this has a cutoff distance (or a gradual decline of visibility). Then there’s 360 degree aural sensory input, but with a small cutoff distance.

This allowed seeing long distances but with a narrow scope. Rendering fog only limits the player. Computer controlled enemies can see till infinity, hence the cutoff distance. Well, if we limit the vision, then a player can sneak up to an enemy from behind and whack him on the virtual head. Not realistic.

So I thought up a 360 degree “alarm” system by incorporating an aural component. Any sound made by the player such as moving on sand (or water or gravel) and firing would alert the enemy (from behind, sideways, anywhere).

I was going to add in some sensitivity control to the aural component, like more receptive from in front than from behind. Then I decided that long distance communication, slow responses and school work was too much. So I sent a polite email to the team leader saying I’m glad to have been part of the team and I really need to focus on my studies. Then I left.

That ended my short-lived stint as a game developer.

This just came into my mind. In a virtual environment, autonomous agents can be omnipresent and omniscient. They can be anywhere in the blink of an eye and know everything. In a virtual environment, we try to limit the capabilities of our simulations to create a believable, realistic character.

In the real world, we try to expand the capabilities of our simulations running in robots. In our physical world, robots already face lots of limitations.

I see a squeeze theorem event waiting to happen…

Rapidly calculating Bezier points

The standard cubic Bezier curve is given by
B(t) = (1-t)3p0 + 3(1-t)2tp1 + 3(1-t)t2p2 + t3p3
where p0, p1, p2, p3 are the control points and t is in the interval [0, 1].

Very elegant, but not very practical. Too many multiplications and additions to be used in a fast-paced environment such as game development. If it’s a 3D Bezier curve, 3 separate calculations are needed for the x-, y- and z-component for a given t value. What we need is a simplication of the equation.

Expanding the polynomial equation, we have
B(t)
= (1 – 3t + 3t2 – t3)p0
+ (3t -6t2 + 3t3)p1
+ (3t2 – 3t3)p2
+ t3p3

Rearranging the coefficients of tn, we have
B(t)
= t3(-p0 + 3p1 – 3p2 + p3)
+ t2(3p0 – 6p1 + 3p2)
+ t(-3p0 + 3p1)
+ p0

The coefficients can now be reduced to constants by precalculating them, and calculation of a Bezier point takes less computation.

In matrix form, the equation looks like
Coefficient matrix form of Bezier curve