Interpolation is a method of calculating a value from a set of given values. We’ll be looking at interpolation with a bias towards image processing, but the theory can be generalised for other purposes. You’ve probably already solved some interpolation problems without knowing it. Let me give you an example.
A distance problem
Suppose there are 3 towns A, B, C and they happen to lie on a straight line, in that order. B is 5 kilometres away from A, and C is 15 kilometres away from A. If you travel one quarter of the way from town B to town C, how far are you from town A?
To solve it, you can figure out the distance between B and C, which is 15 – 5 = 10 km. One quarter of the way means 1/4 * 10 = 2.5 km. Then add the distance between A and B to this and you have 5 + 2.5 = 7.5 km.
If you visualise the problem as interpolating between 2 points, then B becomes the point p0 with a value of 5 (km) and C becomes the point p1 with a value of 15 (km). The usual variable used is t, so the generic formula is:
f(t) = (1 – t) * p0 + t * p1, where t lies between 0 and 1 inclusive.
Using this, we have
f(1/4) = (1 – 1/4) * 5 + 1/4 * 15
= 3/4 * 5 + 1/4 * 15
This is linear interpolation. Linearity refers to the power of the variable t, which is 1. Note that there’s no stopping you from using negative values of t or values greater than 1.
Suppose you travelled from B to A one quarter of the distance between B and C. How far are you from town A?
f(-1/4) = (1 – (-1/4)) * 5 + (-1/4) * 15
= 5/4 * 5 – 1/4 * 15
Suppose you travelled from B to C and went past C by a quarter of the way. How far are you from town A?
f(5/4) = (1 – 5/4) * 5 + 5/4 * 15
= -1/4 * 5 + 5/4 * 15
What happens if you get a negative result?
f(-1) = (1 – (-1)) * 5 + (-1) * 15
= 2 * 5 – 15
It means you’re 5 kilometres away from town A. You’re just in the opposite direction from towns B and C. The calculation result is correct. It’s how you interpret the value.
Applications in image processing
A common operation in image processing is manipulating height maps. Height maps are usually greyscale bitmap files where a white pixel (RGB values are 255 for all 3) is the highest point, and a black pixel (RGB values are 0 for all 3) is the lowest point.
You know enlarging photographs can give you some weird results. What happens is you’re trying to fill in the blanks in a larger image using values from the original image. Where do you think the image editing software comes up with values? Interpolation.
If you think of the red, green and blue values of image pixels as 3 different “height maps”, then you’re just performing interpolation on 3 values. Suppose we’re talking about linear interpolation between two pixels. You’ll interpolate between the red component of the 2 pixels and get a value. Similarly you do it for the green and blue components. The calculated results of the red, green and blue become the interpolated colour.
Cubic Bezier interpolation
There are all kinds of cubic curves available. The Catmull–Rom spline, the non-uniform rational B-spline (NURBS) and I didn’t really want to write anything on the subject after I remember my Hermite splines… I love Bezier curves though, so I thought maybe I can write something with that.
Instead of 2 points used in linear interpolation, cubic interpolation uses 4 points. To illustrate, suppose you’re on an undulating plain with small hills undulating in their usual carefree manner. You’re in between two such (undulating) hills and you want to find out how high you are.
Instead of linear interpolating your way through these two (undulating) hills, the better way will be to interpolate with another 2 (undulating) hills! Ok, I’m stopping with the undulating thing…
The Bezier curve equation looks like this:
B(t) = (1-t)^3 * p0 + 3*(1-t)^2 * t * p1 + 3*(1-t)* t^2 * p2 + t^3 * p3
where p0, p1, p2, p3 are the (height) values, and t lies between 0 and 1 inclusive.
You will be between p1 and p2. Let’s also assume that the hills are equidistant from each other. Like the pixels on an image, the hills shall be of equal distance from its neighbour.
Because of this equidistant property, p1 is 0.33 (roughly 1/3) units away from p0, p2 is 0.67 (roughly 2/3) units away from p0 and p3 is 1 unit away from p0.
How do you know what’s the value of t to use? You might be able to calculate the t if you do linear interpolation between p1 and p2. But that t value is different from the t value in the Bezier curve.
Ahhh… once you get the t-linear value, you interpolate with 0.33 and 0.67 to get the t-Bezier value. Confused? Suppose you’re one quarter way from p1 to p2. Your t-linear value is 1/4. Interpolate that with 0.33 and 0.67 to get
f(1/4) = (1 – 1/4) * 0.33 + 1/4 * 0.67
And 0.415 is your t-Bezier value. Voila!
You skipped the quadratic power!
I know. It’s logical to think that there’s a power 2 somewhere. But there isn’t. There is one fundamental flaw with quadratic interpolation. Which segment do you use?
Interpolation is just a method of creating data values using a set of existing data. What those created values mean is up to you to interpret.
In image processing, interpolation can be used to fill in blanks when enlarging an image. It doesn’t guarantee that the enlarged image looks good. Image processing is very much an aesthetic-based operation. I’ll talk a bit more on this when I get to writing code to rotate images.