The math behind 360 degree fisheye to landscape conversion

Oct

The math behind 360 degree fisheye to landscape conversion

I wrote an article to convert a 360 degree fisheye image to a landscape view some time ago. I also realised I didn’t explain the math very much, mainly because I thought it’s fairly obvious. On hindsight, it doesn’t seem obvious. My apologies.

Commenter Eric pointed out a math technique called linear fractional transformation. It’s basically a function that can map lines and circles to lines or circles.

In theory, it seems applicable to our problem. When I tried working out the solution, I failed. 2 possible reasons: my math sucks, or the technique isn’t applicable to the problem. It’s probably the former…

My postulate is that, the fisheye-landscape conversion has an edge condition that maps a line to a point. Specifically, the bottom of the destination image maps to one point, the centre of the source image. Thus linear fractional transformation is probably not suitable. I’m happy to hear from you if you’ve found a way to make it work.

Let’s bring up the explanation diagram I had from before:

Fisheye to landscape explanation diagram

I have assumed a source image that’s square (width equals height) and its width has an even number of pixels. With this, let the given image be of width and height of 2l pixels. The idea is to construct an image with width of 4l pixels and height of l pixels, which is the landscape view of the source fisheye image.

The dimensions of the destination image was arbitrarily chosen. I just find that a height of l pixels (or half the height of the source image) to be convenient. The centre of the source image is assumed to be the centre of the small “planet”. This means the pixels along the horizontal and vertical bisectors of the source image will not be distorted (much) by the conversion.

Oh right, I haven’t told you about the exacts of the conversion math…

You should know that only the pixels in the inscribed circle of the source image would be in the destination image. This is due to the “uncurling” effect. The pixels not in the inscribed circle would be out of range in the destination image.

So, imagine the source image as a representation of the Cartesian plane. The centre of the image is the origin. The point A, is the eastern point of the inscribed circle. Points B, C and D are the northern, western and southern points of the inscribed circle respectively.

I’m using the Cartesian plane because the Cartesian quadrants make the math easier. Circles mean sines and cosines, so I’d rather work with angles in the typical form than do all the funny angle modifications. I’m not masochistic, you know…

What you should understand now is this: the pixels along the top of the destination image come from the pixels along the circumference of the inscribed circle on the source image.

We’ll be iterating over the destination image (remember my messy index preference?) Let’s start at the top left corner. We’ll be iterating 4l pixels to the top right corner. This is visualised as going clockwise on the source image from point A, to D, to C, to B and back to A.

So, 4l pixels is equivalent to 2 PI radians?

At the top left corner, we start with 2 PI radians (so to speak). As we iterate to the top right corner, the angle reduces to 0. Thus this formula:

theta = (4l – x)/(4l) * 2PI
where x is the Cartesian x axis.

Generally speaking, iterating from the left to right on the destination image is equivalent to going clockwise on the source image.

Now, as we iterate from the top left of the destination image to the bottom left, it’s equivalent to going from point A on the source image to the centre of the source image. Thus:

radius = l – y
where y is the Cartesian y axis.

Generally speaking, iterating from the top to bottom of the destination image is equivalent to going from edge of source image to centre of source image.

And once you understand that, the rest is just coding. I merged the code for converting Cartesian to raster coordinates together (more details here with code here). The code was deliberately left unoptimised so it’s easier to read.

For example,

theta = 2.0 * Math.PI * (double)(4.0 * l - j) / (double)(4.0 * l);

could be

theta = Math.PI * (double)(4.0 * l - j) / (double)(2.0 * l);

to save on operations. But the 2.0 * Math.PI makes it more meaningful.

The if condition

if (x >= 0 && x < (2 * l) && y >= 0 && y < (2 * l))

could have had the (2 * l) part assigned to a variable to avoid multiplication multiple times. You're welcome to use a variable perhaps named iSourceWidth for example.

And that's all I have. I hope you have fun with the code.