The math behind 360 degree fisheye to landscape conversion

I wrote an article to convert a 360 degree fisheye image to a landscape view some time ago. I also realised I didn’t explain the math very much, mainly because I thought it’s fairly obvious. On hindsight, it doesn’t seem obvious. My apologies.

Commenter Eric pointed out a math technique called linear fractional transformation. It’s basically a function that can map lines and circles to lines or circles.

In theory, it seems applicable to our problem. When I tried working out the solution, I failed. 2 possible reasons: my math sucks, or the technique isn’t applicable to the problem. It’s probably the former…

My postulate is that, the fisheye-landscape conversion has an edge condition that maps a line to a point. Specifically, the bottom of the destination image maps to one point, the centre of the source image. Thus linear fractional transformation is probably not suitable. I’m happy to hear from you if you’ve found a way to make it work.

Let’s bring up the explanation diagram I had from before:

Fisheye to landscape explanation diagram

I have assumed a source image that’s square (width equals height) and its width has an even number of pixels. With this, let the given image be of width and height of 2l pixels. The idea is to construct an image with width of 4l pixels and height of l pixels, which is the landscape view of the source fisheye image.

The dimensions of the destination image was arbitrarily chosen. I just find that a height of l pixels (or half the height of the source image) to be convenient. The centre of the source image is assumed to be the centre of the small “planet”. This means the pixels along the horizontal and vertical bisectors of the source image will not be distorted (much) by the conversion.

Oh right, I haven’t told you about the exacts of the conversion math…

You should know that only the pixels in the inscribed circle of the source image would be in the destination image. This is due to the “uncurling” effect. The pixels not in the inscribed circle would be out of range in the destination image.

So, imagine the source image as a representation of the Cartesian plane. The centre of the image is the origin. The point A, is the eastern point of the inscribed circle. Points B, C and D are the northern, western and southern points of the inscribed circle respectively.

I’m using the Cartesian plane because the Cartesian quadrants make the math easier. Circles mean sines and cosines, so I’d rather work with angles in the typical form than do all the funny angle modifications. I’m not masochistic, you know…

What you should understand now is this: the pixels along the top of the destination image come from the pixels along the circumference of the inscribed circle on the source image.

We’ll be iterating over the destination image (remember my messy index preference?) Let’s start at the top left corner. We’ll be iterating 4l pixels to the top right corner. This is visualised as going clockwise on the source image from point A, to D, to C, to B and back to A.

So, 4l pixels is equivalent to 2 PI radians?

At the top left corner, we start with 2 PI radians (so to speak). As we iterate to the top right corner, the angle reduces to 0. Thus this formula:

theta = (4l – x)/(4l) * 2PI
where x is the Cartesian x axis.

Generally speaking, iterating from the left to right on the destination image is equivalent to going clockwise on the source image.

Now, as we iterate from the top left of the destination image to the bottom left, it’s equivalent to going from point A on the source image to the centre of the source image. Thus:

radius = l – y
where y is the Cartesian y axis.

Generally speaking, iterating from the top to bottom of the destination image is equivalent to going from edge of source image to centre of source image.

And once you understand that, the rest is just coding. I merged the code for converting Cartesian to raster coordinates together (more details here with code here). The code was deliberately left unoptimised so it’s easier to read.

For example,

theta = 2.0 * Math.PI * (double)(4.0 * l - j) / (double)(4.0 * l);

could be

theta = Math.PI * (double)(4.0 * l - j) / (double)(2.0 * l);

to save on operations. But the 2.0 * Math.PI makes it more meaningful.

The if condition

if (x >= 0 && x < (2 * l) && y >= 0 && y < (2 * l))

could have had the (2 * l) part assigned to a variable to avoid multiplication multiple times. You're welcome to use a variable perhaps named iSourceWidth for example.

And that's all I have. I hope you have fun with the code.

Convert 360 degree fisheye image to landscape mode

In this article, you will learn how to flatten a 360 degree fisheye image back to its landscape panoramic form. But first, what’s a 360 degree fisheye image?

[WARNING: graphic-intensive article ahead]

Lillestrøm in fisheye mode
[image by gcardinal]

It’s created by using fisheye lens in your camera, take a bunch of pictures using that, and use stereographic projection to get the final image. Or so I understand.

Basically, it’s like taking a normal (but panoramic is better) picture and “curling” it along its bottom or top (but much more complicated than that).

Here’s how to visualise the fisheye image. Hold your right hand, palm downwards, in front of you, thumb towards you, little finger away from you. The tips of your fingers form the “left end” of the image. Your wrist forms the “right end” of the image. Now form a fist with your right hand. There’s your fisheye image.

The conversion algorithm explanation

So given the fisheye image, we want to get that landscape image back. And the way we do that, is to “uncurl” the fisheye image. Here’s the diagram to explain the logic:

Fisheye to landscape explanation diagram

You may have noticed that the corners of the source image will not be in the resulting image. You may also notice that the centre of the source image is very “squeezed”, and the pixels around there will be repeated in the resulting image.

The first problem can be solved by using a bigger destination image. But it doesn’t matter to me, and you will get a jagged top with unfilled pixels. I didn’t like that, so decided to give those up.

The second problem… I don’t know if there’s a solution. Because you’re trying to “guess” the pixels mapped in the destination image from the source image. But the source image has less pixel information. The simplest solution seems to be to get a higher resolution source image. But that only mitigate the problem, not solve it.

You may also notice that only the pixels within the inscribed circle of the source image is used. Well, what do you get when you curl up a line? A circle. *wink*

What happens when circles are involved? Radius and angles, that’s what.

So in the destination image, in raster coordinates, going from top to bottom is equivalent to going from outer inscribed circle of source image to centre of source image. Or the variable l slowly reduces to zero.

Going from left to right is equivalent to going from 2PI radians and decreasing to 0 radians on the inscribed circle. It’s also equivalent to going from 0 radians to -2PI radians. Sines and cosines are periodic functions.

Here’s another diagram to show what happens when we iterate over the destination image:

Before we get to the code, here are 2 assumptions to simplify the process:

  • The source image is square
  • The width of the source image is even

They’re not necessary, but they make the programming easier. And I’m mapping the quadrants to the standard Cartesian quadrants because they make the math easier. The centre of the image should be the centre of the “circle” (or that small planet, as it’s affectionately known).

[The original source image wasn’t square, and its centre wasn’t the centre of the planet. So I cut the image, and guessed as best as I could on the centre. More info on the centre later in the article.]

Fisheye to landscape algorithm/code

I’m plagiarising my own code from the image rotation with bilinear interpolation article for the bilinear interpolating parts. There are 2 resulting images, one with and one without bilinear interpolation. And here’s the code:

// assume the source image is square, and its width has even number of pixels
Bitmap bm = new Bitmap("lillestromfisheye.jpg");
int l = bm.Width / 2;
Bitmap bmDestination = new Bitmap(4 * l, l);
Bitmap bmBilinear = new Bitmap(4 * l, l);
int i, j;
int x, y;
double radius, theta;

// for use in neighbouring indices in Cartesian coordinates
int iFloorX, iCeilingX, iFloorY, iCeilingY;
// calculated indices in Cartesian coordinates with trailing decimals
double fTrueX, fTrueY;
// for interpolation
double fDeltaX, fDeltaY;
// pixel colours
Color clrTopLeft, clrTopRight, clrBottomLeft, clrBottomRight;
// interpolated "top" pixels
double fTopRed, fTopGreen, fTopBlue;
// interpolated "bottom" pixels
double fBottomRed, fBottomGreen, fBottomBlue;
// final interpolated colour components
int iRed, iGreen, iBlue;

for (i = 0; i < bmDestination.Height; ++i)
	for (j = 0; j < bmDestination.Width; ++j)
		radius = (double)(l - i);
		// theta = 2.0 * Math.PI * (double)(4.0 * l - j) / (double)(4.0 * l);
		theta = 2.0 * Math.PI * (double)(-j) / (double)(4.0 * l);

		fTrueX = radius * Math.Cos(theta);
		fTrueY = radius * Math.Sin(theta);

		// "normal" mode
		x = (int)(Math.Round(fTrueX)) + l;
		y = l - (int)(Math.Round(fTrueY));
		// check bounds
		if (x >= 0 && x < (2 * l) && y >= 0 && y < (2 * l))
			bmDestination.SetPixel(j, i, bm.GetPixel(x, y));

		// bilinear mode
		fTrueX = fTrueX + (double)l;
		fTrueY = (double)l - fTrueY;

		iFloorX = (int)(Math.Floor(fTrueX));
		iFloorY = (int)(Math.Floor(fTrueY));
		iCeilingX = (int)(Math.Ceiling(fTrueX));
		iCeilingY = (int)(Math.Ceiling(fTrueY));

		// check bounds
		if (iFloorX < 0 || iCeilingX < 0 ||
			iFloorX >= (2 * l) || iCeilingX >= (2 * l) ||
			iFloorY < 0 || iCeilingY < 0 ||
			iFloorY >= (2 * l) || iCeilingY >= (2 * l)) continue;

		fDeltaX = fTrueX - (double)iFloorX;
		fDeltaY = fTrueY - (double)iFloorY;

		clrTopLeft = bm.GetPixel(iFloorX, iFloorY);
		clrTopRight = bm.GetPixel(iCeilingX, iFloorY);
		clrBottomLeft = bm.GetPixel(iFloorX, iCeilingY);
		clrBottomRight = bm.GetPixel(iCeilingX, iCeilingY);

		// linearly interpolate horizontally between top neighbours
		fTopRed = (1 - fDeltaX) * clrTopLeft.R + fDeltaX * clrTopRight.R;
		fTopGreen = (1 - fDeltaX) * clrTopLeft.G + fDeltaX * clrTopRight.G;
		fTopBlue = (1 - fDeltaX) * clrTopLeft.B + fDeltaX * clrTopRight.B;

		// linearly interpolate horizontally between bottom neighbours
		fBottomRed = (1 - fDeltaX) * clrBottomLeft.R + fDeltaX * clrBottomRight.R;
		fBottomGreen = (1 - fDeltaX) * clrBottomLeft.G + fDeltaX * clrBottomRight.G;
		fBottomBlue = (1 - fDeltaX) * clrBottomLeft.B + fDeltaX * clrBottomRight.B;

		// linearly interpolate vertically between top and bottom interpolated results
		iRed = (int)(Math.Round((1 - fDeltaY) * fTopRed + fDeltaY * fBottomRed));
		iGreen = (int)(Math.Round((1 - fDeltaY) * fTopGreen + fDeltaY * fBottomGreen));
		iBlue = (int)(Math.Round((1 - fDeltaY) * fTopBlue + fDeltaY * fBottomBlue));

		// make sure colour values are valid
		if (iRed < 0) iRed = 0;
		if (iRed > 255) iRed = 255;
		if (iGreen < 0) iGreen = 0;
		if (iGreen > 255) iGreen = 255;
		if (iBlue < 0) iBlue = 0;
		if (iBlue > 255) iBlue = 255;

		bmBilinear.SetPixel(j, i, Color.FromArgb(iRed, iGreen, iBlue));

bmDestination.Save("fisheyelandscape.jpg", System.Drawing.Imaging.ImageFormat.Jpeg);
bmBilinear.Save("fisheyebilinearlandscape.jpg", System.Drawing.Imaging.ImageFormat.Jpeg);

So what did we get from our fisheye image?

Let’s look at our results, shall we? First, let’s bring our source image back.

Lillestrøm in fisheye mode

Here’s the straight-up result from the algorithm:

Lillestrøm in landscape mode
[click for larger image]

Here’s the result with bilinear interpolation:

Lillestrøm in landscape mode with bilinear interpolation
[click for larger image]

And I’m fortunate that the original photographer had the original landscape version for comparison:

[image by gcardinal]

Hmm… better than I expected. The code was also easier than I expected. I think it’s the math that was harder…

The fisheye image I used is one of those where the bottom of the original landscape image is curled into the centre. The other type is where the top of the original image is curled into the centre.

In that case, using the algorithm provided results in an upside-down image. I’ll leave it as an exercise to you for flipping the resulting image right-side-up.

One note about the centre of the source image. I found that if the centre is off, the horizon of the resulting image won’t be flat. I’m just showing you how to get the landscape image from a “perfect” source fisheye image. I’ll leave the details of fixing missing corners and undulating horizons to you.

Oh, while figuring out the algorithm and writing the code, I needed a test image. So I made this monstrosity:

Fisheye test image

with this result:

Fisheye test landscape image
[click for larger image]

It ain’t pretty, but I needed to test if the quadrants map correctly…

P.S. A reader submitted this in a recent survey. I hope I’m interpreting the question and answering it correctly.