Regular polygon equation (solved)

So I’ve finally solved this. You can read about the background and context for the question on:

This is the Wolfram Alpha friendly command:
polarplot [ cos(Pi/7)/cos( | (t mod (2Pi/7)) – (2Pi/(2*7)) | ) , {t,0,2Pi}]

That will generate a regular polygon with 7 sides, with a circumradius of 1 unit. Substitute all the 7’s with the number of sides you want and voila! And the general equation is thus:

cos(Pi/N) / cos( ABS( (t mod (2Pi/N)) – (2Pi/(2*N)) ) )

where t is in [0,2Pi], and N is the number of sides

So how did I get the equation?

The formula for the apothem is cos(Pi/N). The apothem is the shortest distance from the centre to the side. With that, I bring your attention to this illustration. (this is a regular polygon with 4 sides. “It’s a square!”. Yes, I know)

Apothem and regular polygon equation explanation

The length “A” is the apothem, and t is the angle running in the equation we stated above. “apm” is the angle the apothem (for this particular segment of t) makes with the positive X axis.

And L is what we want to find.

We will define the first segment as the segment immediately after the segment whose apothem lies on the positive X axis. (so the apothem illustrated above is for the first segment) That will uniquely identify our segments.

Now convince yourself that the angle apm is in multiples of 2PI/N radians.

To find L, we need to find the angle s (I’m running out of colours…). And angle s = t – apm.

So s = t – 2PI/2N

“But that’s not exactly right!” you say. And you’re right. Because that didn’t take care of the multiples of 2PI/N radians thing.

To get the working t angle we’re using, it should be

“working t angle” = t modulus 2PI/N

Convince yourself that’s true. Substitute N with 4 or 5 or 100.

“But that’s not exactly right!” you say. And you’re right.

Because s = “working t angle” – 2PI/2N
can be negative (suppose the red line L is on the right side of A). That’s why we have

s = ABS( (t modulus 2PI/N) – 2PI/2N )

Why do we need to find s again? Because we want to find L. And L can be found with this equation:

A/L = cos(s)

Revise your trigonometry rules. Cosine of the acute angle is equal to adjacent side divide by the hypotenuse.

So L = A/cos(s)
= cos(PI/N) / cos(ABS( (t modulus 2PI/N) – 2PI/2N ))

So why do we need to find L again? In polar coordinates, you only need the angle and the radius (or length from origin) to uniquely determine a point. Since we have the angle, we just need the radius (or length).

That’s why the polar plotting from Wolfram Alpha works.

You can probably convert that from a polar coordinate point equation representation to a Cartesian point equation representation, but I’m done for now.

Regular polygon equation

A while ago, a blog reader named BJ (he seems to prefer being called BJ. He? *checks email…* Yeah, he) emailed me with his answer to this question: Is there an equation to describe regular polygons?

I’m not clever enough to do much editing and explanation, so I’ll post his email (got his permission and clarified some points) here.

*Start email quote*

An example: A polygon equation can be approximated by a single continuous implicit equation. Suppose x*y=0.1 The concept is to think of this hyperbola as almost being the product of two lines intersecting at x=0 and y=0.

Construct a (3,4,5) right triangle like this: Begin with the multiplication (x – 1)(y + 1) = 0.1. This product is the first approximated vertex. Multiply a second time using (3y – 4x – 5). This creates two more curved vertices. The multiplication expression now has three factors on the LHS. The RHS remains 0.1. The final equation will form a figure with the approximated triangle being a central “island”. Below is an equation for a (3,4,5) right triangle that morphs from intersecting lines to the triangle and then on to the circle. I sandwiched the approximated triangle between two circles, and it becomes one circle in the limit as 0.1 -> 0.

((x + 0.5)^2 + (y – 1)^2 – 6.125) – ((x + 0.5)^2 + (y – 1)^2 – 6.125)(3y – 4x – 5)(x – 1)(y + 1) = 0.1

Notice that I have multiplied by a circle in the last term. I sandwiched the approximated triangle between two circles, and it becomes one circle in the limit as .1 -> 0. The last term is subtracted from the circle that in the first term. The multiplication expression goes to zero, and only the circle remains. This method is somewhat analogous to the method used in Euclidean geometry.

*End email quote*

He also sent a follow-up email:

*Start email quote*

Here is a quadratic form for the triangle [(0,0),(4,0),(4,3)].
(x – 4) y (y – 0.75x) = 0.01

Solve for y=f(x) on the range 0 <= x <= 4 y = (15x^2 - 60x + sqrt(225x^4 - 1800x^3 + 3600x^2 + 16x - 64)) / (40x - 160) y = (15x^2 - 60x - sqrt(225x^4 - 1800x^3 + 3600x^2 + 16x - 64)) / (40x - 160) Plot using two colors, one for each solution. *End email quote* You can add your thoughts on this in the comments.

Story time

I don’t have much else to add to that, so I’ll just tell you a story instead.

Back when I was in university, there was a programming problem that’s to calculate the value of PI. There were 2 methods involved.

The first method used Monte Carlo simulation. Basically you have a circle with radius 1 unit. So the diameter is 2 units. And you have a square that just contains this circle, so the square is 2 units wide by 2 units high.

The area of the square is 2*2 = 4 units. The area of the circle is given by PI*r*r which is PI (because r is 1 unit). And the ratio of area of circle to area of square is PI/4.

Using Monte Carlo simulation, I randomly selected a point within the square. I made a note of whether the point was within the circle or without. And

[Number of points within circle]/[Total number of points] = PI/4

Since PI is the only “unknown”, there you have it. Solve for PI. The more points you use, the more accurate PI is.

The other method involves calculating the circumference of a circle. Suppose you have a square with a width of square root 2. This is chosen such that the diagonal length of the square is 2 units. This means the distance from the centre of the square to the corner point of the square (any of the 4 of them) is 1 unit.

Are you getting the idea yet?

The perimeter of this square is 4 * (square root of 2). Then we double the number of sides so we get an 8-sided polygon, an octagon. But still keeping the “from centre to outer-most point is 1 unit length” condition. Using some more maths, we get the length of one side of this octagon and multiply it by 8. That will be the perimeter.

As we keep doubling the number of sides, this polygon eventually approximates a circle. And so the perimeter of said (regular) polygon approaches the circumference of a circle, which is 2*PI*r, or just 2*PI (because r is 1 unit). Solve for PI.


To end this, I will leave it to you as an exercise to calculate the length of that regular polygon in the 2nd method. Start by understanding why one side of a square is square root of 2. Then continue with calculating the side length for an octagon.

Side note: The 2nd method terminates faster as an iterative process than the Monte Carlo simulation. However, it is also less “stable”. Hahaha… for extra credit, explain why it is less “stable”.

Extra side note: If people knew the formula for a circle is PI*r*r, wouldn’t they have known the value of PI already? This was why I found the Monte Carlo simulation method a little on the self-fulfilling side. You’re solving for something that you sort of know the value of. That’s weird, almost like cheating.

Smooth Bezier splines

Apparently, having mathematically defined curves that pass through a set of desired points is a thing. And (cubic) Bezier splines are popular for this. Professor Dagan (mentioned previously) sent me a link.

Smooth Bézier Spline Through Prescribed Points

The article outlines a method that given a set of points you want your Bezier curve to pass through, calculate the required control points of the Bezier curve. This is similar to what I wrote here.

The difference is that my method requires the inverse of the coefficient matrix to exist, which it does. The method in that article requires the first and second derivatives of the Bezier curve to be continuous.

Cubic polynomials and cubic Beziers

So it turns out that for cubic Bezier curves, t values of 0, 1/3, 2/3 and 1 have special meanings. A general cubic polynomial is of the form

y = a0 + a1 * x + a2 * x^2 + a3 * x^3

where ai’s are real constants.

If the variable x is limited to the interval x0 <= x <= x0 + χ (that's the Greek letter Chi), where χ > 0, then it’s equivalent to a special case of cubic Bezier curves. Namely, when the t values are 0, 1/3, 2/3 and 1.

In fact, there’s a mathematical proof of it. Thanks to Professor Samuel Dagan of Tel-Aviv University for writing in and letting me know of his work. Here’s more of his work.

Does the point lie on the Bezier curve?

Someone recently asked me how to tell if a point lies on a Bezier curve.

For the purposes of discussion, it’s a quadratic Bezier curve and all 3 control points are known (or the start and end points and the 1 control point if you prefer). You can read more about the reverse process of finding the control points here, which is the reference point of that person’s question.

The answer is actually straightforward. Substitute everything into the Bezier curve equation and solve for t. Here’s the quadratic Bezier equation:
B(t) = (1-t)^2 * p0 + 2(1-t)t * p1 + t^2 * p2

Let’s say p0 is [1,1] and p1 is [1.5,4.5] and p2 is [2,3]. We’ll keep the points in 2 dimensions to keep the maths working less cumbersome. And let’s say the point you want to check is [1.8,3.4]. We substitute all the points into the equation, and we get this:

[1.8,3.4] = (1-t)^2 * [1,1] + 2(1-t)t * [1.5,4.5] + t^2 * [2,3]

I know, it doesn’t look pretty. But hey, we’re doing this by hand. If you’re writing code to generalise the solution, the code will probably look just a little uglier, but the solution will come out faster. Like probably instantly given the current modern processors.

Because we’re dealing with 2 dimensional points, that equation splits into 2 separate equations (with scalars instead of vectors as coefficients), like so:
1.8 = (1-t)^2 * 1 + 2(1-t)t * 1.5 + t^2 * 2
3.4 = (1-t)^2 * 1 + 2(1-t)t * 4.5 + t^2 * 3

If you have 3 dimensional points, you’d have 3 equations. Note that even then, the degree of your equations remains as 2. The degree of the Bezier curve is independent of the number of dimensions you’re working with.

If you simplify
1.8 = (1-t)^2 * 1 + 2(1-t)t * 1.5 + t^2 * 2

You get t = 0.8. It so happens that in this case, there’s only one solution.

If you simplify
3.4 = (1-t)^2 * 1 + 2(1-t)t * 4.5 + t^2 * 3

You get
5*t^2 – 7*t + 2.4 = 0

and after solving for that, you get t = 0.6 or t = 0.8 (you’re a smart person, you know how to solve a quadratic equation, right?)

Now, the solution t=0.8 appears in the solution sets of both equations. Therefore, the point [1.8,3.4] lies on the Bezier curve. In fact, t=0.8 is the t value.

Multiple solutions

What if you get multiple t values appearing in multiple solution sets of equations?

Consider the case where p0 is [1,1], p1 is [2,3], and p2 is [1,1]. Notice that the start and end points are the same point. Let’s say you want to know if the point [1,1] lies on the curve (yes I know it’s the same point). Substituting all the points, we get:

[1,1] = (1-t)^2 * [1,1] + 2(1-t)t * [2,3] + t^2 * [1,1]

This gives us the 2 equations:
1 = 1 – 2*t + t^2 + 4*t – 4*t^2 + t^2
1 = 1 – 2*t + t^2 + 6*t – 6*t^2 + t^2

They simplify to:
2*t^2 – 2*t = 0
4*t^2 – 4*t = 0

Hey presto! The solution set is t=0 or t=1 for both equations. Therefore, the point [1,1] lies on the curve. In fact, it lies on the curve where t=0 or t=1. And t=0 and t=1 happens to coincide with the start and end points respectively.

The whole point (haha!) is that, as long as you have at least one value of t that appears in the solution sets of all the equations, then said point you’re checking lies on the curve.

Higher degree Bezier curves

This is a toughie. If you have a cubic Bezier curve, then you’re solving a degree 3 polynomial (of t). If you have a Bezier curve of degree N, then you’re solving a degree N polynomial.

There are algorithms to solve generic degree polynomials, but they are out of scope here. Assuming the highest degree of Bezier curves you’ll ever work with is 3 (cubic), then this Wikipedia article on cubic functions will help. Remember, cubic Bezier curve equations are still cubic equations.

Higher dimensionality

The number of dimensions you’re working with determines the number of equations you need to solve. If you’re working with 5 dimensional points, then you need to solve for 5 equations.

For example, if you’re working with cubic Bezier curves and using 5 dimensional points, then you need to solve 5 cubic functions. You will have possibly 3 (unique) t values for each equation. Let’s say your solution sets are as follows:
t = -1, 3, 5
t = 0, 1, 3
t = -2, 2, 3
t = 3, 3, 4 (yay repeated values!)
t = 3, 6, 8

The value t=3 appears in all 5 sets of solutions, therefore your point lies on the curve.

Keep it real

In the process of solving your equations, there’s a possibility that you might get imaginary solutions. You know, those involving the square root of -1. Dismiss them.

Your Bezier curve is in the real world. The point you’re checking must therefore also lie in the real world.

Unless you’re working with some abstract imaginary Bezier curves on an advanced maths paper. Then good luck to you! The logic above for solving still applies.

Actual applications

When applying the above, you don’t usually get nice numbers like [1.8,3.4] lying on the curve with t=0.8. You get numbers with lots of numbers behind the decimal point that seems to continue forever. You don’t get exact values.

What if you get a t=0.798 for one equation, and t=0.802 for another equation?

Use your common sense. Set an error margin for what is acceptable.

My suggestion is to NOT use the values of t to check for the margin. Substitute the values of t into the equation, and then check the points if they’re within the error margin.

This means you don’t check the difference between t=0.798 and t=0.802, which is 0.04. Is 0.04 within your error margin? Maybe. But you’re not checking for this.

You substitute t=0.798 and t=0.802 into the equation, and you get 2 points: [1.798,3.40198] and [1.802,3.39798]

Then you say, “Are these points close enough that I consider them to be the same point?” Use whatever you think is appropriate. I think the Euclidean distance norm works fine. Then check if that “close enough” criteria is within your error margin.

If you’re checking for [1.8,3.4], then ask yourself, “Is [1.8,3.4] close enough to [1.798,3.40198]? And is [1.8,3.4] close enough to [1.802,3.39798]?”

Obviously, doing this by hand sucks big time. Good thing you’re a programmer.

Optimal width and height after image rotation

A while ago, a blog reader Fabien sent me some code (you can read it here. Thanks Fabien!). The PHP code is a modification of my image rotation code with some upgrades.

I was looking through his code (French variable names!) and was puzzled by the initial section. I believe he based his code on my code where the resulting image wasn’t clipped after rotation, meaning the whole image was still in the picture/bitmap (though rotated).

In that piece of code, I just used the diagonal length of the image (from top-left corner to bottom-right corner) as the final length and breadth of the resulting image. This gave the simplest resulting image dimension without doing complicated maths calculations (a square in this case).

However, what if you want to know the optimal width and height of the resulting image after rotation? Meaning the best-fit width and height that just manages to contain the resulting rotated image. For that, I need to tell you some basic trigonometry and geometry.

Image rotation, optimal width and height

Suppose you have a rectangle with L as the length and H as the height. It is rotated t angles. I’m not going to explain the maths behind it. It involves complementary angles, supplementary angles, rotation symmetry and trigonometry with sines and cosines. Convince yourself that the diagram is true.

So after rotating t angles, the optimal width is L * cos(t) + H * cos(90 – t)

The optimal height is L * sin(t) + H * sin(90 – t)

Short digression: You might notice that any lengths that lie parallel to the x-axis usually involve cosines, and lengths that lie parallel to the y-axis usually involve sines. It’s just the way trigonometry works.

Now, although the image rotation is carried out with respect to the image’s centre, rotating by the top-left corner will result in the same optimal width and height. Again, this is basic maths so you’ll just have to convince yourself it’s true (and that I don’t really want to explain it…).

But that’s if t is an acute angle. What about other angles?

Image rotation, optimal width and height

For those angles, we just need to calculate the acute angle based on the initial rotation angle. After that, just substitute that calculated acute angle into our formula above. I have absolute confidence in your ability to check which quadrant in the Cartesian coordinate system does your rotation angle lie in.

UPDATE: In case you are unable to view images, if your rotation angle is in the 2nd quadrant, the calculated angle is (180 – t). If in the 3rd quadrant, it’s (t – 180). And if in the 4th quadrant, it’s (360 – t).

In practice, you might still want to pad a couple of pixels around. But that should give you the smallest image dimension which can still contain your rotated image.

Bezier curves prefer tea

My maths professor was hammering on the fact that Citroen used Bezier curves to make sure their cars have aesthetically pleasing curves. Again. (This is not a sponsored post from the automaker).

While I appreciate his effort in trying to make what I’m learning relevant to the real world, I kinda got the idea that Citroen used Bezier curves in their design process. Right about the 3rd tutorial lesson.

My professor then went on to give us homework. “Us” meaning 5 of us. It was an honours degree course. It wasn’t like there was a stampede to take post-graduate advanced maths lessons, you know.

Oh yes, homework. My professor, with wisdom acquired over years of teaching, gave a blend of theoretical and calculation-based questions. Any question that had the words “prove”, “justify”, “show” are probably theoretical questions. Calculation-based questions are like “What is 1 + 1?”. Everyone, at least theoretically (haha!), should be able to do the calculation-based questions. The theoretical questions would require more thinking (“Prove that such and such Bezier curve is actually equal to such and such.”).

My friend, who took the course with me, loved calculation-based questions. She’d sit patiently and hammer at the numbers and the calculator. I can’t say I love them. My professor once gave a question that amounted to solving a system of 5 linear equations with 5 unknowns, which amounted to solving a 5 by 5 matrix. By hand. (It involves 15 divisions, 50 multiplications and 50 subtractions. There’s a reason why linear algebra and numerical methods were pre-requisites) I wanted to scream in frustration, throw my foolscap paper at him, and strangle him. Not necessarily in that order.

This coming from someone who is fine with writing a C program doing memory allocations (using the malloc function. And then manually freeing the pointer with the memory allocation. We didn’t have garbage collection, ok?) to simulate an N-sized matrix, and then perform Gauss-Jordan elimination on the matrix. I used that program to solve a 100 by 100 matrix. But I dreaded solving a 5 by 5 matrix by hand.

It probably explains why I remember Bezier curves so much.

Anyway, a while ago, someone sent me a question (through Facebook, of all channels). He asked, for a given “y” value of a Bezier curve, how do you find the “x” value?

That is a question without a simple answer. The answer is, there’s no guarantee there’s only one “x” value. A cubic Bezier curve has a possibility of having 1, 2 or 3 “x” values (given a “y”). Here’s the “worst” case scenario:

Multi x values

So you can have at most 3 “x” values. In the case of the person who asked the question, this is not just wrong, but actually dangerous. The person was an engineer, working on software that cuts metal (or wood). The software had a Bezier curve in it, which it used to calculate (x,y) coordinate values to direct the laser beam (or whatever cutting tool) to the next point (and thus cut the material).

If a “y” value has multiple “x” values, the software won’t know which “x” value you want. And thus cut the material wrongly.

The only way a Bezier curve has only 1 value, is if it’s monotonically increasing/decreasing. That means for all values of x and y such that x <= y [or x >= y], that f(x) <= f(y) [or f(x) >= f(y)].

Bezier curves don’t work well in the Cartesian plane. They work fine after you’ve used them to calculate values, and then transfer onto the Cartesian plane. Bezier curves prefer to work with values of t.

Negative sales targets and percentage commissions

A while ago, I received an email from a distraught salesman. He believed his sales commissions were wrongly calculated, and asked me to shed some light.

Note that I’m not using the exact numbers he gave in his email.

The story goes that Michael (as I’ll call him) and his colleagues were given sales targets that were negative. How could sales targets be negative? Shouldn’t you be trying to sell something? The reason given was that the current economy was disastrous, and basically each sales person was trying to not lose sales.

You’re gonna bleed. It’s how much you bled.

Anyway, given Michael’s negative sales target, he managed to exceed it. He didn’t manage to bring in sales (positive sales numbers), but he didn’t lose too much money (slight negative sales numbers). But his sales commissions didn’t reflect that.

Now I’m not going to discuss how that works out. I can’t presume to understand the business logic behind the sales commission in this case, but I’ll discuss the mathematics behind the numbers.

The normal sales targets and commission

Let’s say your sales target for this month is $1000. This means you’re expected to sell about $1000 worth of products or services. We’ll ignore the condition that you will get some commission based on what you sell, regardless of how much you sold (my brother’s a sales person), as well as other types of commissions.

Let’s say the sales commission is based on how much extra you sold beyond your sales target. Makes sense, right? Let’s use simple percentages.

If you sold $1100 worth of products or services, then your percentage commission might be calculated as follows:
(Difference between Your Sales and Your Sales Target) / (Your Sales Target)

Or ($1100 – $1000) / ($1000) = 10% commission.

This is assuming that your sales amount exceeded the sales target, of course.

The case of negative sales targets

Now if the sales target is negative, as in Michael’s case, the mathematical formula still applies. But you have to note the negative sign. For some reason, “business” people (no offense to business people) tend to see -4567 as larger than 12, even though 12 > -4567. They see the magnitude first, not the value itself. (It’s also why I get emails about calculations involving negative numbers… anyway…)

Let’s say the sales target is -$1000. Everyone’s expected to lose money, but you try not to lose more than $1000. At least that’s what I’m interpreting it as.

Let’s say Michael managed to lose only $50. Or -$50 to be clear. The formula
(Difference between Your Sales and Your Sales Target) / (Your Sales Target)

have to be modified to this
(Difference between Your Sales and Your Sales Target) / (Magnitude of Your Sales Target)

In maths and programming terms, the “magnitude” part refers to the absolute function. Meaning you ignore any negative signs. Actually, the modified version works for the normal case too (which is why you should use it for the normal version anyway to take care of weird cases like this but I digress…).

So, we get (-$50 – [-$1000]) / abs(-$1000) = $950 / $1000
= 95%

Actually, you should use this:
abs( [Your Sales] – [Your Sales Target] ) / abs(Your Sales Target)

That’s the “foolproof” version. Consider it a bonus for reading this far. Frankly speaking, any competent programmer should be able to come up with that formula, even without much maths background. You just need to think about the situation a little carefully (ask “what if?” more often).

Michael’s calculated commission

When Michael wrote to me, he said his commission was calculated as follows (given that he only lost $50):
-$50 / -$1000 = 5%

Let’s say someone else lost -$900 that month. With the above calculation, that person gets:
-$900 / -$1000 = 90%

Clearly it makes more sense to lose more money! This was why Michael wrote to me.

I don’t propose the method I gave is correct, business-logic-wise. Michael didn’t give me any details on what he’s selling, or what his company is (or even why it’s acceptable to have negative sales targets, regardless of the economy). So I cannot give any help other than from a pure mathematical point of view. But I hope it’ll at least give Michael a fairer commission amount.


Given Michael’s situation, what do you think is an appropriate calculation formula?

Can you think of (or know of) a realistic situation where a negative sales target is acceptable? I say “acceptable”, but seriously, no company should “accept” that they lose money every month.