Regular polygon equation (solved)

So I’ve finally solved this. You can read about the background and context for the question on:

This is the Wolfram Alpha friendly command:
polarplot [ cos(Pi/7)/cos( | (t mod (2Pi/7)) – (2Pi/(2*7)) | ) , {t,0,2Pi}]

That will generate a regular polygon with 7 sides, with a circumradius of 1 unit. Substitute all the 7’s with the number of sides you want and voila! And the general equation is thus:

cos(Pi/N) / cos( ABS( (t mod (2Pi/N)) – (2Pi/(2*N)) ) )

where t is in [0,2Pi], and N is the number of sides

So how did I get the equation?

The formula for the apothem is cos(Pi/N). The apothem is the shortest distance from the centre to the side. With that, I bring your attention to this illustration. (this is a regular polygon with 4 sides. “It’s a square!”. Yes, I know)

Apothem and regular polygon equation explanation

The length “A” is the apothem, and t is the angle running in the equation we stated above. “apm” is the angle the apothem (for this particular segment of t) makes with the positive X axis.

And L is what we want to find.

We will define the first segment as the segment immediately after the segment whose apothem lies on the positive X axis. (so the apothem illustrated above is for the first segment) That will uniquely identify our segments.

Now convince yourself that the angle apm is in multiples of 2PI/N radians.

To find L, we need to find the angle s (I’m running out of colours…). And angle s = t – apm.

So s = t – 2PI/2N

“But that’s not exactly right!” you say. And you’re right. Because that didn’t take care of the multiples of 2PI/N radians thing.

To get the working t angle we’re using, it should be

“working t angle” = t modulus 2PI/N

Convince yourself that’s true. Substitute N with 4 or 5 or 100.

“But that’s not exactly right!” you say. And you’re right.

Because s = “working t angle” – 2PI/2N
can be negative (suppose the red line L is on the right side of A). That’s why we have

s = ABS( (t modulus 2PI/N) – 2PI/2N )

Why do we need to find s again? Because we want to find L. And L can be found with this equation:

A/L = cos(s)

Revise your trigonometry rules. Cosine of the acute angle is equal to adjacent side divide by the hypotenuse.

So L = A/cos(s)
= cos(PI/N) / cos(ABS( (t modulus 2PI/N) – 2PI/2N ))

So why do we need to find L again? In polar coordinates, you only need the angle and the radius (or length from origin) to uniquely determine a point. Since we have the angle, we just need the radius (or length).

That’s why the polar plotting from Wolfram Alpha works.

You can probably convert that from a polar coordinate point equation representation to a Cartesian point equation representation, but I’m done for now.

Regular polygon equation

A while ago, a blog reader named BJ (he seems to prefer being called BJ. He? *checks email…* Yeah, he) emailed me with his answer to this question: Is there an equation to describe regular polygons?

I’m not clever enough to do much editing and explanation, so I’ll post his email (got his permission and clarified some points) here.

*Start email quote*

An example: A polygon equation can be approximated by a single continuous implicit equation. Suppose x*y=0.1 The concept is to think of this hyperbola as almost being the product of two lines intersecting at x=0 and y=0.

Construct a (3,4,5) right triangle like this: Begin with the multiplication (x – 1)(y + 1) = 0.1. This product is the first approximated vertex. Multiply a second time using (3y – 4x – 5). This creates two more curved vertices. The multiplication expression now has three factors on the LHS. The RHS remains 0.1. The final equation will form a figure with the approximated triangle being a central “island”. Below is an equation for a (3,4,5) right triangle that morphs from intersecting lines to the triangle and then on to the circle. I sandwiched the approximated triangle between two circles, and it becomes one circle in the limit as 0.1 -> 0.

((x + 0.5)^2 + (y – 1)^2 – 6.125) – ((x + 0.5)^2 + (y – 1)^2 – 6.125)(3y – 4x – 5)(x – 1)(y + 1) = 0.1

Notice that I have multiplied by a circle in the last term. I sandwiched the approximated triangle between two circles, and it becomes one circle in the limit as .1 -> 0. The last term is subtracted from the circle that in the first term. The multiplication expression goes to zero, and only the circle remains. This method is somewhat analogous to the method used in Euclidean geometry.

*End email quote*

He also sent a follow-up email:

*Start email quote*

Here is a quadratic form for the triangle [(0,0),(4,0),(4,3)].
(x – 4) y (y – 0.75x) = 0.01

Solve for y=f(x) on the range 0 <= x <= 4 y = (15x^2 - 60x + sqrt(225x^4 - 1800x^3 + 3600x^2 + 16x - 64)) / (40x - 160) y = (15x^2 - 60x - sqrt(225x^4 - 1800x^3 + 3600x^2 + 16x - 64)) / (40x - 160) Plot using two colors, one for each solution. *End email quote* You can add your thoughts on this in the comments.

Story time

I don’t have much else to add to that, so I’ll just tell you a story instead.

Back when I was in university, there was a programming problem that’s to calculate the value of PI. There were 2 methods involved.

The first method used Monte Carlo simulation. Basically you have a circle with radius 1 unit. So the diameter is 2 units. And you have a square that just contains this circle, so the square is 2 units wide by 2 units high.

The area of the square is 2*2 = 4 units. The area of the circle is given by PI*r*r which is PI (because r is 1 unit). And the ratio of area of circle to area of square is PI/4.

Using Monte Carlo simulation, I randomly selected a point within the square. I made a note of whether the point was within the circle or without. And

[Number of points within circle]/[Total number of points] = PI/4

Since PI is the only “unknown”, there you have it. Solve for PI. The more points you use, the more accurate PI is.

The other method involves calculating the circumference of a circle. Suppose you have a square with a width of square root 2. This is chosen such that the diagonal length of the square is 2 units. This means the distance from the centre of the square to the corner point of the square (any of the 4 of them) is 1 unit.

Are you getting the idea yet?

The perimeter of this square is 4 * (square root of 2). Then we double the number of sides so we get an 8-sided polygon, an octagon. But still keeping the “from centre to outer-most point is 1 unit length” condition. Using some more maths, we get the length of one side of this octagon and multiply it by 8. That will be the perimeter.

As we keep doubling the number of sides, this polygon eventually approximates a circle. And so the perimeter of said (regular) polygon approaches the circumference of a circle, which is 2*PI*r, or just 2*PI (because r is 1 unit). Solve for PI.

Exercise

To end this, I will leave it to you as an exercise to calculate the length of that regular polygon in the 2nd method. Start by understanding why one side of a square is square root of 2. Then continue with calculating the side length for an octagon.

Side note: The 2nd method terminates faster as an iterative process than the Monte Carlo simulation. However, it is also less “stable”. Hahaha… for extra credit, explain why it is less “stable”.

Extra side note: If people knew the formula for a circle is PI*r*r, wouldn’t they have known the value of PI already? This was why I found the Monte Carlo simulation method a little on the self-fulfilling side. You’re solving for something that you sort of know the value of. That’s weird, almost like cheating.

Smooth Bezier splines

Apparently, having mathematically defined curves that pass through a set of desired points is a thing. And (cubic) Bezier splines are popular for this. Professor Dagan (mentioned previously) sent me a link.

Smooth Bézier Spline Through Prescribed Points

The article outlines a method that given a set of points you want your Bezier curve to pass through, calculate the required control points of the Bezier curve. This is similar to what I wrote here.

The difference is that my method requires the inverse of the coefficient matrix to exist, which it does. The method in that article requires the first and second derivatives of the Bezier curve to be continuous.

Cubic polynomials and cubic Beziers

So it turns out that for cubic Bezier curves, t values of 0, 1/3, 2/3 and 1 have special meanings. A general cubic polynomial is of the form

y = a0 + a1 * x + a2 * x^2 + a3 * x^3

where ai’s are real constants.

If the variable x is limited to the interval x0 <= x <= x0 + χ (that's the Greek letter Chi), where χ > 0, then it’s equivalent to a special case of cubic Bezier curves. Namely, when the t values are 0, 1/3, 2/3 and 1.

In fact, there’s a mathematical proof of it. Thanks to Professor Samuel Dagan of Tel-Aviv University for writing in and letting me know of his work. Here’s more of his work.

Does the point lie on the Bezier curve?

Someone recently asked me how to tell if a point lies on a Bezier curve.

For the purposes of discussion, it’s a quadratic Bezier curve and all 3 control points are known (or the start and end points and the 1 control point if you prefer). You can read more about the reverse process of finding the control points here, which is the reference point of that person’s question.

The answer is actually straightforward. Substitute everything into the Bezier curve equation and solve for t. Here’s the quadratic Bezier equation:
B(t) = (1-t)^2 * p0 + 2(1-t)t * p1 + t^2 * p2

Let’s say p0 is [1,1] and p1 is [1.5,4.5] and p2 is [2,3]. We’ll keep the points in 2 dimensions to keep the maths working less cumbersome. And let’s say the point you want to check is [1.8,3.4]. We substitute all the points into the equation, and we get this:

[1.8,3.4] = (1-t)^2 * [1,1] + 2(1-t)t * [1.5,4.5] + t^2 * [2,3]

I know, it doesn’t look pretty. But hey, we’re doing this by hand. If you’re writing code to generalise the solution, the code will probably look just a little uglier, but the solution will come out faster. Like probably instantly given the current modern processors.

Because we’re dealing with 2 dimensional points, that equation splits into 2 separate equations (with scalars instead of vectors as coefficients), like so:
1.8 = (1-t)^2 * 1 + 2(1-t)t * 1.5 + t^2 * 2
3.4 = (1-t)^2 * 1 + 2(1-t)t * 4.5 + t^2 * 3

If you have 3 dimensional points, you’d have 3 equations. Note that even then, the degree of your equations remains as 2. The degree of the Bezier curve is independent of the number of dimensions you’re working with.

If you simplify
1.8 = (1-t)^2 * 1 + 2(1-t)t * 1.5 + t^2 * 2

You get t = 0.8. It so happens that in this case, there’s only one solution.

If you simplify
3.4 = (1-t)^2 * 1 + 2(1-t)t * 4.5 + t^2 * 3

You get
5*t^2 – 7*t + 2.4 = 0

and after solving for that, you get t = 0.6 or t = 0.8 (you’re a smart person, you know how to solve a quadratic equation, right?)

Now, the solution t=0.8 appears in the solution sets of both equations. Therefore, the point [1.8,3.4] lies on the Bezier curve. In fact, t=0.8 is the t value.

Multiple solutions

What if you get multiple t values appearing in multiple solution sets of equations?

Consider the case where p0 is [1,1], p1 is [2,3], and p2 is [1,1]. Notice that the start and end points are the same point. Let’s say you want to know if the point [1,1] lies on the curve (yes I know it’s the same point). Substituting all the points, we get:

[1,1] = (1-t)^2 * [1,1] + 2(1-t)t * [2,3] + t^2 * [1,1]

This gives us the 2 equations:
1 = 1 – 2*t + t^2 + 4*t – 4*t^2 + t^2
1 = 1 – 2*t + t^2 + 6*t – 6*t^2 + t^2

They simplify to:
2*t^2 – 2*t = 0
4*t^2 – 4*t = 0

Hey presto! The solution set is t=0 or t=1 for both equations. Therefore, the point [1,1] lies on the curve. In fact, it lies on the curve where t=0 or t=1. And t=0 and t=1 happens to coincide with the start and end points respectively.

The whole point (haha!) is that, as long as you have at least one value of t that appears in the solution sets of all the equations, then said point you’re checking lies on the curve.

Higher degree Bezier curves

This is a toughie. If you have a cubic Bezier curve, then you’re solving a degree 3 polynomial (of t). If you have a Bezier curve of degree N, then you’re solving a degree N polynomial.

There are algorithms to solve generic degree polynomials, but they are out of scope here. Assuming the highest degree of Bezier curves you’ll ever work with is 3 (cubic), then this Wikipedia article on cubic functions will help. Remember, cubic Bezier curve equations are still cubic equations.

Higher dimensionality

The number of dimensions you’re working with determines the number of equations you need to solve. If you’re working with 5 dimensional points, then you need to solve for 5 equations.

For example, if you’re working with cubic Bezier curves and using 5 dimensional points, then you need to solve 5 cubic functions. You will have possibly 3 (unique) t values for each equation. Let’s say your solution sets are as follows:
t = -1, 3, 5
t = 0, 1, 3
t = -2, 2, 3
t = 3, 3, 4 (yay repeated values!)
t = 3, 6, 8

The value t=3 appears in all 5 sets of solutions, therefore your point lies on the curve.

Keep it real

In the process of solving your equations, there’s a possibility that you might get imaginary solutions. You know, those involving the square root of -1. Dismiss them.

Your Bezier curve is in the real world. The point you’re checking must therefore also lie in the real world.

Unless you’re working with some abstract imaginary Bezier curves on an advanced maths paper. Then good luck to you! The logic above for solving still applies.

Actual applications

When applying the above, you don’t usually get nice numbers like [1.8,3.4] lying on the curve with t=0.8. You get numbers with lots of numbers behind the decimal point that seems to continue forever. You don’t get exact values.

What if you get a t=0.798 for one equation, and t=0.802 for another equation?

Use your common sense. Set an error margin for what is acceptable.

My suggestion is to NOT use the values of t to check for the margin. Substitute the values of t into the equation, and then check the points if they’re within the error margin.

This means you don’t check the difference between t=0.798 and t=0.802, which is 0.04. Is 0.04 within your error margin? Maybe. But you’re not checking for this.

You substitute t=0.798 and t=0.802 into the equation, and you get 2 points: [1.798,3.40198] and [1.802,3.39798]

Then you say, “Are these points close enough that I consider them to be the same point?” Use whatever you think is appropriate. I think the Euclidean distance norm works fine. Then check if that “close enough” criteria is within your error margin.

If you’re checking for [1.8,3.4], then ask yourself, “Is [1.8,3.4] close enough to [1.798,3.40198]? And is [1.8,3.4] close enough to [1.802,3.39798]?”

Obviously, doing this by hand sucks big time. Good thing you’re a programmer.