Collision detection in merging Excel cells

You can merge cells in Excel. “Duh!” you say. So how does Excel ensure that the cells you want to merge don’t overlap any existing merged cells? How do you check programmatically that the cells you want to merge won’t overlap any existing merged cells?

Well, that was the problem I faced while writing my spreadsheet library software (check out SpreadsheetLight here!). The inkling of an idea was formed when I visited StackOverflow. Can’t remember what I was there for, but the answer page wasn’t about Excel or spreadsheets. It was about collision detection.

Yes, the type of collision detection used in games so you don’t walk through walls and stuff.

And thus I learnt a new mathematical theorem: the separating axis theorem (or hyperplane separation theorem). From what I understand, given two N-dimensional thingies, if you can find one (N-1)-dimensional hyperplane that separates those two thingies, then those two thingies are separate (or not overlapping).

Yes, “thingies” is a technical term.

For example, given two 3D objects, if you can find one 2D plane that separates them, then the 3D objects are not overlapping. Another example is when you’re given two 2D objects, and if you can find one line (which is 1D) that separates them, then the 2D objects are not overlapping.

There are some conditions to be fulfilled, such as the objects being convex and disjoint and whatever. (yes I’m a mathematician…) I’ll leave it to you to do your own reading.

But for our purposes, how do we check if that rectangle of Excel cells we want to merge won’t overlap with any existing merged cells? We have these conditions that make our lives easier:

  • The merged cells are rectangular
  • The 4 end points (top-left, top-right, bottom-left, bottom-right) of a merged cell are in whole numbers
  • The (merged) cells map strictly to a 2D rectangular grid

Since the merged cells are rectangular, they’re also convex (I’m not going to explain why, just trust me). Since the 4 end points are in whole numbers, line boundaries can be easily calculated (it’s easy to check if something is <7 or >=8 for example). And since they map strictly to a 2D rectangular grid, the separating axis is a line. And even better, you can always find a subset of solutions of those separating axes that are either horizontal or vertical.

Ok, diagrams are in order.

Collision detection in merged cells

So in our case, as long as you can find one horizontal line or one vertical line that separates the 2 merged cells we’re checking, then the 2 merged cells are separate.

Let’s try an example. Excel uses cell references that are in the A1 format, which means row 3 column 4 becomes “D3”. The column index uses alphabets and is in base-26. Read more here.

We’re going to simplify that. Let’s say our existing set of merged cells only has one entry. This merged cell has the top-left corner at row 1 column 1, and the bottom-right corner at row 3 column 4. Suppose we want to check if this merge cell overlaps: top-left corner at row 7 column 8, bottom-right corner at row 10 column 10.

The horizontal line of 5 fits the bill, or the line y=5 if you want to be mathematical about it (but y goes from negative to positive downwards instead of the usual Cartesian y). Or y=6. Or even y=7 (note that the line can “touch” one of the merged cells, but not both. This is where the “whole number” condition comes in).

The vertical lines x=5, x=6 or x=8 also fit the bill.

Thus, our 2 merged cells don’t overlap.

So what’s the programmatic way to check? You’d be surprised at the simplicity. To make it easier, the variables with “StartRowIndex” mean the top row of the merged cell, “EndRowIndex” mean the bottom row of the merged cell. And “StartColumnIndex” mean the left-most column of the merged cell, and “EndColumnIndex” mean the right-most column of the merged cell.

if (!(iEndRowIndex < mc.StartRowIndex || iStartRowIndex > mc.EndRowIndex || iEndColumnIndex < mc.StartColumnIndex || iStartColumnIndex > mc.EndColumnIndex))
{
    // there's an overlap!
}

So the merge cell we want to check has top-left corner at (iStartRowIndex, iStartColumnIndex) and bottom-right corner at (iEndRowIndex, iEndColumnIndex).

The variable “mc” refers to a custom class I use to represent merged cells. Obviously, you’d run that condition in a loop through all your existing merged cells. If you can’t find an overlap after running through that loop, then the merge cell you’re checking is good to go.

Let’s run through the individual conditions.

  • (iEndRowIndex < mc.StartRowIndex) means our merged cell is completely above the existing merged cell
  • (iStartRowIndex > mc.EndRowIndex) means our merged cell is completely below the existing merged cell
  • (iEndColumnIndex < mc.StartColumnIndex) means our merged cell is completely to the left of the existing merged cell
  • (iStartColumnIndex > mc.EndColumnIndex) means our merged cell is completely to the right of the existing merged cell

The first 2 conditions check for the existence of a horizontal separating axis. The next 2 conditions check for the existence of a vertical separating axis.

Note the negation of the entire boolean condition. Those 4 conditions check for existence of solutions. Negation means checking for overlaps.

Note also that we use strictly greater than or strictly less than checks. If the merge cells share a row or column, then they overlap, right?

Did you know my first draft of writing the code had me checking 6 different conditions? Each condition had its own if statements. This one just had one. I’m so lucky to have learnt the separating axis theorem. I was checking if a point was above, within or below the merge cell, then I had to check if the point was to the left, within or to the right of the merge cell, and … urgh, it was horrible…

Partial fractions in SQL queries

I never thought my maths training would come in handy again. I was working on a financial report, and one of the requirements was to have a particular calculated value show up. The formula didn’t make sense to me, but it was a business/financial logic requirement, so I just dealt with it.

So here’s the core of the problem (specific values had been changed):

select sum(A)/sum(B) - 0.7 from sometable

where “A” and “B” are columns of the database table “sometable”.

So what’s the problem? That select query won’t run. Or at least it didn’t run from a Sybase database (or was it an SQL Server database?). I’m not saying it ran but the value was wrong. I mean it didn’t even execute. Just in case you asked, “A” and “B” are numeric data columns so the sum function will work.

I don’t know how I came up with the idea of using partial fractions. Given that only 0.08%* of staff in the entire office building had maths background, and I probably made up the entire 0.08%, I didn’t have anyone to bounce ideas off of and be told “How about you try using partial fractions on that, Vincent?”

(* a completely made up statistic)

Anyway, I tried using partial fractions, and it worked. Now in partial fractions, you typically deal with decomposing a fraction into 2 or more fractions. Here, we’re combining fractions into 1 fraction. Let me show you.

sum(A)/sum(B) – 0.7
= sum(A)/sum(B) – 7/10
= ( 10*sum(A) – 7*sum(B) ) / 10*sum(B)

If I remember correctly, this (equivalent) SQL query will work:

select ( 10*sum(A) - 7*sum(B) ) / 10*sum(B) from sometable

I’m not a database expert. If you know why that works but not the original (and more direct) version, leave a comment.

[UPDATE: A commenter told me that complicated maths functions don’t work on aggregates. The sum(A) result is an aggregated result. Apparently sum(A)/sum(B) is too complicated. Oh well…]

As part of that same programming task, I had to deal with another similar problem:

select 50 * (sum(A)/sum(B) - 0.7) from sometable

That SQL query also didn’t run. So here’s the partial fraction combining process:

50 * (sum(A)/sum(B) – 0.7)
= 50*sum(A)/sum(B) – 35
= ( 50*sum(A) – 35*sum(B) ) / sum(B)

Alternatives

Now I know there’s another option. I could get sum(A) and sum(B) individually, and then do the required calculation in code (C# code as opposed to database SQL code. I was dealing with ASP.NET then).

After considering my options, I decided to leave all the calculations at the database side. This makes the ASP.NET code “cleaner”. Then I only have to deal with one return value (instead of 2, sum(A) and sum(B)), and I can bind it directly to my database objects for display on the web browser.

Also, there were where (and group-by? Can’t remember…) clauses in the SQL query. I didn’t know if I obtained sum(A) and sum(B) individually (even if they were in the same query) that that will affect their values. I decided to play it safe, and just get it all in one resulting value from the same query.

I didn’t check for efficiency. It wasn’t an oft-used report, so the code execution won’t be run frequently enough to matter.

But if you’re curious enough to do some tests, go ahead. If you then want to share your results, I’d very much appreciate it too.

Absolutes are relative

I came upon a realisation in my first year of university. There are very few absolutes in the universe.

I was sitting in the lecture theatre, taking notes of the professor’s lesson. I was in my first year, and in my first semester, and I haven’t any friends I knew there. “Friends” were people I saw frequently (if you could call the first 2 weeks “frequently”). After two and a half years of being conscripted in the army, my brain was raring to go (about the most mentally strenuous activity I remember was figuring out how dBase IV and FoxPro work [those are database systems by the way]. Oh and how to make the records print nicely. That’s interesting, considering that I had to send letters to over 500 military personnel, and one choke of the printer sent hours and tons of paper to waste. Lining up address information on the paper could be quite challenging. Another story for another time…).

It was a maths lecture (not surprisingly…) about fundamental logic (I think). I learnt of the term “for all” (or “for every” or something similar), which is represented by an inverted capital A. For example, for all even numbers, one of the factors is 2.

This “for all” term is not used lightly. When you say, “for all”, it means for all conditions, circumstances, situations, universes, even alternate realities, that the next statement is true.

And as the professor said, all you need is one example of the statement being false, and the whole statement collapses.

A related concept is the negative. For example, “There is no such thing as a zombicorn.” First, there’s no such thing as a zombie. Second, there’s no such thing as a unicorn. Uh, I’m not going to contest you on that. Go Google it or something…

What you mean by “There is no such thing as XYZ”, is that for all conditions, circumstances, situations, universes, even alternate realities, that the statement is true. And all you need is the existence of XYZ for that statement to be false.

Because of this, the professor also said that it’s impossible to prove a negative. We just barely found Earth-sized, possibly habitable planets at the outer reaches of our exploration of space. How can anyone possibly search the entire known (and unknown) to find that one existence of proof? Have you searched all the planets out there? Do you know for sure, that there is absolutely no way that XYZ can possibly exist?

This is why I find maths interesting. Some of its concepts can be proven. For example, the statement “There is no such thing as an even number with a factor that’s 2.” All I need to disprove that statement is find the existence of an even number with 2 as a factor. For example, 6 (with 2 and 3 as factors).

Once I realised that about the only absolute statements are in maths (and possibly physics, but I suck at physics, so, yeah…), and even then those absolute statements are held in the strictest of scrutiny, I also realised that (almost?) everything is relative. Yes, I know I’m way behind some fellow named Einstein who mentioned some theory of relativity somewhat…

The notion that nothing is absolute scares the hashbrown out of some people. Those people need to know that this is right and that is wrong. That this is black and that is white. That there are clear cut lines with which they can stand behind of. That there are “right” arguments and opinions they can back themselves with.

For example, the statement “You’re always late.” really means the person is saying you’re always late in the person’s mind. Never mind the few times that you are early, because that person will conveniently forget about those instances (because of cognitive dissonance. The person had to convince herself that she’s right about you being always late, and thus fabricates proof that substantiates her opinion). Also, it presupposes that you will continue to always be late. And all you need is to be early once, and that statement is false (but cognitive dissonance will thwart your attempts at defending yourself).

This taught me to be tolerant of other people’s views. When nothing is absolute, then nothing is certain. Like I said, this scares the cranberry out of some people.

And this coming from a maths lecture. More interesting still, it came from a discipline that’s known for its strictness.

Negative sales targets and percentage commissions

A while ago, I received an email from a distraught salesman. He believed his sales commissions were wrongly calculated, and asked me to shed some light.

Note that I’m not using the exact numbers he gave in his email.

The story goes that Michael (as I’ll call him) and his colleagues were given sales targets that were negative. How could sales targets be negative? Shouldn’t you be trying to sell something? The reason given was that the current economy was disastrous, and basically each sales person was trying to not lose sales.

You’re gonna bleed. It’s how much you bled.

Anyway, given Michael’s negative sales target, he managed to exceed it. He didn’t manage to bring in sales (positive sales numbers), but he didn’t lose too much money (slight negative sales numbers). But his sales commissions didn’t reflect that.

Now I’m not going to discuss how that works out. I can’t presume to understand the business logic behind the sales commission in this case, but I’ll discuss the mathematics behind the numbers.

The normal sales targets and commission

Let’s say your sales target for this month is $1000. This means you’re expected to sell about $1000 worth of products or services. We’ll ignore the condition that you will get some commission based on what you sell, regardless of how much you sold (my brother’s a sales person), as well as other types of commissions.

Let’s say the sales commission is based on how much extra you sold beyond your sales target. Makes sense, right? Let’s use simple percentages.

If you sold $1100 worth of products or services, then your percentage commission might be calculated as follows:
(Difference between Your Sales and Your Sales Target) / (Your Sales Target)

Or ($1100 – $1000) / ($1000) = 10% commission.

This is assuming that your sales amount exceeded the sales target, of course.

The case of negative sales targets

Now if the sales target is negative, as in Michael’s case, the mathematical formula still applies. But you have to note the negative sign. For some reason, “business” people (no offense to business people) tend to see -4567 as larger than 12, even though 12 > -4567. They see the magnitude first, not the value itself. (It’s also why I get emails about calculations involving negative numbers… anyway…)

Let’s say the sales target is -$1000. Everyone’s expected to lose money, but you try not to lose more than $1000. At least that’s what I’m interpreting it as.

Let’s say Michael managed to lose only $50. Or -$50 to be clear. The formula
(Difference between Your Sales and Your Sales Target) / (Your Sales Target)

have to be modified to this
(Difference between Your Sales and Your Sales Target) / (Magnitude of Your Sales Target)

In maths and programming terms, the “magnitude” part refers to the absolute function. Meaning you ignore any negative signs. Actually, the modified version works for the normal case too (which is why you should use it for the normal version anyway to take care of weird cases like this but I digress…).

So, we get (-$50 – [-$1000]) / abs(-$1000) = $950 / $1000
= 95%

Actually, you should use this:
abs( [Your Sales] – [Your Sales Target] ) / abs(Your Sales Target)

That’s the “foolproof” version. Consider it a bonus for reading this far. Frankly speaking, any competent programmer should be able to come up with that formula, even without much maths background. You just need to think about the situation a little carefully (ask “what if?” more often).

Michael’s calculated commission

When Michael wrote to me, he said his commission was calculated as follows (given that he only lost $50):
-$50 / -$1000 = 5%

Let’s say someone else lost -$900 that month. With the above calculation, that person gets:
-$900 / -$1000 = 90%

Clearly it makes more sense to lose more money! This was why Michael wrote to me.

I don’t propose the method I gave is correct, business-logic-wise. Michael didn’t give me any details on what he’s selling, or what his company is (or even why it’s acceptable to have negative sales targets, regardless of the economy). So I cannot give any help other than from a pure mathematical point of view. But I hope it’ll at least give Michael a fairer commission amount.

Questions

Given Michael’s situation, what do you think is an appropriate calculation formula?

Can you think of (or know of) a realistic situation where a negative sales target is acceptable? I say “acceptable”, but seriously, no company should “accept” that they lose money every month.

Cantor sets and Invisibility cloaks

More specifically, it’s the Cantor ternary set.