Sometimes, new problems appear after solutions are made

So I had a discussion at Hackerspace (I’ve a video for you soon), and Preetam mentioned something about iPads. He said that schools are using iPads for education, and there’s an interesting problem.

Teachers using the iPads as teaching aids want to move around the classroom. With the iPad, the teacher can project information on her iPad to the screen. But the moving around was a problem, because the iPad needed some connecting wire thingy to the projector.

Well you might say that Apple should have considered making wireless projection of the iPad screen seemless.

But I want you to consider this. If tablets weren’t available, and thus teachers could carry tablets around, would the problem of using the tablet to project information wirelessly even exist?

In the pre-Industrial age, practically everyone was working on the land. If you don’t grow food, you don’t get to eat.

The thing about working on farms is that there’s always something to do. (Just ask any Farmville player…) There are cows to be milked, chickens to feed, eggs to collect, grains to harvest.

In the Industrial age, factories made everything systematic and efficient, and our lives became better. You need to be at the factory at this time. You will go for your lunch break at this time. And most important for our discussion here, you can go home at this time.

This created a problem humans never faced before. Suddenly, we had free time.

What are we going to do with it?

Revenue sharing and operations research – part 3

This is a mini-series on how revenue sharing and operations research are linked. You might want to read part 1 on the specific business problem I was solving, and part 2 for the mathematical formulation of the problem. In this final part, I’ll tell you what was the solution eventually used.

First, although I said it was basically an assignment problem in part 1, on further thought, I don’t believe it is. What I was thinking was “how to assign that $0.01 such that there are no rounding errors”. Thus the “assignment” part. I apologise for any confusion.

Second, I said the financial/maths formulation was an integer problem. But the values are money values with decimal values, right? How can it be an integer problem? Because it has to also be correct up to 2 decimal places. That means fixed arithmetic. Therefore it becomes an integer problem. Just think of the values as cents (instead of dollars correct up to 2 decimal places).

Now, if you’ve read part 2 (and I applaud you if you actually sat through that mass of maths formulations), you should have guessed that using operations research to solve the business problem was not advisable. It might not even be suitable.

However, the problem still needed to be solved. How do you get rid of any extra or missing money?

More maths…

Going back to the example I gave in part 1, there were 3 products with revenue to be split between 2 parties. So there were 6 parts. If each part generated a rounding error of +$0.01, then there was a maximum potential difference of $0.06 between the original revenue to be shared and the sum of the parts after revenue sharing calculations.

I remind you that whatever solution I came up with had to make mathematical sense and financial sense to the programmers and the users. There are actually 2 goals:

  • To have the sum of the parts be equal to the original revenue amount
  • To have each part’s amount (after rounding) be as close to the calculated value as possible

The 1st goal ensures no summation errors. After revenue sharing, there are no extra or missing money amounts. This part is actually easy to fulfill. The 2nd goal is fulfilled with a bit of adjustments. So here’s the solution.

The easy-to-understand maths solution

We split the revenue accordingly to each part first, rounding each part’s amount to 2 decimal places. Then we sort each part in ascending order. Then we sum all the part’s amounts together. If there’s a discrepancy, we correct the discrepancy by adjusting the largest amount. This calls for an example.

Here’s the original example used. Total revenue for the 3 products are:

  • ProductA: $63.13
  • ProductB: $20.75
  • ProductC: $16.12

Assuming a 30-70 percentage split, we have:

  • ProductA: $18.94 (us), $44.19 (them)
  • ProductB: $6.23 (us), $14.53 (them)
  • ProductC: $4.84 (us), $11.28 (them)

Sorting all the parts in ascending order, we have:

  • ProductC: $4.84 (us)
  • ProductB: $6.23 (us)
  • ProductC: $11.28 (them)
  • ProductB: $14.53 (them)
  • ProductA: $18.94 (us)
  • ProductA: $44.19 (them)

The sum of the parts’ amounts is $100.01, which is not equal to the original revenue being shared ($100). The discrepancy is a +$0.01. So we adjust the largest amount. Specifically, we deduct $0.01 from the largest amount (because our discrepancy is positive).

So the revenue share for the content provider for ProductA becomes $44.18, and thus the sum of the parts become $100.

This method ensures that the sum of the each part’s amounts is still equal to the original revenue, which is very important (because this is a financial operation!). This satisfies the 1st goal.

And for each part, the amount is rounded to the nearest 2 decimal place. So each part’s amount is as close to the calculated split value as possible. The only exception is the largest amount might be off a little.

Now I chose the largest amount to “absorb” any rounding discrepancy precisely because it is the largest amount. Note that the term “largest” refers to the magnitude, so if you happen to deal with negative values (it happens, even in financial situations. Consider debt as an example), use the maths absolute function to do the sorting.

Any discrepancy can be mathematically shown to be at most equal to (number of parts) multiply by $0.01 (rounding error).

D <= ± (N * $0.01) where D is the discrepancy and N equals the number of parts. Note that the discrepancy is bounded, which is the mathematical way of saying it has an upper and lower limit (or bound). Note also that in a fraction, a larger numerator means a larger fraction and a smaller numerator means a smaller fraction. A larger denominator means a smaller fraction and a smaller denominator means a larger fraction.

Now, whatever the discrepancy value is, it is bounded, it is fixed, and it is a small value. If we want any amount to “absorb” this discrepancy, then the larger the amount, the smaller the resulting error fraction or error percentage.

For example, if the discrepancy is $0.01 and the amount is $1, the resulting error percentage is 1% ($0.01 / $1.00 * 100%). If the amount is $5, the resulting error percentage becomes 0.2% ($0.01 / $5.00 * 100%).

Suppose the discrepancy is $0.02. We could spread the discrepancy error among the largest 2 amounts, each amount absorbing $0.01. But this makes the programming a little more complicated than it is. Also, it makes the algorithm a bit “dynamic”, which makes tracing any calculations by a programmer or user difficult.

Implementing it in code

All the revenue amounts were stored in the database. Because of this, I recommended that any revenue sharing calculations be done within the database environment itself. Namely, with stored procedures running SQL statements.

Benefits of using stored procedures within the database environment:

  • Can sort values easily (use the SORT BY clause)
  • Can handle dynamic number of values (with temp tables or the original tables)
  • Some calculations can be grouped into a single UPDATE statement
  • All the values are in the database!

The last benefit means there’s little switching of context from the database environment to the… whatever environment. In that project, it was scheduled Unix shell scripts combined with C programs that called the stored procedures. We didn’t want the context to switch back to a Unix environment to do calculations. Doing calculations in the Unix environment with C might be fast, but there are many content providers and many products. The context switching might eat up any performance benefits. Besides, having the calculations in a few stored procedures mean better modularity and separation of functions.

Further considerations

Because we’re in the business of … uh, doing business, we might want the customer to have a better deal. Or at least an easier report to read.

In the method above, we sorted the amounts in ascending order, regardless of whether it’s us or them. So it could well mean that the largest revenue share of the content provider be used to absorb the discrepancy.

This might mean when they read the revenue sharing report, they might question why that amount is $44.18 instead of $44.19. It might be just $0.01 to you, but it’s still money!

What we can do is sort the amounts by theirs first, then ours. And within each, sort by ascending order. So we could have this instead:

  • ProductC: $11.28 (them)
  • ProductB: $14.53 (them)
  • ProductA: $44.19 (them)
  • ProductC: $4.84 (us)
  • ProductB: $6.23 (us)
  • ProductA: $18.94 (us)

In this case, we adjust our revenue share for ProductA to be $18.93 (instead of $18.94) so the revenue sum is correct. Basically, we absorb any discrepancy, using our largest revenue share amount.

And that’s the end of the discussion of revenue sharing: the business part, the maths part and the implementing/programming part. Let me know if you have any questions.