31 March, 2008 | Written by Vincent Leave a Comment

Solve the dual problem

You might have seen an expert programmer in action. You ask him for help in debugging some error, and he comes up with a better way of writing the code. You are amazed at how easy it seemed.

Here’s a secret. He didn’t come up with a better solution. He restated the problem and solved that instead.

The De Morgan dual

You might not know who De Morgan is. He formally stated a set of mathematical logic laws as follows:
not (P and Q) <==> (not P) or (not Q)
not (P or Q) <==> (not P) and (not Q)
not (not P) <==> P

Ok, perhaps that was confusing to you. Let me put it in code form:

bool P = false, Q = true;
if (!(P && Q))
{
   Console.WriteLine("First version");
}
// is equivalent to
if (!P || !Q)
{
   Console.WriteLine("Second version");
}

Does it look more familiar?

The thing with program requirements is that sometimes, they are stated in a convoluted manner, but can be simplified when written in code. Or at least easier to understand in code.

Suppose the business requirement was that if the quantity was not less than 10 and the item was not a normal category item, process with special priority. Well, you could do this

if (!(iQuantity < 10) && (iItem != ItemCategory.Normal))
{
// process with special priority
}
else
{
// process normally
}

That made my mind jump through too many negations. We could transform that into

if ( !( (iQuantity < 10) || (iItem == ItemCategory.Normal) ) )
{
// process with special priority
}
else
{
// process normally
}

I added more space because there were a lot of round brackets.

Well, what's the big deal? Because of the structure of the requirements and the if-else statement, we could rewrite as this

if ( (iQuantity < 10) || (iItem == ItemCategory.Normal) )
{
// process normally
}
else
{
// process with special priority
}

I swapped the contents of the if-else, and thus removed the negation as well. Now, the condition is easier to read.

Just because the requirements were stated in a certain way, doesn't mean you can't rewrite the code to solve the same problem. And my last sentence contained two negations, which made it harder to read.

A practical use would be checking for errors. Usually the error presentation part is shorter than if the condition went through

if ({check for something})
{
// do something, usually many lines of code
}
else
{
// display error message
}

There were times when I was tracing code, through an if condition, and went down line by line, and then hit the else part, and I forgot what the condition was. If it was written this way

if ({error condition})
{
// display error message
}
else
{
// do something, usually many lines of code
}

then it's easier to follow. Add some comments in the else part to state the original condition in case you need to swap back.

You'll have to be sure of what you're doing when swapping. Some requirements cannot be swapped like this. Make sure you know what the contents of the if-else are doing.

Transform the problem into another equivalent

So we've gone through a simple version of changing an if condition to another equivalent if condition. In mathematical terms, it's known as a dual problem. We've stated the if condition one way, and write the code for that. Then we've written the if condition in another equivalent way, and wrote the code for that. And both of them solved the same problem.

What are the uses for this concept? Suppose you're stuck with a problem. You can't write the code to fully express a requirement. Or it could take a lot of effort to solve it. See if you can transform this problem into an equivalent one that's easier to solve.

For example, you could be retrieving a bunch of data from the database. Then you iterate through the data, performing calculations on each row, then inserting everything back into the database. The problem is that it's too slow, or too memory intensive. A lot of data is held in memory while you match and sort and compare and update each row.

Transform the problem from doing calculations and updates in your programming environment to doing calculations and updates in the database environment. The master records are in the database. The detail records are in the database. Why are you retrieving them into memory and do updates? Use the right programming tool; do them in the database!

For another example, say the requirement was to have more white space around images when displayed in a web page. If you didn't know any better, you might have gone through every single image and added a white border around each one. You have understood the problem as creating more white space around images, so you went about solving that.

If you had understood the emphasis as "when displayed in a web page", the problem shifted. You could use CSS to add padding, and the equivalent problem was solved.

This is a powerful concept. Solving an equivalent (dual) problem solves the original (primary) problem.

As programmers, we transform requirements into code. There are many ways to write the code, so there are many equivalents to those requirements. Choose the right equivalent problem to solve.

28 March, 2008 | Written by Vincent 2 Comments

Reusable code not important

We’re moving faster. Technology advances. Businesses change. New services emerge. It’s not how much better you are at reusing code, but how much better you are at creating new code that’s of value.

It’s a new program, with new logic

Someone once said that his team seldom reuse code, because they almost certainly have to rewrite most of their existing code. Their new program will need new logic and a different way of optimisation.

I’m talking about demos. I can’t remember if it’s ASD, or Conspiracy who said that. Or maybe I read it in a Hugi magazine.

Once you get past the standard texture, music and 3D geometric mesh/object loading, it’s down to the new and creative code. And new and creative code doesn’t come from reusable code. Code used to create a special effect in one demo, might not be transferrable to another demo.

Businesses change

And so does the business logic. My users think of new ways to conduct business all the time. New services, new forms of price plans and new ways of interacting with data.

I develop .NET web applications for them. Like a good programmer, I separated web pages from business logic code. Then I realised that I’ve never used any function related to business logic more than a few times in an entire web application. That business function had a specific use, and so was inappropriate for code reuse, because you can’t reuse it!

To stay competitive, businesses have to change. They’ve got to keep innovating and keep providing more value to their customers. Programs written for one service might have to be rewritten to take advantage of better tools, or refactored to remove useless parts, or even replaced by a new program specifically written for a new service.

Innovation versus APIs, components and the like

Innovation means new ideas. So it means new code. Sure, you could come up with a new idea by combining two or more existing ideas. Existing code could still be used. Fair enough.

Then you come up with another new idea, based on that combo idea. And another based on existing ideas. Eventually, there comes a point where a new idea is better off being free from any parent idea. The new idea is so completely new that it looks nothing like existing ideas.

That’s innovation. So where’s the code going to come from? From scratch.

Application programming interfaces are meant to be reused. I’ve written custom web controls for reuse. I’ve written Javascript functions for reuse.

But unless you’re specifically writing code for reuse purposes, write code for the intended purpose first. If it’s easy to extend it to a more general form, then go ahead. Forget generalisation if any major reconstruction is needed to make it usable by an unknown number of applications for unknown uses.

I am the sole web developer in my team. I don’t have time to think into the future where I guess what my users might possibly want to do with my applications. By the time that imagined future arrives, my users might want a different feature. So what does that get me?

So should you reuse code or write reusable code?

As with many things, it depends. I want you to think for yourself. Decide for yourself if that piece of code is going to be useful in a generic form. Decide for yourself if your time is better used on writing a feature that provides value immediately.

This decision comes with experience. And programming experience isn’t measured in years, or months. It’s measured in the number of lines of code you write that never see the light of day. (which is an excellent topic for another day…)

In these current times, there are excellent code libraries that take out the drudgery of your work. Your time should be used on creating more valuable code on top of that. Reusable code is not important. Valuable code is.

26 March, 2008 | Written by Vincent 5 Comments

Use the right programming tools

Some programmers never learn beyond their programming language of choice. Or they wrongly merge a new programming tool to their old way of programming, the unchallenged way of thinking. They either have only one tool, or they use the wrong tool.

Dealing with groups of data

Have you ever done this?

  • Select some records from database
  • Iterate through those records
  • For each record, do an insert statement to another table

For example, let’s say we have a table named Items

ItemID Price
ITEM01 0.45
ITEM02 1.50
ITEM03 2.70

Suppose someone bought 2 of everything, and we want to store that information in a table named Orders.

OrderID ItemID Quantity Total
ORDER01 ITEM01 2 0.90
ORDER01 ITEM02 2 3.00
ORDER01 ITEM03 2 5.40

This was what happened in a C program

  • Do the select statement
  • Bind the ItemID and Price to 2 variables, sItemID and fPrice
  • Have a temporary variable fSum store (2 * fPrice)
  • Do an insert statement

Since there were 3 rows retrieved, a total of 3 insert statements were issued.

That could easily be accomplished with the following:

insert into Orders
select 'ORDER01', ItemID, 2, 2 * Price from Items

Ok, that wasn’t a very good example. The point is that SQL operations are meant for manipulating groups of data. You can retrieve rows of records. You can add rows of records. You can update existing chunks of records. You can even wipe out an entire database table.

What happened up there was the inability to understand what database operations were good at. The programmer was still stuck in the standard looping structure in programming languages. He cannot imagine manipulating chunks of data without iterating through each record.

When dealing with databases, use SQL operations as far as possible. This cuts down on the number of client-server communication (3 inserts down to 1). When it becomes difficult or lengthy to form the SQL statement, then use the programming language to help.

The programming environment is very flexible. That doesn’t mean every calculation has to be done in that environment.

Text parsing

Regular expressions are good for manipulating text. If you’ve never heard of regular expressions before, have a look here for an introduction.

Regular expressions describe a search pattern. For example, “\d” searches for any digit. So, to search for an IP address such as “12.23.34.45″, we might use this “\d\.\d\.\d\.\d”. The dot character is a special character, so we need to escape it with a backslash.

That’ll work. Until we find “1234.54.00847.2″ or “6524.738294.8477645.72645″. Remember each part of an IP address ranges from 0 to 255. Ok, so we try to limit by the number of digits in each part like this, “\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}”.

A “{1,3}” means the pattern before this is repeated a minimum of 1 time, and a maximum of 3 times. So “\d{1,3}” means search for a digit at least 1 time, but up to a maximum of 3 times.

We hit a wall when we get results such as “333.444.555.666″. We try to refine our regular expressions more and more. And each time, it gets more and more convoluted and unwieldy. I’m sure someone out there wrote a regular expression that will correctly search for an IP address.

You know what I’m going to say. “That’s not the point“.

Regular expressions are fantastic for searching, manipulating and parsing text. A simple search pattern can easily grab something that looks like an IP address. It fails to grab an actual IP address though.

Now think about this. Can your programming language easily search through some text and find something that looks like an IP address? You’re going to read in the text, store it in a string variable, then try to run through every character and check if it looks like a number. Then you’ve got to check if a dot character follows that number. Then a number, then a dot, then a number, then a dot, then a number.

Your string manipulation code is going to be crazy.

It’s hard for regular expressions to verify a range of numbers in the IP address, but it can easily grab something that looks like an IP address. Your programming language can easily verify if a number is within a specified range, but it’s hard for it to search through text for something that looks like an IP address. Are you getting my drift here?

Use regular expressions to grab text that looks like an IP address. Then use your programming language functions to parse that small piece of string data and verify if the 4 numeric parts in the IP address are within 0 to 255.

Using the right tool

I see programmers try to force something to work in a certain way when there’s a better way to go about it. They retrieve data, do computations in memory, then push it back into the database, when it’s more effectively done within the database environment. They try to completely rely on regular expressions, when a more effective way was to combine the powers of both regular expressions and the programming language used.

Learn to use tools beyond your programming language. Then learn to use the right tools for the right job.

Next Page →