Use the right programming tools

Some programmers never learn beyond their programming language of choice. Or they wrongly merge a new programming tool to their old way of programming, the unchallenged way of thinking. They either have only one tool, or they use the wrong tool.

Dealing with groups of data

Have you ever done this?

  • Select some records from database
  • Iterate through those records
  • For each record, do an insert statement to another table

For example, let’s say we have a table named Items

ItemID Price
ITEM01 0.45
ITEM02 1.50
ITEM03 2.70

Suppose someone bought 2 of everything, and we want to store that information in a table named Orders.

OrderID ItemID Quantity Total
ORDER01 ITEM01 2 0.90
ORDER01 ITEM02 2 3.00
ORDER01 ITEM03 2 5.40

This was what happened in a C program

  • Do the select statement
  • Bind the ItemID and Price to 2 variables, sItemID and fPrice
  • Have a temporary variable fSum store (2 * fPrice)
  • Do an insert statement

Since there were 3 rows retrieved, a total of 3 insert statements were issued.

That could easily be accomplished with the following:

insert into Orders
select 'ORDER01', ItemID, 2, 2 * Price from Items

Ok, that wasn’t a very good example. The point is that SQL operations are meant for manipulating groups of data. You can retrieve rows of records. You can add rows of records. You can update existing chunks of records. You can even wipe out an entire database table.

What happened up there was the inability to understand what database operations were good at. The programmer was still stuck in the standard looping structure in programming languages. He cannot imagine manipulating chunks of data without iterating through each record.

When dealing with databases, use SQL operations as far as possible. This cuts down on the number of client-server communication (3 inserts down to 1). When it becomes difficult or lengthy to form the SQL statement, then use the programming language to help.

The programming environment is very flexible. That doesn’t mean every calculation has to be done in that environment.

Text parsing

Regular expressions are good for manipulating text. If you’ve never heard of regular expressions before, have a look here for an introduction.

Regular expressions describe a search pattern. For example, “\d” searches for any digit. So, to search for an IP address such as “12.23.34.45”, we might use this “\d\.\d\.\d\.\d”. The dot character is a special character, so we need to escape it with a backslash.

That’ll work. Until we find “1234.54.00847.2” or “6524.738294.8477645.72645”. Remember each part of an IP address ranges from 0 to 255. Ok, so we try to limit by the number of digits in each part like this, “\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}”.

A “{1,3}” means the pattern before this is repeated a minimum of 1 time, and a maximum of 3 times. So “\d{1,3}” means search for a digit at least 1 time, but up to a maximum of 3 times.

We hit a wall when we get results such as “333.444.555.666”. We try to refine our regular expressions more and more. And each time, it gets more and more convoluted and unwieldy. I’m sure someone out there wrote a regular expression that will correctly search for an IP address.

You know what I’m going to say. “That’s not the point“.

Regular expressions are fantastic for searching, manipulating and parsing text. A simple search pattern can easily grab something that looks like an IP address. It fails to grab an actual IP address though.

Now think about this. Can your programming language easily search through some text and find something that looks like an IP address? You’re going to read in the text, store it in a string variable, then try to run through every character and check if it looks like a number. Then you’ve got to check if a dot character follows that number. Then a number, then a dot, then a number, then a dot, then a number.

Your string manipulation code is going to be crazy.

It’s hard for regular expressions to verify a range of numbers in the IP address, but it can easily grab something that looks like an IP address. Your programming language can easily verify if a number is within a specified range, but it’s hard for it to search through text for something that looks like an IP address. Are you getting my drift here?

Use regular expressions to grab text that looks like an IP address. Then use your programming language functions to parse that small piece of string data and verify if the 4 numeric parts in the IP address are within 0 to 255.

Using the right tool

I see programmers try to force something to work in a certain way when there’s a better way to go about it. They retrieve data, do computations in memory, then push it back into the database, when it’s more effectively done within the database environment. They try to completely rely on regular expressions, when a more effective way was to combine the powers of both regular expressions and the programming language used.

Learn to use tools beyond your programming language. Then learn to use the right tools for the right job.

  1. JD

    i think it depends a lot on your programming experience. In the early years of programming for sure some people will end up using inappropriate functions and writing unnecessary lengthy code with the wrong tools. Things tend to get better as time passes and when we start losing hair… : ) Yeah losing hair might be a good indication that you’re getting better especially after pulling it out when codes don’t work ; )

    The good thing is that there are persons like you helping others to find the right path. Keep it up.

    JD
    http://www.techsted.com

  2. Kalen Jordan

    Good points man. As a coder, its so tempting to just do it the way that you’re comfortable, but you need to constantly be learning not only new tools but new techniques, methodologies, and frameworks to approach problems in fresh ways!

  3. Vincent Tan

    @JD: Yes, it does depend on the individual’s programming experience. I help out at Dream In Code forums, and some of the questions posed are just a bit appalling. Until I realise that it’s easy for newcomers to be confused about hard-but-necessary ground work and cool-and-advanced code techniques.

    “Things tend to get better as time passes and when we start losing hair…”
    The getting better part is cool. The losing hair part, not so cool…

    Thanks for visiting!

    @Kalen: It’s about choice. The more we learn, the more options we have. Then we can decide if the old way is still useful, or if it can be modified. Or totally scrapped.

  4. Heaps of by-products | Polymath Programmer

    […] Perhaps. I don’t understand why the concern over performance and memory management, and that Unix C programs and shell scripts are better than anything done on Windows (no this isn’t a *nix/Windows comparison/war), and better than anything done by a database engine. Use the right tools. […]

Comments are closed.