Multi-use variables or multiple variables?

So I’ve been working on a software project of mine. I’ll tell you more about it soon enough, but for now, it’s enough to say that I’m writing source code that generates source code.

One thing I’ve noticed is variable declaration. There are 2 extremes.

One variable used multiple times

This is the memory-efficient version. If you need the use of an integer variable, you just declare one variable. For example,

int i;
i = DoSomething() + DoSomethingElse();
DoAlpha(i);
i = DoThis() + DoThat();
DoBeta(i);

That’s just for illustrative purposes. If you’ve written a fair amount of code, I’m sure you can think of better examples. Which are probably (and usually) more elaborate and lengthier.

The drawback to this is that the variable is temporary. As the code continues its execution, previous values stored in that variable are considered to be unimportant to future executions. That’s why the value can be discarded and the variable overwritten.

Multiple variables but one-off use

Then there’s the “declare as many variables as you can (or think you need)” method. For example,

int i1 = DoSomething();
int i2 = DoSomethingElse();
int i3 = DoThis();
int i4 = DoThat();

This has the advantage of keeping the variable values “alive” through that section of code. The drawback is that you use more memory, even if seemingly trivial. I mean, that’s like 12 more bytes of memory (assuming integers still take up 32 bits when you’re reading this). That hardly makes a dent in the computer’s memory space.

The hybrid

The above 2 are extreme cases. What happens when you write code is probably a hybrid, somewhere in between the 2 extremes. For example,

int iSubtotal;
int iTotal;
iSubtotal = DoSomething();
iTotal += iSubtotal;
iSubtotal = DoThis() + DoThat();
DoSomethingElse(iSubtotal);
iTotal += iSubtotal;

You know what you declared those variables for, so you have an idea how many “unique” variables you need. This have the benefits of using the least number of variables (sort of), balanced with keeping the least number of “live” variable values around.

So why am I talking about this?

Auto-generated source code cannot generate hybrids

When you’re writing code, you have one very important advantage: You have context. A program that generates source code, such as a decompiler, does not have that.

When you’re writing code, you make variable decisions such as naming, naming conventions, how many you need and so on.

A decompiler has difficulty making decisions like those, so it has to choose one of the extremes. Typically the multiple variables route, because that’s the safest. All a decompiler can do is detect that a variable is needed, and so writes out the variable declaration in the resulting source code. It cannot decide on whether this part of the code can reuse one of the variables it has already declared (or at least has difficulty doing so).

Ok, so the cat’s out of the bag. I’m writing a decompiler. That’s not exactly true but will suffice for now (I promise I’ll tell you more soon!).

Anyway, that’s what I discovered while working on my software project. I have decided to go the multi-use variable route, because of a human (and programmer) behaviour. A human programmer has difficulty holding on to many separate variables in his head.

When a section of code requires many variables, I tend to try to limit the number of variables I remember in my head. Maybe there’s a pattern. I might remember there’s fFinancialYear1 up to fFinancialYear7. I might decide to refactor the code such that I only need one fFinancialYear floating point variable (assuming the appended numeral makes sense, and not just laziness in naming). I might separate the code section into several sections, so each section has a limited number of variables.

Maybe that’s not how most programmers work, but I find it “friendlier” than having thisIsAnAwesomeClass1 through thisIsAnAwesomeClass20, and I can’t remember which awesome class does which. I tend to work with tighter variable names (where possible and logical), and write code that’s as tight in scope as possible. So the variable values can be discarded, which means I don’t have to keep track of whether that value is still needed, even if the computer doesn’t mind having to keep track of it.

So how do you write your code where variables are concerned?

Colossal computer coding

Jumping fish, lazing cat, right-handed piano keyboard playing and more than 2000 lines of source code written.

You can try to guess what’s the music piece I’m practising on. I’m almost ready… and I’ll have a recording up then.

Modularity in programming guides

I’ve read many programming guide books and tutorials. The one thing I’m looking for is, “I want to do X. What is the code I need to write to do just that?”

Many times, the author of the book or tutorial had mixed in other code or concepts into that. I want to know the simplest way to print “Hello World!”. I don’t want to include any extra libraries that don’t help with that. I don’t want any custom functions that makes printing a string any easier. I just want to print a string, ok?

The point is, the author already knows how to accomplish that task you want to learn. It’s when he gets, I don’t know, bored, that he adds other concepts to make it, I don’t know, interesting.

I’m not looking for the least number of code lines to write that accomplishes that task, although it’s usually that. I’m looking for the lines of code that just do what I want.

Because sometimes, I can’t differentiate the important from the extraneous. I don’t know, that’s why I’m learning, remember? This is especially important when I need to mix and match different concepts. If what I learnt has other extra stuff mixed in, then the resulting code has “more” extra stuff.

It’s like I want to mix X and Y, but got (X + dx) and (Y + dy). And I don’t know which parts are dx or dy.

Some authors make it clear which parts are the actual lines of code to accomplish X. Some authors are great at explaining stuff. I’m saying there are many others who don’t or can’t.

So when I wrote my Open XML spreadsheet programming guide, I made sure each chapter was modular. If not, I had sufficient comments and explanations so the reader knows which parts are the important parts. Each chapter was modelled after a major feature/function in the Excel software. How to style text, how to insert images, how to add more worksheets, that kind of thing. The Excel user mixes and matches those functions, so I want the programmer using the guide to be able to do a similar thing.

I got an email from a programmer who bought my guide that he liked to pick apart code to figure it out. I wrote a few custom functions but only because it made the code more readable. The full source code is given, so the reader is free to pick apart those functions and write his own (to better suit his needs).

I believe this is attributed to Albert Einstein:

Make things as simple as possible, but not simpler.

Be careful of encapsulating too much into just one function call.

Spreadsheet Open XML V2

I was rushing to get this out. The updated version of my programming guide is out! I first launched it on 17 Jan this year, so I was hoping to meet the personal-and-unseen deadline of 17 May, so it’s a nice 4 months interval. Ah well, I’m the only one who cares anyway…

I’ve added loads more content to help you with your Open XML spreadsheet needs. Version 1 was 53 pages. Version 2 is 147 pages. There are a lot of pages with screenshots, but still… 147 pages!

You can find out more here.

The last few weeks had been interesting while I rushed to get working source code and write explanations for the guide… I need to sleep… wait, I’ve got a magazine deadline! *sigh*

First programming product almost done

As I mentioned earlier, I’m preparing a guide to creating Excel files using just code and the Open XML SDK. I’m calling it “Excel Open XML From Scratch” (nice name, huh?). All the source code (C# and VB.NET) had been written and tested. They work! Yay!

So now, I’m writing the accompanying PDF to explain the code and concepts. As mentioned before, I’m targeting a “before Christmas” launch. Now that we’re nearer the date, I can confidently tell you that it’s going to be sometime within the next few days. It will be released on 17 December. This gives you a week before Christmas to get this as a present for yourself or a programmer friend.

The price will be set at USD 17 (see actual price on product page), which I believe is a fair price considering that it will save you hours of work. Think about how much you’re paid per hour. Probably more than the price of this product. Hmm… I might even have to increase the price…

This is my first programming-related product, so I’m really excited by this. If you don’t really care about C#, VB.NET, Open XML or Excel, then I apologise. Just ignore any related posts for the immediate couple of weeks.