Code like a rockstar

There’s a new course available from Polymath Lectures called “Code Like a Rockstar”. Here’s an excerpt from the course description:

Taught by a successful Google Software Engineer and Computer Science Ph. D., this 5-session online masterclass will teach you expert-level coding techniques and practices which will get your code noticed by companies such as Facebook, Google, Apple, and Microsoft. Acquired over years of writing amazingly stable software at unimaginable scale and complexity, the tips in this course go well beyond the techniques taught in a typical software engineering program.

I think the instructor Michael Barnathan is kinda cool. But you’re welcome to go find out more and make up your own mind. And I don’t get a penny out of this if you sign up.

The course starts on January 7 2012, so you have some time to decide.

Multi-use variables or multiple variables?

So I’ve been working on a software project of mine. I’ll tell you more about it soon enough, but for now, it’s enough to say that I’m writing source code that generates source code.

One thing I’ve noticed is variable declaration. There are 2 extremes.

One variable used multiple times

This is the memory-efficient version. If you need the use of an integer variable, you just declare one variable. For example,

int i;
i = DoSomething() + DoSomethingElse();
DoAlpha(i);
i = DoThis() + DoThat();
DoBeta(i);

That’s just for illustrative purposes. If you’ve written a fair amount of code, I’m sure you can think of better examples. Which are probably (and usually) more elaborate and lengthier.

The drawback to this is that the variable is temporary. As the code continues its execution, previous values stored in that variable are considered to be unimportant to future executions. That’s why the value can be discarded and the variable overwritten.

Multiple variables but one-off use

Then there’s the “declare as many variables as you can (or think you need)” method. For example,

int i1 = DoSomething();
int i2 = DoSomethingElse();
int i3 = DoThis();
int i4 = DoThat();

This has the advantage of keeping the variable values “alive” through that section of code. The drawback is that you use more memory, even if seemingly trivial. I mean, that’s like 12 more bytes of memory (assuming integers still take up 32 bits when you’re reading this). That hardly makes a dent in the computer’s memory space.

The hybrid

The above 2 are extreme cases. What happens when you write code is probably a hybrid, somewhere in between the 2 extremes. For example,

int iSubtotal;
int iTotal;
iSubtotal = DoSomething();
iTotal += iSubtotal;
iSubtotal = DoThis() + DoThat();
DoSomethingElse(iSubtotal);
iTotal += iSubtotal;

You know what you declared those variables for, so you have an idea how many “unique” variables you need. This have the benefits of using the least number of variables (sort of), balanced with keeping the least number of “live” variable values around.

So why am I talking about this?

Auto-generated source code cannot generate hybrids

When you’re writing code, you have one very important advantage: You have context. A program that generates source code, such as a decompiler, does not have that.

When you’re writing code, you make variable decisions such as naming, naming conventions, how many you need and so on.

A decompiler has difficulty making decisions like those, so it has to choose one of the extremes. Typically the multiple variables route, because that’s the safest. All a decompiler can do is detect that a variable is needed, and so writes out the variable declaration in the resulting source code. It cannot decide on whether this part of the code can reuse one of the variables it has already declared (or at least has difficulty doing so).

Ok, so the cat’s out of the bag. I’m writing a decompiler. That’s not exactly true but will suffice for now (I promise I’ll tell you more soon!).

Anyway, that’s what I discovered while working on my software project. I have decided to go the multi-use variable route, because of a human (and programmer) behaviour. A human programmer has difficulty holding on to many separate variables in his head.

When a section of code requires many variables, I tend to try to limit the number of variables I remember in my head. Maybe there’s a pattern. I might remember there’s fFinancialYear1 up to fFinancialYear7. I might decide to refactor the code such that I only need one fFinancialYear floating point variable (assuming the appended numeral makes sense, and not just laziness in naming). I might separate the code section into several sections, so each section has a limited number of variables.

Maybe that’s not how most programmers work, but I find it “friendlier” than having thisIsAnAwesomeClass1 through thisIsAnAwesomeClass20, and I can’t remember which awesome class does which. I tend to work with tighter variable names (where possible and logical), and write code that’s as tight in scope as possible. So the variable values can be discarded, which means I don’t have to keep track of whether that value is still needed, even if the computer doesn’t mind having to keep track of it.

So how do you write your code where variables are concerned?

Colossal computer coding

Jumping fish, lazing cat, right-handed piano keyboard playing and more than 2000 lines of source code written.

You can try to guess what’s the music piece I’m practising on. I’m almost ready… and I’ll have a recording up then.

Podcast: Why programmers write stupid code

Need I tell you, again, how my first recording failed abysmally? Probably not. And here I am, doing another audio recording. Plus the fact that I got a new headset and I want to play with it.

To be honest, I learned a few things. I speak quite fast normally, and when my speech gets into recorded form, my words just slurred together. I just think faster than I can write or speak.

Be clear. That was main goal this time. So I spoke slower and slightly louder. I did some post production work and here it is:

Download mp3 [~ 2:40 minutes ~ 1.22 MB]

Noticed that I haven’t said anything about the subject at hand. Well, here’s a synopsis of the recording.

  • Murder on the Orient Express by Agatha Christie
  • Courting danger in racing simulations
  • Why programmers write stupid code

You’ll have to listen to the recording to find out how all 3 points fit together. For the impatient ones out there who just wants to know the final answer, here’s a hint: First word of first four paragraphs.