Multi-use variables or multiple variables?

So I’ve been working on a software project of mine. I’ll tell you more about it soon enough, but for now, it’s enough to say that I’m writing source code that generates source code.

One thing I’ve noticed is variable declaration. There are 2 extremes.

One variable used multiple times

This is the memory-efficient version. If you need the use of an integer variable, you just declare one variable. For example,

int i;
i = DoSomething() + DoSomethingElse();
DoAlpha(i);
i = DoThis() + DoThat();
DoBeta(i);

That’s just for illustrative purposes. If you’ve written a fair amount of code, I’m sure you can think of better examples. Which are probably (and usually) more elaborate and lengthier.

The drawback to this is that the variable is temporary. As the code continues its execution, previous values stored in that variable are considered to be unimportant to future executions. That’s why the value can be discarded and the variable overwritten.

Multiple variables but one-off use

Then there’s the “declare as many variables as you can (or think you need)” method. For example,

int i1 = DoSomething();
int i2 = DoSomethingElse();
int i3 = DoThis();
int i4 = DoThat();

This has the advantage of keeping the variable values “alive” through that section of code. The drawback is that you use more memory, even if seemingly trivial. I mean, that’s like 12 more bytes of memory (assuming integers still take up 32 bits when you’re reading this). That hardly makes a dent in the computer’s memory space.

The hybrid

The above 2 are extreme cases. What happens when you write code is probably a hybrid, somewhere in between the 2 extremes. For example,

int iSubtotal;
int iTotal;
iSubtotal = DoSomething();
iTotal += iSubtotal;
iSubtotal = DoThis() + DoThat();
DoSomethingElse(iSubtotal);
iTotal += iSubtotal;

You know what you declared those variables for, so you have an idea how many “unique” variables you need. This have the benefits of using the least number of variables (sort of), balanced with keeping the least number of “live” variable values around.

So why am I talking about this?

Auto-generated source code cannot generate hybrids

When you’re writing code, you have one very important advantage: You have context. A program that generates source code, such as a decompiler, does not have that.

When you’re writing code, you make variable decisions such as naming, naming conventions, how many you need and so on.

A decompiler has difficulty making decisions like those, so it has to choose one of the extremes. Typically the multiple variables route, because that’s the safest. All a decompiler can do is detect that a variable is needed, and so writes out the variable declaration in the resulting source code. It cannot decide on whether this part of the code can reuse one of the variables it has already declared (or at least has difficulty doing so).

Ok, so the cat’s out of the bag. I’m writing a decompiler. That’s not exactly true but will suffice for now (I promise I’ll tell you more soon!).

Anyway, that’s what I discovered while working on my software project. I have decided to go the multi-use variable route, because of a human (and programmer) behaviour. A human programmer has difficulty holding on to many separate variables in his head.

When a section of code requires many variables, I tend to try to limit the number of variables I remember in my head. Maybe there’s a pattern. I might remember there’s fFinancialYear1 up to fFinancialYear7. I might decide to refactor the code such that I only need one fFinancialYear floating point variable (assuming the appended numeral makes sense, and not just laziness in naming). I might separate the code section into several sections, so each section has a limited number of variables.

Maybe that’s not how most programmers work, but I find it “friendlier” than having thisIsAnAwesomeClass1 through thisIsAnAwesomeClass20, and I can’t remember which awesome class does which. I tend to work with tighter variable names (where possible and logical), and write code that’s as tight in scope as possible. So the variable values can be discarded, which means I don’t have to keep track of whether that value is still needed, even if the computer doesn’t mind having to keep track of it.

So how do you write your code where variables are concerned?

Those variables on Bezier curve equations are not fixed

I wrote something on reverse engineering Bezier curves about… *goes to check* woah, 2 years ago! I don’t remember it being that long… (You might want to read that article before proceeding…)

Anyway, I’ve received a few comments and emails about its usefulness. Basically, what I did was to find the 4 control points of a cubic Bezier curve from 4 known points which lie on that Bezier curve. 2 of the known points are to be the end points of the Bezier curve (which automatically makes them control points too). The other 2 known points lie somewhere on the Bezier curve.

Here’s where the confusion sets in. Commenter Yonatan pointed out that there is an infinite number of Bezier curves based on how those 2 known points are defined. And he’s right.

Now, I formed that solution based on the “natural” implicit decision that the 4 points are evenly spread out on the Bezier curve. There is no reason for them to be, and the math never assumed they are. The solution arose from the assumption that the control points were evenly spread out, but in the end, it worked for the general case as well. So long as 0 < u,v < 1 and u not equal to v (and logically speaking, u < v), everything worked fine.

So the whole point of this article is this: u doesn’t have to be 1/3, and v doesn’t have to be 2/3. You are supposed to know or decide what value they take. Once you’ve decided, the other control points will be uniquely determined. Let me illustrate:

Same Bezier curve with different control points

Now the 2 Bezier curves are exactly the same (I would know, I copied and pasted them…). Suppose I define u and v such that f and g lie on certain points on one Bezier curve, and they lie on different other points on the other Bezier curve. What happens is that the control points p1 and p2 are different for the 2 Bezier curves, even though the curves are exactly the same!

Disclaimer: I haven’t worked out an example such that it is true (other than the trivial case of a straight line), that a Bezier curve can be drawn with 2 different sets of control points. As in exactly the same. However, based on the math, I can say that 4 points lying on a Bezier curve can be drawn with 2 different sets of control points. The resulting curves might (actually they should) differ slightly, a twitch of a pixel here, a slight upward gradient there. But the 4 points would be exactly positioned as calculated. It’s meant to be a, what’s the word, sensational example. So there.

I can’t tell you what u and v are, although 1/3 and 2/3 should work fine. I gave you the theory and the solution. It’s up to you to decide how to use it. Depending on your context, you might decide on different values for u and v, which will influence how your control points are calculated.

My original intent was to produce a camera path flying in 3D. I didn’t care about the “correctness” of its path, only that there is one. As such, u=1/3 and v=2/3 worked excellently for me.

You might find that 1/3 and 2/3 don’t work for you. That’s fine. u and v are variables. By definition, they’re not fixed. Choose whatever value works for you. Depending on context, you might even want to come up with a simple formula (based on your situation) to calculate u and v dynamically.

Minor irks between C# and VB.NET

It’s about the way you think about programming. This isn’t another debate on which language is better. Just noting the differences because of how I think. The first is…

Declaring variables

After I think through the logic I want, the first thing that comes to mind might be “I need an integer”. This works well:

int i;

This, not so much:

Dim i As Int32

In C#, the name of the variable is secondary, at least at the point when it’s created. I need an integer. I don’t really care what its name is (yet). Nor does the compiler.

In VB.NET, I have to come up with a name. And if my RPG days are any indication, I take a long time coming up with names. By the time I think up an appropriate name, I forgot what type it’s supposed to be.

It’s like the active-passive voice in English. “He ate the apple.” and “The apple was eaten.” Which do you want to focus on?

I might be wrong. VB views variables as containers for values, hence there’s no point in fixing the type at declaration (like Javascript)? And VB.NET inherits the language structure.

Arrays

In C#, arrays are declared like so:

int[] ia = new int[5];

In VB.NET, they are declared like so:

Dim ia(5) As Int32

There’s a catch though. The array in C# has 5 elements. The one in VB.NET has 6.

Both languages treat arrays with zero-based indices. In VB.NET, the number used in declaring the size of the array means the last index of the array.

So if I wanted 5 elements, I should declare it as:

Dim ia(4) As Int32

Ok, I guess my frustration has run its course…

What do you think of when declaring variables?

I’m interested in what goes through your mind, even before the declaration code is written. The moment you decided you need a variable, what do you think of? Let me give an example comparing C# and VB.NET code.

Note: This is not another language war incitement. This is an article on self-discovery and self-understanding.

In C#:

int iNumberOfApples;

In VB.NET:

Dim iNumberOfApples As Int32

Personally, the moment I decided I need a variable to keep track of the number of apples, I think “I need an integer”. Sometimes, before I’ve even thought through the whole thing, this appears in my code editor:

int

Then I decide on what name to give. When I play RPGs, one of the hardest parts was to give my hero a name. I’ve been known to falter for 10 whole minutes before giving that scrawny protagonist something to call himself. This quaint custom of mine carries over into my variable namings, though I’ve had much success in culling the time spent to seconds instead of minutes now.

VB.NET on the other hand makes me write this first

Dim

Then I have to think of a name.

Dim iNumberOfApples

By this time, I can’t even remember what I wanted to use the variable for… oh yeah, it’s an integer.

In this respect, I prefer C# over VB.NET. Other than that, I write the two languages in similar ways. Yes, I am aware that Javascript variables are typeless when declared. My point is that I can write the code for saying “I want a variable” first.

// Javascript
var iNumberOfApples;

I haven’t had much contact with other languages (note to self: go read some). For example, I don’t know how Ruby looks like… oh wait, here it is. It’s one of those languages where you can use a variable immediately without declaring it, right?

So what do you think about at the moment you decide a variable is needed?

Beginning C# – Variables and operations

So you’re sick with Hello World program code, and want to start doing something. Before you go writing your NASA-approved quantum space engine or the next blockbuster financial application, you have to know about how programs store and manipulate data.

Program data (such as rocket speed or stock prices) are stored in variables, which can be thought of as boxes holding information. We remember information differently from a computer. For example, the information “15” can be a number to us. Or part of an address such as “street 15”. Or 15 dollars. We can pretty much move this data around in our heads.

Here’s the thing: computers need structure. They must know that 15 is a number, and will only store 15 as a number. If 15 is to be part of an address, it must be stored as an address. Moving data from one form to another usually requires telling the computer explicitly to do so, which means you have to write specific code to do that. It’s easier to talk to a computer in its own terms, so we’ll learn about …

Variable types
There are a few variable types, and the most commonly used are those storing numbers and text. There are 2 kinds of number variables; whole numbers and numbers with decimal points (also known as real numbers). In C#, the whole number variables are byte, short and int. The other kind has float, double and decimal. There are more, but these are the common ones.

Why is there a difference? Because computers think, store and manipulate them differently. To the computer, a 7 and a 7.00 are two very different pieces of information. This is very important, especially if you’re doing mathematical calculations.

So we’ll go through them a little bit

  • byte – a whole number between 0 and 255 inclusive
  • short – a whole number between -32768 and 32767 inclusive
  • int – a whole number between -2147483648 and 2147483647 inclusive
  • float – a real number approximately between ±10-45 and ±1038
  • double – a real number approximately between ±10-323 and ±10308
  • decimal – a real number approximately between ±10-28 and ±1028

Why are the real numbers approximations? Again it’s due to the way computers store data. float and double variable types implement the IEEE standard. For now, we’ll simply learn more about these variable types through practical use. Which brings us to the sample code.

using System;
using System.Collections.Generic;
using System.Text;

namespace VariablesAndOperations
{
    class Program
    {
        static void Main(string[] args)
        {
            int number;
            double anothernumber, yetanothernumber;

            number = 5;
            anothernumber = 3.14159;
            yetanothernumber = 2 * anothernumber;
            
            Console.WriteLine(number);
            Console.WriteLine(anothernumber);
            Console.WriteLine(yetanothernumber);

            number = 3 + 8;
            Console.WriteLine(number);

            // This indicates the end of the program

            Console.WriteLine("End of program");
            Console.ReadLine();
        }
    }
}

A closer look
The first two lines inside the Main function is

int number;
double anothernumber, yetanothernumber;

This is how we declare our variables. In this case, we tell the computer we want a variable of type int and we name it “number”. We also declare two variables of type double, named (unimaginatively) “anothernumber” and “yetanothernumber”. Variables of the same type can be declared on the same line by separating them with a comma.

The next three lines are

number = 5;
anothernumber = 3.14159;
yetanothernumber = 2 * anothernumber;

This is where we assign values to our variables. So we want the integer 5 to be stored in “number”. Then we want 3.14159 to be stored in “anothernumber”. The next line is more interesting. We want two times the value stored in “anothernumber” to be stored in “yetanothernumber”. This means we can assign any appropriate value to a variable, even if the value is stored in another variable.

Manipulating data basically involves mathematical operations (although they are others). They are addition, subtraction, multiplication and division. Their respective operations in code are +, -, * and /.

Commenting
Skipping a couple lines down, we encounter the double forward slash //. This simply tells the computer that whatever follows after the // on the same line is to be ignored, because it’s meant to be read by a human.

Additional note
If you’ve tried running the previous Hello World program straight from the directory by double clicking, you’ll find that it comes up and then disappears. Which is what the following lines will prevent:

Console.WriteLine("End of program");
Console.ReadLine();

By printing “End of program”, you’ll know that the program has indeed reached the end. The Console.ReadLine(); tells the computer to wait for input till the Enter/Return key is hit. This automatically solves the disappearing problem.

Here’s the source code for you to play with.

Homework
Try assigning

number = 1 / 4;
anothernumber = 1.0 / 4.0;

and print out the answer! Can you figure out why the values aren’t what you think they should be?