31 December, 2007 | Written by Vincent Tan 2 Comments

Empty strings

Alright, there’s already been some discussion on this here, and here, and here. Someone even made a chart of timings from someone else’s results. What’s the topic? Whether string.Empty or “” is better.

Generally speaking, the differences in speed and efficiency aren’t fantastic enough to be a big deal. I haven’t had to code something where this difference was crucial to my application’s efficiency.

The general consensus is to use the Length property to determine emptiness. I’ve finally decided to give it a whirl and run some tests. There were 5 tests suggested:

  • s == “”
  • s == string.Empty
  • s.Equals(”")
  • s.Equals(string.Empty)
  • s.Length == 0

I’ll give you the code I used first

const int cnTries = 10;
const int cnIterations = 1000000000;
DateTime dtStart, dtEnd;
TimeSpan ts;
double fEmptyQuotes = 0, fShortQuotes = 0, fLongQuotes = 0;
double fEmptyEmpty = 0, fShortEmpty = 0, fLongEmpty = 0;
double fEmptyDotQuotes = 0, fShortDotQuotes = 0, fLongDotQuotes = 0;
double fEmptyDotEmpty = 0, fShortDotEmpty = 0, fLongDotEmpty = 0;
double fEmptyDotLength = 0, fShortDotLength = 0, fLongDotLength = 0;
int i, j;
string sEmpty = string.Empty;
string sShort = "This is a short string to test empty string comparison";
string sLong = "This is a long string to test the efficiency of comparing with empty strings, which means it has to be like, really long. And I'm starting to run out of useless things to say...";

for (j = 0; j < cnTries; ++j)
{
    //double fEmptyQuotes = 0, fShortQuotes = 0, fLongQuotes = 0;
    dtStart = DateTime.Now;
    for (i = 0; i < cnIterations; ++i)
    {
        if (sEmpty == "") ;
    }
    dtEnd = DateTime.Now;
    ts = dtEnd - dtStart;
    fEmptyQuotes += ts.TotalMilliseconds;

    dtStart = DateTime.Now;
    for (i = 0; i < cnIterations; ++i)
    {
        if (sShort == "") ;
    }
    dtEnd = DateTime.Now;
    ts = dtEnd - dtStart;
    fShortQuotes += ts.TotalMilliseconds;

    dtStart = DateTime.Now;
    for (i = 0; i < cnIterations; ++i)
    {
        if (sLong == "") ;
    }
    dtEnd = DateTime.Now;
    ts = dtEnd - dtStart;
    fLongQuotes += ts.TotalMilliseconds;

    //double fEmptyEmpty = 0, fShortEmpty = 0, fLongEmpty = 0;
    dtStart = DateTime.Now;
    for (i = 0; i < cnIterations; ++i)
    {
        if (sEmpty == string.Empty) ;
    }
    dtEnd = DateTime.Now;
    ts = dtEnd - dtStart;
    fEmptyEmpty += ts.TotalMilliseconds;

    dtStart = DateTime.Now;
    for (i = 0; i < cnIterations; ++i)
    {
        if (sShort == string.Empty) ;
    }
    dtEnd = DateTime.Now;
    ts = dtEnd - dtStart;
    fShortEmpty += ts.TotalMilliseconds;

    dtStart = DateTime.Now;
    for (i = 0; i < cnIterations; ++i)
    {
        if (sLong == string.Empty) ;
    }
    dtEnd = DateTime.Now;
    ts = dtEnd - dtStart;
    fLongEmpty += ts.TotalMilliseconds;

    //double fEmptyDotQuotes = 0, fShortDotQuotes = 0, fLongDotQuotes = 0;
    dtStart = DateTime.Now;
    for (i = 0; i < cnIterations; ++i)
    {
        if (sEmpty.Equals("")) ;
    }
    dtEnd = DateTime.Now;
    ts = dtEnd - dtStart;
    fEmptyDotQuotes += ts.TotalMilliseconds;

    dtStart = DateTime.Now;
    for (i = 0; i < cnIterations; ++i)
    {
        if (sShort.Equals("")) ;
    }
    dtEnd = DateTime.Now;
    ts = dtEnd - dtStart;
    fShortDotQuotes += ts.TotalMilliseconds;

    dtStart = DateTime.Now;
    for (i = 0; i < cnIterations; ++i)
    {
        if (sLong.Equals("")) ;
    }
    dtEnd = DateTime.Now;
    ts = dtEnd - dtStart;
    fLongDotQuotes += ts.TotalMilliseconds;

    //double fEmptyDotEmpty = 0, fShortDotEmpty = 0, fLongDotEmpty = 0;
    dtStart = DateTime.Now;
    for (i = 0; i < cnIterations; ++i)
    {
        if (sEmpty.Equals(string.Empty)) ;
    }
    dtEnd = DateTime.Now;
    ts = dtEnd - dtStart;
    fEmptyDotEmpty += ts.TotalMilliseconds;

    dtStart = DateTime.Now;
    for (i = 0; i < cnIterations; ++i)
    {
        if (sShort.Equals(string.Empty)) ;
    }
    dtEnd = DateTime.Now;
    ts = dtEnd - dtStart;
    fShortDotEmpty += ts.TotalMilliseconds;

    dtStart = DateTime.Now;
    for (i = 0; i < cnIterations; ++i)
    {
        if (sLong.Equals(string.Empty)) ;
    }
    dtEnd = DateTime.Now;
    ts = dtEnd - dtStart;
    fLongDotEmpty += ts.TotalMilliseconds;

    //double fEmptyDotLength = 0, fShortDotLength = 0, fLongDotLength = 0;
    dtStart = DateTime.Now;
    for (i = 0; i < cnIterations; ++i)
    {
        if (sEmpty.Length == 0) ;
    }
    dtEnd = DateTime.Now;
    ts = dtEnd - dtStart;
    fEmptyDotLength += ts.TotalMilliseconds;

    dtStart = DateTime.Now;
    for (i = 0; i < cnIterations; ++i)
    {
        if (sShort.Length == 0) ;
    }
    dtEnd = DateTime.Now;
    ts = dtEnd - dtStart;
    fShortDotLength += ts.TotalMilliseconds;

    dtStart = DateTime.Now;
    for (i = 0; i < cnIterations; ++i)
    {
        if (sLong.Length == 0) ;
    }
    dtEnd = DateTime.Now;
    ts = dtEnd - dtStart;
    fLongDotLength += ts.TotalMilliseconds;
}

Console.WriteLine("empty: {0}", fEmptyQuotes / (double)cnTries);
Console.WriteLine("short: {0}", fShortQuotes / (double)cnTries);
Console.WriteLine("long : {0}", fLongQuotes / (double)cnTries);
Console.WriteLine("empty: {0}", fEmptyEmpty / (double)cnTries);
Console.WriteLine("short: {0}", fShortEmpty / (double)cnTries);
Console.WriteLine("long : {0}", fLongEmpty / (double)cnTries);
Console.WriteLine("empty: {0}", fEmptyDotQuotes / (double)cnTries);
Console.WriteLine("short: {0}", fShortDotQuotes / (double)cnTries);
Console.WriteLine("long : {0}", fLongDotQuotes / (double)cnTries);
Console.WriteLine("empty: {0}", fEmptyDotEmpty / (double)cnTries);
Console.WriteLine("short: {0}", fShortDotEmpty / (double)cnTries);
Console.WriteLine("long : {0}", fLongDotEmpty / (double)cnTries);
Console.WriteLine("empty: {0}", fEmptyDotLength / (double)cnTries);
Console.WriteLine("short: {0}", fShortDotLength / (double)cnTries);
Console.WriteLine("long : {0}", fLongDotLength / (double)cnTries);

Basically I used an empty string, a short string and a long string to check against empty strings. I ran these 3 cases against the 5 tests 1 billion times for each test case. Then I ran the entire gamut of tests 10 times to get an average. The short string contained 54 characters. The long string contained 177 characters.

Running 1 billion times per test case will give a sufficiently long enough time period. Running 10 times and getting an average will give a sufficiently stable result (DateTime.Now isn’t exactly an, uh, exact stopwatch criteria.).

Here are the results
Checking with [s == ""] test.

  • Empty string, 10315.6250 milliseconds
  • Short string, 8307.8125 milliseconds
  • Long string, 8564.0625 milliseconds

Checking with [s == string.Empty] test.

  • Empty string, 3573.4375 milliseconds
  • Short string, 8307.8125 milliseconds
  • Long string, 8603.1250 milliseconds

Checking with [s.Equals("")] test.

  • Empty string, 9517.1875 milliseconds
  • Short string, 7537.5000 milliseconds
  • Long string, 7576.5625 milliseconds

Checking with [s.Equals(string.Empty)] test.

  • Empty string, 9540.6250 milliseconds
  • Short string, 7515.6250 milliseconds
  • Long string, 7607.8125 milliseconds

Checking with [s.Length == 0] test.

  • Empty string, 443.7500 milliseconds
  • Short string, 443.7500 milliseconds
  • Long string, 445.3125 milliseconds

The check with the Length property wins hands down.

Of course, if I stopped here, this post would be very boring. The reason cited for slowness of checks (or manipulations) of strings that has any double quote in it, is that an actual object of type string is created. The creation of the object added overhead.

I wouldn’t want to delve into the IL code and dismantle everything just so I could spot the exact portion that explains why the Length property is faster, or whether the compiler creates a constant representing an empty string.

Instead, I’ll analyse the results I got instead. If you look at the results, the two non-empty strings run with close timings. This tells me something; The length of the string doesn’t matter as long as it’s non-empty.

I also realised that, with the exception of the 5th test, all test results of using the empty string to compare with an empty string differ with test results of using non-empty strings to compare with empty strings. For example, in the 1st test, the empty string case ran with 10315.6250 milliseconds while the 2nd test gave 3573.4375 milliseconds. Yet in both tests, the non-empty string cases ran with similar timings (hovering around 8400 milliseconds)!

So my second realisation: Empty strings are indeed treated as different objects as non-empty strings. This might seem obvious, but I thought it bears highlighting.

My third realisation is that the Equals function is faster than the double equal == operation. In the grand scheme of things, it probably doesn’t matter. It’s just another indicator that strings aren’t native types, so just because it supports the == operator doesn’t mean it’s comparable to other native types such as integers.

Note that there aren’t any significant differences between tests 3 and 4, meaning tests using Equals("") and Equals(string.Empty) are practically equivalent. As to why test results from 1 and 2 aren’t relatively similar to test results from 3 and 4, I don’t know. My guess would be the Equals function transformed the "" to string.Empty better than the == operator.

My current coding practice is to use s.Equals("stringtocompare") for checking equality with non-empty strings, and s.Length == 0 for checking if s is an empty string. From my test results, my choice seems to be the most efficient.

As for the use of the string.Empty constant, I actually have another reason, and that is maintainability. Let’s look at the following 2 cases,

string s = "";

and

string s = string.Empty;

Even though it’s more verbose, the latter case is clearer that an empty string is assigned. I’ve debugged code where the error was because the original coder typed " " instead of "". Why subject your eyes to undue labour to checking if it’s a string with a space or an empty string?

I’m getting on in years, and my eyes aren’t what they used to be. So, have pity on me and your fellow programmers. Just use string.Empty ok?

Have a happy New Year!

28 December, 2007 | Written by Vincent Tan Leave a Comment

Service Oriented Architecture

I was talking with my friend about the direction of software development, and the topic of service oriented architecture (SOA) came up. Having never heard of this term before, I asked for some clarification. Our discussion led me to believe that it’s programming geared towards providing valued services, towards business bottom lines, towards promoting company products.

Boy was I wrong.

I understood it as all efforts centered around the business core services. From top management, to sales, to marketing, to customer service, to programming. Teams are built based on services, say a manager, a customer service staff and a programmer forming a customer facing unit.

When I did some research on Wikipedia to further my understanding, I found my grasp on the subject fatally wrong. I’ll leave you to read up on Wikipedia’s entry on service oriented architecture. There’s a sinking gut feeling as I read the entry. Let me tell you what it was later.

The first sentence already put me to sleep

Service Oriented Architecture (SOA) is an architectural style that guides all aspects of creating and using business processes, packaged as services, throughout their lifecycle, as well as defining and provisioning the IT infrastructure that allows different applications to exchange data and participate in business processes loosely coupled from the operating systems and programming languages underlying those applications.

The second sentence gave a less soporific challenge

SOA represents a model in which functionality is decomposed into small, distinct units (services), which can be distributed over a network and can be combined together and reused to create business applications.

After finishing the entire entry, I found it analogous to an N-tier application. There are user interface screens and business classes and possibly data access components. A new user interface screen is created by slapping on UI controls and running functions from existing business classes. If a new function or new business class is required, then it’s written.

The service oriented architecture simply creates new applications by stringing together existing applications (or services as they’re termed).

Now I have maintained and even coded entire web applications by myself. I know what it’s like to maintain a whole bunch of business classes and know every function quite intimately. And that’s just ensuring the code works cohesively within the web application. It’s hard, but with discipline, it becomes manageable.

But are we talking about inter-application functionality?

I have to admit that this model works well for certain businesses. Social media sites like Facebook and search engines certainly benefit from the publicly exposed API. They probably can survive only if their services are exposed and easily used.

So here’s my gut feeling: It sounds like a lot of work, and the whole thing seems ready to collapse. It also require a lot of control management, as is seen by something further down the entry

using a special software tool which contains an exhaustive list of all of the services, their characteristics, and a means to record the designer’s choices which the designer can manage and the software system can consume and use at run-time.

The concept sounds wonderful. String together a unique set of services in a specific order, and you get a brand new application. I’ve programmed for 5 years, and let me tell you, I’ve never seen an application that can be dismantled into independent and reusable parts for use in creating another application. Projects and requirements are sometimes extremely specific and often need customisation.

Of course, just because I’ve never seen it working doesn’t mean it doesn’t work (that’s a lot of negatives…). I’m just saying that if a business company adopts the model, then someone at a high level has to refactor everything. And I’m not talking about code. Customer service applications, billing applications, inventory applications.

Everything has to be decomposed into independent parts and rewritten as a service. Then someone has to string them up back together to work exactly as the original applications. It’s a huge task if we’re talking about companies offering a diverse set of services and products.

Independent services require structure and well-defined limits and capabilities. It means having the courage to say no. And you know how sales people find it extremely difficult to say that word. There goes billing applications…

I think I prefer the definition my friend and I were discussing…

26 December, 2007 | Written by Vincent Tan Leave a Comment

Office Christmas party

It’s the annual office Christmas eve party! I’ll be letting the pictures do most of the talking. First, when I reached the venue, this greeted me:
Christmas party prizes
Helllloooo prizes! The one on the extreme right is the 1st prize, a Brother printer.

Then I went checking out the place. Cupcakes!
Merry Christmas cupcakes
Of course, you weren’t looking at the cupcakes, right? You were eyeing the booze, right?
Wines

Next I moved on to the turkey.
Turkey dish
The other dishes were covered up, so I moved on to the view.
High view

I spotted some people playing with this contraption.
Roulette ball spinner
And no, it’s not used for the lucky draw. My guess was that it’s used to draw lots in a matching game, where they ask us for stuff and we’re supposed to come up with matching items. I realised how hard it was to find 2 mobile phones without camera functionality…

Just when I thought the only subjects willing to be photographed were inanimate, my colleague offered to be captured in bytes.
My colleagues
He also pulled in a friend. The one on the left’s An Le. The other one’s the brave soul Ming Chun. I was given explicit instructions to capture some of the background too.

So, how’s the lucky draw done? You see those balloons?
Prize balloons
One of them had my name in it. The organisers stuffed slips of paper with our names into the balloons. The fortunate ones would have their balloons pricked and names called out. The lucky draw would be conducted amidst exploding showers of balloon rubber, much clapping and hand shaking, and prize awarding.

And I won a notepad! Cool. I was running out of note paper at home…

24 December, 2007 | Written by Vincent Tan Leave a Comment

Crazy sales period

For the past couple of weeks, I’ve noticed many stores slashing their prices for the Christmas period. People just flock over, regardless if the discount was 10 percent or 70 percent. Somehow, getting a 10 percent felt good, even if the absolute amount saved was little.

Once, I heard someone say that Singapore was crazy. First there’s Christmas sales, then the Chinese Lunar New Year sales, then mid-year there’s the Great Singapore Sale. And then it starts all over again with Christmas sales.

My reasoning? It’s probably why the Singapore economy remained relatively stable. People have to buy stuff to move the economy.

Anyway, have a Merry Christmas!

21 December, 2007 | Written by Vincent Tan Leave a Comment

Best of 2007

It’s near the end of the year, and it seems fitting to look back, to review. So I’m going to do a short recap, consolidating the best of what I’ve learnt or written.

In artificial intelligence

I wrote something about using multiple senses in developing games. It was something I tried out during my university days, and I joined an online team collaboration. The article was featured in AiGameDev. It eventually got Stumbled. Cool.

From Harry Potter

Learnt to ignore debilitating remarks from others. How else am I going to improve myself, if I keep trying to satisfying everyone else?

From embracing flat world syndrome

I tried outsourcing my blog design with disastrous results. Despite the horrendous experience, I still think outsourcing is ok. I just happen to have met the wrong service provider.

From working with dates and times

My work in .NET development often require me to manipulate date and time information to and from databases in code. I used a lot of time poring over the MSDN documentation about format descriptors… Get a head start by reading my article.

In Bezier curves

I did a lot of homework regarding Bezier curves, and surfaces in university. When I was developing my own camera class for use in a game, I wanted to trace an easy curve in 3D. I wanted to be able to define 4 points and the camera would follow a Bezier curve defined by those 4 points.

The problem? Bezier curves don’t pass through the middle 2 points out of the defined 4 points. So I came up with an algorithm to calculate the middle 2 points.

Ending 2007…

It is the holiday season (check out my office decorations). So go enjoy your holidays. And keep learning!

Next Page →