Tax season and SpreadsheetLight deal

It’s currently tax season. Making financial and tax reports can be taxing (haha!). I know, because I’ve spent years writing software that creates Excel reports for sales, revenue, debt and other financial reports.

So from now till the end 15th of April 2012, I’m offering my spreadsheet software library at the unbelievably low cost of USD 199 150. SpreadsheetLight runs on .NET Framework and Open XML SDK, is written in C#, uses the MIT License, and is designed to be easy to integrate into existing software projects with a minimum of fuss.

UPDATE: The promotion will be until 16th April 2012, and I’m selling it at USD 150. Need a spreadsheet library software? Get it now before I raise the price.

The reason is because I want you to have an easier time writing your programs. My experience in churning out Excel reports tells me it can be frustrating at times…

So check out SpreadsheetLight. Or get your manager to take a look.

Launching SpreadsheetLight

I am excited to tell you that my spreadsheet software library is available!

SpreadsheetLight

For the initial launch version (I decided to go for version 1. Why do people launch with versions 0.8? I don’t know…), you get comprehensive support for styles, rich text formatting, 47 named cell styles, themes (either one of the 20 built-in themes or create-your-own). Well, like I said, comprehensive styling support.

There’s also the (hum-drum) support for merging cells and freezing panes. I actually explored how to split panes. I certainly wrote about it in my Open XML guide, but it turns out that there’s a tiny rectangle at the top-left corner that Microsoft Excel didn’t tell me about. The size of that tiny rectangle is dependent on the font you use, and even the screen resolution of your computer screen.

While I could add a function that allows you to just input the size in EMUs (English Metric Units), I decided that if I can’t do it well, I don’t want to do it. At least for the initial launch.

Row heights and column widths were also big time drains. It turns out that they’re also dependent on the font and screen resolution of your computer screen. I was trying to calculate the standard row heights and column widths for the fonts in the built-in themes, and I thought I had them. I wrote a program using SpreadsheetLight to generate spreadsheets with different minor fonts, and I wrote a program to read in those spreadsheets and get the “standard” row height and column width. I spent 3 hours collecting data.

Then on a whim, I switched my computer screen’s resolution from 120 DPI to 96 DPI (my eyesight’s not that good ok? I need bigger text…), and whoa! All that data doesn’t apply anymore… All in all, I think I spent 6 or 7 days trying to figure out a general calculation formula. I failed. I don’t know how Excel does it.

I also surprised myself by including extensive support for pictures. I thought you just insert a picture into the worksheet and that’s it! It turns out there’s a ton of post-insertion manipulations you can do. For example, if your picture has transparent areas, you could set a background fill, and that background colour will be visible through the picture. Also, you can rotate the picture in 3D.

SpreadsheetLight is licensed under the MIT license. I decided to use one of the available software licenses instead of making up one of my own. As far as I can tell, the MIT license allows the recipient of the software to use the software in personal or commercial products. It’s also categorised as free software, as in freedom of use, not free as in cost. I don’t want to deal with per-client, or per-server, or per-developer or per-what-not licensing restrictions.

Even if you’re not interested in spreadsheet software, have a look at SpreadsheetLight. Tell some other programmer about it. Tell your manager about it. It took me slightly over 2 months of intense coding, and I want someone in the world out there to have an easier life because of SpreadsheetLight. Thanks!

Oh, and the image art is designed by Charlie Pabst from Charfish Design. While I have a fair competence in image work, I decided to get a professional designer to help me. It’s a business and professional product. I’m not going to risk the product’s success so I could stoke my ego…

SpreadsheetLight gradient fill function

I’m fascinated by gradient fills in a spreadsheet. More specifically, why would anyone want to have a cell with gradient colours? Is a standard block colour fill not enough? Is a texture image fill not enough? I guess this comes down to the visual aspect. Humans like to look at pretty colours. Especially if you have to stare at financial figures in a spreadsheet for hours.

So, that spreadsheet library I’m working on? It can do this:
Gradient fills in SpreadsheetLight

The code to do that is

SLDocument sl = new SLDocument(SLThemeTypeValues.Flow);

SLStyle style = new SLStyle();
style.Fill.SetCustomGradient(GradientValues.Linear, 45, null, null, null, null);
style.Fill.AppendGradientStop(0, SLThemeColorIndexValues.Light2Color);
style.Fill.AppendGradientStop(0.2, System.Drawing.Color.Red);
style.Fill.AppendGradientStop(0.4, System.Drawing.Color.Green);
style.Fill.AppendGradientStop(0.6, System.Drawing.Color.Blue);
style.Fill.AppendGradientStop(0.8, System.Drawing.Color.Yellow);
style.Fill.AppendGradientStop(1, SLThemeColorIndexValues.Accent1Color, 0.5);

sl.SetCellValue(2, 3, "Custom gradient function");
sl.SetCellStyle(2, 2, style);

style = new SLStyle();
style.Fill.SetGradient(SLGradientShadingStyleValues.DiagonalDown2, SLThemeColorIndexValues.Accent2Color, SLThemeColorIndexValues.Accent6Color);

sl.SetCellValue(4, 3, "Built-in gradient function");
sl.SetCellStyle(4, 2, style);

sl.SetColumnWidth(2, 24);
sl.SetRowHeight(2, 108);
sl.SetRowHeight(4, 108);

sl.SaveAs("GradientFill.xlsx");

The gradient stops are positioned from 0.0 to 1.0. The “built-in” functions (simulating Excel) allow you to specify only 2 colours, even though you can have more.

You will notice that the library allows you to use both theme colours and System.Drawing.Color’s. You can even specify a tint modifier (as seen in the last gradient stop), which range from -1.0 to 1.0 (-1.0 being completely dark, and 1.0 being completely white).

You might also notice that you don’t need to declare many variables from the library. For most of your work, you just need to know SLDocument class (which handles most of the spreadsheet’s functions), and the SLStyle class (which handles all your styling needs). Most of the functions are overloaded, which is why the functions are squeezed into fewer classes.

Here’s my rationale: I walk into a party. I don’t really know anyone. I find one person that I recognise. Probably the host. Then I let the host introduce me to everything. Who the interesting people are. Where’s the food. Where’s the washroom. Look, I don’t mind meeting people in the party, but I’m not really into that particular party. I just want to mingle a little so I can tell my friend that yes, I was at the party. Mission accomplished…

Then I go to that other party that I really wanted to go. (no offense to the host of the first party. “None taken.” Aww, isn’t he a nice guy?).

I don’t want to burden you with yet another software library to learn. So I’ve made it easy. 2 classes for most of your needs. If you’re using one of them intelligent code editing software, you’d get auto-completion too. Exploring what else a class can do for you is just a “.” away.

Yes, I’m finishing the library. It’ll be ready soon, ok? Just a couple of features more, and some testing, and I’ll launch version 1 of the product. I don’t give a flying fishball about eternal software betas. (Just launch already, dammit! Stupid software startups…)

Working on spreadsheet software library

As a natural and logical extension from my Open XML spreadsheet guide, I’m writing a software library to create and manipulate Open XML spreadsheets. (Never mind that decompiler project I was working on… 2 months of coding… sunk cost… moving on…).

I did some research (ok, an inordinate amount of research…) on the available spreadsheet software libraries out there, both free and commercial, both supporting Open XML (or .xlsx in any case) and the old .xls (Microsoft Excel in binary). I have 2 observations.

First, there’s a plethora of classes in the library. It’s sort of expected. There’s support for a lot of functionality, and it just burgeoned into many classes. Personally, I hate it when I have to learn a new library. There’s a whole bunch of documentation and classes I have to read up on and experiment to just do a simple thing (printing a string of characters is the first thing I try). When I first encountered the .NET Framework, I was crushed. It’s redeeming feature was its extensive documentation, which made learning easier.

Second, even though there’s support for a lot of functionality, it still takes quite a bit of code to accomplish what you want done (granted, much less than if you wrote low level code). Hey I wrote a guide on Open XML spreadsheets, I know how many lines of code needed to just create an empty Excel file, ok?

But these are spreadsheet software libraries!. They’re supposed to make your life easier. In fact, much easier.

I read that when the iPhone was designed, the engineers told Steve Jobs that it needed to have 4 or 5 buttons. Steve Jobs said no. One button (to rule them all). The iPhone now only has the 1 button.

So I took inspiration from that and designed my library to have that quality. Alright, alright, here’s a code sample:

SLDocument sl = new SLDocument();
sl.Save();

That will save an empty Excel file named “Book1.xlsx”. What, not Hello World enough for you?

SLDocument sl = new SLDocument();
sl.SaveAs("HelloWorld.xlsx");

There. Now the file is named “HelloWorld.xlsx”. What, sheet name? Most (if not all) of the libraries I researched required you to add a new worksheet to an empty file. All spreadsheets have at least one worksheet. Why force the programmer to do it anyway? You don’t see Microsoft Excel forcing the user to add worksheets in a newly created spreadsheet file, right? (Excel even has 3 worksheets added by default).

Alright, fine. The first worksheet’s name is by default “Sheet1”. You can rename it.

SLDocument sl = new SLDocument();
sl.RenameWorksheet(sl.DefaultFirstSheetName, "Hello");
sl.SaveAs("HelloWorld.xlsx");

There, happy? So, how do we set cell values?

SLDocument sl = new SLDocument();
sl.RenameWorksheet(sl.DefaultFirstSheetName, "Hello");
sl.SetCellValue(2, 3, 3.14159);
sl.SetCellValue(2, 4, "This is PI");
sl.SaveAs("HelloWorld.xlsx");

The cell with row 2, column 3 will have the value of PI. The cell with row 2, column 4 will have the string “This is PI”. Yes, the library supports cell references such as “C2” and “D2”. My opinion? They make better sense to a user with visual interface to the spreadsheet. It’s much harder to use when you’re programming with a non-visual interface to the spreadsheet. Good luck iterating through rows 2 to 500,000, with columns 1 to 1000 (financial reports, I’m looking at you…).

Want to add a new worksheet?

SLDocument sl = new SLDocument();
sl.RenameWorksheet(sl.DefaultFirstSheetName, "Hello");
sl.SetCellValue(2, 3, 3.14159);
sl.SetCellValue(2, 4, "This is PI");
sl.AddWorksheet("SecondWorksheet");
sl.SetCellValue(5, 5, "Why am I not first?");
sl.SaveAs("HelloWorld.xlsx");

Hey, a software library is supposed to make your life easy. The second worksheet’s name is *drum roll*, “SecondWorksheet”. The string “Why am I not first?” is in row 5, column 5 of the newly added worksheet. How does the library know which worksheet to add which cell value? By magic. Ok, fine, it automatically keeps track of worksheets.

When a user enters a cell value in Excel, does the user need to know which worksheet? No, because that information is implied. The user knows which worksheet because the user chose it already. And so does this software library.

Oh yeah, I even have basic theme support!

SLDocument sl = new SLDocument(SLThemeTypeValues.Flow);
sl.RenameWorksheet(sl.DefaultFirstSheetName, "Hello");
sl.SetCellValue(2, 3, 3.14159);
sl.SetCellValue(2, 4, "This is PI");
sl.SaveAs("HelloWorld.xlsx");

That gives you the Flow theme, one of the built-in themes in Microsoft Excel (note: only the fonts and font colours are supported). You can even design your own custom theme.

System.Drawing.Color[] clrs = new System.Drawing.Color[12];
clrs[0] = System.Drawing.Color.White;
clrs[1] = System.Drawing.Color.Black;
clrs[2] = System.Drawing.Color.WhiteSmoke;
clrs[3] = System.Drawing.Color.DarkSlateGray;
clrs[4] = System.Drawing.Color.DarkRed;
clrs[5] = System.Drawing.Color.OrangeRed;
clrs[6] = System.Drawing.Color.DarkGoldenrod;
clrs[7] = System.Drawing.Color.DarkOliveGreen;
clrs[8] = System.Drawing.Color.Navy;
clrs[9] = System.Drawing.Color.Indigo;
clrs[10] = System.Drawing.Color.SkyBlue;
clrs[11] = System.Drawing.Color.MediumPurple;

SLDocument sl = new SLDocument("ColourWheel", "Castellar", "Harrington", clrs);
sl.RenameWorksheet(sl.DefaultFirstSheetName, "Hello");
sl.SetCellValue(2, 3, 3.14159);
sl.SetCellValue(2, 4, "This is PI");
sl.SaveAs("HelloWorld.xlsx");

There are 12 colours you need to define. These correspond to the 2 light colours, 2 dark colours, 6 accent colours, the hyperlink colour and the followed hyperlink colour. “ColourWheel” is the theme name, “Castellar” is the major Latin font and “Harrington” is the minor Latin font. The major Latin font is used when you apply the Title named cell style. The minor Latin font is basically the body font.

What named cell style? A customer suggested supporting the feature.

Named cell styles

So how do you apply it?

sl.ApplyNamedCellStyle(2, 3, SLNamedCellStyleValues.Good);
sl.ApplyNamedCellStyle(2, 4, SLNamedCellStyleValues.Accent1);

I am finishing up version 1 of the library, and it will soon be available. I’m targeting a launch in January 2012. The software library will be called SpreadsheetLight. The primary idea is for it to be simple and clean. Simple for you to use, and you write clean code when you use it.

My internal tests show that SpreadsheetLight runs faster than 2 other free libraries. I won’t tell you which 2, because it’s not relevant, and because I’m automatically biased, and because it’s just not nice to the other programmers who contributed to those 2 libraries. The point is that it runs fast and is effective, which I’m happy about.

Multi-use variables or multiple variables?

So I’ve been working on a software project of mine. I’ll tell you more about it soon enough, but for now, it’s enough to say that I’m writing source code that generates source code.

One thing I’ve noticed is variable declaration. There are 2 extremes.

One variable used multiple times

This is the memory-efficient version. If you need the use of an integer variable, you just declare one variable. For example,

int i;
i = DoSomething() + DoSomethingElse();
DoAlpha(i);
i = DoThis() + DoThat();
DoBeta(i);

That’s just for illustrative purposes. If you’ve written a fair amount of code, I’m sure you can think of better examples. Which are probably (and usually) more elaborate and lengthier.

The drawback to this is that the variable is temporary. As the code continues its execution, previous values stored in that variable are considered to be unimportant to future executions. That’s why the value can be discarded and the variable overwritten.

Multiple variables but one-off use

Then there’s the “declare as many variables as you can (or think you need)” method. For example,

int i1 = DoSomething();
int i2 = DoSomethingElse();
int i3 = DoThis();
int i4 = DoThat();

This has the advantage of keeping the variable values “alive” through that section of code. The drawback is that you use more memory, even if seemingly trivial. I mean, that’s like 12 more bytes of memory (assuming integers still take up 32 bits when you’re reading this). That hardly makes a dent in the computer’s memory space.

The hybrid

The above 2 are extreme cases. What happens when you write code is probably a hybrid, somewhere in between the 2 extremes. For example,

int iSubtotal;
int iTotal;
iSubtotal = DoSomething();
iTotal += iSubtotal;
iSubtotal = DoThis() + DoThat();
DoSomethingElse(iSubtotal);
iTotal += iSubtotal;

You know what you declared those variables for, so you have an idea how many “unique” variables you need. This have the benefits of using the least number of variables (sort of), balanced with keeping the least number of “live” variable values around.

So why am I talking about this?

Auto-generated source code cannot generate hybrids

When you’re writing code, you have one very important advantage: You have context. A program that generates source code, such as a decompiler, does not have that.

When you’re writing code, you make variable decisions such as naming, naming conventions, how many you need and so on.

A decompiler has difficulty making decisions like those, so it has to choose one of the extremes. Typically the multiple variables route, because that’s the safest. All a decompiler can do is detect that a variable is needed, and so writes out the variable declaration in the resulting source code. It cannot decide on whether this part of the code can reuse one of the variables it has already declared (or at least has difficulty doing so).

Ok, so the cat’s out of the bag. I’m writing a decompiler. That’s not exactly true but will suffice for now (I promise I’ll tell you more soon!).

Anyway, that’s what I discovered while working on my software project. I have decided to go the multi-use variable route, because of a human (and programmer) behaviour. A human programmer has difficulty holding on to many separate variables in his head.

When a section of code requires many variables, I tend to try to limit the number of variables I remember in my head. Maybe there’s a pattern. I might remember there’s fFinancialYear1 up to fFinancialYear7. I might decide to refactor the code such that I only need one fFinancialYear floating point variable (assuming the appended numeral makes sense, and not just laziness in naming). I might separate the code section into several sections, so each section has a limited number of variables.

Maybe that’s not how most programmers work, but I find it “friendlier” than having thisIsAnAwesomeClass1 through thisIsAnAwesomeClass20, and I can’t remember which awesome class does which. I tend to work with tighter variable names (where possible and logical), and write code that’s as tight in scope as possible. So the variable values can be discarded, which means I don’t have to keep track of whether that value is still needed, even if the computer doesn’t mind having to keep track of it.

So how do you write your code where variables are concerned?

Developer Stories

Paparazzi red carpet

I must be getting famous. I was contacted by a social media marketing manager at M80. His company is working with Microsoft to promote Visual Studio 2010. Ok, I think I’ve fulfilled the necessary disclosure requirements. And no, I don’t get anything out of this. And I doubt I’m really that famous, but give me a few seconds to savour my short-lived fame anyway.

Microsoft wants to get feedback, and I believe it doesn’t have to be about Visual Studio 2010 (or Visual Studio at all). They’ve created a YouTube channel called Developer Stories. And they want to know why you are a developer.

If you’re a video kind of person, please go ahead and upload a video of you telling your story. Forget about your biases and opinions about the companies involved. I believe a polymath programmer should be above that. If nothing else, I want to hear your story too.

Since I’ll stutter in front of a video camera, and thank goodness I don’t have one (which is just an excuse, since my bedroom is a lousy backdrop), I’ll have to tell you my story in blog post format.

Once upon a time…

Actually, I stumbled upon programming. Sure I joined the computer club when I was in junior college (about 17 or 18 years old), but I didn’t understand the point of all the arcane Pascal lines of code. When I got into university, I didn’t have the credits (nor background) for the computer science track. So I took up applied mathematics as a major, and computational science as a minor.

In my first semester of my freshman year, I took C programming (which was a requirement for the computational science minor). Variables, assignments, loops, algorithms. It was fun. I also couldn’t understand why some of my classmates had difficulty wrapping their heads around what I perceived as simple ideas. They couldn’t understand some programming concepts, and they had some trouble understanding how to apply and change and break down a problem into programmable, solvable parts. I mean, you calculate something, add it to a temporary variable, go to next iteration, calculate with different input, add to temporary variable, go to next iteration until done. That’s summation. What’s so hard?

It was then that I realised that some people just weren’t made for programming. I’m not saying I’m a genius at it. I may just have a knack for breaking down problems so a programmatic solution is possible. That’s what programming is, not the lines of code and algorithms and what-not. Well, I was good enough at it that I decided to upgrade my computational science minor to a major.

Programming is about solving problems

What’s so special about computational science? And what’s the difference between it and computer science? I’m not sure. Computational science is more about solving (scientific) problems using programming, rather than the programming itself. I’ll leave you to compare that with your understanding of what’s computer science.

And what do I mean by solving problems using programming? I drew Sierpinski triangles. I used Newton-Raphson method to find roots. I solved a gigantic set of 100 equations with a 100 by 100 matrix. And most of the problems were based in science or mathematics.

So my background is about breaking down problems and translating that into programmable parts. I didn’t learn about software development cycles, software management practices and all those complicated stuff. I was trained in the solving problems, not the meta stuff around it. I’m not saying those complicated stuff aren’t useful. Just be aware of what you’re doing.

Here’s a suggestion. Learn about your business processes and work flow. What does your company do? What does your company sell? Which industry? Because your value as a software developer goes up exponentially if you can solve a business problem, not that clever obfuscated one-liner of yours.

Don’t just be a programmer. Be a problem solver.

So after I graduated, even with a math background, I went for a software development career. I like solving problems, and programming is one method. This guy just about sums it up:

So that’s a summary of why I’m a software developer. And now for some free advertisement for Visual Studio. I like C# and Visual Studio. Probably because of my C background. There’s Intellisense, a well documented library of the .NET Framework’s functions, and… it feels “clean”. I’m not sure how to explain that to you. I’m a simple man. I don’t need a lot of what is called developer productivity tools. Maybe I haven’t a problem to solve that requires them.

[update]
And I only have the Visual Studio Express version, not the paid one (but I bought VS2005 way back if that counts). The professional version’s a little steep in price, you know, considering my recent foray into entrepreneurship.

So what’s your story? Tell me in a comment, a blog post, or a video response.

[image by Ad Hatcher. Videos taken from Developer Stories YouTube channel]

Fractal nature of computer programming

Have you ever felt like you are doing the same things over and over and over again? Your daily commute, your workday schedule or your programming pattern?

Fractional capacity dimension
MandelbrotFractals possess the distinctive property of self similarity. They look the same when you zoom in and when you zoom out. Take a small part of a fractal, and it looks similar to another part of the fractal.

So how does programming come into the picture?

A simple programming structure
When you write a program, there are typically 3 sections to it:

  • Declare variables
  • Perform logic
  • Obtain a result

In the program, you might write functions to help break up the program into smaller pieces. And in the function code, you

  • Declare variables
  • Perform logic
  • Obtain a result

See any pattern yet? How about programs calling other programs? The caller program declares some variables, calls another program to perform logic, and obtain a result from that callee program.

Self referential
There’s a special type of function whose programming logic involves calling itself again. They are called recursive functions. An example is the function for calculating factorials. The factorial of 6 is the product of all positive integers less than or equal to 6. Or 1*2*3*4*5*6 = 720. The recursive function will be f(n) = n * f(n-1), where f(1) = 1. In code, it looks like

public int Factorial(int x)
{
if (x == 1) return 1;
else return x * Factorial(x - 1);
}

It is programmatically inefficient, but it is concise. (Math is like that sometimes…)

Iterative process of software development
The very cycle of software development is itself self similar. You develop, test, debug, roll out and get feedback from your users. Then based on the feedback, you develop again, test again, debug again, roll out the next version and get more feedback.

Sometimes, it’s even shorter. You write some code for a small part, test it until you’re satisfied with it. Then you write some code for another part and test that until you’re satisfied.

Can you give more programming examples with fractal properties?