Blind libraries

I have discovered that the software I’m working on right now falls into a particularly interesting category.

Pre-emptive note: I mean no disrespect for blind people or people with visual problems.

There are software that are basically machine-machine. These software programs talk almost exclusively with other programs with nary a human interaction. Stock software, scheduled financial server programs, batch data upload programs.

Then there are software that’s basically made for human interaction. Facebook, Twitter, mobile map software, image/video editing software.

My software sits somewhere between them. I’m going to use Microsoft Excel as the example because that’s what my software is related to.

Microsoft Excel is essentially a visual software. Sure you can enter data into cells and the spreadsheet is basically rows and cells of data. But it’s geared for a human to understand that data, and to interact with that data. Excel merely exists to facilitate that interaction.

Short digression: I was reading up on Excel user manuals (yes, I do that) and it turns out that the earliest spreadsheet software were effectively command-line. We’ve come a long way since then, huh?

My software is a spreadsheet library/component. Basically it allows a program to create and manipulate spreadsheets without any user interaction. More specifically, without any visual interaction.

That monthly revenue report you need? Have the data uploaded first (probably the machine-machine type of software doing this). Then use a spreadsheet library to create that report, styling rows/columns/cells or add a chart.

And then the final product, the spreadsheet itself is handed over to a human.

I want you to think about this for a second. The spreadsheet library is working blind.

Teaching the blind to paint

Imagine teaching a blind person to paint a tree.

The blind person has no idea what green or brown looks like. He doesn’t even know what a tree looks like. But you tell the blind person that this container has green pigment and that container has brown pigment.

Then you teach the blind person to hold the paintbrush this way, and apply the green pigment like that, and to have the paintbrush come into contact with the canvas in this way. And with practice, leaves start to appear. Or some semblance of it. You do the same with the tree trunks and branches, using the brown pigment.

The blind person still has no idea what an actual tree looks like, so he doesn’t even know if what he painted is accurate. But he finishes his painting and hands that painting off to someone who can actually judge the painting.

That’s what I feel my software is doing. This is especially true when I have to design function interfaces that allow a developer to do visual things.

Like insert an image.

Oh I can let you insert an image. No problem. Give me a file name.

Where do I position it? Oh.

With respect to what on the screen? Oh yeah, what screen?

How do I know if it’s large enough? Oh yeah, can’t see the image.

Some things are simpler to understand. Like “I want that cell to have a purple background.” I’ll let you assign a System.Drawing.Color structure, or assign a hexadecimal value, or a theme colour.

Keep in mind that the software doesn’t care if you assign #00FFFF or #FFFF00.

So that’s my experience with my spreadsheet library. I’m working on a word processing library right now, which is even more insane.

While spreadsheets have styles, word processing documents live on styles. Word processing software like Microsoft Word or LibreOffice Writer will be useless if they don’t allow the user to bold this word or italicise that sentence or underline that heading.

Final thought

Microsoft Office is about USD 300+, which is a visual software. The “blind libraries” commercial software cost about USD 999. And that’s the low end price.

Did you know that mine is open source and free? Check out SpreadsheetLight (there we go, self-promotion).

Action by reference

The son hits his finger and the father feels the pain, regardless of where the father is (or possibly when the father was/is… uh, what?).

But that’s the image I get from the source code I’ve been reading lately. And I’ve been reading a lot of source code (it’s for research). Here’s an example:

TheFather father = new TheFather();
TheSon son = father.Son;

father.FeelsPain = false;

son.HitFinger();

// I bet you nuggets to donuts that father.FeelsPain is now true
if (father.FeelsPain) Console.WriteLine("It hurts!");
else Console.WriteLine("Nahh, no biggie!");

Is there some coding paradigm I don’t know about?

Basically you have a parent class, and you get some child property/class from it. Then you manipulate the child property/class, changing some properties perhaps, or calling some of its functions.

And then the parent class automagically has the changes from its child property/class updated.

That’s the equivalent of NASA sending Curiosity to Mars and NASA automagically gets real-time footage from its rover. The rover doesn’t need to send the updates back home at all.

In the programming realm, this is similar to passing variables by reference instead of by value. Done within known and well-defined scope, this can be useful. Otherwise, they’re equivalent to global variables.

So what’s the background?

Reading the source code of other open source spreadsheet libraries (ahem), and using these libraries, I found this to be common:

var worksheet = workbook.Worksheets["Sheet1"];
// do something with worksheet

Then the workbook automagically have the changes on the worksheet updated into itself.

Then I went to read the example source code of commercial spreadsheet and word processing libraries (ahem), and this behaviour still held true.

Perhaps this makes sense to you. Personally, I find it confusing. I sent my son to get donuts. He seemed to have bought the donuts. But he didn’t report back. Should I call him? Shouldn’t he be home by now? Where are the donuts? What’s going on?

Now I’m hungry…

ToString() and cultural insensitivity

It was a scramble going through my entire code base. We’re talking thousands and thousands of lines of code here.

Luckily there was the Find function. And that the search string was fairly unique. And that the code change was fairly contained.

So what was I getting all hyped up about? Someone submitted a bug report telling me the SetRowHeight() function of my spreadsheet library didn’t work. What?!?! I tested that thing and it worked fine!

“What do you mean by it didn’t work? Was the row height changed to a wrong value, or something else” I asked.

“It completely failed. I can’t even open the Excel file.” was the answer.

Oh cranberry. Oh flying fishball hashbrown cranberry.

So I asked the person to send me the resulting Excel file that was generated. I looked through the contents, and found this: “40,2”

Shouldn’t that be “40.2”, I wonder.

And then I looked at the person’s name real close. I guessed the person’s somewhere in Europe that uses the comma as a decimal point character and the period as the thousands separator.

I proceeded to set every numeric variable that uses the ToString() function to have ToString(System.Globalization.CultureInfo.InvariantCulture). There were other changes I made based on this knowledge, but that ToString() was the main change.

Now this isn’t a post telling you about ToString() and thinking about how your program works in different cultures and so on (The biggest one I know is the Turkish “i”. That one seems to be a nightmare for upper and lower casing changes). The lesson I want to tell you is that no matter how much you check your software, the moment an actual user (in my case, a programmer) uses your software, shiitake mushrooms hit the fan.

You will only really know how good your software performs when other people are using it. You have no idea where they’re using it (like in my case), how they’re using it, or even why they’re using it. This means the sooner you put your software out there, the sooner you know whether your software is actually useful.

It doesn’t mean put out sloppy work. It just means you put out work that’s done to the best of your abilities, regardless of how small that feature list is. If it’s a bug tracking software, make sure it can track bugs easily. If it’s a shoot-em-up game, make sure I can run around easily and shoot enemies/monsters/aliens (fairly) accurately. If it’s a calculator, make sure it can at least add, subtract, multiply and divide before doing all those scientific and accounting functions.

So what software projects are you working on right now? What can you cut out so you can devote more energy to the core of your software?

Multi-personality classes

I’ve been working on my spreadsheet library and I discovered something. There are classes where I needed them to be multiple end results at the beginning of their lives. Let me illustrate.

Suppose we have an Embryo class. However, the way we use it is that we need it to survive all the way to the end of our program, and we happen to need the Testicles and Ovaries properties depending on how our program uses it. For example:

Embryo emb = new Embryo();
if (boy)
{
    emb.PrivatePart.Testicles.MakeStuff();
}
else
{
    emb.PrivatePart.Ovaries.MakeStuff();
}

The thing is, we don’t know beforehand whether we have a boy or girl until runtime. And even if we know beforehand that we have a boy or girl, we need a generic placeholder class that can represent either a boy or girl until some later time. Using the .NET object class and doing boxing/unboxing seems unproductive (no pun intended).

So what’s the actual situation? Chart axis.

The primary horizontal axis of an Excel chart can be either a category axis, date axis or value axis. Category axes are used for most cases, where the chart data’s categories are text. But if they’re dates, then date axes are used. And if they’re scatter charts, then the axis is a value axis.

Well, the primary horizontal axis is usually those 3 types. Bar charts have them at the primary vertical axis. *sigh* Don’t even get me started…

So the primary horizontal axis has to be all 3 at the same time. Well, the way I expose the property/class requires it to be all 3 at the same time.

“Why don’t you expose them individually? That would solve the ambiguity problem.”

I could do this:

SLChart chart = new SLChart("B2", "G6");
chart.PrimaryCategoryAxis.Title.Text = "A category title";
chart.PrimaryDateAxis.Title.Text = "A date title";
chart.PrimaryValueAxis.Title.Text = "A value title";

And then based on the type of axis used, I’ll use the different axis class. The thing is, I want to expose only one property instead of 3. This means programmers using my library don’t have to differentiate which axis class/property to use.

If I expose only 1 property/class, then that underlying class has to be all 3 types of axis at the same time.

There’s probably a design pattern I don’t know about. If you know it, or have comments on how to approach this, I’d love to hear it in the comments.

Now if you’re in the business of producing (hahaha… pun… never mind) Excel spreadsheets, try my library!

Next level of web development

The other day I met up with a friend who just finished giving a talk on HTML5. Well, not exactly HTML5 but more on the current mix of technologies that’s making up the current web development skills. The basic technologies involved are HTML5 (markup/data), Javascript (action) and CSS (presentation).

My friend used HTML5 in his title because there doesn’t seem to be a term for this new “level” of web development. And because he’s afraid his audience won’t know what he meant.

I remember writing Java applets (using the applet tag). I remember web sites without an ending paragraph p tag because the web browser was extremely tolerant. I remember web pages with font tags everywhere because CSS was practically non-existent back then. I remember displaying the current time using Javascript was an extremely cool thing to do.

Then web standards were introduced. HTML markup standards were encouraged. Javascript libraries started sprouting. And CSS came to the rescue, separating the presentation layer from the code layer.

As far as cross-browser issues go, adhering to current/latest HTML standards and using good Javascript libraries and using CSS meant that users are free to choose whichever browser they want. And the web site they’re visiting is expected to behave the same way and be rendered on the screen the same way (with maybe a few pixels off the mark as an error buffer, I guess) regardless of the web browser chosen.

And my friend is worried.

He’s worried that the current web developers are so used to the current set of technologies that basic programming problems are not even considered.

The big one is Internet access. He told me this group of (mostly young) web developers assume the Internet is as available as air. The idea that there might not be Internet access never crossed their minds, and so their web applications crash in the most spectacular manner when the user has no wifi.

And the worst part of it all was that these web developers think HTML5 is the current “web developing thing”. Sort of like “I’m a C# programmer” or “I’m a Python programmer”. They’re now developing in HTML5, without understanding that HTML5 is just the markup.

It’s the “as long as it works” mentality. I’m a practical man, so I agree with this mentality. I also do so with some understanding of the underlying technologies. This “HTML5, Javascript, CSS” combo seems like a black box, but it isn’t. Each part even advances independently of each other.

It’s like self-publishing on the Kindle without understanding the role of “traditional” editors and publishing houses. It’s like auto-tune of your song without understanding some music basics. It’s like relying on the auto-focusing of modern “prosumer” cameras without understanding basic camera equipment and terms.

I’m not saying they’re bad. I’m guilty of the last one. I use a camera that does everything for me when I click “Record” because I don’t want to fiddle with aperture and focal length and filters and such.

The problem is, if you don’t know (or care) what’s the underlying technology, you won’t know what to look for when things go wrong.

Question

Is there a term for this current level of HTML5, Javascript (I can only think of jQuery as a popular library) and CSS (CSS2/CSS3?)?

Spreadsheet chart reference manual now available

After weeks of working on this, I’ve finally finished my reference manual on Open XML charts in spreadsheets. It’s called (imaginatively) Spreadsheet Open XML Charts.

It’s sort of a sequel to my other reference manual on Open XML spreadsheets.

If you do any work on Open XML spreadsheets and in charting specifically, I’m certain you’ll find this useful. Please check it out and let me know what you think.

Now if you’ll excuse me, I’ve got to go crash and do nothing. Perhaps get a cup of tea or hot chocolate.

How to properly use OpenXmlWriter to write large Excel files

Brian Jones already wrote something on using the OpenXmlWriter with the Open XML SDK to write large Excel files. There are a couple of things that weren’t mentioned though.

OpenXmlWriter works best for creating new things

For the purposes of explanation, we’ll assume the “new thing” is a new worksheet, although OpenXmlWriter can be used to write any Open XML SDK part (such as WorkbookPart).

The reason for this is that OpenXmlWriter is essentially writing out XML tags. The SDK offers classes and properties that we’re familiar with, but underneath it all, an Open XML spreadsheet’s internal structure are XML files zipped together. So instead of using the SDK classes to form the “XML” parts, we use the OpenXmlWriter to directly write out the XML.

The problem comes when we need to save. If you’re working with an existing worksheet, there’s already an existing XML structure in place. This means there’s a corresponding WorksheetPart (to your Worksheet class), and this WorksheetPart already has the XML structure (and more importantly, a DOM structure).

So you can’t just use the OpenXmlWriter to write the XML stuff into an existing WorksheetPart, because any changes you made will be ignored.

This is why in Brian’s code, he reads in the existing worksheet and writes the content into a new worksheet, thus using a new WorksheetPart that’s “attached” to the WorkbookPart. After he’s done reading, he deletes the old WorksheetPart.

In conclusion, either you

  • Work with the DOM structure using the SDK, which means snail-pace if you’re writing many cells
  • Work with new WorksheetPart

If you have something like worksheetPart.Worksheet.Save() mixed with the use of OpenXmlWriter, it’s not going to work.

XML attributes

The more subtle (and possibly more crucial) information that’s missing is of attributes. When you use the OpenXmlWriter, it works best if it’s used all the way. As in you don’t mix SDK class/property usage code with OpenXmlWriter code.

Basically, if there’s an SDK class property you want to set, don’t assign to that property in code. Use the OpenXmlWriter to write that property, because that property is eventually an XML attribute anyway. In this case, think of the SDK as getting in the way of making your code run faster.

In Brian’s code, notice that the RowIndex property of the Row class is not assigned. Which is fine since it’s optional. In this case, Excel tries to guess which row it is, and places the first Row class content into the first row on the spreadsheet. This is good if that’s what you want. And bad if your want your “first” Row class to be on the 3rd row instead.

You might also notice that the Cell class has no properties assigned. And no, CellFormula and CellValue are not properties of the Cell class, they’re actually child classes. The SDK allows you to access them as though they’re properties. This means the “standard” set of cell properties are not assigned:

  • The cell reference, such as “A1”
  • The cell data type, such as string or number
  • The cell style index

The code works because the cell content is a random number. And the default data type of a cell is a number (so it doesn’t need to be explicitly assigned). It will fail if you need strings or other data types.

And because there’s no cell reference given, Excel tries to guess where each cell is. Which is to say, place all the cells in that row flush to the left. This means you can’t have a cell in A1, and a blank cell in B1, and then a cell in C1, because Excel mooshes all the Cell classes together.

And because there’s no style index given, the default style is used. This is typically the minor font (such as Calibri) with no special formatting (such as font colour or bold or italics or borders).

So here’s sample code on how to use OpenXmlWriter. Note the class OpenXmlAttribute. And if you understand this, you will know that the OpenXmlWriter isn’t only for writing large Excel files. It just happens that it directly writes XML tags and attributes, and thus works the fastest for writing a lot of data.

using (SpreadsheetDocument xl = SpreadsheetDocument.Create("LargeFile.xlsx", SpreadsheetDocumentType.Workbook))
{
    List<OpenXmlAttribute> oxa;
    OpenXmlWriter oxw;

    xl.AddWorkbookPart();
    WorksheetPart wsp = xl.WorkbookPart.AddNewPart<WorksheetPart>();

    oxw = OpenXmlWriter.Create(wsp);
    oxw.WriteStartElement(new Worksheet());
    oxw.WriteStartElement(new SheetData());

    for (int i = 1; i <= 50000; ++i)
    {
        oxa = new List<OpenXmlAttribute>();
        // this is the row index
        oxa.Add(new OpenXmlAttribute("r", null, i.ToString()));

        oxw.WriteStartElement(new Row(), oxa);

        for (int j = 1; j <= 100; ++j)
        {
            oxa = new List<OpenXmlAttribute>();
            // this is the data type ("t"), with CellValues.String ("str")
            oxa.Add(new OpenXmlAttribute("t", null, "str"));

            // it's suggested you also have the cell reference, but
            // you'll have to calculate the correct cell reference yourself.
            // Here's an example:
            //oxa.Add(new OpenXmlAttribute("r", null, "A1"));

            oxw.WriteStartElement(new Cell(), oxa);

            oxw.WriteElement(new CellValue(string.Format("R{0}C{1}", i, j)));

            // this is for Cell
            oxw.WriteEndElement();
        }

        // this is for Row
        oxw.WriteEndElement();
    }

    // this is for SheetData
    oxw.WriteEndElement();
    // this is for Worksheet
    oxw.WriteEndElement();
    oxw.Close();

    oxw = OpenXmlWriter.Create(xl.WorkbookPart);
    oxw.WriteStartElement(new Workbook());
    oxw.WriteStartElement(new Sheets());

    // you can use object initialisers like this only when the properties
    // are actual properties. SDK classes sometimes have property-like properties
    // but are actually classes. For example, the Cell class has the CellValue
    // "property" but is actually a child class internally.
    // If the properties correspond to actual XML attributes, then you're fine.
    oxw.WriteElement(new Sheet()
    {
        Name = "Sheet1",
        SheetId = 1,
        Id = xl.WorkbookPart.GetIdOfPart(wsp)
    });

    // this is for Sheets
    oxw.WriteEndElement();
    // this is for Workbook
    oxw.WriteEndElement();
    oxw.Close();

    xl.Close();
}

This creates a spreadsheet with 100 columns and 50000 rows of data. On my machine, it ran in about 18 seconds.

I’m not going to list all the available XML attributes, but you can find out the correct attribute name by exploring the Open XML SDK documentation, or the official ECMA documentation.

How to create an Open XML spreadsheet bar chart

Today, you’re going to learn how to insert a bar chart into an Open XML spreadsheet. Frankly speaking, the basics apply to the other types of charts, so if you master this, you can figure out how to create the other charts.

First, download the source code and the resulting Excel file.

I’m not going to go through every detailed step of the code. Instead, I’ll go through broad strokes of explanation. You’re expected to know the basics of using Open XML SDK to create cells and insert images. Refresher on cells and styles here, and inserting multiple images here.

Hardcoded data

You may have noticed that the spreadsheet cell data is hardcoded into arrays. The reason is it’s used in creating the cells, and creating a copy of the data when creating the chart. Open XML charts store the actual data it’s representing.

For charts, there are 3 parts to the data: the category entries, the “legend” entries, and the cell data. “Legend” is in quotes because most charts use that data part as the legend, but it’s not called legend entries. I believe internally, it’s referred as the series name of data series.

Absolute cell references

You may also notice that in the chart parts, cell references are absolute. For example, “Sheet1!$B$3”. It means the cell references are absolute, as opposed to relative (such as “Sheet1!B3”).

The sheet name is used

You may also notice that the sheet name is used. This means you can have a chart in sheet “Sheet1” but contain data from “Sheet2”.

Axis IDs

If you’ve explored an Open XML spreadsheet chart, you may have noticed magic numbers used as axis IDs. So how does Excel calculate them? I don’t know, but you don’t have to worry about that.

Within each chart, as long as the axis IDs are unique, you’re fine (this means within each ChartSpace class). This means theoretically, you only need 6 unique axis IDs. There are 3 axis IDs (category/date axis, value axis and series axis) for the primary axis and another 3 for the secondary axis. The secondary axis comes into play when you plot a particular data series on its own (or against an existing secondary axis).

When data series are plotted on a secondary axis, they can also be plotted as a different chart type. This is what’s known as a combination chart. For example, a column chart with a line chart.

I don’t know all the valid types of combination charts, thus the theoretical limit is 6. But I don’t think 3D charts can be part of combination charts. In practice, you only need 4 unique IDs.

And you don’t even need to fret. Just use 1 through 6 as the IDs.

Data series

Excel assumes that if your data is wider than it’s tall, then rows of horizontal cells are taken as data (with first row as category entries and the first cell of each row as the “legend” entry). Otherwise, it’s columns of vertical data (with first column as category entries and the first cell of each column as the “legend” entry). You can change this if you find your chart weird-looking on the Excel UI, but programmatically, the source code uses rows as data series.

Each data series has the category entries, the associated “legend” entry, and the actual data.

Different charts have different Open XML SDK classes representing data series. For example, the bar charts and column charts use the BarChartSeries. The pie charts (and surprisingly the doughnut charts) use the PieChartSeries. The reason is that there are different properties associated with different charts, and thus each chart series needs to take care of that. For example, line charts can have markers on each data point, and pie charts can have exploded pie slices.

Charts are also images

A chart has 2 main parts: the chart content, and the drawing part.

The chart content contains the chart data, legend, title, axes and what not. The drawing part refers to how big the chart is displayed on the worksheet, and where to display it. And in this sense, a chart technically behaves like an image.

So not only do you have to create the chart content, you have to take care of its “image-y” properties. Hence the reference link I gave on inserting multiple images earlier on.

Because of this, you have to take care of unique image IDs. Each worksheet needs a unique ID assigned to each image part. And a chart is considered an image for this purpose.

Here’s the gist. A WorksheetPart class is tied to each separate worksheet. Each WorksheetPart class has a DrawingsPart class. And a DrawingsPart class can have ImageParts (the “normal” images) and ChartParts (for charts).

For more Inception-like information, a ChartPart can have ImagePart’s too (to have picture fills for the chart title or legend for example). It’s sort of complicated, so let’s not go there in this article.

The dreaded TwoCellAnchor

Position of images (and thus charts) is done with either an AbsoluteAnchor, OneCellAnchor or TwoCellAnchor class. I explained a little why TwoCellAnchor took out more of my hair than 4 years of university mathematics here.

But for charts, you don’t really need fine-tuned positioning. So a TwoCellAnchor is fine. The column/row properties are zero-based indices of the column and row indices. The offsets are in EMUs. Theoretically, the offsets are less than the column width or row height of its corresponding column or row.

Chart anatomy

The typical chart goes like this. Each chart has a title, plot area and legend. The plot area contains a chart type (or several chart types for combination charts), and relevant axes.

Each chart type might have specific properties unique to the chart type assigned (bar charts will have the bar direction assigned for example). And data series unique to the chart type assigned (the BarChartSeries for bar and column charts for example). And any relevant axis IDs assigned (some charts don’t have axes, such as doughnut charts).

Relevant axes are typically a category (or text) axis on the horizontal, and a value axis on the vertical. If the category entries are date values, Excel might have used a date axis instead of a category axis. The source code provided assumes textual category axis is used.

For data series charted on the secondary axis, the order of appending axes is as follows:

  • Primary horizontal axis (whether category or date axis) [typically at bottom]
  • Primary vertical axis (value axis) [typically on the left]
  • Secondary vertical axis (value axis) [typically on the right]
  • Secondary horizontal axis (whether category or date axis) [typically on the top]

Bonus!

If you change the bar direction in the source code to Column, you’ll get a clustered column chart. Everything else works just fine.

In fact, all the charts look surprisingly similar in terms of code structure. Change the chart class, and the chart series class used. Fiddle with a couple of properties unique to the particular chart type. BOOM! New chart born.

The really hard part is getting the chart data series down pat. Which is why I hardcoded the data for this explanation. In practice, you’re probably getting your data from a database, so you’ll be iterating through that. Or obtaining the actual cell data somehow. So I might as well have a simple data structure so you can expand it yourself instead of hardcoding the absolute cell references everywhere (which will be a pain for you to follow).

In conclusion…

Charts are hard… I’m writing a reference manual for Open XML charts in spreadsheets. So stay tuned for that. In the meantime, my software library SpreadsheetLight takes care of all the above in a few lines of code. So take a look at that (it’s free and source code is available).

Future of spreadsheets

So I was checking some site statistics for my spreadsheet library website and I found an interesting search phrase: “open xml excel future”. The person might have been checking how important spreadsheets are in the future. Or maybe just checking if Microsoft Excel is still around.

Spreadsheets are still going to be used in future. This is based on my experience working in a medium-large telecommunications company, and based on the companies who bought my Open XML spreadsheet reference. Before I go on, let me tell you about an incident.

I went on Facebook and asked my friends to tell everyone they know of my spreadsheet library (SpreadsheetLight). One person commented that his friends are all hackers and don’t use spreadsheets. I will assume “hackers” to mean “programmers with higher than average programming skills” and “don’t use spreadsheets” to mean “hardly ever thought about them”.

Software is meant to be used by people. Usually “normal” people, not “hacker” people.

PayPal allows you to download transactional data. You can download it in CSV format. What do you do to see it clearer? You paste it into Excel. Or whatever spreadsheet software you use.

I’ve worked with satellite transactional data. There are lots of data fields. Source of transmission, destination of transmission, length of transmission, type of data. Why does this matter? Because the satellite companies charge the transactions to the customers. Even the source and destination is important (think of calling from America to China, and from Singapore to China. Wait, you might be too young to know about the overseas call charges…).

And the best way to view satellite transactional data is on a grid-like display. A spreadsheet.

So here’s a short rant. If you are a highfalutin coder who thinks you’re better than anyone, even another programmer who’s using some “inferior” or “non-open source” language or platform, get off your high tower. The only people who care about your code are other programmers. The normal people don’t care a flying fishball about your code.

I’m practical. I use whatever tools I have to do what I need to do. So far, C# is great. And there are people who don’t like the .NET Framework simply on principle that it’s from Microsoft.

I’ve never really understood the tension between the open source software people and the “closed” source software (aka enterprises and large companies) people. In particular, Microsoft seems to bear the biggest brunt of the force whereas Apple (the supposed “opposite”) is enshrined, despite the various salient restrictions Apple imposes (App Store apps approval process, closed system, proprietary hardware [iPhone, iPad]).

If I didn’t know better, I’d say some programmers hate Microsoft because Microsoft makes a lot of money. Oh there are specific reasons like closed proprietary systems, or crushing small competitors by buying them out, or whatever. I’d say the underlying cause of their distress is that Microsoft makes lots of money. And if I didn’t know better, I’d say these programmers find money a repulsive thing.

Views on money is a separate topic. My short version is: money allows you to be more of who you are. Money gives you more options. Money is not inherently evil.

Now Microsoft introduced the Open XML specifications, with the creation of the Open XML SDK for .NET Framework. People have criticised the Open XML specs, whether it’s “yet another standard” or “it’s from evil Microsoft!”.

The Open XML specs have their flaws. I should know because I’ve spent hours studying them and poring over the details. From my research, the specs are created to conform to the existing Microsoft Office software (Word, Excel and PowerPoint). If it’s in the Office software, the specs reflect it.

You can’t blame Microsoft. It’s a good move. It also made the specs very complicated. The reason is that the Office software is the accumulated knowledge and code base from years of testing and marketing.

You know what’s the most common thing to ask when creating Open XML documents? “How to create a simple Word document?” or “How to create a simple Excel spreadsheet?” Because the complexity made creating even an empty file a big pain in the derriere.

So what’s the future, really?

Some kind of standard would be used. Possibly the Open XML standard. Google Docs and Apple’s Number and other spreadsheet software will adopt it, or at least support it.

Spreadsheets are going to be around because companies and businesses use them. Even the small ones. I don’t care if you’re a one-man startup. You will eventually find a need to look at data in a grid-like pattern. That’s when you’ll use spreadsheets.

Businesses who hire programmers pay those programmers. And businesses use spreadsheets. That means somewhere along the line, you as a programmer will have to work with spreadsheets. Maybe not using spreadsheets like your users, but create them for your users.

Your users need monthly sales reports. You’re tasked to dump the database data into a spreadsheet that they can actually use. You don’t expect them to write SQL statements right? (If you do, why are you being paid?)

However, I don’t expect spreadsheets to be very complicated. Tabular data, with heading text. Simple charts (although I’m revising my view on charts. Even creating a simple bar chart with Open XML SDK is giving me a headache due to the many various options and settings available on the Excel UI). Simple styling (too much styling and your spreadsheet looks untidy and won’t be useful).

This means the full Open XML specs won’t have to be implemented. My gut guess is that at least 50% of the specs won’t be used for the simpler spreadsheets.

And in the future, simpler spreadsheets are probably going to be more common. You need to see and understand the data fast. The fancy styling just makes it harder to do that.