Launching SpreadsheetLight

I am excited to tell you that my spreadsheet software library is available!

SpreadsheetLight

For the initial launch version (I decided to go for version 1. Why do people launch with versions 0.8? I don’t know…), you get comprehensive support for styles, rich text formatting, 47 named cell styles, themes (either one of the 20 built-in themes or create-your-own). Well, like I said, comprehensive styling support.

There’s also the (hum-drum) support for merging cells and freezing panes. I actually explored how to split panes. I certainly wrote about it in my Open XML guide, but it turns out that there’s a tiny rectangle at the top-left corner that Microsoft Excel didn’t tell me about. The size of that tiny rectangle is dependent on the font you use, and even the screen resolution of your computer screen.

While I could add a function that allows you to just input the size in EMUs (English Metric Units), I decided that if I can’t do it well, I don’t want to do it. At least for the initial launch.

Row heights and column widths were also big time drains. It turns out that they’re also dependent on the font and screen resolution of your computer screen. I was trying to calculate the standard row heights and column widths for the fonts in the built-in themes, and I thought I had them. I wrote a program using SpreadsheetLight to generate spreadsheets with different minor fonts, and I wrote a program to read in those spreadsheets and get the “standard” row height and column width. I spent 3 hours collecting data.

Then on a whim, I switched my computer screen’s resolution from 120 DPI to 96 DPI (my eyesight’s not that good ok? I need bigger text…), and whoa! All that data doesn’t apply anymore… All in all, I think I spent 6 or 7 days trying to figure out a general calculation formula. I failed. I don’t know how Excel does it.

I also surprised myself by including extensive support for pictures. I thought you just insert a picture into the worksheet and that’s it! It turns out there’s a ton of post-insertion manipulations you can do. For example, if your picture has transparent areas, you could set a background fill, and that background colour will be visible through the picture. Also, you can rotate the picture in 3D.

SpreadsheetLight is licensed under the MIT license. I decided to use one of the available software licenses instead of making up one of my own. As far as I can tell, the MIT license allows the recipient of the software to use the software in personal or commercial products. It’s also categorised as free software, as in freedom of use, not free as in cost. I don’t want to deal with per-client, or per-server, or per-developer or per-what-not licensing restrictions.

Even if you’re not interested in spreadsheet software, have a look at SpreadsheetLight. Tell some other programmer about it. Tell your manager about it. It took me slightly over 2 months of intense coding, and I want someone in the world out there to have an easier life because of SpreadsheetLight. Thanks!

Oh, and the image art is designed by Charlie Pabst from Charfish Design. While I have a fair competence in image work, I decided to get a professional designer to help me. It’s a business and professional product. I’m not going to risk the product’s success so I could stoke my ego…

Working on spreadsheet software library

As a natural and logical extension from my Open XML spreadsheet guide, I’m writing a software library to create and manipulate Open XML spreadsheets. (Never mind that decompiler project I was working on… 2 months of coding… sunk cost… moving on…).

I did some research (ok, an inordinate amount of research…) on the available spreadsheet software libraries out there, both free and commercial, both supporting Open XML (or .xlsx in any case) and the old .xls (Microsoft Excel in binary). I have 2 observations.

First, there’s a plethora of classes in the library. It’s sort of expected. There’s support for a lot of functionality, and it just burgeoned into many classes. Personally, I hate it when I have to learn a new library. There’s a whole bunch of documentation and classes I have to read up on and experiment to just do a simple thing (printing a string of characters is the first thing I try). When I first encountered the .NET Framework, I was crushed. It’s redeeming feature was its extensive documentation, which made learning easier.

Second, even though there’s support for a lot of functionality, it still takes quite a bit of code to accomplish what you want done (granted, much less than if you wrote low level code). Hey I wrote a guide on Open XML spreadsheets, I know how many lines of code needed to just create an empty Excel file, ok?

But these are spreadsheet software libraries!. They’re supposed to make your life easier. In fact, much easier.

I read that when the iPhone was designed, the engineers told Steve Jobs that it needed to have 4 or 5 buttons. Steve Jobs said no. One button (to rule them all). The iPhone now only has the 1 button.

So I took inspiration from that and designed my library to have that quality. Alright, alright, here’s a code sample:

SLDocument sl = new SLDocument();
sl.Save();

That will save an empty Excel file named “Book1.xlsx”. What, not Hello World enough for you?

SLDocument sl = new SLDocument();
sl.SaveAs("HelloWorld.xlsx");

There. Now the file is named “HelloWorld.xlsx”. What, sheet name? Most (if not all) of the libraries I researched required you to add a new worksheet to an empty file. All spreadsheets have at least one worksheet. Why force the programmer to do it anyway? You don’t see Microsoft Excel forcing the user to add worksheets in a newly created spreadsheet file, right? (Excel even has 3 worksheets added by default).

Alright, fine. The first worksheet’s name is by default “Sheet1”. You can rename it.

SLDocument sl = new SLDocument();
sl.RenameWorksheet(sl.DefaultFirstSheetName, "Hello");
sl.SaveAs("HelloWorld.xlsx");

There, happy? So, how do we set cell values?

SLDocument sl = new SLDocument();
sl.RenameWorksheet(sl.DefaultFirstSheetName, "Hello");
sl.SetCellValue(2, 3, 3.14159);
sl.SetCellValue(2, 4, "This is PI");
sl.SaveAs("HelloWorld.xlsx");

The cell with row 2, column 3 will have the value of PI. The cell with row 2, column 4 will have the string “This is PI”. Yes, the library supports cell references such as “C2” and “D2”. My opinion? They make better sense to a user with visual interface to the spreadsheet. It’s much harder to use when you’re programming with a non-visual interface to the spreadsheet. Good luck iterating through rows 2 to 500,000, with columns 1 to 1000 (financial reports, I’m looking at you…).

Want to add a new worksheet?

SLDocument sl = new SLDocument();
sl.RenameWorksheet(sl.DefaultFirstSheetName, "Hello");
sl.SetCellValue(2, 3, 3.14159);
sl.SetCellValue(2, 4, "This is PI");
sl.AddWorksheet("SecondWorksheet");
sl.SetCellValue(5, 5, "Why am I not first?");
sl.SaveAs("HelloWorld.xlsx");

Hey, a software library is supposed to make your life easy. The second worksheet’s name is *drum roll*, “SecondWorksheet”. The string “Why am I not first?” is in row 5, column 5 of the newly added worksheet. How does the library know which worksheet to add which cell value? By magic. Ok, fine, it automatically keeps track of worksheets.

When a user enters a cell value in Excel, does the user need to know which worksheet? No, because that information is implied. The user knows which worksheet because the user chose it already. And so does this software library.

Oh yeah, I even have basic theme support!

SLDocument sl = new SLDocument(SLThemeTypeValues.Flow);
sl.RenameWorksheet(sl.DefaultFirstSheetName, "Hello");
sl.SetCellValue(2, 3, 3.14159);
sl.SetCellValue(2, 4, "This is PI");
sl.SaveAs("HelloWorld.xlsx");

That gives you the Flow theme, one of the built-in themes in Microsoft Excel (note: only the fonts and font colours are supported). You can even design your own custom theme.

System.Drawing.Color[] clrs = new System.Drawing.Color[12];
clrs[0] = System.Drawing.Color.White;
clrs[1] = System.Drawing.Color.Black;
clrs[2] = System.Drawing.Color.WhiteSmoke;
clrs[3] = System.Drawing.Color.DarkSlateGray;
clrs[4] = System.Drawing.Color.DarkRed;
clrs[5] = System.Drawing.Color.OrangeRed;
clrs[6] = System.Drawing.Color.DarkGoldenrod;
clrs[7] = System.Drawing.Color.DarkOliveGreen;
clrs[8] = System.Drawing.Color.Navy;
clrs[9] = System.Drawing.Color.Indigo;
clrs[10] = System.Drawing.Color.SkyBlue;
clrs[11] = System.Drawing.Color.MediumPurple;

SLDocument sl = new SLDocument("ColourWheel", "Castellar", "Harrington", clrs);
sl.RenameWorksheet(sl.DefaultFirstSheetName, "Hello");
sl.SetCellValue(2, 3, 3.14159);
sl.SetCellValue(2, 4, "This is PI");
sl.SaveAs("HelloWorld.xlsx");

There are 12 colours you need to define. These correspond to the 2 light colours, 2 dark colours, 6 accent colours, the hyperlink colour and the followed hyperlink colour. “ColourWheel” is the theme name, “Castellar” is the major Latin font and “Harrington” is the minor Latin font. The major Latin font is used when you apply the Title named cell style. The minor Latin font is basically the body font.

What named cell style? A customer suggested supporting the feature.

Named cell styles

So how do you apply it?

sl.ApplyNamedCellStyle(2, 3, SLNamedCellStyleValues.Good);
sl.ApplyNamedCellStyle(2, 4, SLNamedCellStyleValues.Accent1);

I am finishing up version 1 of the library, and it will soon be available. I’m targeting a launch in January 2012. The software library will be called SpreadsheetLight. The primary idea is for it to be simple and clean. Simple for you to use, and you write clean code when you use it.

My internal tests show that SpreadsheetLight runs faster than 2 other free libraries. I won’t tell you which 2, because it’s not relevant, and because I’m automatically biased, and because it’s just not nice to the other programmers who contributed to those 2 libraries. The point is that it runs fast and is effective, which I’m happy about.

First look at XML Studio

I’ve been working with XML files for a while (if you’ve been reading my blog for the past few months, you’re probably sick of the XML-related stuff…). Specifically with Open XML. While I don’t always read and write XML files, I do refer to the Open XML ECMA-376 documentation and the Open XML SDK help file a lot. And then, I go look at some XML files, just to check that I wrote them correctly.

I recently found out about XML Studio from Liquid Technologies. Disclaimer: I was contacted by a company representative, and given a free developer license for the software. But go check out their software if you’re doing XML-related stuff.

I blazed through the list of features and benefits, and settled on one. Oh my fishballnoodles they can generate C# source code from XML files! It uses the XML Data Binder.

So my first thought was: Can I use it to somehow generate source code that’s (sort-of) compatible with Open XML SDK?

Short answer: No. But that’s because the XML files don’t have the Open XML SDK class names in them, so you can’t really have source code working with the SDK.

However, my next thought was: Can I at least generate an XML file that would have been generated by the equivalent source code using the SDK?

First, I loaded an XML file from an Open XML spreadsheet (after renaming .xlsx to .zip and then unzipping and then get one of ’em darn sheet.xml files). Then I found out that I couldn’t generate source code from this. *sad*

But I found out I could generate an XML schema from the XML file. Ohhkayy… Then I found out that generating source code required an XSD, an XDR or a DTD file. Alright, getting there.

Then I thought I could create a worksheet with some typical data so that I could grab the resulting XML file with some of the possible data types, which I could then use to generate a corresponding XSD schema file, and then generate corresponding source code. Note the recursive problem solving ability of my programmer mind.

And then it hit me that I could just use the correct schema file from ECMA-376. So I went to the second edition of ECMA-376 (latest is third edition as of this writing but not currently super-supported yet), and went to folder of part 1 (there are parts 1 to 4). Which has this very descriptive name of “ECMA-376, Second Edition, Part 1 – Fundamentals And Markup Language Reference”. Under this folder, there’s a zip file called “OfficeOpenXML-XMLSchema-Strict.zip”. And in that zip file is the motherlode of your schema dreams.

And so I opened up the schema file related to the Worksheet class of the SDK (which is sml.xsd). And got this:
XML Studio Schema View

Click image for larger view.

I’ve expanded the node for the Cell class. That’s awesome. You see that “0..1” to the left of CT_CellFormula? That means a Cell class can contain 0 to 1 of the complex type (see “CT_” prefix) CellFormula. For nodes that take in at least 1 to an unlimited number of children nodes, you get “1..*”. This is reflected in the schema as minOccurs=”0″ for “you don’t really need it” and maxOccurs=”unbounded” for “you just have as many children as you want, ok? But make sure they’re of this type.”

As of this writing, I still haven’t managed to generate source code that does what I want (after a few hours of wheedling code around). But essentially, I’m trying to create an alternate Open XML SDK just from the schema information from ECMA-376. I’m pretty sure XML Studio wasn’t created for this in mind… I’ll keep you posted on my findings. If you have any XML editing stuff you think I should know, tell me, because I want to see if I can break, uh, I mean utilise XML Studio to its full potential.

Decompiling Open XML spreadsheets

Ok, I’m going to reveal the big secret project that I’ve been working on for the last 2 months. I’m writing a software program that will decompile Open XML spreadsheets into C# and VB.NET source code.

Now I know what you’re thinking. “But Vincent, there’s that SDK Productivity Tool that does that already!”

Frankly, when I started the project, I didn’t even think about the SDK tool. But, when I looked at the generated source code from the SDK tool, I found it… hideous. There were 2 things I found annoying:

  • New classes were created willy-nilly
  • Properties were dumped into class instantiation using object initialisers

The first point meant that most of the classes were created one-off. It didn’t matter if you needed a class of type SomeClass multiple times. The SDK tool simply created another class of type SomeClass. If that class type was used multiple times, you’ll see variables named someClass1, someClass2 all the way to someClass21. It’s why I wrote about multiple use variables versus multiple variables.

The second point meant that if a class has many properties, you might end up with something like:

CellFormat cellFormat3 = new CellFormat(){ NumberFormatId = (UInt32Value)0U, FontId = (UInt32Value)10U, FillId = (UInt32Value)9U, BorderId = (UInt32Value)0U, ApplyNumberFormat = false, ApplyBorder = false, ApplyAlignment = false, ApplyProtection = false };

That’s one line of code.

The problem I have with object initialisers is when you need to comment something in between. Commenting in C# and VB.NET means an entire line is commented, although C# offers the /* comment */ variant. There’s just no easy way to do so. Compare with this:

cellFormat = new CellFormat();
cellFormat.NumberFormatId = 0U;
cellFormat.FontId = 11U;
cellFormat.FillId = 10U;
cellFormat.BorderId = 0U;
cellFormat.ApplyNumberFormat = false;
cellFormat.ApplyBorder = false;
cellFormat.ApplyAlignment = false;
cellFormat.ApplyProtection = false;

I just find that easier to pick and choose stuff I don’t want.

Now the big advantage (my differentiation or unique selling proposition) is that I offer VB.NET too. The SDK tool doesn’t. Here’s a snippet:

run = New Run()
run.RunProperties = New RunProperties()
run.RunProperties.Append(New FontSize() With {.Val = 11R})
clr = New Color()
clr.Theme = 1UI
run.RunProperties.Append(clr)
run.RunProperties.Append(New RunFont() With {.Val = "Calibri"})
run.RunProperties.Append(New FontFamily() With {.Val = 2})
run.RunProperties.Append(New FontScheme() With {.Val = FontSchemeValues.Minor})

You will notice that I do use object initialisers. “That’s hypocritical of you!”. Perhaps, but I use them when the number of properties is small. I’ve kept it to 3 for now. Object initialisers in my case also made it easier that I don’t have to declare and instantiate new classes with actual variable names.

I understand why the SDK tool generates source code the way it does. It has to do with completely iterating through every single part and class of the root class SpreadsheetDocument. If you’ve ever written code to traverse a tree structure, you’ll know how tedious it can be.

The one thing the SDK tool lacks about the source code it generates is context. It runs through the entire Open XML document structure like a squirrel looking for every single acorn on a tree. It doesn’t stop to check any acorn for size, defects or even if it’s an acorn. Look, winter’s coming soon, and the squirrel doesn’t have all day telling you that this particular acorn is related to that particular acorn, and no it doesn’t care how big the acorn is, it’s got the teeth to eat it, ok?

Why are we talking about squirrels again?

So, after about 20 thousand lines of code, I’m just barely getting my software into beta mode. Halfway through that, my heart sank with the enormity of the task. In order to generate more readable code, I cannot iterate through the XML tree structure like the SDK tool. I had to stop and make sense of what the class was.

That made me look at the SDK help file and the ECMA-376 specification file way too much… Did you know the ECMA spec is like over 5000 pages long? And that’s part 1. Parts 2, 3 and 4 are smaller, but still heavyweights in their own right. And there are so many classes and child classes and grandchildren classes and properties and…

I’m going to at least make a valiant effort to have the software self-complete on a subset of the Excel functionality (and thus a subset of the SDK). If you’re interested, I present to you SoxDecompiler. As of this writing, I’m just trying to see if people are interested in the software, so it’s just a page to collect email addresses of the people interested in the software. I think I wrote “interested” way too many times…

For some reason, the name conjures an image of a thread slowly unravelling a sock. But I like it. It stands for “Spreadsheet Open XML Decompiler”.

DefinedNames in Cells for Open XML spreadsheets

So a while back, a customer of mine asked me if I knew how to set names (or labels) to cells in a spreadsheet, so that a cell formula just referenced those names. Frankly, I didn’t even know I could do that. I’m not an Excel wizard, as you can tell.

In case you’re in a hurry, here’s the code and the resulting Excel spreadsheet. The code runs against the Open XML SDK.

To my surprise, the name given to the spreadsheet cell isn’t defined at the Cell class level. You’d think you would name your child and make sure to slap a name tag on your progeny just so everyone knows what to call your offspring, right? After looking through the documentation, and thinking it through, I guess it makes sense to separate it. It works by having a central depository with all the available names, contained within DefinedNames and DefinedName classes. Then the names are available throughout the spreadsheet’s workbook.

It’s sort of like the SharedStringTable, where every piece of text is stored in a SharedStringItem, and referenced with an index. This has problems, in that I don’t know what is contained in the Cell class itself, but I don’t really want to go there right now…

Let’s go through the code a bit. Here’s the part where you define DefinedName(s):

defname = new DefinedName();
defname.Name = "PrimeNum1";
defname.Text = "Sheet1!$C$2";
defnames.Append(defname);

defname = new DefinedName();
defname.Name = "PrimeNum2";
defname.Text = "Sheet1!$C$3";
defnames.Append(defname);

defname = new DefinedName();
defname.Name = "PrimeNum3";
defname.Text = "Sheet1!$C$4";
defnames.Append(defname);

defname = new DefinedName();
defname.Name = "PrimeNum4";
defname.Text = "Sheet1!$C$5";
defnames.Append(defname);

In case you’re unfamiliar with Excel, “Sheet1!$C$5” means the cell C5 of the sheet named “Sheet1”. The dollar sign acts as a separator, I think. In this case, the relevant cells are C2, C3, C4 and C5 (containing the 1st, 2nd, 3rd and 4th prime numbers). And then we have this part:

r = new Row();
r.RowIndex = 6;
c = new Cell();
c.CellReference = "B6";
c.DataType = CellValues.String;
c.CellValue = new CellValue("SUM");
r.Append(c);
c = new Cell();
c.CellReference = "C6";
c.CellFormula = new CellFormula("SUM(PrimeNum1, PrimeNum2, PrimeNum3, PrimeNum4)");
c.CellValue = new CellValue("17");
r.Append(c);
sd.Append(r);

Note this particular line:

c.CellFormula = new CellFormula("SUM(PrimeNum1, PrimeNum2, PrimeNum3, PrimeNum4)");

That’s our defined names in effect. Otherwise, we would use this:

c.CellFormula = new CellFormula("SUM(C2, C3, C4, C5)");
// or even
// c.CellFormula = new CellFormula("SUM(C2:C5)");

I teach you more about the CellFormula class, as well as a whole bunch of Open XML concepts in my programming guide.

3 facts about Open XML SDK

In the course of writing my guide on Open XML and Excel spreadsheets, I discovered some things about the Open XML SDK. There’s been some flak about the SDK or even on Open XML in general. But the price of the SDK is free, so you can’t beat that. While there are some inconveniences with using the SDK, the solutions and workarounds aren’t particularly nasty.

Here we go.

Open XML SDK is not a convenience library

If you’re looking for a neat little function that will toast bread, mop your floor, press your shirt and wish you “Have a nice day!” as you’re leaving the house, Open XML SDK is not what you’re looking for. (Actually if you find a software library that does the above, I wanna know.)

The Open XML SDK is a software development kit (hence the “SDK”). It is not a conventional software library in that sense. It provides atomic functions that allow you to do all sorts of nifty things to create/manipulate Open XML spreadsheets, word processing documents and presentation slides. The keyword is “atomic”.

This isn’t necessarily a bad thing. I know some companies will prefer not to use a software library (or third-party software). For example, government agencies or financial institutions, where data security is a concern and all software is (preferably) written in-house. The furthest they would allow is probably standard libraries.

I’m not going into a discussion about whether Open XML SDK is a “standard” library, since it’s from Microsoft. There seems to be a lot of unhappiness with Microsoft. Is it because of their monopoly? Because Bill Gates is rich? I like the .NET Framework, because it makes coding easier. Open XML SDK is built on top of that, so that makes Open XML documents easier to create and manipulate.

That said, because Open XML SDK offers atomic functions to do atomic tasks, this allows you to create your own functions to encapsulate whatever you’re trying to achieve. This flexibility is sometimes more valuable than convenience. Why do you think software companies charge a lot more for the source code of their libraries? You want to know what’s going on behind that function from a third-party library, because you don’t know if that bug is due to your own code or their code. (It’s probably your own code though…)

Open XML SDK is not a data manipulation library

There are 4 main parts to the Open XML SDK:

  • Spreadsheets
  • Word processing documents
  • Presentation slides
  • Graphics

The graphics part is a common set of class objects used by the other 3. Out of the first 3, spreadsheets are typically used to hold lots of data. Financial data, fiscal summaries, call logs, database dumps and so on. Text documents and slides hold data too, but not usually as intensive as spreadsheets.

Now it might surprise you that Open XML SDK offers no functions to manipulate any kind of data. None at all.

Need to dump 100 database records with 8 fields into a spreadsheet? You need to write code to fill in a 8-column-100-row block of cells.

Need to sort data? Do it within the database environment (there’s a SORT BY clause, you know?), or write your own sorting function. Because Open XML SDK ain’t got it.

This isn’t a big problem, but it does make writing code a little bit of a hassle. Particularly when you’re creating or manipulating a data-intensive spreadsheet. It’s not like when you’re attaching the DataReader object to a DataTable, and voila! All the data is nicely formatted and shown on a web page.

But this is also where the flexibility of writing your own functions come in.

Code execution order is important

When writing code with Open XML SDK, you have to put aside how you normally work with spreadsheets, text documents and slides. I’ll lump them together as “office documents”. Small “o” so it’s not copyright, right?

As a user, you would normally just go about your task of typing text, inserting images, styling text, cut-and-copy sections of an office document. The order of how you go about doing it is irrelevant, because you get to see the final result. And the final result is all that matters.

When working with Open XML SDK, you need to keep to an ordering of code sections. Each class in the SDK has to be appended to the appropriate parent class in the correct order, or you’ll get a corrupt file. This means your code has to be executed in the correct order too.

Go to the Open XML SDK help file. Let’s say we look at the Worksheet class. There will be a section called XML schema. In that, there will be an XML tag called “sequence”. See those “element” tags? That’s the order.

This code order isn’t a big problem. In your spreadsheet, it looks like an image is at the top, with a table of cell data below it. In code, you need to code for the table of cells first, then the image. Because that’s the order.

The solution is to ignore the visual placements of the data sections in your office document, and look at what is in your office document. Then you rearrange all the data parts according to the order, and then write code for that.

Conclusion

The Open XML SDK offers you flexibility in how you write your code and a great price (it’s free). It is backed by a large software company (Microsoft), so it’s not likely to disappear with no support in the future.

Sure it’s not as easy to use, but sometimes your situation doesn’t allow you to use a third-party library (cost, licensing, security). Hey, it works well enough. And sometimes, when you’re in a tight project, that’s all you really need. That you produce software (that produces office documents) well enough.

If you’re interested, you can check out my guide. I teach you how to create a spreadsheet with the standard functions in Excel such as text styling, image insertions, multiple worksheets, cell formulas. I also show you how to use a template in case you need something really complicated. And there’s working source code in C# and VB.NET.

Spreadsheet Open XML V2

I was rushing to get this out. The updated version of my programming guide is out! I first launched it on 17 Jan this year, so I was hoping to meet the personal-and-unseen deadline of 17 May, so it’s a nice 4 months interval. Ah well, I’m the only one who cares anyway…

I’ve added loads more content to help you with your Open XML spreadsheet needs. Version 1 was 53 pages. Version 2 is 147 pages. There are a lot of pages with screenshots, but still… 147 pages!

You can find out more here.

The last few weeks had been interesting while I rushed to get working source code and write explanations for the guide… I need to sleep… wait, I’ve got a magazine deadline! *sigh*

Programming guide update

So. Quick update.

I’ve finished writing a bunch of code for my Open XML guide to create spreadsheets. They include:

  • Inserting comments
  • Using formulas
  • Setting header/footer options
  • Setting page setup options
  • Creating multiple worksheets
  • Freezing/Splitting panes
  • Using and styling tables (but not pivot tables)
  • How to write the cells if you have multiple sets of tabular data

The last one was a problem that a customer faced. Let’s say you have 2 sets of tabular data, and they’re side by side in the worksheet. Using Open XML SDK, you have to run through the 1st row of the 1st set of data, then the 1st row of the 2nd set of data. You can’t completely iterate over the 1st set of data first. This is because with Open XML SDK, you have to completely define the first row to be inserted into the worksheet. That will include the 1st row of the 2nd set of data.

So I’m (feverishly) typing the explanation part of the guide (the code is all done and tested). I’m telling you now because when I release this update, I’m going to increase the price. I calculated that with the amount of work done on writing the code and the explanation, and that I currently offer a 1-year free technical support together with the guide, I’m severely undercharging. Besides, the number of hours saved studying the guide versus learning from scratch is worth the price increase.

Since I currently give customers a free update within one year of purchase, you might want to get the guide now before the price goes up. Then you’ll get the current guide and do awesome stuff with it right now, and when I release the updated guide, you get it at no extra cost.

Depending on how fast I type (and how many cups of tea I imbibe), I might release the update in a week’s time or two. I don’t really have a deadline. But the latest release date will be around mid-May (because I’ll have to work on my magazine by then. But I digress…). But it might be sooner, because I want to get this off my todo list so I can work on something else.

Built-in styles for Excel Open XML

So a blog reader, Sebastien, once asked about built-in style numbers used in Open XML for Excel. I’m sorry to break the news to you. It’s not straightforward to use those built-in styles.

What are these built-in styles? Check these out:

Excel built-in styles

For instance, Sebastien was asking how to use the “Good” and “Bad” built-in styles. So the good news is, you can use those named styles. The bad news is, the various style effects (font colour, background colour, borders) aren’t automatically added in for you. From the Microsoft documentation of CellStyle:

This element represents the name and related formatting records for a named cell style in this workbook.

Annex H contains a listing of cellStyles whose corresponding formatting records are implied rather than explicitly saved in the file. In this case, a builtinId attribute is written on the cellStyle record, but no corresponding formatting records are written.

For all built-in cell styles, the builtinId determines the style, not the name. For all cell styles, Normal is applied by default.

Read the second paragraph again. It means even if you use the built-in style “Good”, you still need to create a style with the green background and green text font colour.

Since this is the case, I suggest you just stick to creating your own custom styles. It’s what’s going to happen anyway.

Now there is a magic number, 164. Your custom styles will start being numbered from 164 onwards. This implies there are 164 built-in styles (0-indexed). My search efforts only gave me a partial list.

Please refer to the ECMA-376 documentation for Office Open XML formats. Download the 1st edition, part 4. Inside, you’ll find a PDF. Turn to page 2135 (as of this writing. It’s page 2128 on the PDF itself though). You’ll find this:

Excel Open XML built-in style list

That list is probably that of “standard” styles. From reading the next few pages of the documentation, I believe most of those 164 built-in styles are due to internationalisation issues.

So my conclusion is, for the “standard” built-in styles you’re interested in, you still have to create the supporting style effects (font colour, background colour and the like). For the “non-standard” built-in styles, they are probably different based on the localisation of your Excel file. So you’re better off just creating your own custom styles, which I teach you how here and here. Or you can get my programming guide for detailed explanations and source code.