Clopen source

A few years ago, when I was working in a job, I attended a social media course. There was a training budget to use up, and I couldn’t find any technical courses worth attending. So I attended the social media one.

At one point in the one-day course, the trainer was showing the attendees how easy it was to create a Wikipedia entry. He showed the update history. He showed how to edit other Wikipedia entries.

Now I’m going to tell you about a book I read recently. Don’t worry, it’s related. The book is “Geekonomics” by David Rice. There’s a chapter on open source (specifically related to software development).

Rice wrote that the openness of the open source movement might also be its downfall. Because anyone can contribute to an open source project, anyone does.

He also wrote that open source project contributors are typically not paid (in the form of money). They contribute for geek cred. And if you don’t contribute anything, you don’t get any geek cred at all.

And so developers typically contribute to features the developers want themselves or that the features are cool, and not because the features are user-requested or even helpful to the project. I mean they’re not paid, so they might as well do something cool.

Remember, no contribution, no geek cred.

Now Wikipedia is successful because if any “amateur” goes in to create “useless” entries or update existing entries with wrong information, there’s someone else who’s willing to go in and change it. The long tail of contributors work here because the entries aren’t typically arcane. And that contributors are motivated enough to make Wikipedia better.

Open source software projects don’t work quite as well. Amateur developers add useless or unnecessary features. No one wants to go in and edit code. Because developers do it for geek cred, if you go in and delete their stuff, even if their stuff is unnecessary or possibly even detrimental, the original developer/contributor is going to be upset with you. Because you’re removing their geek cred.

This had put me in a bit of a quandary. For the purposes of this article, I’ll define “open source” as:

  • Source code is available
  • Licensing is such that users are able to study, change, redistribute the software and the source code

I’m not sure about the licensing part. Any license that qualifies as open-source is good enough for me.

And “closed source” will be defined as “not everything that has to do with the software is made available”. I’ll explain the meaning in a bit.

I made a spreadsheet library and made it open source. It’s open source because the source code is available for anyone to read. I’ve licensed it with the MIT License, which basically allows you to do whatever you want with very few restrictions.

It’s also closed, because I didn’t include the .csproj file and the strong name key file in the downloadable package. There are various reasons, but the main ones are that I intend to keep the branding and that there’s only one “original” SpreadsheetLight library running out there (mine). Anyone who forks my source code can compile it, but the resulting DLL won’t be the “original”, so to speak.

So I consider my project as clopen source. “Clopen” is a mathematical term used in topology to mean both open and closed. I’m not going to bore you with the details. Go Google the definition yourself.

The main point I had was that, while my project was open source, I don’t quite encourage open collaboration. Open collaboration is not a requirement for a project to be considered open source. And it is open source, because you can view all the source code.

My main motivation was that I had a specific vision for the project, which was to make the spreadsheet library as easy to use as possible for developers who are on a tight schedule and don’t have time to learn how to make spreadsheets with a third party library. This meant I had to maintain strict control over things like method signatures and even method/property/class names.

Can you imagine having just anyone coming in to change the source code? Or just adding a feature because they need it, but the intended developer audience doesn’t need it?

Not every contributor is interested in being aligned with the project’s vision and purpose.

Blind libraries

I have discovered that the software I’m working on right now falls into a particularly interesting category.

Pre-emptive note: I mean no disrespect for blind people or people with visual problems.

There are software that are basically machine-machine. These software programs talk almost exclusively with other programs with nary a human interaction. Stock software, scheduled financial server programs, batch data upload programs.

Then there are software that’s basically made for human interaction. Facebook, Twitter, mobile map software, image/video editing software.

My software sits somewhere between them. I’m going to use Microsoft Excel as the example because that’s what my software is related to.

Microsoft Excel is essentially a visual software. Sure you can enter data into cells and the spreadsheet is basically rows and cells of data. But it’s geared for a human to understand that data, and to interact with that data. Excel merely exists to facilitate that interaction.

Short digression: I was reading up on Excel user manuals (yes, I do that) and it turns out that the earliest spreadsheet software were effectively command-line. We’ve come a long way since then, huh?

My software is a spreadsheet library/component. Basically it allows a program to create and manipulate spreadsheets without any user interaction. More specifically, without any visual interaction.

That monthly revenue report you need? Have the data uploaded first (probably the machine-machine type of software doing this). Then use a spreadsheet library to create that report, styling rows/columns/cells or add a chart.

And then the final product, the spreadsheet itself is handed over to a human.

I want you to think about this for a second. The spreadsheet library is working blind.

Teaching the blind to paint

Imagine teaching a blind person to paint a tree.

The blind person has no idea what green or brown looks like. He doesn’t even know what a tree looks like. But you tell the blind person that this container has green pigment and that container has brown pigment.

Then you teach the blind person to hold the paintbrush this way, and apply the green pigment like that, and to have the paintbrush come into contact with the canvas in this way. And with practice, leaves start to appear. Or some semblance of it. You do the same with the tree trunks and branches, using the brown pigment.

The blind person still has no idea what an actual tree looks like, so he doesn’t even know if what he painted is accurate. But he finishes his painting and hands that painting off to someone who can actually judge the painting.

That’s what I feel my software is doing. This is especially true when I have to design function interfaces that allow a developer to do visual things.

Like insert an image.

Oh I can let you insert an image. No problem. Give me a file name.

Where do I position it? Oh.

With respect to what on the screen? Oh yeah, what screen?

How do I know if it’s large enough? Oh yeah, can’t see the image.

Some things are simpler to understand. Like “I want that cell to have a purple background.” I’ll let you assign a System.Drawing.Color structure, or assign a hexadecimal value, or a theme colour.

Keep in mind that the software doesn’t care if you assign #00FFFF or #FFFF00.

So that’s my experience with my spreadsheet library. I’m working on a word processing library right now, which is even more insane.

While spreadsheets have styles, word processing documents live on styles. Word processing software like Microsoft Word or LibreOffice Writer will be useless if they don’t allow the user to bold this word or italicise that sentence or underline that heading.

Final thought

Microsoft Office is about USD 300+, which is a visual software. The “blind libraries” commercial software cost about USD 999. And that’s the low end price.

Did you know that mine is open source and free? Check out SpreadsheetLight (there we go, self-promotion).

Action by reference

The son hits his finger and the father feels the pain, regardless of where the father is (or possibly when the father was/is… uh, what?).

But that’s the image I get from the source code I’ve been reading lately. And I’ve been reading a lot of source code (it’s for research). Here’s an example:

TheFather father = new TheFather();
TheSon son = father.Son;

father.FeelsPain = false;

son.HitFinger();

// I bet you nuggets to donuts that father.FeelsPain is now true
if (father.FeelsPain) Console.WriteLine("It hurts!");
else Console.WriteLine("Nahh, no biggie!");

Is there some coding paradigm I don’t know about?

Basically you have a parent class, and you get some child property/class from it. Then you manipulate the child property/class, changing some properties perhaps, or calling some of its functions.

And then the parent class automagically has the changes from its child property/class updated.

That’s the equivalent of NASA sending Curiosity to Mars and NASA automagically gets real-time footage from its rover. The rover doesn’t need to send the updates back home at all.

In the programming realm, this is similar to passing variables by reference instead of by value. Done within known and well-defined scope, this can be useful. Otherwise, they’re equivalent to global variables.

So what’s the background?

Reading the source code of other open source spreadsheet libraries (ahem), and using these libraries, I found this to be common:

var worksheet = workbook.Worksheets["Sheet1"];
// do something with worksheet

Then the workbook automagically have the changes on the worksheet updated into itself.

Then I went to read the example source code of commercial spreadsheet and word processing libraries (ahem), and this behaviour still held true.

Perhaps this makes sense to you. Personally, I find it confusing. I sent my son to get donuts. He seemed to have bought the donuts. But he didn’t report back. Should I call him? Shouldn’t he be home by now? Where are the donuts? What’s going on?

Now I’m hungry…

The sparrow is complete

It is a time of change. Some people give up and wait for changes to happen to them. Some people take action and make changes happen. Don’t be the former.

Previously, I summarised the recent developments in the online business world. Despite the opportunities available, it’s still easier to just sit back and do what you’ve always done.

There’s a quote from Twyla Tharp, a dance choreographer about creative blocks.

Do something. Anything.

The point is to get something done, even if it doesn’t specifically solve the problem you have right then. That gives you a sense of accomplishment, a feeling of progress, and you can channel that into doing more actions and eventually you’ll get to fixing your creative block.

Some people wait for the stars to align before doing anything. Some people wait for all the traffic lights to turn green before doing anything.

Sure your financial situation isn’t ideal. Sure your mother-in-law keeps breathing down you neck. Sure your work hours aren’t quite what you want. Sure you might like to have more space in your home.

But if you wait for everything to be perfect, you’ll never get to making that baby. Do you want to make babies or not?

There’s a Chinese saying that goes something like “The sparrow may be small, but it has all 5 major organs.” I don’t know what those internal organs are, probably heart, lungs, brain, kidney, liver?

The point is that the sparrow is complete, small as it is. It functions. It can fly, it can eat, and it can continue to make small baby sparrows.

In the business context, people almost always want to know more before starting. But the only real way to find out if you need to know something, is to start and then find out that you need it. Until you’ve done a task yourself and find it distasteful, you don’t really need to outsource it. Until you’ve answered customer questions, you don’t know how your customer service officers should behave.

Make a sparrow first. You can tweak it into an eagle later on.

Recent waves in online business world

By “recent”, I mean maybe up to the last couple of years or so. Let me start a little earlier than that.

When blogging became hip, there were programs (read: paid products) that teach you how to blog, how to write effectively, how to get your blog to be read.

And on the last note, website traffic became important. So there were programs (read: paid products) that teach you how to get traffic to your website. More importantly, how to get targeted traffic, because casual passers-by were next to useless for business purposes. Just look at all the traffic from Digg and StumbleUpon and Reddit and other social media sites. People come, look at your post, then leaves. That’s pretty much useless.

So creating an email list became imperative. You want to capture people’s email addresses so you can talk to them. If they sign up, then they want to hear from you. This is what Seth Godin would call permission marketing. But beware! There were some WordPress plugins that set annoying pop-ups that has a sign up box for people to put their email addresses. This pop-up happens either on finishing reading a post, or worse, on leaving a page. That would be “annoyance marketing”.

Then came teaching programs (read: pai… ok, you get the idea), that teach you how to teach a topic. The main one is Teaching Sells. The idea is that people will want to pay to learn something useful (and probably turn it into something profitable).

And on that note, videos were becoming popular, what with the increased bandwidth that most people have. And that some people like to see a person talking to them, instead of reading text or hearing audio files. So there was this product called Video Boss (I think). It teaches you (see previous paragraph) how to shoot, edit and upload a video. There were all sorts of information in that product, going so far as the minute details such as making your video visually interesting and lighting setups and so on.

Then there was the app craze, popularised by the iPhone. “Create apps. Become millionaire.” says some paid products (or to that effect anyway). If you’re a developer (which you probably are if you’re reading this blog), then be aware of what you’re creating. Create and sell apps if that’s your thing and that it’s working for you, not because someone says it’s the in thing.

Then there was the Kindle revolution, changing how people read. You can now self-publish on Amazon and push your ebooks out to millions of Kindles in the world. And make a bit of money from every ebook you sell.

The app thing and the Kindle thing have two things in common. They both relieve you of payment processing, and they both let you leverage an existing platform. Apple’s App Store for iPhone/iPad, Windows Store for Windows apps, Google Marketplace for Android devices, BlackBerry App World for Blackberry devices. And Kindle for well, Kindle devices.

Somewhere in those times, there was a need to know how to launch your product. I’m not talking about hype (or just hype anyway). I’m talking how to get sales from your product launch, how to get maximum impact. There’s this product called Product Launch Formula (by Jeff Walker) that teaches you how to do this.

I subscribe to many of these people’s email lists, so I get emails whenever whatever. Some are useful, some are interesting, some I just delete because it’s an obvious sales email (after you receive as many emails of such nature as I do, you can tell from the subject line or within a couple of sentences in).

There’s a point to all this. And I’ll tell you in the next post.

ToString() and cultural insensitivity

It was a scramble going through my entire code base. We’re talking thousands and thousands of lines of code here.

Luckily there was the Find function. And that the search string was fairly unique. And that the code change was fairly contained.

So what was I getting all hyped up about? Someone submitted a bug report telling me the SetRowHeight() function of my spreadsheet library didn’t work. What?!?! I tested that thing and it worked fine!

“What do you mean by it didn’t work? Was the row height changed to a wrong value, or something else” I asked.

“It completely failed. I can’t even open the Excel file.” was the answer.

Oh cranberry. Oh flying fishball hashbrown cranberry.

So I asked the person to send me the resulting Excel file that was generated. I looked through the contents, and found this: “40,2”

Shouldn’t that be “40.2”, I wonder.

And then I looked at the person’s name real close. I guessed the person’s somewhere in Europe that uses the comma as a decimal point character and the period as the thousands separator.

I proceeded to set every numeric variable that uses the ToString() function to have ToString(System.Globalization.CultureInfo.InvariantCulture). There were other changes I made based on this knowledge, but that ToString() was the main change.

Now this isn’t a post telling you about ToString() and thinking about how your program works in different cultures and so on (The biggest one I know is the Turkish “i”. That one seems to be a nightmare for upper and lower casing changes). The lesson I want to tell you is that no matter how much you check your software, the moment an actual user (in my case, a programmer) uses your software, shiitake mushrooms hit the fan.

You will only really know how good your software performs when other people are using it. You have no idea where they’re using it (like in my case), how they’re using it, or even why they’re using it. This means the sooner you put your software out there, the sooner you know whether your software is actually useful.

It doesn’t mean put out sloppy work. It just means you put out work that’s done to the best of your abilities, regardless of how small that feature list is. If it’s a bug tracking software, make sure it can track bugs easily. If it’s a shoot-em-up game, make sure I can run around easily and shoot enemies/monsters/aliens (fairly) accurately. If it’s a calculator, make sure it can at least add, subtract, multiply and divide before doing all those scientific and accounting functions.

So what software projects are you working on right now? What can you cut out so you can devote more energy to the core of your software?

Multi-personality classes

I’ve been working on my spreadsheet library and I discovered something. There are classes where I needed them to be multiple end results at the beginning of their lives. Let me illustrate.

Suppose we have an Embryo class. However, the way we use it is that we need it to survive all the way to the end of our program, and we happen to need the Testicles and Ovaries properties depending on how our program uses it. For example:

Embryo emb = new Embryo();
if (boy)
{
    emb.PrivatePart.Testicles.MakeStuff();
}
else
{
    emb.PrivatePart.Ovaries.MakeStuff();
}

The thing is, we don’t know beforehand whether we have a boy or girl until runtime. And even if we know beforehand that we have a boy or girl, we need a generic placeholder class that can represent either a boy or girl until some later time. Using the .NET object class and doing boxing/unboxing seems unproductive (no pun intended).

So what’s the actual situation? Chart axis.

The primary horizontal axis of an Excel chart can be either a category axis, date axis or value axis. Category axes are used for most cases, where the chart data’s categories are text. But if they’re dates, then date axes are used. And if they’re scatter charts, then the axis is a value axis.

Well, the primary horizontal axis is usually those 3 types. Bar charts have them at the primary vertical axis. *sigh* Don’t even get me started…

So the primary horizontal axis has to be all 3 at the same time. Well, the way I expose the property/class requires it to be all 3 at the same time.

“Why don’t you expose them individually? That would solve the ambiguity problem.”

I could do this:

SLChart chart = new SLChart("B2", "G6");
chart.PrimaryCategoryAxis.Title.Text = "A category title";
chart.PrimaryDateAxis.Title.Text = "A date title";
chart.PrimaryValueAxis.Title.Text = "A value title";

And then based on the type of axis used, I’ll use the different axis class. The thing is, I want to expose only one property instead of 3. This means programmers using my library don’t have to differentiate which axis class/property to use.

If I expose only 1 property/class, then that underlying class has to be all 3 types of axis at the same time.

There’s probably a design pattern I don’t know about. If you know it, or have comments on how to approach this, I’d love to hear it in the comments.

Now if you’re in the business of producing (hahaha… pun… never mind) Excel spreadsheets, try my library!