Clopen source

A few years ago, when I was working in a job, I attended a social media course. There was a training budget to use up, and I couldn’t find any technical courses worth attending. So I attended the social media one.

At one point in the one-day course, the trainer was showing the attendees how easy it was to create a Wikipedia entry. He showed the update history. He showed how to edit other Wikipedia entries.

Now I’m going to tell you about a book I read recently. Don’t worry, it’s related. The book is “Geekonomics” by David Rice. There’s a chapter on open source (specifically related to software development).

Rice wrote that the openness of the open source movement might also be its downfall. Because anyone can contribute to an open source project, anyone does.

He also wrote that open source project contributors are typically not paid (in the form of money). They contribute for geek cred. And if you don’t contribute anything, you don’t get any geek cred at all.

And so developers typically contribute to features the developers want themselves or that the features are cool, and not because the features are user-requested or even helpful to the project. I mean they’re not paid, so they might as well do something cool.

Remember, no contribution, no geek cred.

Now Wikipedia is successful because if any “amateur” goes in to create “useless” entries or update existing entries with wrong information, there’s someone else who’s willing to go in and change it. The long tail of contributors work here because the entries aren’t typically arcane. And that contributors are motivated enough to make Wikipedia better.

Open source software projects don’t work quite as well. Amateur developers add useless or unnecessary features. No one wants to go in and edit code. Because developers do it for geek cred, if you go in and delete their stuff, even if their stuff is unnecessary or possibly even detrimental, the original developer/contributor is going to be upset with you. Because you’re removing their geek cred.

This had put me in a bit of a quandary. For the purposes of this article, I’ll define “open source” as:

  • Source code is available
  • Licensing is such that users are able to study, change, redistribute the software and the source code

I’m not sure about the licensing part. Any license that qualifies as open-source is good enough for me.

And “closed source” will be defined as “not everything that has to do with the software is made available”. I’ll explain the meaning in a bit.

I made a spreadsheet library and made it open source. It’s open source because the source code is available for anyone to read. I’ve licensed it with the MIT License, which basically allows you to do whatever you want with very few restrictions.

It’s also closed, because I didn’t include the .csproj file and the strong name key file in the downloadable package. There are various reasons, but the main ones are that I intend to keep the branding and that there’s only one “original” SpreadsheetLight library running out there (mine). Anyone who forks my source code can compile it, but the resulting DLL won’t be the “original”, so to speak.

So I consider my project as clopen source. “Clopen” is a mathematical term used in topology to mean both open and closed. I’m not going to bore you with the details. Go Google the definition yourself.

The main point I had was that, while my project was open source, I don’t quite encourage open collaboration. Open collaboration is not a requirement for a project to be considered open source. And it is open source, because you can view all the source code.

My main motivation was that I had a specific vision for the project, which was to make the spreadsheet library as easy to use as possible for developers who are on a tight schedule and don’t have time to learn how to make spreadsheets with a third party library. This meant I had to maintain strict control over things like method signatures and even method/property/class names.

Can you imagine having just anyone coming in to change the source code? Or just adding a feature because they need it, but the intended developer audience doesn’t need it?

Not every contributor is interested in being aligned with the project’s vision and purpose.

Future of spreadsheets

So I was checking some site statistics for my spreadsheet library website and I found an interesting search phrase: “open xml excel future”. The person might have been checking how important spreadsheets are in the future. Or maybe just checking if Microsoft Excel is still around.

Spreadsheets are still going to be used in future. This is based on my experience working in a medium-large telecommunications company, and based on the companies who bought my Open XML spreadsheet reference. Before I go on, let me tell you about an incident.

I went on Facebook and asked my friends to tell everyone they know of my spreadsheet library (SpreadsheetLight). One person commented that his friends are all hackers and don’t use spreadsheets. I will assume “hackers” to mean “programmers with higher than average programming skills” and “don’t use spreadsheets” to mean “hardly ever thought about them”.

Software is meant to be used by people. Usually “normal” people, not “hacker” people.

PayPal allows you to download transactional data. You can download it in CSV format. What do you do to see it clearer? You paste it into Excel. Or whatever spreadsheet software you use.

I’ve worked with satellite transactional data. There are lots of data fields. Source of transmission, destination of transmission, length of transmission, type of data. Why does this matter? Because the satellite companies charge the transactions to the customers. Even the source and destination is important (think of calling from America to China, and from Singapore to China. Wait, you might be too young to know about the overseas call charges…).

And the best way to view satellite transactional data is on a grid-like display. A spreadsheet.

So here’s a short rant. If you are a highfalutin coder who thinks you’re better than anyone, even another programmer who’s using some “inferior” or “non-open source” language or platform, get off your high tower. The only people who care about your code are other programmers. The normal people don’t care a flying fishball about your code.

I’m practical. I use whatever tools I have to do what I need to do. So far, C# is great. And there are people who don’t like the .NET Framework simply on principle that it’s from Microsoft.

I’ve never really understood the tension between the open source software people and the “closed” source software (aka enterprises and large companies) people. In particular, Microsoft seems to bear the biggest brunt of the force whereas Apple (the supposed “opposite”) is enshrined, despite the various salient restrictions Apple imposes (App Store apps approval process, closed system, proprietary hardware [iPhone, iPad]).

If I didn’t know better, I’d say some programmers hate Microsoft because Microsoft makes a lot of money. Oh there are specific reasons like closed proprietary systems, or crushing small competitors by buying them out, or whatever. I’d say the underlying cause of their distress is that Microsoft makes lots of money. And if I didn’t know better, I’d say these programmers find money a repulsive thing.

Views on money is a separate topic. My short version is: money allows you to be more of who you are. Money gives you more options. Money is not inherently evil.

Now Microsoft introduced the Open XML specifications, with the creation of the Open XML SDK for .NET Framework. People have criticised the Open XML specs, whether it’s “yet another standard” or “it’s from evil Microsoft!”.

The Open XML specs have their flaws. I should know because I’ve spent hours studying them and poring over the details. From my research, the specs are created to conform to the existing Microsoft Office software (Word, Excel and PowerPoint). If it’s in the Office software, the specs reflect it.

You can’t blame Microsoft. It’s a good move. It also made the specs very complicated. The reason is that the Office software is the accumulated knowledge and code base from years of testing and marketing.

You know what’s the most common thing to ask when creating Open XML documents? “How to create a simple Word document?” or “How to create a simple Excel spreadsheet?” Because the complexity made creating even an empty file a big pain in the derriere.

So what’s the future, really?

Some kind of standard would be used. Possibly the Open XML standard. Google Docs and Apple’s Number and other spreadsheet software will adopt it, or at least support it.

Spreadsheets are going to be around because companies and businesses use them. Even the small ones. I don’t care if you’re a one-man startup. You will eventually find a need to look at data in a grid-like pattern. That’s when you’ll use spreadsheets.

Businesses who hire programmers pay those programmers. And businesses use spreadsheets. That means somewhere along the line, you as a programmer will have to work with spreadsheets. Maybe not using spreadsheets like your users, but create them for your users.

Your users need monthly sales reports. You’re tasked to dump the database data into a spreadsheet that they can actually use. You don’t expect them to write SQL statements right? (If you do, why are you being paid?)

However, I don’t expect spreadsheets to be very complicated. Tabular data, with heading text. Simple charts (although I’m revising my view on charts. Even creating a simple bar chart with Open XML SDK is giving me a headache due to the many various options and settings available on the Excel UI). Simple styling (too much styling and your spreadsheet looks untidy and won’t be useful).

This means the full Open XML specs won’t have to be implemented. My gut guess is that at least 50% of the specs won’t be used for the simpler spreadsheets.

And in the future, simpler spreadsheets are probably going to be more common. You need to see and understand the data fast. The fancy styling just makes it harder to do that.