Nobody hires a dodo hunter

My mom has a Vietnamese colleague with a law degree. Apparently, it’s more lucrative to sell cookware in Singapore than practise law in Vietnam. White collar jobs, welcome to 2011.

There’s an article on the Wall Street Journal, “China to Cancel College Majors That Don’t Pay“. China is tackling the problem of jobless graduates in her country. This is the start of the nightmare of something I wrote over a year ago on education:

They [the universities] might go create more graduates who make higher salaries. What might those be? Those academic fields where the economy pays well for, for example, medicine, law, accountancy, banking, biotechnology and computer science. The arts and philosophy majors are doomed, I tell ya. The education syllabus might well be skewed towards commercially profitable disciplines.

China is at least thinking about it.

A nation-wide purge of university majors that don’t pay means you’re essentially specialising. Individually, a university might use that as a hook, such as offering excellent biotechnology classes taught by world-renown people in those fields. Nationally, it will be a disaster.

How do you determine which majors don’t pay? The implicit assumption is you know which majors don’t pay now and in the future. The implicit assumption is that you know what’s going to happen next. You don’t.

When radio was invented, people thought nobody would pay for advertising, since it’s a broadcast medium to nobody in particular (anyone can listen in).

When the telephone was invented, people thought face-to-face communications would die. We still value face-to-face communications now. Never mind the teenage girl who texts 563 messages a day (though I’m sure she still wants to meet up with her friends. Those messages are probably “Meet where?”, “K” (the short form of OK), and “lol”).

When the television was invented, people thought it’s ugly. Black and white? Who’d watch?

When the Internet was invented, nobody thought it’d be a commercially viable medium. Look at all the online stores now.

When music could be digitised, people started sharing MP3s. Music labels sued their customers. They lost money. Apple iTunes is doing fine though.

When Amazon was started, it was to be an online bookstore. The major bookstores didn’t think it will work. They’re now in financial trouble.

It takes an average of 4 years to graduate with a degree. A lot can happen in 4 years. By the time you graduate with a PhD in ornithology specialising in dodos, nobody is hiring a dodo hunter. The job is no longer relevant…

… but it doesn’t mean you’re irrelevant. Adapt your skills. Become an exotic bird care specialist.

Let’s say China purges all non-manufacturing related majors. That means most of her graduates know only manufacturing related stuff. If the economy suddenly rewards creativity-based knowledge work, China will be struggling to move. Remember, it takes 4 years to churn out graduates. You’ll be 4 years behind.

Hmm? China’s too big? The dinosaurs thought they’d live forever too. A meteor wiped them out. Doesn’t matter how big you are. A big enough meteor will still wipe you out. You may quote me. Hey, let me help you:

Doesn’t matter how big you are. A big enough meteor will still wipe you out.
– Vincent

A university shouldn’t model against Amazon. You should not offer long-tail majors. You can’t afford to. The proliferation of majors is probably to attract as many students as possible.

Nobody hires a dodo hunter.

Code like a rockstar

There’s a new course available from Polymath Lectures called “Code Like a Rockstar”. Here’s an excerpt from the course description:

Taught by a successful Google Software Engineer and Computer Science Ph. D., this 5-session online masterclass will teach you expert-level coding techniques and practices which will get your code noticed by companies such as Facebook, Google, Apple, and Microsoft. Acquired over years of writing amazingly stable software at unimaginable scale and complexity, the tips in this course go well beyond the techniques taught in a typical software engineering program.

I think the instructor Michael Barnathan is kinda cool. But you’re welcome to go find out more and make up your own mind. And I don’t get a penny out of this if you sign up.

The course starts on January 7 2012, so you have some time to decide.

American hare, Asian tortoise

I’ve been meaning to get a drink from the cafe within the library for a while. It’s exam period, and all the seats were taken. But I finally got a chance to sit. I got myself a “Peach Dream”, a smoothie with peach flavour I think.

I quickly sat down at one of the tables that a lady graciously shared with me. Her friend soon returned with their drinks. I was just happily sipping my smoothie, watching a man on his laptop, one girl slumped on the table with her books, and listening to a mother reading a book to her daughter.

The two ladies at my table began talking.

“Did you know her son got 58 out of 60?”
“Really?”
“Her son is already so clever. But he’s still getting tuition.”
“But he’s so clever! He might get 50 even without tuition.”
“We don’t know if it’s because he has tuition, that’s why he got 58.”

That was a primary school science test. Hey I’m not eavesdropping. I just happen to overhear their overshared conversation.

Believing you can improve by putting in effort

There’s this study conducted dividing people on their perception on learning and intelligence. One group believed that intelligence is fixed, therefore if they don’t know something, they’re doomed to never learn how to do it properly. The other group believed that intelligence is malleable. If they put effort into learning, eventually they’ll get the hang of it.

The first group didn’t care what the answer was, only whether they’re correct or not. They didn’t care to learn how the answer came about. The second group cared more about why an answer was so.

When the 2 groups were tested again, the researchers found that the second group improved significantly. The first group didn’t do any better or worse.

I’m going to generalise here. Asians typically believe that if you put effort into something, you can improve. Be it maths, science, English, Chinese, whatever. That’s why here in Singapore, parents hire tuition teachers for their children, even if their children have phenomenal grades in school. (Also see PISA).

I didn’t have any tuition teachers after primary four (age 10). Not because I’m smart, but because my dad couldn’t afford it. Good thing I turned out alright…

Another general trait of Asians is that we save. Money that is. We’re brought up with the concept of saving money for a rainy day.

The hare and the tortoise

I read this book by former British Prime Minister Gordon Brown called Beyond the Crash. He brought up some concepts I’ve learnt about the global economy and politics.

America and Europe lead the world in terms of consumption. It’s worked so far because they also produced as much (as in exports). Their production brought in enough money for them to consume. They’ve raced ahead and amassed much wealth.

Like the hare, they’ve grown comfortable and stopped (more or less).

Globalisation allowed the other countries to come to the fore. The BRIC (Brazil, Russia, India, China), Indonesia, Philippines.

American (and European?) jobs flowed to other countries. First the Baby Boomer generation is slowly retiring, leaving a mass number of jobs for the smaller group of Generation X-ers who cannot fill them. Then globalisation killed those jobs, and the current Generation X-ers and Y-ers can’t find jobs.

The subprime housing situation created more turmoil. The recent bank crisis instilled fear and distrust. University tuition fees go up as people sought to get a Master’s degree in the tight job market. (Just for info, I’ve read there’s an “education bubble” going on).

America just averted a $14 trillion debt ceiling problem. Greece has a financial problem. Europe faces a sovereign debt problem. Their aging population doesn’t have enough people to take care of them, financially speaking (where do you think taxes go to?).

And the tortoises started to catch up.

Education

I’ve read an economist praising the education system of Singapore. I must admit, I was surprised. Then he (can’t remember whom or what book I was reading. Sorry…) pointed out that in America, teachers with average graduating scores are dumped to “second-rate” schools without training. In Singapore, the Ministry of Education chooses the best teachers, and provide them with training. I think it was 2 out of 10 applicants who get in. The Singapore government takes education very seriously.

Barack Obama has stated he’s taking America’s education seriously. As far as the future is concerned, I believe maths and science to be crucial. We’re going to need engineers, mathematicians, doctors, physicists, chemists, biologists and more to tackle the health care of our aging population, creating a sustainable Earth, and understand and make use of any future technologies.

Global commerce

Here’s something you should know. To get money, you have to sell something in exchange. I don’t care if it’s an apple, an iPad, television shows, movies, your body, real estate, knowledge (information). Even if it’s just a 250 by 250 pixel ad on your web page. You have to sell something.

America and Europe produced enough for domestic and international consumption. As a result, they grew. Then globalisation came. Their production dropped (because that production went to other countries as jobs). You produce less, but your consumption rate remain. You should see the problem, right? Then their domestic consumption even increased (think rampant credit card use).

Here’s the catch. China (seems to be the biggest blamee, though there are others) is exporting more stuff, and America (and Europe?) is buying. China buys up raw materials from other countries, manufactures products, and sells them.

What you should realise is that China has a small domestic consumption (remember Asians extol saving as a virtue, so we buy and consume less). Contrast that with China’s growing export business, you should see how China is growing in strength. But this depends on other countries buying their stuff *cough America cough*. China’s growth comes mainly from exports and China’s biggest worry is that people stop buying their exports.

The rebalancing

There was a time when the outsourcing/offshoring thing was a craze. Do you know how supply and demand works? As jobs went to India, China and Philippines because it’s cheaper, those jobs started becoming more expensive as the workers wanted better pay. It might still be cheaper to outsource/offshore, but it doesn’t always make a big financial impact to the bottom line.

You know this oil thing we need? It’s getting more expensive as it becomes scarcer. We need to find alternative energy solutions soon. See education above. Where are the people we need to solve this problem? (They aren’t motivated enough to learn, and they’re watching cat videos on YouTube).

You know what more expensive energy means? Transportation is going to get costlier. Getting a product to be manufactured in China, then assembled in Mexico, then shipped to America is going to be financially inadvisable.

You know what that means? Jobs are going to start flowing back (to wherever they came from).

You know what? There are millions of jobless young people who are willing to do those jobs.

But you need to be willing to train them. Specialisation cannot be your focus. Remember, these people just graduated from school. You won’t find a person who fits the job of a managerial post with an emphasis on information technology.

Get that graduate (who has a bundle of joyful energy) with the MBA. Train him/her on your business with information technology.

Get that programmer who did a bunch of software projects. Teach him/her about your special accounting software business.

Export more bits than atoms

I read that a Singapore minister (can’t remember who. You should know by now I have a terrible memory for these things…) who said that Singapore’s economic concern should still be to focus on manufacturing. I believe he’s referring to material goods.

I’m going to ask you a question. With the climate concern now, and that our landfills are starting to fill with our waste at a rate that’s slightly alarming, and that raw materials are getting costlier to shovel around, should you still export physical goods?

That’s still going to be a viable business. I mean, I still see people queuing up to buy the latest iPhone 4S, and texting on their perfectly working iPhone 4 (I still use an iPhone 3G, which Apple doesn’t even support anymore).

Remember the outsourcing/offshoring thing? There were 2 kinds of jobs: the physical creation of a product, and the intangible stuff. China does manufacturing. India does call centres.

As people become more aware of what they buy and consume, I see people having less material goods. The modern cell phone allows you to play games, organise your calendar, keep todo lists, take photos, capture videos, record audios, browse the Internet and make phone calls.

Even if that’s not the case, there’s a physical limit to how much you can export (and thus sell and thus make money). So sell your skills and knowledge. Teach people stuff. Offer something that’s not so easily replaceable by another person in another country.

Africa poised as untapped and trapped consumer base

Africa is like the poster child for a country in poverty. She has a large population but most of her people are struggling with where the next meal is coming from.

As China and India got more of their people out of poverty (due to globalisation in part), their people started buying stuff.

India is an interesting case. As her economy improved, so did her domestic consumption. In a sense, India is more “stable” than China in terms of growth.

The point is that Africa has a large population who most probably cannot and will not buy your products and services. They’re too busy dealing with AIDS, malaria and hunger. Not only that, it represents a huge number of people who cannot contribute to the world.

A human mind is a terrible thing to waste.

Finishing line

That was a lot to write.

So in case you skipped the whole shebang above, here’s the moral: Consume less (with more intelligence), raise education, and help other people.

First look at XML Studio

I’ve been working with XML files for a while (if you’ve been reading my blog for the past few months, you’re probably sick of the XML-related stuff…). Specifically with Open XML. While I don’t always read and write XML files, I do refer to the Open XML ECMA-376 documentation and the Open XML SDK help file a lot. And then, I go look at some XML files, just to check that I wrote them correctly.

I recently found out about XML Studio from Liquid Technologies. Disclaimer: I was contacted by a company representative, and given a free developer license for the software. But go check out their software if you’re doing XML-related stuff.

I blazed through the list of features and benefits, and settled on one. Oh my fishballnoodles they can generate C# source code from XML files! It uses the XML Data Binder.

So my first thought was: Can I use it to somehow generate source code that’s (sort-of) compatible with Open XML SDK?

Short answer: No. But that’s because the XML files don’t have the Open XML SDK class names in them, so you can’t really have source code working with the SDK.

However, my next thought was: Can I at least generate an XML file that would have been generated by the equivalent source code using the SDK?

First, I loaded an XML file from an Open XML spreadsheet (after renaming .xlsx to .zip and then unzipping and then get one of ’em darn sheet.xml files). Then I found out that I couldn’t generate source code from this. *sad*

But I found out I could generate an XML schema from the XML file. Ohhkayy… Then I found out that generating source code required an XSD, an XDR or a DTD file. Alright, getting there.

Then I thought I could create a worksheet with some typical data so that I could grab the resulting XML file with some of the possible data types, which I could then use to generate a corresponding XSD schema file, and then generate corresponding source code. Note the recursive problem solving ability of my programmer mind.

And then it hit me that I could just use the correct schema file from ECMA-376. So I went to the second edition of ECMA-376 (latest is third edition as of this writing but not currently super-supported yet), and went to folder of part 1 (there are parts 1 to 4). Which has this very descriptive name of “ECMA-376, Second Edition, Part 1 – Fundamentals And Markup Language Reference”. Under this folder, there’s a zip file called “OfficeOpenXML-XMLSchema-Strict.zip”. And in that zip file is the motherlode of your schema dreams.

And so I opened up the schema file related to the Worksheet class of the SDK (which is sml.xsd). And got this:
XML Studio Schema View

Click image for larger view.

I’ve expanded the node for the Cell class. That’s awesome. You see that “0..1” to the left of CT_CellFormula? That means a Cell class can contain 0 to 1 of the complex type (see “CT_” prefix) CellFormula. For nodes that take in at least 1 to an unlimited number of children nodes, you get “1..*”. This is reflected in the schema as minOccurs=”0″ for “you don’t really need it” and maxOccurs=”unbounded” for “you just have as many children as you want, ok? But make sure they’re of this type.”

As of this writing, I still haven’t managed to generate source code that does what I want (after a few hours of wheedling code around). But essentially, I’m trying to create an alternate Open XML SDK just from the schema information from ECMA-376. I’m pretty sure XML Studio wasn’t created for this in mind… I’ll keep you posted on my findings. If you have any XML editing stuff you think I should know, tell me, because I want to see if I can break, uh, I mean utilise XML Studio to its full potential.

Calculating Excel spreadsheet column names

I’ve been working with Open XML spreadsheets for the past, I don’t know how long… A year? I just realised that getting that Excel column header name is a frequent task. You know, given that it’s the 4th column, it’s “D”. I don’t work frequently with spreadsheets with lots of columns. So it was interesting that the 26th column is “Z” and the 27th column becomes “AA”. Basically, base-26 arithmetic, using the 26 letters of the English alphabet as tokens.

There are probably lots of code snippets out there showing you how to calculate a column name given the column index. Here’s mine:

string[] saExcelColumnHeaderNames = new string[16384];
string[] sa = new string[] { "A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M", "N", "O", "P", "Q", "R", "S", "T", "U", "V", "W", "X", "Y", "Z" };
string s = string.Empty;
int i, j, k, l;
i = j = k = -1;
for (l = 0; l < 16384; ++l)
{
    s = string.Empty;
    ++k;
    if (k == 26)
    {
        k = 0;
        ++j;
        if (j == 26)
        {
            j = 0;
            ++i;
        }
    }
    if (i >= 0) s += sa[i];
    if (j >= 0) s += sa[j];
    if (k >= 0) s += sa[k];
    saExcelColumnHeaderNames[l] = s;
}

That gives you a zero-based indexing version. So to get the 30th column name, you use saExcelColumnHeaderNames[29].

In case you’re wondering, 16384 is the maximum number of columns supported by Excel 2010.

You will notice that it’s not a function given the column index. I find that not as useful. Look, typically when you need the column name, you probably also need to get it frequently, usually with different parameters.

What I did was to store all the calculation results into a string array. Then you reference it with an index. The calculation function typically is a O(n) operation. With you needing to use the function multiple times, your whole algorithm probably goes up to O(n^2).

My method is also an O(n) operation. But referencing a string array is I think an O(1), meaning it’s a constant. I’ve never been good with big O notation…

This style of solving the problem is called pre-calculation. Pre-calculation is especially useful in the games development region, where speed is important. For example, selected values of sine and cosine were pre-calculated and stored in arrays, for use in the numerous 3D/2D calculations in games. Calculating sine’s and cosine’s in real-time were detrimental to a speedy game.

That’s not as useful now because you need a fuller range of floating point values as input. But the concept is still useful.

I think I read somewhere (while I was doing hobbyist game development) this quote:

Pre-calculate everything!

Maybe computers are now much faster. I don’t care. That doesn’t give you an excuse to be sloppy. It’s an optimisation that doesn’t take much effort.

If you need to calculate it, see if you can calculate it just once.

Decompiling Open XML spreadsheets

Ok, I’m going to reveal the big secret project that I’ve been working on for the last 2 months. I’m writing a software program that will decompile Open XML spreadsheets into C# and VB.NET source code.

Now I know what you’re thinking. “But Vincent, there’s that SDK Productivity Tool that does that already!”

Frankly, when I started the project, I didn’t even think about the SDK tool. But, when I looked at the generated source code from the SDK tool, I found it… hideous. There were 2 things I found annoying:

  • New classes were created willy-nilly
  • Properties were dumped into class instantiation using object initialisers

The first point meant that most of the classes were created one-off. It didn’t matter if you needed a class of type SomeClass multiple times. The SDK tool simply created another class of type SomeClass. If that class type was used multiple times, you’ll see variables named someClass1, someClass2 all the way to someClass21. It’s why I wrote about multiple use variables versus multiple variables.

The second point meant that if a class has many properties, you might end up with something like:

CellFormat cellFormat3 = new CellFormat(){ NumberFormatId = (UInt32Value)0U, FontId = (UInt32Value)10U, FillId = (UInt32Value)9U, BorderId = (UInt32Value)0U, ApplyNumberFormat = false, ApplyBorder = false, ApplyAlignment = false, ApplyProtection = false };

That’s one line of code.

The problem I have with object initialisers is when you need to comment something in between. Commenting in C# and VB.NET means an entire line is commented, although C# offers the /* comment */ variant. There’s just no easy way to do so. Compare with this:

cellFormat = new CellFormat();
cellFormat.NumberFormatId = 0U;
cellFormat.FontId = 11U;
cellFormat.FillId = 10U;
cellFormat.BorderId = 0U;
cellFormat.ApplyNumberFormat = false;
cellFormat.ApplyBorder = false;
cellFormat.ApplyAlignment = false;
cellFormat.ApplyProtection = false;

I just find that easier to pick and choose stuff I don’t want.

Now the big advantage (my differentiation or unique selling proposition) is that I offer VB.NET too. The SDK tool doesn’t. Here’s a snippet:

run = New Run()
run.RunProperties = New RunProperties()
run.RunProperties.Append(New FontSize() With {.Val = 11R})
clr = New Color()
clr.Theme = 1UI
run.RunProperties.Append(clr)
run.RunProperties.Append(New RunFont() With {.Val = "Calibri"})
run.RunProperties.Append(New FontFamily() With {.Val = 2})
run.RunProperties.Append(New FontScheme() With {.Val = FontSchemeValues.Minor})

You will notice that I do use object initialisers. “That’s hypocritical of you!”. Perhaps, but I use them when the number of properties is small. I’ve kept it to 3 for now. Object initialisers in my case also made it easier that I don’t have to declare and instantiate new classes with actual variable names.

I understand why the SDK tool generates source code the way it does. It has to do with completely iterating through every single part and class of the root class SpreadsheetDocument. If you’ve ever written code to traverse a tree structure, you’ll know how tedious it can be.

The one thing the SDK tool lacks about the source code it generates is context. It runs through the entire Open XML document structure like a squirrel looking for every single acorn on a tree. It doesn’t stop to check any acorn for size, defects or even if it’s an acorn. Look, winter’s coming soon, and the squirrel doesn’t have all day telling you that this particular acorn is related to that particular acorn, and no it doesn’t care how big the acorn is, it’s got the teeth to eat it, ok?

Why are we talking about squirrels again?

So, after about 20 thousand lines of code, I’m just barely getting my software into beta mode. Halfway through that, my heart sank with the enormity of the task. In order to generate more readable code, I cannot iterate through the XML tree structure like the SDK tool. I had to stop and make sense of what the class was.

That made me look at the SDK help file and the ECMA-376 specification file way too much… Did you know the ECMA spec is like over 5000 pages long? And that’s part 1. Parts 2, 3 and 4 are smaller, but still heavyweights in their own right. And there are so many classes and child classes and grandchildren classes and properties and…

I’m going to at least make a valiant effort to have the software self-complete on a subset of the Excel functionality (and thus a subset of the SDK). If you’re interested, I present to you SoxDecompiler. As of this writing, I’m just trying to see if people are interested in the software, so it’s just a page to collect email addresses of the people interested in the software. I think I wrote “interested” way too many times…

For some reason, the name conjures an image of a thread slowly unravelling a sock. But I like it. It stands for “Spreadsheet Open XML Decompiler”.

Online business mentorship

In case you missed it, I’m offering to help you with starting your own small online business. You don’t have to pay for it, at least not with money.

Basically, you help me write a few articles, or in whatever form of contribution you can give (I’m flexible). And in return, I help you to get your own online business off the ground. You don’t have to pay me anything, and I don’t give you financial assistance.

And whatever profits your business generates, you get to keep it. All of it. I don’t get a single cent.

In terms of financial reward, that’s a terrible deal for me. Why the hashbrown would I do it? Because I want people to feel more in control of their lives, particularly in these rough economic times. Because I want to create more entrepreneurs and self-starters. A small online business on the side can generate enough cash flow to help with monthly food and/or bills. That’s immensely helpful.

Because my time is limited, I cannot help too many people. So the maximum will be 3. I’ve already had one person interested. If you’re interested in working with me, contact me.