Specious spreadsheet security

If not for Chinese, it would’ve worked.

So in Excel, you can set a password for either protecting a particular worksheet, or protecting the entire workbook/spreadsheet. This password is then hashed, and the result is stored within the spreadsheet contents.

Now with the use of Open XML spreadsheets, this means the resulting hash is stored in “plain text” within XML files. Without going into too much detail, here’s the algorithm for the hash as documented in the Open XML SDK 2.0 help docs:

// Function Input:
//    szPassword: NULL-terminated C-style string
//    cchPassword: The number of characters in szPassword (not including the NULL terminator)
WORD GetPasswordHash(const CHAR *szPassword, int cchPassword) {
      WORD wPasswordHash;
      const CHAR *pch;
 
      wPasswordHash = 0;
 
      if (cchPassword > 0)
            {
            pch = &szPassword[cchPassword];
            while (pch-- != szPassword)
                  {
                  wPasswordHash = ((wPasswordHash >> 14) & 0x01) | ((wPasswordHash << 1) & 0x7fff);
                  wPasswordHash ^= *pch;
                  }
            wPasswordHash ^= (0x8000 | ('N' << 8) | 'K');
            }
      
      return(wPasswordHash);
}

This algorithm is wrong. Or at least it doesn't give the resulting hash that Excel produces.

Granted, I didn't really expect the algorithm to be correct. Because from the SDK help:

An example algorithm to hash the user input into the value stored is as follows:

It's an example algorithm, so it may or may not be the one that Excel actually use. However, for the purposes of usability, the password used by users have to be encrypted using Excel's algorithm, so we have to somehow get the resulting hash. Either we get the algorithm itself, or we simulate an algorithm such that the resulting hash matches that of Excel's.

Which is what Kohei Yoshida did. See his modified algorithm. This algorithm worked!

Given that I'm Chinese, I did what's natural: I used Chinese characters as the password.

And the modified algorithm failed. It only worked if the password consisted only of Latin alphabets. I tried Japanese characters. Failed too.

This is why I don't support password protection in SpreadsheetLight. I don't want to give the false impression that the worksheet/workbook encryption works. This is one feature where "partially worked" is unacceptable.

Granted Open XML spreadsheets also support other types of encryption, and you can store the name of the algorithm and salt value you used, and even the number of times the hash algorithm was run.

But.

This is in the context of a spreadsheet library.

What are spreadsheet libraries mostly used for? Automation.

Which means minimal (if at all) human interaction and intervention.

And so the question comes up.

Where do you store the password?

If a human is protecting the worksheet/workbook, she will provide the password herself, and then encrypts it (well Excel does it), and she just remembers the password.

If a spreadsheet library is generating the workbook, and encrypting it, the password has to be gotten from somewhere, right? So it's stored, maybe in a text file, maybe in a database, maybe hardcoded in code, or whatever.

The point being that the password is not held solely by a human being. And a computer hard drive is easier to hack into than a human brain.

And I will prefer not to be a party to facilitating insecure spreadsheet generation. Besides, it's Open XML. The data is supposed to be open and sharable. Password protection seems to be the opposite. Even Microsoft cautions the use of depending just on encrypted Excel files.

We already have social engineering used by devious people to deceive people into giving their passwords over. Storing passwords on a computer seems suicidal because computers have no common sense at all.

As a final word, I'd say using the Open XML SDK can be either verbose, or obscenely painstakingly verbose. Simple tasks need a couple of dozens of lines of code, and complex tasks take at least 2 magnitudes of work to do. For individualistic, compartmentalisable (that's not a word, right?) tasks, you can do it from scratch. Add even a smidgeon of complexity, and you'll find a library more useful. Try mine.

Sorry, comments are closed, but you can contact me directly if you like.