Debug like a CSI

It is a morning like any other. You’ve got your tunes, you’ve got no one around in the office yet, and you are cranking out code like you’re on fire. Until some industrious user calls you and drags you out of your zone. You smile and nod as you listen to her explain that there’s been a tragedy, a terrible program error has occurred and she desperately needs your help. Time to debug like a CSI.

Document the scene
You ask her what exactly happened and you listen attentively as you jot down notes of her concerns, providing gentle affirmations and guiding her to the answers you seek. She’s practically in tears and just manage to tell you some name of the program she was using and a brief mention of the error itself. Then you ask her to provide you the most powerful piece of evidence she has: the screen shot of the error.

Gather evidence
So you’ve got an image of the error, and whilst you have every confidence in her statement, sometimes users just don’t see the difference between “An error occured” and “A Error occurred”. Her statement is probably something short like “I click Save button and there’s the error!”, so you don’t really have much to work on. She also probably sent you a 3 megabyte attachment of the screen shot and practically blew up your inbox. You note down the exact error message, and from her statement and the screen shot, find out the pertinent program responsible for making her life (and yours) miserable.

Evidence analysis
You open up the program code and use the Find function of the editor to track down the places where the exact error message occurred. You also retrace the steps taken for her to get that error. Hopefully, the error is documented in the code. If it’s a system error, then you’ll have to rely on her statement to pinpoint the exact location of the offending line of code.

You find out why the error occurred and if it’s actually correct that it happened, according to the business logic behind the program, then you give the user a call or an email to explain that everything’s fine and the error is supposed to happen. You might get more questions, such as the reasoning behind the business logic. But that’s another story.

Crime scene reconstruction
If you can’t find anything wrong, it’s time to reproduce the error. You have got to do exactly what the user is doing. It can be tedious when it’s production data you’ll be handling, and you have try out on a test environment. The test environment may not be identical to the production environment, some database tables might be missing from the test environment, and you’re starting to get cranky.

Somehow, you reproduce the error, and finally figure out the reason. You correct the error, recompile the program and send it to the user. You sigh as you are reminded of how poorly your department fails in the software management test you’ve read about, but that’s life.

Postmortem
Taking in a deep breath, and another swig of coffee, you settle back anxiously into your rhythm, half-expecting the user to call you to say that the error came back. After a few minutes, you slip into your zone again, the error forgotten like an ephemeral whisper…

Comments are closed.