If not using the database, please disconnect

I’m maintaining some Windows programs created by the PowerBuilder software. The original developer didn’t plan for the programs to be used by many people. So the instant one of the programs was run, a database connection to the Sybase database was opened. And left there.

As more programs were created in this manner (and added to the suite of programs my team is in charge of), the number of total users also increased. Since the connections were held in place, table locks between users became a real problem, because a user could be done with an operation, but still hold onto the table. This also meant the database became clogged up with connections, usually non-active.

The better solution is to open the connection when you’re going to do any database operation, and then close it once you’re done. But the original programs were developed like eons ago. If I understand it correctly, client programs back then assumed they had total control over the database. Contrast that with the web applications of today, and let me just say that, I have my work cut out for me…

I decided to write something on this after reading Raymond’s article on cookie licking. So if you’re not using any database functions, please disconnect.

Date and time format mistakes in .NET and SQL

I’ve written about how you can manipulate dates and times in .NET before. Here, I’m going to highlight a few avoidable mistakes when manipulating them in .NET and in SQL for database queries.

Case matters

I’m not going to list down all the format strings you can use. Refer to this list instead. You might find this article by Microsoft on best practices useful. I am going to highlight these two letters, h and m.

The small “h” gives you the hour, in 12-hour notation without a leading zero (for less than 10 values). “hh” gives you the 12-hour representation with a leading zero.

The capital “H” gives you the hour in 24-hour notation (or military time) without a leading zero, and “HH” gives you the 24-hour representation with a leading zero.

It’s good user-friendly practice to include the “t” or “tt” notation if you’re using the small “h” to represent the hour (for “A” / “P” or “AM” / “PM” respectively). This way, you know if it’s in the morning or night.

The letter H isn’t so bad. At least you’re still referring to the hour. When you get to M, oh you better watch out.

The small “m” and “mm” gives you the minute (of the time) without and with a leading zero respectively.

The capital “M” and “MM” gives you the month without and with a leading zero respectively. For example, September would be 9 and 09 respectively. See, totally different thing from its small lettered counterpart. There’s the “MMM” and the “MMMM” format string, and I’ll leave it to you to experiment with it.

Now to a common mistake I see: “dd/mm/yyyy“. See any problems?

I’m going to give you a starter custom format string: “dd/MM/yyyy HH:mm:ss“. Burn that into your brain. You can swap “dd” and “MM” if you want. Use “-” instead of “/” as the separator if you want.

The TO_CHAR() function in Oracle PL/SQL

For the equivalent SQL statement to format dates and times in Oracle, here it is:

TO_CHAR(sysdate, 'DD/MM/YYYY HH24:MI:SS')

Note that the function parameter is case insensitive.

Note the difference between “MM” for month and “MI” for minute. For a 12-hour representation, use “HH” or “HH12”.

Tricky formats in Sybase and SQL Server

I don’t know why Sybase and SQL Server don’t just give the ability to easily customise date and time formats. Instead of the string formats in .NET and PL/SQL, they use number codes. Number codes! *urgh*

I actually have a table of the codes and the resulting formats printed out and placed beside my desk, somewhere buried in the chaotic mess of papers. I really just remember 3 number codes: 103, 112 and 120.

select convert(char(10), getdate(), 103)
-- gives something like 17/09/2008 in dd/MM/yyyy format
select convert(char(8), getdate(), 112)
-- gives something like 20080917 in yyyyMMdd format
select convert(char(19), getdate(), 120)
-- gives something like 2008-09-17 05:04:03 in yyyy-MM-dd HH:mm:ss format

I’m using the .NET format string notation in the comments. Note that number code 120 is only available in SQL Server. The other 2 codes are available in both Sybase and SQL Server. There are other codes (which you can explore here). I just frequently use those 3, particularly 112.

If you’re used to the American date display, then this might be useful:

select convert(char(10), getdate(), 101)
-- gives something like 09/17/2008 in MM/dd/yyyy format

Note that you can do something like

select convert(char(6), getdate(), 112)
-- gives something like 200809 in yyyyMM format

Note the char(6) part (versus char(8) originally). I do this quite often too.

Now for the mistake in T-SQL. What do you think is wrong with this?

select convert(char(10), EFF_DATE, 103) EFF_DATE
from Customers
order by EFF_DATE desc

Yes, I’m being deliberately vague. Use your powers of deduction to fill in the blanks.

Quick and easy data migration check

What is the fastest and easiest way to check if 2 databases contain the same data?

Check that for every table, the number of records is the same in both databases.

Yes, it’s a superficial check. Just because the count is the same in both databases doesn’t mean they contain the same data. But it’s a quick assessment that you’ve done a data migration correctly.

Suppose we have 3 database tables FunLovingDepartments, AwesomeEmployees, and SuperbEmployeeTypes. We could do this:

select count(*) from FunLovingDepartments

and then run it against both databases. Then we do the same for the other 2 tables.

That’s just too tedious. What if we could include the table name and use a union?

select 'FunLovingDepartments', count(*) from FunLovingDepartments
union
select 'AwesomeEmployees', count(*) from AwesomeEmployees
union
select 'SuperbEmployeeTypes', count(*) from SuperbEmployeeTypes

which gives a result something like

'FunLovingDepartment'   3
'AwesomeEmployees'      17
'SuperbEmployeeTypes'   8

Much faster to analyse with everything together. What if you’ve got dozens and dozens of tables? You’re going to get carpal tunnel syndrome from typing all those select statements. Now, what if I told you how you can generate those select statements?

Notice the structure of the select statement you want.

select '{tablename}', count(*) from {tablename} union
select '{tablename}', count(*) from {tablename} union
...
select '{tablename}', count(*) from {tablename} union
select '{tablename}', count(*) from {tablename}

What you do is write the select statement that generates the select statement!

select 'select '''+name+''',count(*) from '+name+' union'
from sysobjects
where type='U'
order by name

There are 2 single quotes to produce 1 single quote because the escape character in a SQL string is the single quote.

This’ll work for SQL Server and Sybase databases. If you’re working with Oracle, no problem.

select 'select '''||TABLE_NAME||''',count(*) from '||TABLE_NAME||' union'
from ALL_TABLES
order by TABLE_NAME

Oracle SQL syntax uses 2 pipe characters for string concatenation.

The above 2 statements will generate the select-union statement that includes every table in your database. All you need is delete the trailing union from the last line. So for our fictional database, the generator SQL will produce this

select 'FunLovingDepartments', count(*) from FunLovingDepartments union
select 'AwesomeEmployees', count(*) from AwesomeEmployees union
select 'SuperbEmployeeTypes', count(*) from SuperbEmployeeTypes union

So just delete the last union keyword and you’re done.

What we have here is a classic case of code generating code. After you run the generated SQL in both databases, you’re going to get 2 sets of results. If you’ve a long list of tables, doing eyeball checks is going to speed up your myopia.

So what you do is run the generator SQL in one database to produce that big chunk of select-union statements. Then run that big chunk of select-union statements to get a set of results. Then copy those results into Excel. Do the same on the other database. What you’ll then have looks something like this in Excel.
Table count comparison in Excel

Columns A and B contain the result set from one database. Columns F and G contain the result set from the other database. Then in column D, you use an Excel formula to do string comparison between the columns. Let me give you the formula for comparing the first row.

=IF(A1=F1,0,99)+IF(B1=G1,0,9999)

What it means is if A1 (table name from database 1) equals to F1 (table name from database 2), then return value 0, otherwise 99. The other part is if B1 (number of records from database 1) equals to G1 (number of records from database 2), then return value 0, otherwise 9999. Then add the two return values. Copy that Excel cell and paste down the line. Excel will automatically make sure the cell rows are correct (A2, A3 and so on).

If the final value is 0, then for that particular table, the number of records is the same in both database. I used 99 and 9999 respectively to distinguish the two different if comparisons. But you can set them to other values, as long as it looks significantly different from 0. Remember, you’ll be scrolling up and down the Excel file (lots of tables), so you don’t want to have your eyes distinguish between for example 8 and 0.

I think the screenshot probably explained it better.

There you have it, a quick and easy data migration check method. This saved me significant amounts of time and effort before. It’s a deadly combination of a SQL statement generating a SQL statement, which in turn generated a result set, which was then copied to Excel for comparison.

Use the existing tools where possible. Not everything needs a custom written program.

P.S. Yay, it’s spring!