Quick and easy data migration check
What is the fastest and easiest way to check if 2 databases contain the same data?
Check that for every table, the number of records is the same in both databases.
Yes, it’s a superficial check. Just because the count is the same in both databases doesn’t mean they contain the same data. But it’s a quick assessment that you’ve done a data migration correctly.
Suppose we have 3 database tables FunLovingDepartments, AwesomeEmployees, and SuperbEmployeeTypes. We could do this:
select count(*) from FunLovingDepartments
and then run it against both databases. Then we do the same for the other 2 tables.
That’s just too tedious. What if we could include the table name and use a union?
select 'FunLovingDepartments', count(*) from FunLovingDepartments union select 'AwesomeEmployees', count(*) from AwesomeEmployees union select 'SuperbEmployeeTypes', count(*) from SuperbEmployeeTypes
which gives a result something like
'FunLovingDepartment' 3 'AwesomeEmployees' 17 'SuperbEmployeeTypes' 8
Much faster to analyse with everything together. What if you’ve got dozens and dozens of tables? You’re going to get carpal tunnel syndrome from typing all those select statements. Now, what if I told you how you can generate those select statements?
Notice the structure of the select statement you want.
select '{tablename}', count(*) from {tablename} union
select '{tablename}', count(*) from {tablename} union
...
select '{tablename}', count(*) from {tablename} union
select '{tablename}', count(*) from {tablename}
What you do is write the select statement that generates the select statement!
select 'select '''+name+''',count(*) from '+name+' union' from sysobjects where type='U' order by name
There are 2 single quotes to produce 1 single quote because the escape character in a SQL string is the single quote.
This’ll work for SQL Server and Sybase databases. If you’re working with Oracle, no problem.
select 'select '''||TABLE_NAME||''',count(*) from '||TABLE_NAME||' union' from ALL_TABLES order by TABLE_NAME
Oracle SQL syntax uses 2 pipe characters for string concatenation.
The above 2 statements will generate the select-union statement that includes every table in your database. All you need is delete the trailing union from the last line. So for our fictional database, the generator SQL will produce this
select 'FunLovingDepartments', count(*) from FunLovingDepartments union select 'AwesomeEmployees', count(*) from AwesomeEmployees union select 'SuperbEmployeeTypes', count(*) from SuperbEmployeeTypes union
So just delete the last union keyword and you’re done.
What we have here is a classic case of code generating code. After you run the generated SQL in both databases, you’re going to get 2 sets of results. If you’ve a long list of tables, doing eyeball checks is going to speed up your myopia.
So what you do is run the generator SQL in one database to produce that big chunk of select-union statements. Then run that big chunk of select-union statements to get a set of results. Then copy those results into Excel. Do the same on the other database. What you’ll then have looks something like this in Excel.

Columns A and B contain the result set from one database. Columns F and G contain the result set from the other database. Then in column D, you use an Excel formula to do string comparison between the columns. Let me give you the formula for comparing the first row.
=IF(A1=F1,0,99)+IF(B1=G1,0,9999)
What it means is if A1 (table name from database 1) equals to F1 (table name from database 2), then return value 0, otherwise 99. The other part is if B1 (number of records from database 1) equals to G1 (number of records from database 2), then return value 0, otherwise 9999. Then add the two return values. Copy that Excel cell and paste down the line. Excel will automatically make sure the cell rows are correct (A2, A3 and so on).
If the final value is 0, then for that particular table, the number of records is the same in both database. I used 99 and 9999 respectively to distinguish the two different if comparisons. But you can set them to other values, as long as it looks significantly different from 0. Remember, you’ll be scrolling up and down the Excel file (lots of tables), so you don’t want to have your eyes distinguish between for example 8 and 0.
I think the screenshot probably explained it better.
There you have it, a quick and easy data migration check method. This saved me significant amounts of time and effort before. It’s a deadly combination of a SQL statement generating a SQL statement, which in turn generated a result set, which was then copied to Excel for comparison.
Use the existing tools where possible. Not everything needs a custom written program.
P.S. Yay, it’s spring!
Tags: data migration, database, oracle, sql server, sybase
Sign up now to get your free ebook of "How to self-publish an online magazine". Your email is kept confidential, and is used only to send information about the magazine.



Hi! I write about maths and programming and other topics of esoteric interest. I'm also the editor of the online magazine Singularity, and you can get the latest issue at the top (it's free!).
