Let me start off by saying I’m not bashing the writer of this article, and I’m trying not to be super critical. I don’t want to discourage this person from writing articles about Snort rules. It’s great when people in the Snort community step up and explain some simple things out there. There are mistakes, it comes with the territory. If you choose to be one of the people that tries to write Snort rules, you also choose to be someone who wants to learn how to do it better. That’s why I write this blog post, not to bash the writer, but to teach.
I noticed this post today over at the “Tao of Signature Writing” blog, and to be honest I glanced over most of it figuring it was a rehash of things I’ve already read or things that have already been written from countless people about “Here’s how you write Snort rules!”. I scrolled down quickly skimming, not reading at all really, and noticed this part:
Now, let us look at the second question: “We have “aol” as the id and Import method name. Should we use “aol” along with “Import”?”. Just because we narrowed down to “clsid:” followed by CLSID number, does not mean that we have to narrow down in this case too. Just like how the Shellcode will change, the attackers might change the ID too, to just find out if they could evade the IDS/IPS. Why give them a chance? Hence, we should broaden our search to just the import method: content:”.Import(“. The reason why we have “.” and “(” around the key “Import” is to narrow the chances of triggering the signature on some term “Import” and to concentrate on the vulnerable method.
This post is about ActiveX and CLSID detection with a Snort rule, trying to detect an AOL 9.5 ActiveX 0day. Okay, fair enough, so the above paragraph is trying to find the Import command to call the javascript. So I kept reading.
Then I got to this part:
In here, I would like to position the CLSID before the method. This would help me trigger the signature specific to “AOL 9.5 ActiveX 0day Exploit (heap spray)“. I can do this ordering by using “Offset”. We cannot set the “Depth” in this case, since the position of CLSID or Method in a packet will change according to the packet size or the way in which it is sent. Hence, the content of final signature would look something like this:
content:”clsid:A105BD70-BF56-4D10-BC91-41C88321F47C”; nocase;content:”.Import(“; nocase; Offset:0;
The writer is correct in a couple things.
- First, they say they want to position the CLSID before the method, so they want to do with using offset.
- Second, they say they cannot set a “depth” because the position and method in the packet will change according to the packet size, which is partially correct.
However, the problem with this above signature is that the offset is placed after the second content match.
So here’s what would happen with the above signature so far. The CLSID content match is the longest, so it would be fed into the fast pattern matcher. If the fast pattern matcher came across a packet that matched the CLSID that is specified in the rule, <leaves stuff out>, then the packet would then be run through the detection engine (rule) for detection. Contrary to popular belief, unless an offset/depth/distance/within modifier is specified, there is no order for the packet to match. So if I were to write the above as this:
content:”clsid:A105BD70-BF56-4D10-BC91-41C88321F47C”; nocase;content:”.Import(“;nocase;
Snort doesn’t care which order the content matches are in. As long as both the contents are in the packet, then the rule will fire. So putting a content:”.Import(“; nocase; offset:0; does absolutely nothing. You can kind of think of offset:0; being implied, but if you don’t have any relative content matches, then it really doesn’t matter unless you are trying to be specific to a position match. However, as the author already stated, you can’t add a depth statement to the rule, so it plain, just doesn’t matter. I see this kind of thing all the time, so I figured common mistake. So I kept on reading:
Now, let us look into the direction of traffic. Client-side exploits generally flow from server to client: “flow:to_client,established;“.
The author explains that “Client-side exploits generally flow from server to client”. Okay, correct in this instance, but not always, so let me explain:
Flow has four direction operators you can specify:
- to_server
- from_server
- to_client
- from_client
What happens is when I hear from people is that they think “server” as that 2U thing back in the server room (hence the name), and client being “you”. But that’s not how Snort thinks about it. Snort thinks about client server in the “who initiated the conversation” term. So, at the beginning of a TCP conversation there is a 3-way handshake. SYN, SYN-ACK, ACK.
- CLIENT -> SYN -> SERVER
- CLIENT <- SYN, ACK <- SERVER
- CLIENT -> ACK -> SERVER
The client is who initiated the conversation, the server is who is responding. So, in this case, since we are attempting to catch a web browser accessing a webpage and downloading a webpage which contains this CLSID, the flow would be to_client. (Or from_server) Correct. However, what if someone downloaded a PDF, and upon opening the PDF the PDF went and grabbed something off the internet. This is a client side exploit, however, the flow would be reversed. So, the author is correct in saying that “Client-side exploits are generally…” I wanted to explain to make sure no one was confused. The “established” keyword means the the session is established. So beginning on the 3rd part of the 3-way handshake.
In this case some folks might believe that CLSID is already in the “content” part of the signature, and that this is a repetition if we use it in PCRE once again. We are not using this PCRE to repeat the value in the content, but to ensure that we do not miss any possibilities of matching this exploit. Let us look into the PCRE part of this signature:
pcre:”/<OBJECT\s+[^>]*classid\s*=\s*[\x22\x27]?\s*clsid\s*\x3a\s*\x7B?\s*A105BD70-BF56-4D10-BC91-41C88321F47C/si”;
In here, the signature is telling the PCRE compiler that there is “< object” followed by strings and “>” with multiple-strings possibly following it followed by “classid” & “=” with the “clsid”, “:” and “{“. The true classid is then inserted into the PCRE. The PCRE ends with /i to indicate the case-insensitive nature of this regular expression.
The first paragraph is partially correct. If you check for a content match, you can use a pcre to clarify what you are looking for. This is done for a couple reasons. One, as the author states above, is to not miss the possibilities of matching the exploit, but more accurately, it’s to avoid obfuscation of the exploit. So for example, let’s go back and take a look at the content match before we look at the pcre portion.
content:”clsid:A105BD70-BF56-4D10-BC91-41C88321F47C”; nocase;
Problem with this content match is, well, I wouldn’t have put the specific “clsid:” in there. Reason? If I was an attacker and I wanted to bypass your rule, I would put “clsid: A105BD70-BF56-4D10-BC91-41C88321F47C”. (Notice the space after the colon.) Which completely bypasses the content match.
So let’s come back to the pcre and take a look at it.
Now, this PCRE format was written by the VRT and a lot of people have copied it blindly without understanding what it does. So let me explain, as what the author wrote in the second paragraph quoted above, is wrong. As I said, I’m not trying to be mean or whatever, I am simply trying to teach.
So, the pcre is this:
/<OBJECT\s+[^>]*classid\s*=\s*[\x22\x27]?\s*clsid\s*\x3a\s*\x7B?\s*A105BD70-BF56-4D10-BC91-41C88321F47C/si
(I am going to put double quotes around the things we are trying to match that are explicit, the quotes don’t actually exist in the regular expression unless specified)
So we are looking for “<OBJECT”
Then a whitespace (\s). That’s what “\s” is. (It says ‘followed by strings’ in the above quoted paragraph). Whitespace is a tab, (0×09), space (0×20), new line character, or a line feed (0x0A), or a carriage return (0x0D). The “+” sign after the “\s” means ‘any character directly proceeding it as many times, but there must be at least 1′. So there must be 1 or more “\s” there.
Then you see this “[^>]“, which the author says that we are positively looking for. The thing about character classes “[ ]” is, they allow you to do some nifty things. Range matching, ([0-9]), multiple matches, [abc] (this will look for either an a, b, or c, for one character), and you can also do negative matches. Or “lack of” matches. The way you specify a negative match within a character class is to use the carat within a character class. So “[^>]” means, “the next character after any amount of positively matched “\s” cannot be a “>”. Directly after that is a “*” character. The “*” is similar to a “+” but the difference is, while a “+” means you must have at least 1 match of the proceeding character (in this case the negative character class), the “*” means you don’t have to have a positive match. It means “0 or more”.
Following that we have a “classid\s*=\s*” match. So look for classid(maybeaspacehere,it’soptional)=(maybeanotherspacehere)
Then there is a “[\x22\x27]“. In regular expressions, if you want to specify a hex character you have to write “\x” before the hex. So, you might see a space specified like this: 0×20. You might see it specified in Unicode like this: %20. In regular expressions, it would be “\x20″. Since there are two characters within the character class, 0×22 is the hex for a double quote. ” and 0×27 is hex for a single tick. ‘
Since this is a run of the mill character class match (not a range or something more complex) this means that the next character that the “[\x22\x27]” pattern match is looking for is either a ‘ or a “. Notice the “?” after the character class? That’s a ‘lazy optional’. So without going into a long book about lazy and greedy (which, by the way, if you are interested, I suggest checking out the book “Mastering Regular Expressions” by Jeffery Friedl, it’s the bible), the “?” basically means “The Character that is directly in front of the “?” is optional”. So, it essentially means, when all put together the match is either a ‘ or a ” or not at all.
Then we have (maybesomewhitepacehere)clsid(maybesomemorewhitespacehere):(maybesomemorewhitespacehere){(optionally)(maybesomemorewhitespacehere)A105BD70-BF56-4D10-BC91-41C88321F47C.
Notice that I translated “\x3a” and “\x7B” (the latter of which has the “?” behind it, so it’s optional) above.
Then the modifiers of the whole Regular Expression at the end are “/si”.
“s” means “include new lines in the dot metacharacter”. However, there are no “.” metacharacters in the regular expression, so that was probably put there by habit (and good practice), and the “i” means “anything within the regular expression treat with case insensitivity” similar to the “nocase;” keyword in Snort’s regular rule language.
So the final signature that the write comes up with is:
alert tcp $EXTERNAL_NET $HTTP_PORTS -> $HOME_NET any (msg:”ActiveX Exploit Signature Sample”; flow:to_client,established; content:”clsid:A105BD70-BF56-4D10-BC91-41C88321F47C”; nocase; content:”.Import(“; nocase; Offset:0;pcre:”/<OBJECT\s+[^>]*classid\s*=\s*[\x22\x27]?\s*clsid\s*\x3a\s*\x7B?\s*A105BD70-BF56-4D10-BC91-41C88321F47C/si”; reference:url,www.exploit-db.com/exploits/11204; rev:1;)
Which I am going to rewrite:
alert tcp $EXTERNAL_NET $HTTP_PORTS -> $HOME_NET any (msg:”ActiveX Exploit Signature Sample”; flow:to_client,established; content:”A105BD70-BF56-4D10-BC91-41C88321F47C”; nocase; content:”.Import(“; distance:0; pcre:”/<OBJECT\s+[^>]*classid\s*=\s*[\x22\x27]?\s*clsid\s*\x3a\s*\x7B?\s*A105BD70-BF56-4D10-BC91-41C88321F47C/si”; reference:url,www.exploit-db.com/exploits/11204; rev:2;)
So, what did I do different? Removed the “CLSID” content match, it won’t speed up detection, and it checked for in the pcre anyway. So, if you are going to fire up the pcre engine to check the content match on the long content match, just knock out two birds with one stone.
What’s with the “distance:0;” stuff? I made the content match directly proceeding that relative to the previous content match. Since I don’t have a within, I don’t constrain the match.
Why did you keep the “.Import(” stuff? False positive reduction. It will do nothing to speed up the match.
So, be careful when writing rules. Unless you understand all the pieces and parts you can walk yourself right into a dark hole and do it wrong. You can do that to yourself, but take extra care that you don’t walk anyone down the hole with you.
Again, I post this, not to be mean, but to be constructive. VRT could probably write the rule better than I as well. I’m not a member of the VRT, and I can’t even pretend to be. (They’d kick my butt.)