Tuesday, April 27, 2010

Using Snort fast patterns wisely for fast rules

Anyone that's ever written their own Snort rule has wondered, at some point or another, about how to make their rule(s) faster. While some things are obvious - don't use a PCRE with a bunch of ".*" clauses, for example - others are less so. Today I'd like to go over one of the more subtle methods of speeding up a rule, which has been highlighted by some new features in Snort 2.8.6.

Any rule that has one or more content matches in it has a fast pattern associated with it - the string that Snort puts into its fast pattern matching engine to begin the process of detection. Chosen somewhat intelligently by Snort itself, this pattern is usually the longest string in a rule; as a general rule of thumb, the longer the string is, the faster a rule will be, with strings of four or more bytes typically being necessary to reap the benefits of the fast pattern matcher. Only if this string is found in a packet does Snort evaluate the remaining options in the rule - which means that the fewer times the fast pattern matches, the less performance drag the rule will create on Snort. Thus, the goal of a rule-writer should be to choose a fast pattern that will be as closely associated with the actual triggering conditions of the rule as possible - if you can generate an alert for most of the times you actually enter a rule, you've successfully targeted your detection, and written a rule with the minimum possible performance impact on Snort.

Up until Snort 2.8.6, unfortunately, rule writers had little control over what was chosen as a rule's fast pattern. With the introduction of the fast_pattern keyword and a new config option, however, that's all changed.

Let's start by going over the new config option, since it will provide us with the intelligence we need to properly use the fast_pattern keyword. It's really rather simple; just add:

debug-print-fast-pattern

...to your config detection statement (NOTE: if you try to specify this on a line separate from your non-default config detection statement, you'll end up setting all detection parameters back to their defautls.)

Just add this line to your Snort config, and you're good to go. If you run Snort with this option enabled, you'll get output similar to the following:


1:6407
Fast pattern matcher: Content
Fast pattern set: no
Fast pattern only: no
Negated: no
Pattern offset,length: none
Pattern truncated: no
Original pattern
"INVITE|20|SIP:"
Final pattern
"INVITE|20|SIP:"


For the sake of this example, we're running Snort with just the following rule enabled:

alert udp $HOME_NET any -> $EXTERNAL_NET 5060 (msg:"POLICY Gizmo register VOIP state"; content:"INVITE sip|3A|"; nocase; content:"User-Agent|3A|"; nocase; content:"Gizmo"; nocase; pcre:"/^User-Agent\x3A[^\n\r]+Gizmo/smi"; reference:url,www.gizmoproject.com; classtype:policy-violation; sid:6407; rev:1;)

As noted earlier, Snort has chosen the longest available string - "INVITE sip|3A|" - as the fast pattern for the rule. The problem, unfortunately, is that this pattern will match on all SIP invitations, whereas the rule will generate an alert on only a tiny portion of those requests. Clearly, this is sub-optimal from a performance perspective.

With the new fast_pattern keyword, however, we can fix this problem. By updating the rule to read as follows:

alert udp $HOME_NET any -> $EXTERNAL_NET 5060 (msg:"POLICY Gizmo register VOIP state"; content:"INVITE sip|3A|"; nocase; content:"User-Agent|3A|"; nocase; content:"Gizmo"; nocase; fast_pattern; pcre:"/^User-Agent\x3A[^\n\r]+Gizmo/smi"; reference:url,www.gizmoproject.com; classtype:policy-violation; sid:6407; rev:1;)

...we get the following output from Snort:

1:6407
Fast pattern matcher: Content
Fast pattern set: yes
Fast pattern only: no
Negated: no
Pattern offset,length: none
Pattern truncated: no
Original pattern
"GIZMO"
Final pattern
"GIZMO"


As you can see, the fast pattern has been changed per the keyword we used, and Snort now notes that we've explicitly set the fast pattern (i.e. "Fast pattern set: yes"). Since the string "Gizmo" is likely to be orders of magnitude less common than "INVITE sip|3A|" in SIP traffic, the number of times this rule is evaluated will drop dramatically, and the rule will get a commensurate performance boost.

So based on this information, given the following rule, what would you expect the fast pattern to be for the following rule?

alert tcp $HOME_NET any -> $EXTERNAL_NET $HTTP_PORTS (msg:"SPYWARE-PUT Hijacker xp antispyware 2009 runtime detection - pre-sale webpage"; flow:to_server,established; uricontent:"/buy.html?"; nocase; uricontent:"wmid="; nocase; uricontent:"skey="; nocase; content:"Host|3A| www.xpas2009.com"; nocase; reference:url,research.sunbelt-software.com/threatdisplay.aspx?name=XPAntiSpyware%202009&threatid=429593; reference:url,www.ca.com/us/securityadvisor/pest/pest.aspx?id=453141780; classtype:misc-activity; sid:16136; rev:2;)

If you answered "Host|3A| www.xpas2009.com", you'd be wrong - because of the way Snort picks fast patterns when you have a mix of buffers:


1:16136
Fast pattern matcher: URI content
Fast pattern set: no
Fast pattern only: no
Negated: no
Pattern offset,length: none
Pattern truncated: no
Original pattern
"/BUY.HTML?"
Final pattern
"/BUY.HTML?"


As you can see, Snort chose the longest pattern out of the URI buffer. In a lot of cases, this default will make sense - after all, the URI buffer is usually smaller than the regular content buffer, and searching a smaller space will be faster. In this particular case, however, we've ended up with a fast pattern that will be fairly common in web traffic - or, at the very least, more common than a search for a particular host string. Since the goal is to enter the rule as little as possible, we want to override this behavior, and go for the more unique pattern:

alert tcp $HOME_NET any -> $EXTERNAL_NET $HTTP_PORTS (msg:"SPYWARE-PUT Hijacker xp antispyware 2009 runtime detection - pre-sale webpage"; flow:to_server,established; uricontent:"/buy.html?"; nocase; uricontent:"wmid="; nocase; uricontent:"skey="; nocase; content:"Host|3A| www.xpas2009.com"; nocase; fast_pattern; reference:url,research.sunbelt-software.com/threatdisplay.aspx?name=XPAntiSpyware%202009&threatid=429593; reference:url,www.ca.com/us/securityadvisor/pest/pest.aspx?id=453141780; classtype:misc-activity; sid:16136; rev:2;)

1:16136
Fast pattern matcher: Content
Fast pattern set: yes
Fast pattern only: no
Negated: no
Pattern offset,length: none
Pattern truncated: no
Original pattern
"HOST:|20|WWW.XPAS2009.COM"
Final pattern
"HOST:|20|WWW.XPAS2009.COM"


We can actually optimize even further from here. As it turns out, once a fast pattern has been matched, and a rule has been entered, Snort will spend CPU cycles looking for the content chosen as the fast pattern again, this time using the content matching engine. While this seems duplicative, in many cases, it's useful; for example, if a content clause follows the one chosen as the fast pattern content, and that second content uses distance and within to force a match only relative to the end of the fast pattern, Snort needs to find the fast pattern that second time to properly evaluate the second content clause. However, for this particular rule, that's not the case, and so there's no point in bothering to find this string a second time. With that in mind, we'll change fast_pattern; to fast_pattern:only;, and save the CPU cycles during rule evaluation. Finally, since the string we're looking for should only be found in the HTTP headers, we'll use the new http_header; keyword to restrict the search to that buffer (which is explicitly split out for the first time in Snort 2.8.6), and end up with the following rule:

alert tcp $HOME_NET any -> $EXTERNAL_NET $HTTP_PORTS (msg:"SPYWARE-PUT Hijacker xp antispyware 2009 runtime detection - pre-sale webpage"; flow:to_server,established; uricontent:"/buy.html?"; nocase; uricontent:"wmid="; nocase; uricontent:"skey="; nocase; content:"Host|3A| www.xpas2009.com"; nocase; fast_pattern:only; http_header; reference:url,research.sunbelt-software.com/threatdisplay.aspx?name=XPAntiSpyware%202009&threatid=429593; reference:url,www.ca.com/us/securityadvisor/pest/pest.aspx?id=453141780; classtype:misc-activity; sid:16136; rev:2;)

...and the associated debug output:


1:16136
Fast pattern matcher: URI content
Fast pattern set: yes
Fast pattern only: yes
Negated: no
Pattern offset,length: none
Pattern truncated: no
Original pattern
"HOST:|20|WWW.XPAS2009.COM"
Final pattern
"HOST:|20|WWW.XPAS2009.COM"


(Note: just because the debug output specifies "URI content" here doesn't actually mean that the pattern is being searched for in the URI buffer. I've verified through testing and talking to the development team that the HTTP header buffer is what's being searched here; the output is the way it is because the HTTP-related buffers, including the URI buffer and the header buffer, are grouped together at the point this output is printed.)

One additional item to be cognizant of, for those who begin using the newly available ac-split fast pattern method introduced in 2.8.6, is pattern truncation. The recommended configuration for this method includes the directive "max-pattern-len 20", which will truncate fast patterns at 20 bytes; doing so helps with the memory footprint for Snort, and generally 20 bytes is sufficient for simply using a fast pattern to determine entry into a rule. If your Snort install is set up in this manner, and you need to specify which bytes of a long pattern are the most unique, you can use the fast_pattern:x,y; modifier to the content you're operating on, to specify the start and end bytes of the portion of the content you wish to use as the fast pattern (you can exceed the 20 byte truncation limit by doing this - Snort will take all of the specified bytes). Note that if you specify fast_pattern:only; on a pattern longer than the number of bytes specified in your configuration, the entire pattern will be used, regardless of its size.

With this new functionality in hand, the VRT is busy reviewing our entire ruleset, looking for places where rules can be optimized by proper tweaking of fast pattern settings. Expect to see thousands of changes to the rules over the next several weeks as we work through and implement all of these changes.
Add to Technorati Favorites Digg! This

1 comment:

Will Metcalf said...

This might help some folks with custom rule sets identify non-unique patterns that are being added to fast-pattern as the final match.

http://rules.emergingthreats.net/projects/blackpattern/