Friday, January 8, 2010

VRT Guide To IDS Ruleset Tuning

Everyone who's ever used Snort, or any other IDS for that matter, for any length of time knows that in order to get the most of out of their system, they need to tune it. Most people have at least a basic idea of what that means - choosing the right rules to run, placing the system at the right spot in the network, etc. - but judging from some of the questions that routinely come in to the VRT, apparently there are a lot of people out there who lack a full understanding of how to pick the right rules for their environment. I'm hoping this guide will help those people, and serve as a reminder to those who already know what they're doing.

Let me start off by saying that simply turning on all of the VRT Certified Rules - or all the rules from any published ruleset - is a Bad Idea™, especially if you're running in IPS mode and dropping packets. A number of rules are meant to be advisory in nature; for example, SID 11968 ("VOIP-SIP inbound INVITE message"), if configured with IP lists appropriate to your voice network, will tell you if you've got SIP traffic on segments of your network where it's not supposed to be. If you blindly enable that rule on a production network, you could instantly take down your phone system. Other rules can be performance-intensive, and should only be run if you really need the coverage. Thinking that the more rules you have, the better your protection will be (I'm looking at you, MSSPs) can cause you a world of hurt if you're not careful.

Your life as an IDS analyst will be much easier if you start by eliminating large chunks of the ruleset from your policy, so that you've got a manageable number of rules to individually look through. To that end, the VRT has done a pair of things to help you out. Historically, our rules have been broken down into large categories - Attack-Responses, FTP, Oracle, Web-IIS, etc. It's trivial to look at the 53 different categories we provide and turn entire groups of rules off at a time depending on their relevance to your situation; after all, if you don't run any Oracle servers, you can turn off the entire Oracle category without even worrying about it. For open-source users, this is as simple as commenting out, say, the "include RULE_PATH/oracle.rules" file in your snort.conf; Sourcefire customers can take a similar step through the administration interface.

Some of those categories, however, are relatively broad, and can't be turned off in one fell swoop - for example, Web-Client encompasses attacks against everything from Adobe Reader to Internet Explorer, from Mozilla Firefox to RealPlayer. Recognizing this shortcoming, Sourcefire added the "metadata" keyword to the ruleset in January of 2006. That keyword's primary purpose is to help collect rules into default policies, maintained by the VRT, so that users can assess the level of security they think is relevant to their network, and then have a recommended collection of rules to fit their security stance. One of the three default policies we maintain - Connectivity Over Security, Balanced, and Security Over Connectivity - should be a reasonable starting point for most real-world Snort administrators. People with Sourcefire appliances can already choose one of these policies as a starting point for their own custom IDS policy; open source users are now able to use a recently released feature from JJ Cummings' Pulled Pork tool to create policies based on metadata as well.

We weigh a number of factors when determining which policies, if any, to include a rule in; since these factors will also be relevant to anyone reviewing an individual rule on their network, they're worth listing here:


  • Impact of the vulnerability: Essentially the same process that is used to determine a CVSS Score. Can the vulnerability be exploited remotely? Are authentication credentials required? How much user interaction is required? How reliably can the bug be exploited, and what type of compromise results from successful exploitation? Are there public exploits or proofs-of-concept? How widely adopted is the software in question? Obviously, a simple exploit in, say, a core Windows component that gives administrative privileges and has a virus in the wild is going to be included in all policies, where as an unproven bug in Jim Bo Bob's PHP Bulletin Board that results in cross-site scripting will not. For the end user attempting to figure this out on their own, a close reading of the relevant CVE entry, Bugtraq listing, and vendor response (if available) should provide most, if not all, of this information; additionally, CVSS scores are publicly available, and can often serve as a simple shortcut for determining vulnerability impact.

  • Reliability of the rule: Simply put, some rules do detection better than others. When SID 14896 - which catches the exploit used by Conficker - fires, you're virtually guaranteed that malicious activity is in progress. DCE/RPC is a well-defined protocol that the VRT understands very well, and a very specific set of conditions must be present in order to trigger the vulnerability; thus, false positives will be minimal to nonexistent. On the other hand, when SID 7070 - which is designed as a generic rule to catch cross-site scripting attacks - fires, the likelihood of a false positive is relatively high, since the rule was intentionally written very broadly. While it may be difficult for an end-user to gauge a rule's reliability accurately, a good rule of thumb to use if you're trying to figure this out yourself is to look at the number of rule options and the size of the content matches - in both cases, the more, the merrier.

  • Performance impact of the rule: While a given rule's performance will necessarily vary based upon the environment it's operating in, there are several things that we know will almost always result in either a fast rule or a slow rule. For example, if your rule consists solely of a pair of long, unique content matches, it should be blazingly fast; in fact, the fast pattern matcher becomes exponentially faster as you feed it longer and longer content matches. Options like byte_test and byte_jump are also particularly quick. Complex PCREs - especially those that contain constructs like ".*" - will be slow, as will two- or three-byte content matches - especially common ones such as |BM| for bitmap headers - particularly if they're not bounded by clauses such as depth, offset, etc. Again, this can be tough for an end-user to fully understand on their own; as a general rule, though, the longer and more unique the content match you start with, the faster the rule will be.

  • Applicability of the rule: This factor is, of course, the most variable of all, depending on the environment a rule is being run in. However, some rules are clearly more applicable to a broad base of users than others: for example, a rule that catches an in-the-wild exploit will appeal to many more people than a rule designed to block Yahoo Messenger conversations. The good news for those of you playing at home is that this is the easiest metric to asses on-site; after all, you know your company's IT policy, you know what software you run, and as a result, you know whether a given rule will apply in your environment.

  • Age of the vulnerability: The longer it's been since a patch for a vulnerable piece of software was released, the higher the likelihood that any given system running that software has been patched, and the smaller the number of vulnerable hosts remain. That's why, for example, SID 2318 - which covers a vulnerability in the open-source CVS content tracking system from 2003 - is not included in any policy, despite the fact that the exploit allowed attackers to write arbitrary files to vulnerable servers. If you've patched all of your machines against a given exploit, there's no reason to be having your IDS look for that exploit (with the one important exception that if a rule is looking for a vulnerability that has occurred across a whole class of software - i.e. a buffer overflow in an FTP command - it may be a good idea to keep it enabled to protect against future vulnerabilities of that type).



So what do you do once you've narrowed your ruleset down to something more manageable? If you're in a hurry to get things deployed, it's probably OK to start running Snort at this point; you can tune as you go. From here on out, it's a matter of reviewing individual rules, which can take a considerable amount of time to do for the thousands of potential rules you may wish to be running. I'll go through some examples here, to give people a feel for what all can be involved in the process.


  • SID 818 ("WEB-CGI dcforum.cgi access"): Access to a known vulnerable script from 2001. This one's obvious - turn it off, no one runs this any more - but I'm including this one for a reason: to make the point that with any rule more than 5 years old (i.e. those with a SID under 3000 or so), the default assumption is that it should be turned off, and that a good reason should be found before you decide to enable it.

  • SID 2412 ("ATTACK-RESPONSES Microsoft cmd.exe banner"): The banner that is displayed when a Windows shell opens has left your network on a port other than FTP or Telnet (where you might expect to see such a banner normally). I'm including this as an example of an older rule where it's met the burden of proof to stay on: any time you see a shell opening up across the network, you'll want to know about it, because chances are high it means you've been compromised.

  • SID 13415 ("EXPLOIT CA BrightStor cheyenneds mailslot overflow"): This one requires a bit more thought. Obviously, you can disable it if you're not running CA BrightStor; however, if you are, you need to assess your patching process, and determine how confident you are that a 4-year-old server-side bug has been patched throughout your organization (which, unfortunately, is not always the case). Given that it has a 19-byte content clause, relatively simple detection otherwise, and is running on a specific port (138), if you're at all unsure as to your patch status, it wouldn't hurt to run it - particularly since this is a server-side exploit that could result in administrative privileges for an attacker, and an exploit was known to be running around in the wild.

  • SID 1842 ("IMAP login buffer overflow attempt"): Sure, the oldest reference in here is from 1999; however, you can see that there are lots of references across the years, up through 2007. Combine that with the fact that no specific product is mentioned in the name, and it's obvious that this rule catches a type of vulnerability commonly found among many different IMAP servers. If you're running one at all, it's probably best to leave this rule enabled.

  • SID 13517 ("EXPLOIT Apple QTIF malformed idsc atom"): Based solely on the name, you might be tempted to discard this. Looking up the CVE entry, however, shows that it's a buffer overflow in a widely deployed program, QuickTime, and that it's only two years old (which is still fairly current in the world of client-side exploits, where patch management is a much bigger issue than server-side). Checking the Bugtraq entry's exploit section, we see that there are no known exploits; combine this with the fact that it's got a four-byte content match and a single byte_test, which means it may be prone to false positives, and it makes the decision of whether to enable the rule almost a matter of personal preference and/or paranoia level.

  • SID 6510 ("WEB-CLIENT Internet Explorer mhtml uri shortcut buffer overflow attempt"): Yes, this is an older rule - the vulnerability is three and a half years old - but you can be virtually guaranteed that someone in your organization is running Internet Explorer, and it would be no surprise if you had, say, a sales guy whose laptop hadn't been updated in that time frame. The SecurityFocus exploits page only has proof-of-concept exploits, but buffer overflows are notoriously easy to turn into attacks that result in code execution. At first glance the rule may look slow - one of the two content clauses is "URL", and the PCRE is 64 characters long - but in reality performance should be solid because "mhtml|3A|//" is rare in web traffic, and the PCRE is looking for a relatively well-defined string. If I were looking at this, I'd leave the rule on for another year or so, just to be sure.

  • SID 15727 ("POLICY Attempted download of a PDF with embedded Flash"): This rule doesn't detect any specific exploits, just a type of document that has been known to have security problems that are difficult to detect accurately. Given the recent rash of 0-day attacks against Adobe products, you need to consider whether your organization actually has legitimate reason to be working with PDF files that have Flash videos embedded within them (a practice that's not particularly widespread). Unless you can identify specific reasons that you need such files, it's probably a good idea to enable this rule, to help protect against vulnerabilities you may not even know exist yet.

Does this sound like a lot of work? Definitely - but at the end of the day, doing this work will save you time later, as tuning like this will help to alleviate potential false positives, and will allow you to focus on actual attacks against your systems.
Add to Technorati Favorites Digg! This

2 comments:

mish said...

This is a really great post, thanks Alex. One of those posts I'll probably refer to extensively in the upcoming future. :-)

Jeganesh said...

Hi Alex..

Nice Tips...anyway I am still not get it how to tune those rules.so wht you mean there is we just comment the rules isit.?
advice on this ..thanks.