Many organizations focus their detection strategy almost exclusively on malware, not realizing that attackers don't need it to compromise their networks. When you start to look at the extensive intruder behavior outside of malware, you quickly recognize the massive detection challenges we face today. Not only do these intruders change their techniques when they become easy to detect, but all too much of the detection available depends on events occurring at a single point in time. This inability to automatically correlate events over time to only trigger intelligent, valuable alerts is one of the most significant limitations we need to overcome and I will explain why we need to leverage more sophisticated rules engines, for lack of a better term, with some basic examples.
Most IOCs are designed to indicate "bad only" activity
Network compromises are no longer likely to be a single piece of malware with a known signature running amok and installing itself on every system it can reach. Sure, there are always going to be infections in your organization which involve a piece of known malware, but more malicious activities come in the form of unknown software, attackers with interactive tools, and seemingly legitimate access through stolen credentials. The IOCs most organizations use to detect these activities are dependent on guaranteed negative attributes seen simultaneously, such as attempting to send unencrypted data on a non-standard port to a .su domain (because what good things are occurring on Soviet Union domains these days?).
This can be effective against certain threats, but it's the equivalent of searching for villains by checking every mustachioed man's house for stockpiles of rope and large mallets. Building up more and more rules for alerting on this kind of activity has helped a lot of organizations thwart attacks, but it also takes a great deal of experience with missed attacks to compile a solid series of rules which either may alert again or may never trigger at all. Do you alert when you see a man with both rope and mallet, but just hasn't shaved in a day? What if someone with a mustache has been seen with both, but never at the same time? If the only certainty is for a threat to possess all three attributes at once, this emphasizes how limited very strong indicators of compromise can be when used in solutions which only correlate data points at a single point in time.
Attackers have adjusted to use a great deal more "grey" behavior
If you want to evade "bad only" detection, your best bet is to avoid activity that can only possibly be malicious. Since attackers are motivated people and motivated people adjust their approach to a problem when they have been unsuccessful, they have started using remote administration tools, stolen credentials, and a compromised machine in your company's primary country to serve as their access point. Then, to get the data out, they encrypt it and transmit over port 443. All of these actions, in isolation, are not enough to trigger any "known bad" alerts, so they have a nearly unlimited amount of time to operate, until a database administrator is lucky enough to notice his account being used without his permission.
These attackers are not the old school villain tying a damsel to railroad tracks, but closer to "The Talented Mr. Ripley". No single action is enough to raise your suspicion, and you occasionally get an uncomfortable feeling, but by the time you realize what is happening, you are already a victim. We don't have the luxury of viewing all of the events of an attack from the audience's viewpoint. That is left to the follow-up reports wrapping it into a single hindsight narrative scapegoating the security team as incompetent.
Building alerts for "possibly bad" activity causes alert fatigue
A lot of the more funded teams recognized the shift toward questionable behavior rather than the obviously malicious, so they started building rules to alert on both. However, due to limitations in the rules engines at their disposal, they had to create a great deal of rules guaranteed to generate a lot of false positives. Without being able to look at activity even one minute before or after an event, alerts were the best way to take billions of events and pare them down to hundreds of thousands for triage. However, just as quickly as new attacker behaviors were learned and corresponding rules created, the triage teams started to lose the sense of urgency because they knew over ninety percent of the alerts would be false positives.
The teams creating the rules are not to blame. It is the rules engines themselves. Just as you can tweak a car alarm's sensitivity so it doesn't sound when a loud truck passes, you can tweak or remove rules which are too frequently triggered, but the amount of effort involved is tremendous because you are still tweaking the triggers based on simultaneous events. Think about the last time you saw anyone urgently rush over to make sure an alarming car wasn't being stolen. Car alarms can only recognize events in the moment, so they are never going to be as useful as a human interpreting a slight nudge of the car to mean nothing unless it is then followed by another indicator a few moments later.
Detection for modern intruders must correlate actions across time
You aren't going to pick up the nuance of tripley not being the real Tom Ripley simply by creating a rule to flag every account which opens two concurrent VPN sessions running from different origination IP addresses. The chances these session overlap are incredibly rare, even if stolen, because the intruder is going to use various means to access the network, change accounts frequently after initial access, and use a lot more "low and slow" actions than malware of years past. We don't have the benefit of enjoying every single moment as dogs do, but as humans recognize cause and effect rather well, we can realize when an event explains what occurred two minutes earlier. We need our rules engines and detection to be able to effectively identify the sequence of negative events over time.
At Rapid7, this recognition and comparison of events over time was built into the UserInsight solution from day one. We've moved from static indicators of compromise, such as concurrent VPN sessions, to behavioral patterns happening over the course of an hour or day. We are not trying to create the most IOCs of any solution ever, but each is designed as an intelligent, low-noise alert which can wait after witnessing event X until a follow-up event Y occurs within a given time period before triggering. Accessing a new asset for the first time is not noteworthy enough to cause a team to investigate the event, but accessing a few new assets in an hour might be extremely abnormal for the real tripley and, thus, warrant an alert.
Near-time events are much more likely than simultaneous events, but they're a much bigger analytical challenge. UserInsight leverages its cloud-based analytics platform to detect these types of events, and we're constantly building out even more sophisticated intruder behavior patterns. As older solutions now offer you the ability to Hadoop this data and NoSQL that data with the promise to immediately bring all the benefits of security analytics, you need to ask how it improves your team's ability to detect and investigate attacks. There are benefits to these technologies when built into the core of solutions, but as an afterthought, they give your team a lot more questions you can ask your historical data for reporting purposes, but not smarter alerts or clear answers.
If you want to learn more about UserInsight and our approach to detecting intruder behavior, check out the Rapid7 Incident Detection and Response page. I think you'll find we have the services and solutions to bring low-noise incident alerting into your organization.