Note: This is the introductory post of a five-part series assessing three common security analytics approaches that typically don’t work well. Future articles will examine each of the three in greater detail and then we’ll wrap up with a piece on how they could, and should, work.
Today’s security products are broken — and we all know it. The threats have morphed and multiplied. The attackers outmaneuver or overwhelm the defenders. Even the underlying doctrines are wrong for the times.
We also read in articles and white papers that ‘Security Analytics’ is the solution, the wave of the future, the Next Big Thing. At this meta level, though, the word has little meaning. So let’s unpack it a bit and examine three analytical approaches that are often held up as shining exemplars of this brave new world: Bayesian networks, machine learning, and rules-based systems. We know these three approaches well because they form the core of our Haystax Analytics Platform™, but we also see plenty of implementations where they either don’t produce good results, don’t scale or are too hard to work with.
In this series we’ll explore why they’re so challenging and describe how we’ve addressed those challenges in our products. But first, a quick summary of the approaches and why they typically don’t work:
Bayesian probability theory says it’s possible to come up with a surprisingly accurate prediction of the likelihood of something happening (or not happening) in a transparent and analytically defensible way, and the BayesNet captures all the elements of the problem and possible outcomes mathematically. The harder the problem, the better it works — in theory at least. In reality, the typical approach is to gather a roomful of PhDs and take a really long time and much wrangling to build a BayesNet (boy, that math is hard). Then, with even greater difficulty and more man-hours, the BayesNet is turned into software by a roomful of coders. The resulting product is something the user struggles to even understand, let alone use. Not surprisingly, there’s an emerging camp that says BayesNets are old fashioned and not suited to solving today’s security challenges — especially now that you can do machine learning.
In Arthur Samuel’s classic definition, machine learning “gives computers the ability to learn without being explicitly programmed,” so that it can be used, for example, to uncover hidden insights from historical relationships and trends found in data. While that may have excited the circa-1959 set, we’ve had over 50 years to discover some of its limitations as well. First, there are no real, generalizable approaches to machine learning. Second, correlation ain’t everything in today’s world of black-swan scenarios and asymmetric threats. Third, most machine learning solutions come black-boxed; users who have to make and then defend their critical decisions hate that. And finally, hasn’t science taught us always to start with a hypothesis anyway?
Much simpler and more common than BayesNets and machine learning, rules-based systems have their own inherent drawbacks. Because they are typically binary, the outputs tend to be too coarse-grained for the often subtle threats they’re trying to detect and identify. This leads to a proliferation of red flags (many of them false positives), which then leads to a proliferation of pricey analysts. Try to create special rules for special cases, though, and you get a proliferation of rules. Paralysis reigns, and the world is still not safer.
The good news is that these approaches can be harnessed in meaningful ways — as long as they’re thoughtfully built, combined and applied. We’ll explain how in the next several posts.
Bryan Ware is CEO of Haystax Technology.