Ask our experts
Ask our subject-matter experts
Tell us a bit about yourself
Ask our experts
Tell us a bit about yourself
Many of the security threats that organizations face every day are technological in nature, but at their core they are still perpetrated by people. This applies regardless of whether the threat is a network exploit or the theft of valuable intellectual property. Moreover, not all threats originate from outside an organization; often it is the trusted insider who poses the greatest risk.
Further complicating matters is the fact that an insider threat can originate from a variety of situations. Some attacks will be malicious — perhaps from the greedy accountant who sees an opportunity to steal money, or the disgruntled IT support rep bent on sabotaging his company’s computer systems. But other threats could be due to a willfully negligent employee who misuses his work computer and exposes sensitive customer data, or from a distracted manager who unwittingly clicks on a phishing link in an email.
In this multilayered and asymmetric security environment, security decision-makers must therefore be effective at managing not just the technological aspects of risk but the human element as well. Failure to do so could result in harm to the enterprise’s finances, personnel, assets, networks or reputation.
Since the mid-1990s a growing number of companies have responded to the rise of cyber threats with tools designed to detect and analyze a range of adverse events. Initial solutions focused on ‘end-point’ devices such as laptops, then expanded in the early 2000s to detect threats across entire networks. Big-data analytics came along in the mid-2000s to aggregate and correlate burgeoning volumes of threat information across networks and devices, spawning a family of solutions called security information and event management (SIEM).
SIEM tools have evolved considerably in the last 10 years or so, but remain centered on collecting and correlating network events. In response, the security analytics community has started deploying a new generation of more sophisticated solutions generally known as user behavior analytics, or UBA* for short.
UBA is primarily focused on human activity and behavior, and is well suited to applications for detecting cyber attackers and insider threats. UBA solutions use a combination of analytic approaches — including rules-based, pattern-matching and statistical methods, and in some cases supervised and unsupervised machine learning — to establish baselines of how people typically use networks and devices, and then to detect significant anomalies in their behavior and send alerts to security analysts for further validation.
Source data for UBA systems typically includes structured data such as logs from network activity, SIEM and other devices. Gartner adds that ingesting unstructured information such as performance appraisals, travel records and social media activity “can be extremely useful in helping discover and score risky user behavior” because it provides better historical context.
Focusing on user behavior is a step forward for security analytics, but it won’t be sufficient to counteract the latest wave of rapidly evolving threats. In this white paper we will examine important lessons learned from the application of current UBA solutions, and describe a model-based approach that fundamentally redefines what we mean by effective user behavior analytics.
In today’s security environment, threat data is plentiful. As this bow wave of data increases, expectations for harnessing it to solve complex insider threat problems are also rising.
Unfortunately, many existing data mining techniques and SIEM/ UBA tools fail to solve such problems, and in some key respects they actually make matters worse. Patterns are discerned where there are none, key signals are missed entirely and false positives rapidly escalate. The cybersecurity company FireEye recently polled C-level security executives at large enterprises worldwide and found that 37% of respondents received more than 10,000 alerts each month, of which 52% were false positives and 64% were redundant alerts. This represents a huge burden on companies, since approximately 40% of them manually review each alert, overwhelming their analyst teams.
With the continuing proliferation of smartphones, wireless networks and (most recently) Internet-of-Things (IoT) devices — not to mention the people using them — the volume, velocity and variety of signals containing potentially actionable threat intelligence will balloon even further, meaning the ‘analyst overload’ problem will only get worse.
Over the last several years Haystax has worked with leading private-sector enterprises, and with agencies in the US government, developing and deploying risk management software solutions across a variety of critical missions. One key area of specialization has been insider threats.
This experience has given us a front-row seat on the issues organizations confront every day as they work to implement a comprehensive insider threat mitigation program across their enterprise. Here are the five most common lessons our partner organizations have learned.
The most well designed insider threat program will fall short if it relies solely on the analytical outputs of rules-based or machine-learning systems to monitor user and device behavior. Rules-based systems do well at flagging anomalies based on known behaviors but also tend to be too coarse-grained for the threats they are trying to detect, leading to an excess of red flags (most of them false positives) that overwhelm analysts. If more rules are added to manage a growing list of exceptions, then the entire system becomes unwieldy very quickly.
Machine learning systems can handle massive quantities of log data and automatically recognize patterns and correlations in that data ‘on the fly’ – but the systems must constantly be trained and retrained by experts. They tend to miss weak signals because they rely on statistical correlation, and they can’t say why a particular activity is anomalous. Moreover, like rules-based systems they are ineffective at detecting unprecedented ‘black swan’ scenarios, including many of the latest wave of asymmetric threats.
As threats have proliferated, the limitations of these data-driven UBA approaches have become increasingly apparent. Even more troubling: by the time a threat is detected, the attack has already occurred.
What all strictly ‘big-data’ systems have in common with each other is a lack of expert human judgment, operating on the opposing principle that ‘the data is the model.’ To be truly effective, though, UBA solutions must reason the way the best analysts do – by assembling many pieces of disparate information about an individual and fusing them into a picture of composite risk. In fact, in today’s multilayered threat environment they must leverage the combined wisdom of multiple experts across multiple domains. Given the scale and speed of the incoming data being analyzed, the analytics must be able to automate much of this human reasoning process, allowing the system to scale up to process thousands of events continuously as though they’d been individually evaluated by a team of experts.
Existing enterprise systems contain a wealth of data that can provide key insights and indicators to enhance the overall signal. But absent sufficient internal resources to analyze them properly, the signals are often ignored. UBA solutions should take advantage of as many internal sources as possible, not just badge scans and existing network feeds but also performance reviews, travel documents, HR records, CRM data and threat intelligence feeds. Even public and third-party sources – for example, criminal and financial records, as well as open-source data – should be used for evidence that can bolster often weak internal signals. And an effective system should be adaptable enough to leverage these sources without requiring complex integrations or the building of an entirely new infrastructure.
It goes without saying that a modern analytics solution must be designed for scale. But for threat detection and analysis, scale means much more than managing large amounts of data with computational power and throughput. Most critically, the system must be designed to minimize the number of analysts required to investigate alerts, even as the data load grows exponentially.
Reducing false positives and focusing analysts on the most important risks while absorbing an ever increasing amount of data therefore requires an additional layer of sophisticated reasoning algorithms that can fuse a wide variety of data types to provide the context needed to rapidly identify serious threats – without generating correspondingly high volumes of nuisance alerts.
Many of the rules-based and machine-learning systems being deployed today are closed-loop or ‘black box’ solutions, meaning the underlying analytic processes and algorithms remain unknown to the user. The bulk of today’s cases revolve around sensitive personnel and corporate security issues, and any deployed system must provide transparency into what factors raised an individual’s threat score, and when. Organizations that take proactive steps to mitigate risks must be able to explain and defend how and why they arrived at their decision.
Likewise, in order to tap into multiple data sources for context and reinforcement of weak signals, UBA system interfaces must be open. This enables effective system integrations and avoids the common traps of walled gardens, vendor lock-in and expensive data integration projects. The solution should provide a means not just to bring data in but to share the UBA solution’s insights with other enterprise systems.
As organizations have learned what doesn’t work well from their own experiences, they have developed a clear sense of what is required to prevail in today’s evolving threat landscape. And there is an emerging consensus among them that any UBA solution that aspires to manage insider threats as part of a world-class enterprise security and risk management program must have the following three core characteristics.
Organizations want more than just a threat detection system that tells them an attack has already taken place. They need an early warning system that allows them to anticipate major adverse events through a comprehensive threat assessment framework that leverages all available internal and external data, while producing far fewer false positives. The way to do this is to build an expert model of their specific security challenges, in coordination with their own analysts, and then run all available and relevant data sources against that model in software, and in real time. Another key to becoming more predictive is that the system must be able to display indicators and issue alerts as soon as a high-priority risk is detected.
The system must be able to evolve as the understanding of a threat improves over time, and to adjust to shifting organization-specific needs. It also should be able to integrate with existing enterprise systems that contain potentially valuable intelligence. The capability should additionally be flexible and open enough to provide transparency and traceability in analytic results, and it naturally must comply with an organization’s own security protocols and technology standards.
The system must be able to absorb new data loads regardless of format or volume, as soon as the data becomes available. More importantly, as threats spike, the system should be able to handle the increased load without either bogging down or producing a surge in false positives that overwhelms the existing analyst team. And the system must scale to an organization’s national or global footprint as it grows.
Haystax Technology believes that decision-makers get better answers to complex problems — and start getting them more quickly — when careful analysis takes precedence over ‘big data’ collection and mining. That is why we first define a problem set and extract knowledge from subject-matter experts and reference materials, then build a mathematical representation, or model, of that particular problem domain. Only then do we identify and apply data to the model.
Our unique patented approach blends the resulting qualitative model-based expert judgments with quantitative artificial intelligence techniques to find threat signals buried inside multiple internal and external data sources, producing prioritized indicators of risk and rapidly delivering those indicators as actionable intelligence to analysts and decision-makers.
Haystax’s fully operational UBA solution, known as Constellation for Insider Threat, has been deployed at public and private organizations to establish and continuously validate the trustworthiness of personnel (plus contractors and vendors), regardless of their roles or the level of potential risk they pose.
Within an enterprise, typical users and uses for the Haystax solution include:
One of the key benefits of the Haystax solution is that it will indicate not only insiders who might do harm to an organization, but also individuals who exhibit the highest levels of trustworthiness and the least number of issues of concern, as well as everyone in between.
Because it starts with a model, Constellation for Insider Threat is the opposite of most data-driven UBA solutions deployed today, providing users with a more predictive way to pinpoint their highest-priority threats at very large scale without being deluged with noisy alerts or having to hire an army of analysts.
Constellation for Insider Threat is the ideal solution for users who:
Haystax builds several types of models, but the most critical is the Bayesian inference network, which is able to efficiently represent probabilistic knowledge about a complex problem domain and then use that knowledge to reason intelligently in the domain under conditions of uncertain, incomplete or even contradictory data.
The Bayesian network first captures a set of top-level concepts from the problem domain. For example, for insider threat applications it seeks to know above all else if a person is trustworthy, or reliable. It then captures the relationships between the concepts (e.g., ‘trustworthy people are likely to be reliable’), and most importantly the strength of the relationships between the concepts (e.g., ‘trustworthy people are likely to be very reliable’).
In addition, the Bayesian network (see image below) captures our knowledge about how knowing one concept influences our belief in another concept. When concepts and their influences are connected together in a multi-level network, Bayesian networks can capture multiple levels of influence, and thus have the ability to express the relationships and dependencies of many related concepts in a complex domain.
Individual nodes of the Bayesian network break each concept down into smaller and smaller sub-concepts until they become causal indicators that are measurable in data. In order to use a Bayesian network as a model, data is applied to it, in a process called ingestion, to set the belief states of the variables. The Bayesian network update algorithm then propagates the beliefs to all other concepts in the model.
Some nodes can be informed fairly directly by input data, but in other cases we will use machine-learning, rules-based and other artificial intelligence techniques — collectively known as augmentation — to enrich the data and generate new events. Augmentation is particularly useful for flagging anomalies and discrepancies, which can then be used as evidence in the Bayesian inference network.
It is important to note that Haystax employs a recurring ‘test and tune’ process to ensure that all model elements work as expected, and that any model or data issues are identified and resolved quickly. During model tuning we seek out opportunities to define new concepts, new augmentations and new data, and to refine the model accordingly.
The Bayesian network at the heart of Constellation for Insider Threat is a probabilistic model called Carbon that mathematically represents over 700 different human behaviors and life events as key risk indicators, culled from subject-matter experts in psychology, cybersecurity, fraud, HR and other relevant domains (see image below).
Carbon is intuitively simple, in that it uses common phrases and terminologies familiar to all security analysts. It was developed by our data scientists to detect individuals who show an inclination to commit or abet a variety of adverse malicious insider acts, including: committing fraud (see image on page 6); leaving a firm or agency with stolen information, or selling the information illegally; sabotaging an organization’s reputation, systems or infrastructure; and committing workplace violence. It also can detect acts of willful negligence and unwitting or accidental behavior that could harm an organization.
In this way Carbon reflects the broad and disparate array of underlying causes and drivers of insider threat behavior. Malicious acts are accounted for with nodes that address factors such as disgruntlement, financial pressures, mental illness, ideology, greed, etc.; negligence is detected with indicators of rule-flouting, hubris, careless attitudes, etc.; and inadvertent behaviors are indicated by fatigue, human error, substance abuse, etc.
One of the added benefits of using a model-first approach is that it provides a structure for identifying data that will be applied to the model. In this way users can select the data they need (and pinpoint any critical data gaps they may have) while ignoring data that’s unlikely to be useful.
Data applicable to Constellation for Insider Threat can take many forms. It can be structured or unstructured, streaming or in batches. It can include network and printer activity, building access logs, investigations case data, employment applications, performance reviews, travel and HR records and any other internally sourced documents or files about individuals in the organization. It can also include public records, third-party data and even social media posts and news/RSS feeds related to the individual. Employing simple connectors and configuration tools, users can add new data as they discover it and apply it to the problem at hand.
There are literally dozens of distinct threat signals that could be visible to an enterprise, and for which data is readily available. Among the indicators being monitored within the Carbon model are: pending departmental layoffs (HR plans); accessing workplace at unusual times of day (badge in/out logs); copying or printing large or sensitive files (network behavior, printer logs); unauthorized access and/or escalating privileges (network behavior); poor workplace performance reviews (HR files); financial difficulties (public bankruptcy, divorce records); narcissistic behavior (HR, staff/management reports); substance abuse (public arrest records, SF-86, HR); and family relationship issues (public divorce, criminal records).
The results of this seemingly complex collection of data are made available in easy-to-understand risk ratings that enable an organization to understand, at both macro and micro levels, the specific risks they face. Beyond the simple score, analysts and other users can drill down into the specific factors that led to the overall risk score, ensuring they have a complete understanding of the situation and are able to take appropriate mitigation actions.
Haystax has engineered Constellation for Insider Threat so that it can be extended to handle any velocity or volume of data. And because the system has the ability to map new data inputs in real time, users can get continuously updated awareness of their security environment and immediately detect even slight changes in threat levels.
Constellation for Insider Threat can be run as software as a service (SaaS) in the cloud or on-premises to meet an organization’s specific requirements. Haystax deploys each system with a simple interface that contains a variety of visualization and reporting tools so that every user, from an analyst in the SOC to a decision-maker in the C-suite, is provided with a tailored way of viewing data and understanding the risk results generated by the system.
The nerve center of Constellation for Insider Threat is the Dashboard, a single-screen environment for viewing threatening patterns and trends within an organization (see image below). The Dashboard contains windows that highlight specific high-risk individuals as well as organization-wide trends — color-coded and mapped to grids for easy viewing — providing users with a rapid way of understanding what is driving a particular threat. The Carbon model itself can also be viewed in the Dashboard, and users can generate a series of custom reports from on-screen.
Besides displaying analytical results, Constellation for Insider Threat features a series of applications that correspond to existing workflows and additional security-related tasks like managing data on individuals, performing assessments, managing incidents and viewing threat information.
For example, the Assets app helps users analyze in greater depth each monitored individual in the system (see image below), providing all the information they need in one location. Editable data fields include address, contact information and custom tags to describe risk levels, roles in the organization, departmental assignments, etc. An events field captures birth date, education and employment history, awards, military service and other life milestones. Model results displaying specific indicators of the individual’s level of trustworthiness can also be viewed in the Assets app.
Personnel assessments are another key element of investigative workflows, providing critical intelligence on staff whose activities have been detected in the system. These can be carried out in the Assessments app. Pre-packaged assessment templates include Supervisor Interview, Insider Employment Application, Insider Risk Level, Insider Watch-List Screening and Insider Reference Check; users can upload their own existing personnel assessment forms as well. These assessments ease corporate workflows and help ensure all those who need to know are kept in the loop when it comes to making critical insider threat-related decisions.
Haystax Technology has consistently emphasized a pioneering model-driven approach to security analytics. In our early work on critical infrastructure protection, we built and applied models of threat likelihoods, asset vulnerabilities and post-attack consequences to analyze and prioritize an organization’s asset risks.
More recently, we have developed new models and analytic approaches optimized for detecting and prioritizing adverse cyber events — such as malware attacks, financial fraud, IP theft and other exploits — regardless of whether the sources of the threat are trusted insiders, external actors or a combination of both.
This approach enables Constellation for Insider Threat to filter out much of the noise that would be generated by other UBA tools. But it goes a step further as well. By providing additional data on each individual in an organization, a more textured picture behavior is detected early enough to enable a manager to offer help to an employee going through difficulties.
It seems that every time there’s a major insider event, we later learn it was possible to conclude from readily available evidence that the individuals behind the events were high risk, or at least were deviating from their normal patterns of life. By favoring a model-driven approach over conventional big-data techniques, Constellation for Insider Threat excels at intelligently correlating information about a person in broader context and in finding individuals who are still in the earliest stages of becoming a threat. In other words, the system is predictive rather than reactive. In addition, the system not only tells our users who may be risky, but why, providing an unprecedented level of transparency to decision-makers.
Thanks to open APIs and simple model tuning, moreover, the system is adaptable enough to integrate with a wide and growing array of existing information sources and SIEM/UBA data, and to keep pace with the latest threats. Finally, Constellation for Insider Threat has already been proven across thousands of personnel at leading corporations and large government agencies, demonstrating that it can scale to the size and scope of any organization, and any threat.
* Leading industry analyst groups use somewhat different terms and definitions for UBA, but all are essentially describing a similar range of features and capabilities. Gartner refers to User and Entity Behavior Analytics (UEBA), Forrester has Security User Behavior Analytics (SUBA) and Enterprise Strategy Group favors Security Operations and Analytics Platform Architecture (SOAPA). In this white paper we will use UBA as the standard term.
Haystax Technology is a leading security analytics platform provider, delivering advanced security analytics and risk-management solutions that enable rapid understanding and response to virtually any type of cyber or physical threat. Based on a patented model-driven approach that applies multiple artificial intelligence techniques, it reasons like a team of expert analysts to detect complex threats and prioritize risks in real time at scale. Top federal government agencies and large commercial enterprises, as well as state and local public-safety organizations, rely on Haystax for more effective protection of their critical systems, data, facilities and people.