Site logo
Sticky header logo
GET A DEMO
  • PLATFORM
  • SOLUTIONS
    • For School Safety
    • For Event Security
    • For Law Enforcement
    • For Emergency Management
    • For Fire Safety
    • Public Safety Services
  • ABOUT
    • Leadership Team
    • Our Data Science
    • Careers
    • Case Studies
    • Awards
    • Login
  • BLOG
  • RESOURCES
  • CONTACT
Mobile logo
Sticky header logo
Mobile logo
Sticky header logo
Advanced Threat Analytics Analytics artificial intelligence Bayesian networks cloud solutions cybersecurity emergency response Enterprise Threat Management events field intelligence first responders Haystax Haystax Analytics Haystax Analytics Platform Haystax Technology incident management Insider Threat law enforcement Media Coverage mobile apps Modeling public safety Public Safety Cloud risk management school safety security analytics situational awareness threat analytics threat monitoring user behavior analytics
Previnsider threatHaystax Technology VP to Discuss Insider Threat Mitigation at DoDIIS Event25 May 2018NextCyber securityIntegrated Risk Management Comes of Age12 June 2018
  • Technology & Data Science

Principled AI with Probabilistic Machine Learning

May 29, 2018

At Haystax Technology, we are proponents and early adopters of principled approaches to machine learning (ML) and artificial intelligence (AI) for cybersecurity.

We use the term ‘principled AI’ to describe what we call our Bayesian AI approach, which is built on the coherent mathematical principles of probability theory, information theory and Bayesian decision theory. These principles help us keep our AI transparent, explainable and interpretable. Most importantly, they enable our systems to quantify uncertainty, unlike the black-box approach of deep neural networks. Our users and followers often hear us evangelize this principled approach through publications and conferences, boot camps and local meetups.

Last month, I gave a presentation titled “Introduction to Probabilistic Machine Learning using PyMC3” at two local meetup groups (Bayesian Data Science D.C. and Data Science & Cybersecurity) in McLean, Virginia. The following is a summary of the concepts we discussed during the meetup.

General Overview

Many data-driven solutions in cybersecurity rely heavily on machine learning to detect and predict cyber crimes. This may include monitoring streams of network data and predicting unusual events that deviate from the norm. For example, an employee downloading large volumes of intellectual property on a weekend. Immediately, we are faced with our first challenge, that is, we are dealing with quantities (unusual volume/unusual period) whose values are uncertain. To be more concrete, we start off very uncertain whether this download event is unusually large and then slowly get more and more certain as we uncover more clues such as the period of the week, performance reviews for the employee, have they visited WikiLeaks, etc.

In fact, the need to deal with uncertainty arises throughout our increasingly data-driven world. Whether it’s Uber autonomous vehicles needing to predict pedestrians on roadways or Amazon’s logistics apparatus having to optimize its supply chain, these applications are compelled to handle and manipulate uncertainty. Consequently, we need a principled framework for quantifying uncertainty that will allow us to create applications and build solutions in ways that can represent and process uncertain values.

Fortunately, there is a simple framework for manipulating uncertain quantities, and it uses probability to quantify the degree of uncertainty. As Prof. Zhoubin Ghahramani, Uber’s Chief Scientist and Professor of AI at University of Cambridge, put it:

The mathematical language for representing uncertainty is probability theory. So in the same way as calculus is the language for thinking about rates of change, probability theory is the mathematical language for representing uncertainty.

This has resulted in a principled approach to machine learning based on probability theory called probabilistic machine learning. It is an exciting area of research that is currently receiving a lot of attention in conferences (NIPS, UAI, AISTATS), journals (JMLR, Nature), open-source software tools (TensorFlow Probability, Pyro) and practical applications at notable companies such as Uber AI, Facebook AI Research, Google AI, and Microsoft Research.

Probabilistic Machine Learning

In general, probabilistic machine learning (PML) can be defined as an interdisciplinary field focusing on both the mathematical foundations and practical applications of systems that learn models from data. It brings together ideas from statistics, computer science, engineering and cognitive science as illustrated in the figure below.

Image Credit: https://mlg.eng.cam.ac.uk/zoubin/

In this framework, a model is defined as a description of data one could observe from a system. In other words, a model is a set of assumptions that describe the process by which the observed data was generated. This model can be developed graphically in the form of a probabilistic graphical model as illustrated in the figure below.

The circular nodes above represent random variables for the uncertain quantities (e.g., unusual volume or unusual period) and the square nodes represent the uncertainty over the corresponding quantities (e.g., the probability of unusual volume). The downward arrow shows the direction of the process that generated the data. The upward arrow shows the direction of inference, that is, given observed data we can learn the parameters of the probability distributions that generated the observed data. As we observe more and more data, our uncertainty over the random variables (e.g., unusual volume) decreases. This is the modern view of machine learning according to Prof. Chris Bishop of Microsoft Research.

Learning follows from two simple rules of probability, namely:

  • The sum rule: p(\mathbf{\theta}) = \sum_{y} p(\mathbf{\theta}, y)
  • The product rule: p(\mathbf{\theta}, y) = p(\mathbf{\theta}) p(y \mid \mathbf{\theta})

These two rules can be formulated into Bayes Theorem, which tells us the new information we have gained about our original hypothesis (or parameters) given observed data.

\begin{equation}\label{eqn:gpsim}
p(\mathbf{\theta}\mid \textbf{y}) = \frac{p(\textbf{y} \mid \mathbf{\theta}) \, p(\mathbf{\theta})}{p(\textbf{y})},
\end{equation}

where:

p(\mathbf{\theta}\mid \textbf{y}) = the posterior distribution of the hypothesis (or parameters), given the observed data
p(\textbf{y} \mid \mathbf{\theta}) = the data likelihood, given the hypothesis (or parameters)
p(\mathbf{\theta}) = the prior over all possible hypotheses (or parameters)
p(\textbf{y}) = the data (constant)

This PML approach has proven to be preferable to deep learning in many applications that require transparency and oversight. Although deep learning has produced amazing performance on many benchmark tasks in specific applications, such as computer vision and conversational AI (e.g, in the recent Google Duplex), it has several limitations in much more general and broader use cases such as cybersecurity and banking. Deep learning systems are generally:

    • Very data hungry (i.e., they often require millions of examples for training)
    • Very compute-intensive to train and deploy (i.e., they require cloud GPU & TPU resources)
    • Poor at representing uncertainty
    • Easily fooled by adversarial examples
    • Finicky to optimize: choice of architecture, learning procedure, etc., require expert knowledge and experimentation
    • Uninterpretable black-boxes, lacking in transparency and difficult to trust

In contrast, PML systems are transparent and explainable, and do not require lots of data and computer power.

Currently, it is easier than ever to get started building PML systems, thanks to a plethora of open-source software tools called Probabilistic Programming Languages. These include Google’s TensorFlow Probability, Uber’s Pyro, Microsoft’s Infer.Net, PyMC3, Stan and many others.

The following presentation contains a few of the topics that we discussed during the recent meetup. Materials from the meetup, including slides and source code, are provided below.

Daniel Emaasit is a Data Scientist at Haystax Technology. For a more detailed treatment of this subject, please see Daniel’s blog.

Source code

For interested readers, two options are provided below to access the source code used for the demo:

  1. The entire project (code, notebooks, data, and results) can be found here on GitHub.
  2. Click this icon Binder to open the notebooks in a web browser and explore the entire project without downloading and installing any software.

References

  1. Ghahramani, Z. (2015). Probabilistic machine learning and artificial intelligence. Nature, 521(7553), 452.
  2. Bishop, C. M. (2013). Model-based machine learning. Phil. Trans. R. Soc. A, 371(1984), 20120222.
  3. Murphy, K. P. (2012). Machine learning: a probabilistic perspective. MIT Press.
  4. Barber, D. (2012). Bayesian reasoning and machine learning. Cambridge University Press.
  5. Salvatier, J., Wiecki, T. V., & Fonnesbeck, C. (2016). Probabilistic programming in Python using PyMC3. PeerJ Computer Science, 2, e55.
  • artificial intelligence
  • Bayesian networks
  • Daniel Emaasit
  • machine learning
  • Modeling
  • principled AI
  • probabilistic machine learning
  • probabilistic programming

Related posts

Policy & Practice Insider Threat
September 3, 2021in Policy & Practice 0 Comments 4 Likes

Insider Threats Are Evolving – Fast

Product Updates Introducing: A Managed Service for Proactive Insider Threat Detection & Response
June 28, 2021in Product Updates 0 Comments 3 Likes

Introducing: A Managed Service for Proactive Insider Threat Detection & Response

Policy & Practice Using Commercially Available Data to Detect Insider Threats
April 7, 2021in Policy & Practice 0 Comments 2 Likes

Using Commercially Available Data to Detect Insider Threats

Technology & Data Science Two Essential Tools for Successful Insider Risk Mitigation – Part 1
October 11, 2020in Technology & Data Science 0 Comments 5 Likes

Two Essential Tools for Successful Insider Risk Mitigation – Part 1

Technology & Data Science Counting the Costs of an Insider Attack
September 9, 2020in Technology & Data Science 0 Comments 4 Likes

Counting the Costs of an Insider Attack

Policy & Practice CDM Publishes Haystax Article on Proactive Insider Threat Mitigation
July 13, 2020in Policy & Practice 0 Comments 3 Likes

CDM Publishes Haystax Article on Proactive Insider Threat Mitigation

Company News Haystax Wins Triple Gold in Cybersecurity Excellence Awards
February 21, 2020in Company News 0 Comments 6 Likes

Haystax Wins Triple Gold in Cybersecurity Excellence Awards

Technology & Data Science Can Employee Wellness Programs Help Avert an Insider Attack?
January 17, 2020in Technology & Data Science 0 Comments 3 Likes

Can Employee Wellness Programs Help Avert an Insider Attack?

Technology & Data Science Self-Driving Cars and Insider Risk
December 27, 2019in Technology & Data Science 0 Comments 6 Likes

Self-Driving Cars and Insider Risk

Policy & Practice Insider Threat Mitigation: The ‘Whole’ Story
December 18, 2019in Policy & Practice 0 Comments 0 Likes

Insider Threat Mitigation: The ‘Whole’ Story

Policy & Practice Finding Edward Snowden: A Haystax Use Case
December 6, 2019in Policy & Practice 0 Comments 3 Likes

Finding Edward Snowden: A Haystax Use Case

Company News A New Way to Hunt Insider Threats
October 23, 2019in Company News 0 Comments 1 Likes

A New Way to Hunt Insider Threats

Policy & Practice Continuous Evaluation Could Save Billions, RAND Says
September 30, 2019in Policy & Practice 0 Comments 0 Likes

Continuous Evaluation Could Save Billions, RAND Says

Policy & Practice Finding Ana Montes: A Haystax Use Case
September 24, 2019in Policy & Practice 0 Comments 0 Likes

Finding Ana Montes: A Haystax Use Case

Technology & Data Science A Risk-Based, Data-Driven Approach to Continuous Vetting
September 18, 2019in Technology & Data Science 0 Comments 0 Likes

A Risk-Based, Data-Driven Approach to Continuous Vetting

Policy & Practice Insider Threat
July 11, 2019in Policy & Practice 0 Comments 2 Likes

Most UBA Doesn’t Actually Focus on the User

Policy & Practice GDPR, One Year Later
May 25, 2019in Policy & Practice 0 Comments 1 Likes

GDPR, One Year Later

Technology & Data Science Multiple-Persona Disorder: Understanding All Your Insider Threats
April 30, 2019in Technology & Data Science 0 Comments 1 Likes

Multiple-Persona Disorder: Understanding All Your Insider Threats

Company News Sasi Mudigonda Joins Haystax Executive Team
March 13, 2019in Company News 0 Comments 5 Likes

Sasi Mudigonda Joins Haystax Executive Team

Company News cybersecurity
February 22, 2019in Company News 0 Comments 1 Likes

Haystax Wins Cybersecurity Excellence Gold Award

Technology & Data Science Fraud
January 31, 2019in Technology & Data Science 0 Comments 3 Likes

Small Businesses Most Vulnerable to Insider Fraud

Policy & Practice h
October 29, 2018in Policy & Practice 0 Comments 1 Likes

‘Last-Mile’ Workflows for Tighter SOC Responses

Technology & Data Science Gleaning Deeper Insights from Badge Data – Part 2
October 8, 2018in Technology & Data Science 0 Comments 3 Likes

Gleaning Deeper Insights from Badge Data – Part 2

Company News Haystax Talks Probabilistic Modeling at MIT
October 2, 2018in Company News 0 Comments 0 Likes

Haystax Talks Probabilistic Modeling at MIT

Technology & Data Science Gleaning Deeper Insights from Badge Data – Part 1
September 23, 2018in Technology & Data Science 0 Comments 1 Likes

Gleaning Deeper Insights from Badge Data – Part 1

Policy & Practice Insider Threat, Security Analytics
September 21, 2018in Policy & Practice 0 Comments 0 Likes

Personal Trust Scoring Gains Acceptance

Technology & Data Science Get the Data You Need
September 17, 2018in Technology & Data Science 0 Comments 4 Likes

Get the Data You Need

Company News x
September 7, 2018in Company News 0 Comments 1 Likes

Haystax VP Writes Article on Insider Threat Mitigation Techniques

Company News Fishtech
August 20, 2018in Company News 0 Comments 1 Likes

At the Fishtech Pro Tour: UBA for Insider Threat

Technology & Data Science Image of Cyber Fraud model on laptop
July 5, 2018in Technology & Data Science 0 Comments 2 Likes

How Bayesian Networks Glean Better Insights During Clearance Investigations

Technology & Data Science insider threat
June 25, 2018in Technology & Data Science 0 Comments 1 Likes

The Case of the ‘Disgruntled’ Tesla Insider

Policy & Practice Cyber security
June 12, 2018in Policy & Practice 0 Comments 0 Likes

Integrated Risk Management Comes of Age

Company News insider threat
May 25, 2018in Company News 0 Comments 1 Likes

Haystax Technology VP to Discuss Insider Threat Mitigation at DoDIIS Event

Technology & Data Science Gartner Sees Evolving UBA Market
May 20, 2018in Technology & Data Science 0 Comments 1 Likes

Gartner Sees Evolving UBA Market

Company News Gartner Highlights Haystax Technology in UEBA Guide
May 11, 2018in Company News 0 Comments 0 Likes

Gartner Highlights Haystax Technology in UEBA Guide

Company News cybersecurity
April 20, 2018in Company News 0 Comments 0 Likes

Haystax Technology CEO to Speak on Cybersecurity

Technology & Data Science Gaussian processes
March 20, 2018in Technology & Data Science 0 Comments 0 Likes

Gaussian Processes with Spectral Mixture Kernels to Implicitly Capture Hidden Structure from Data

Product Updates Haystax Product Update: Workflow-Driven Analytics for Actionable Threat Hunting
March 19, 2018in Product Updates 0 Comments 0 Likes

Haystax Product Update: Workflow-Driven Analytics for Actionable Threat Hunting

Company News cyber
February 28, 2018in Company News 0 Comments 0 Likes

Haystax Technology is Red Hot Cyber Awardee

Policy & Practice Security Analytics and the GDPR’s ‘Right to Explanation’
February 26, 2018in Policy & Practice 0 Comments 0 Likes

Security Analytics and the GDPR’s ‘Right to Explanation’

Company News cybersecurity
February 8, 2018in Company News 0 Comments 1 Likes

Haystax Technology Takes Gold in Cybersecurity Excellence

Company News Cybersecurity
February 2, 2018in Company News 0 Comments 1 Likes

Haystax Technology is Finalist for 2018 Cybersecurity Excellence Award

Technology & Data Science Insider threats
January 8, 2018in Technology & Data Science 0 Comments 3 Likes

Mitigating Insider Threats Using Bayesian Models

Technology & Data Science Machine Learning: Expertise vs. Coverage
December 26, 2017in Technology & Data Science 0 Comments 1 Likes

Machine Learning: Expertise vs. Coverage

Technology & Data Science Using AI to Extract High-Value Threat Intel from Data
November 30, 2017in Technology & Data Science 0 Comments 2 Likes

Using AI to Extract High-Value Threat Intel from Data

Company News Haystax Technology Named Red Herring 2017 Top 100 Global Winner
November 20, 2017in Company News 0 Comments 1 Likes

Haystax Technology Named Red Herring 2017 Top 100 Global Winner

Company News Haystax Technology Expands Field Ops Organization with World-Class Software Sales Executives
November 6, 2017in Company News 0 Comments 0 Likes

Haystax Technology Expands Field Ops Organization with World-Class Software Sales Executives

Technology & Data Science UBA Is Just Getting Warmed Up
October 30, 2017in Technology & Data Science 0 Comments 1 Likes

UBA Is Just Getting Warmed Up

Technology & Data Science artificial intelligence
October 20, 2017in Technology & Data Science 0 Comments 0 Likes

Beyond Machine Learning: Using Models in AI for Security

Company News cybersecurity
October 18, 2017in Company News 0 Comments 0 Likes

Haystax Technology Named 2017 SINET 16 Innovator

Company News CSO logo
October 2, 2017in Company News 0 Comments 2 Likes

CSO: Avoiding Hype Around User Behavior Analytics

Policy & Practice insider threat
September 25, 2017in Policy & Practice 0 Comments 1 Likes

Six Steps Companies Can Take to Improve Insider Threat Mitigation

Company News insider threat
September 13, 2017in Company News 0 Comments 1 Likes

Haystax Technology VP Comments on FY18 Insider Threat Program Language

Policy & Practice insider threat
September 1, 2017in Policy & Practice 0 Comments 0 Likes

Detecting an Insider Threat in Game of Thrones

Company News cybersecurity
August 25, 2017in Company News 0 Comments 0 Likes

Haystax Technology Named SINET 16 Finalist

Company News Cybersecurity
August 14, 2017in Company News 0 Comments 1 Likes

Haystax Technology Again Makes Cybersecurity 500 List

Technology & Data Science More Organizations Adopting UBA, Gartner Says
August 9, 2017in Technology & Data Science 0 Comments 1 Likes

More Organizations Adopting UBA, Gartner Says

Technology & Data Science New SANS, Haystax Technology Insider Threat Survey Reveals Malicious Actors as the Most Damaging Threat Vector for Companies
August 1, 2017in Technology & Data Science 0 Comments 0 Likes

New SANS, Haystax Technology Insider Threat Survey Reveals Malicious Actors as the Most Damaging Threat Vector for Companies

Technology & Data Science Insider Threat, Security Analytics
June 23, 2017in Technology & Data Science 0 Comments 0 Likes

Why We Need More Shades of Gray in Security

Technology & Data Science Network Data is Not Enough
June 8, 2017in Technology & Data Science 0 Comments 1 Likes

Network Data is Not Enough

Technology & Data Science Cyber security
April 12, 2017in Technology & Data Science 0 Comments 1 Likes

Managing Insider Risk: Breathalyzers and Behavioral Analytics

Product Updates Haystax
March 23, 2017in Product Updates 0 Comments 0 Likes

Haystax Product Update: App, Dashboard and Map Enhancements

Technology & Data Science Security analytics
February 23, 2017in Technology & Data Science 0 Comments 0 Likes

Haystax Technology Publishes Security Analytics White Paper

Company News Haystax Technology Wins 2017 Cybersecurity Excellence Award for Security Analytics
February 8, 2017in Company News 0 Comments 0 Likes

Haystax Technology Wins 2017 Cybersecurity Excellence Award for Security Analytics

Product Updates Haystax Product Update: Multiple Model Support
February 8, 2017in Product Updates 0 Comments 0 Likes

Haystax Product Update: Multiple Model Support

Company News security analytics
February 3, 2017in Company News 0 Comments 2 Likes

Haystax Named Finalist for 2017 Cybersecurity Excellence Awards

Technology & Data Science Overcoming Objections to Bayesian Networks – Part 2
January 4, 2017in Technology & Data Science 0 Comments 0 Likes

Overcoming Objections to Bayesian Networks – Part 2

Company News Cyber security
December 8, 2016in Company News 0 Comments 1 Likes

Washington Post Reports on Haystax Technology-Fishtech Partnership

Technology & Data Science Security analytics
December 2, 2016in Technology & Data Science 0 Comments 0 Likes

Network World Cites Haystax Technology CEO in AI Piece

Technology & Data Science Bayesian Network
October 27, 2016in Technology & Data Science 0 Comments 0 Likes

Overcoming Objections to Bayesian Networks – Part 1

Technology & Data Science Insider Threat, Security Analytics
October 11, 2016in Technology & Data Science 0 Comments 1 Likes

Haystax CEO: Effective Security Analytics for Insider Threat Prevention

Technology & Data Science A Security Analytics Approach That Does Work
September 29, 2016in Technology & Data Science 0 Comments 1 Likes

A Security Analytics Approach That Does Work

Technology & Data Science Three Weaknesses of Rules-Based Systems
September 6, 2016in Technology & Data Science 0 Comments 1 Likes

Three Weaknesses of Rules-Based Systems

Technology & Data Science Machine Learning vs. Model-First Approaches to Analytics
August 31, 2016in Technology & Data Science 0 Comments 0 Likes

Machine Learning vs. Model-First Approaches to Analytics

Company News Haystax Selected to Present Advanced Threat Analytic Applications for Insider Threat Detection at STIDS – 2014 Conference
November 13, 2014in Company News 0 Comments 0 Likes

Haystax Selected to Present Advanced Threat Analytic Applications for Insider Threat Detection at STIDS – 2014 Conference

The Latest at Haystax

Preventing School Violence: Challenges and Opportunities
Preventing School Violence: Challenges and Opportunities
2021 Insider Threat Lessons Learned
2021 Insider Threat Lessons Learned
Haystax Insider Threat Expert Contributes to INSA Paper
Haystax Insider Threat Expert Contributes to INSA Paper

DOWNLOAD FREE GUIDE

READ THE LATEST INSIDER THREAT REPORT

Tags

Advanced Threat Analytics Analytics cybersecurity Enterprise Threat Management first responders Haystax Analytics Haystax Analytics Platform Haystax Technology Insider Threat Media Coverage public safety risk management security analytics situational awareness threat monitoring

Follow Haystax on Social

Categories

  • Company News (147)
  • Policy & Practice (66)
  • Product Updates (48)
  • Success Stories (30)
  • Technology & Data Science (52)

Haystax’s security analytics platform applies artificial intelligence techniques to reason like a team of analysts and prioritize risks in real time at scale for more efficient protection of critical assets.

877-442-4553

Contact Haystax

© Haystax 2022 - A Fishtech LLC company. All rights reserved.