search menu icon-carat-right cmu-wordmark

Anomaly Detection in Cyber Networks using Graph-node Role-dynamics and NetFlow Bayesian Normalcy Modeling

January 2018 Presentation
Anthony Palladino (Boston Fusion Corporation), Andrew Spisak (Boston Fusion Corporation), Christopher Thissen (Boston Fusion Corporation)

In the presentation, the author describes a novel approach to cyber-anomaly detection. The method includes multi-modal data fusion, advanced graph-based analytics, and Bayesian normalcy modeling.

Abstract

Advanced Persistent Threats (APTs), i.e., “low and slow” cyber-attacks, are difficult to detect using standard network defense tools. APTs typically hide within the noise of normal network operations, and may persist undetected for months or even years. As a result, the warning signs of an APT can easily be lost in the flood of alerts generated by intrusion detection systems (IDSs) and NetFlow data.

This paper describes ongoing research in APT detection. Our approach is two-fold. First, we fuse alerts generated by multiple IDSs (e.g., Snort, OSSEC, and Bro), into a single weighted graph that allows us to identify anomalies across modalities. To detect the anomalies, we apply the role-dynamics algorithm, which has successfully identified anomalies in social media, email, and IP communication graphs. In the cyber domain, each node in the fused IDS-alert graph is assigned a probability distribution across a small set of roles based on that node’s features. A cyber-attack should trigger IDS alerts causing changes in node features, but rather than track every feature for every node individually, roles provide a succinct, integrated summary of those feature changes. We measure changes in each node's probabilistic role assignment over time, and identify anomalies as deviations from expected roles.

Second, we implement a Bayesian dynamic packet flow model to characterize NetFlow patterns within the network. The algorithm provides a probabilistic measure of traffic volatility from which Bayesian inference can be used to forecast expected normal behavior. The model triggers an indication of compromise when deviations from the expected behavior occur, such as during the exfiltration of data.