search menu icon-carat-right cmu-wordmark

Uncovering Priority Anomalies using Pattern Discovery as a Roadmap for Contextual Analysis

August 2020 Presentation
Thomas S. Henretty, PhD (Reservoir Labs)

In this talk, attendees will be exposed to a unique approach to network anomaly detection and prioritization that combines tensor decompositions with deeper, query-based analysis.


Reservoir Labs


In most real-world network environments, abnormal activity is a routine part of normal operation. Systems that flag statistically abnormal events flood IT specialists with meaningless alerts. Meanwhile, systems that key on signatures are limited to discovering what is already known to be anomalous. In this talk, we describe an approach to anomaly detection that is based on the insight that large-scale tensor decompositions can be used to create an effective roadmap for targeted database and graph queries that confirm or reject behavior hypotheses. Tensor decompositions, which are based on matrix operations extended to higher dimensions, have been shown to isolate coherent patterns of behavior from within complex network traffic logs. This pattern-based approach can immediately link together multiple discrete activities separated by time, entity, or location in multidimensional data and can embody interactions that cannot be expressed (or often even anticipated) by rule signatures. Tensor decompositions alone, however, are limited in that they cannot ascribe significance to discovered patterns. Large database and graph structures are a natural choice for representing linked metadata at scale and offer rich query capabilities. As a first-line analytics tool, however, search-based approaches can suffer from the “boil-the-ocean” problem of having to examine the totality of massive data stores to find instances of specific, sometimes complex patterns, among potentially billions of interconnected records.

We show how patterns discovered through tensor decomposition can be thought of as documents that can be subjected to a variety of analyses, in parallel, with successively increasing need for deep contextual information. Patterns are subjected to topic-based analysis trained from prior decompositions to discover anomalies. Anomalous patterns are then further categorized using tests for the existence of a variety of descriptive behaviors including beaconing, mapping, and scanning, among others. Once patterns requiring deeper, targeted inspection have been identified, elements of these patterns drive query-based analysis scripted according to the category assigned to the pattern. These analyses are capable of combining records from the original network log data with contextual information including network topology, whitelist and blacklist information, and external information such as published alerts. This segmentation of analysis allows adaptation and customization to specific environments resulting in scalable, network-aware prioritization of alerts while also reducing alert clutter.