Data Fusion: Enhancing NetFlow Graph Analytics
January 2016 • Presentation
Emilie Purvine, Bryan Olsen (Pacific Northwest National Laboratory), Cliff Joslyn (Pacific Northwest National Laboratory)
In this FloCon 2016 presentation, the authors explain RDP logins and why they are important to analyze in the context of NetFlow.
Network defenders often are required to analyze multiple data sources, going beyond just NetFlow, to begin to understand the context and full extent of attack. One prominent additional source is Windows event log information. The authors present their work focused on data fusion of Windows event logs with NetFlow to enhance analysis of Remote Desktop Protocol (RDP) sessions. One of the main objectives in this presentation is to enable the community to understand RDP logins and why they are important to analyze in the context of NetFlow. The authors also present further analysis of how enhancing NetFlow graphs with metadata included in Windows events provides a better understanding of activity at the enterprise level and may highlight opportunities for behavioral analysis. Finally, they present their research on the use and deployment of analytical methods in spectral and algebraic topological techniques to identify features and events.
For event logs, the authors use the Windows Logging Service (WLS), developed by the Department of Energy's Kansas City Plant, for the purposes of enhancing and standardizing information coming from Windows logging. They have incorporated network interface information with Windows events to create a hybrid data set enabling more accuracy in NetFlow/event log fusion at the enterprise level.Their overall goal is then to compare a NetFlow graph with the login graph to enable higher level understanding of linked events and deviations within session behavior. Their initial work focused on understanding RDP sessions and how they would represent themselves in both NetFlow and windows event log data. The authors found that there were unique and interesting features in the Windows event log data that required some novel approaches to fusing them before attempting to correlate them with remote login NetFlows over RDP.
Their analysis of the fused data focuses on the evolution of both the graph spectrum and the algebraic topological structure (i.e., presence of loops and voids in a higher dimensional representation of the graphs). The graph spectrum can pick up changes in statistics, like graph density, whereas the algebraic topology looks at more subtle, higher dimensional changes in the graphs.