Scalable Flow Analysis (White Paper)
October 2006 • White Paper
Abhishek Kumar (University of Maryland), Sapan Bhatia (Princeton)
In this paper, the authors present a new approach for summarization and analysis of flow records.
Abstract
While current toolkits for analysis of flow-records such as SiLK are powerful and versatile, real-time analysis of flow records at very large flow collection installations continues to be a challenge. In this paper we present a new approach for summation and analysis of flow records. Through the use of approximate data structures, a large bulk of flow records is reduced to a compact representation that is 100 times smaller in volume than the original flow records. The techniques are suitable for implementing a small number of predefined queries that are evaluated repeatedly in a periodic manner. The operations involved in summation and query processing are fast enough to keep up with 2.5 million flows per second in a software implementation running on general purpose hardware.