search menu icon-carat-right cmu-wordmark

Bayes at 10+ Gbps: Identifying Malicious and Vulnerable Processes from Passive Traffic Fingerprinting

August 2020 Presentation
David McGrew (Cisco Systems, Inc.)

This presentation describes an inferencing system and its implementation, results in applying it to real-world traffic, and open issues in this technology area.




As network monitoring techniques have evolved in response to the rise of encrypted traffic, protocol fingerprinting has become an essential component of network defense. While exact-match fingerprinting of TLS clients is now widespread, it is too imprecise to use for process identification. To more reliably determine the process associated with a session, we applied inferencing based on naïve Bayes to fingerprints and destination information, using equivalence classes of destinations derived from auxiliary data. Our implementation of the packet capture and inferencing uses Linux TPACKETv3 and can identify processes on 10+ Gbps enterprise internet connections. This system detects many interesting categories of processes, including malware, evasive applications, scanners, and obsolete and vulnerable software. As it is based on an interpretable machine learning model, its findings are readily understandable and it can adapt to different prior probabilities.