New DNS Traffic Analysis Techniques to Identify Global Internet Threats

January 11, 2016 • Presentation

By

Dhia Mahjoub (OpenDNS) and Thomas Mathew (OpenDNS)

In this presentation, the authors describe how they extracted domains associated with Exploit kit, DGA, and spam-run campaigns from their worldwide live DNS traffic.

Publisher

Software Engineering Institute

Subjects

Flocon Situational Awareness

Abstract

Leveraging DNS data to detect new Internet threats has been gaining in popularity in the past few years. However, most industry and academic work examines DNS solely from the authoritative layer through the use of passive DNS. This presentation covers three novel methods that can be used to detect network threats at an Internet scale by analyzing DNS traffic below and above the recursive layer, monitoring malware hosting IP infrastructures, and applying graph analytics on DNS lookup patterns. Several research papers present methods to identify emerging DGA threats by using lexical features and passive DNS. Despite their proven merits, these approaches do not provide a complete overview of DNS traffic. Recursive DNS data is the missing piece, as it gives information regarding client lookup patterns. By considering DNS request counts to domains as a time series vector, we construct a set of algorithms that identify anomalous query patterns recorded by our worldwide recursive resolvers. Certain spikes in DNS traffic can signal the emergence of DGAs or Exploit kit campaigns. We describe in this presentation how, through a combination of clustering and supervised learning methods, we consistently and efficiently extracted domains associated with Exploit kit, DGA, and spam-run campaigns from our worldwide live DNS traffic during an eight-month study. In addition, authoritative DNS and IP space data is used to show evolving patterns of TTPs adopted by adversaries in setting up their DNS and IP infrastructures for resilience and scale. Such patterns include improper ASN peering relationships, offshore registration of hosting businesses, diversification of IP space across various RIRs, rogue ASNs and affiliated hosters, and use of compromised registrant accounts. This combined process allows us to examine and mitigate new threats (Angler, Nuclear, Neutrino, etc.) as they begin to emerge on the network. An important model based on recursive traffic is the co-occurrence model, which expresses temporal locality of domain queries. From a set of domains and timestamps, we build a directed graph to represent co-occurrence relationships between tuples of domains. Domains that appear requested shortly after one another are thereby linked. Our third method applies graph analytics to study this large scale-free co-occurrence graph. We identify the strongly connected components and use these individual components to help us isolate distinguishable subgraph structures that constitute signatures of specific attacks. Relying on a variety of graph analytic measures, such as density and centrality, we can create classification schemes for different families of subgraphs. These methods have helped us identify new DGA botnet patterns and closely connected malware chains.

Software Engineering Institute