Look Ma, No Malware!
August 2020 • Presentation
This presentation uses a specific instance of this problem, DNS-based DDoS attacks, as a case study to highlight how the application of unsupervised learning, and some particular methodologies, can help address this threat intelligence problem.
Over the last few years, we have studied a type of DNS DDoS attack which first appeared at-scale in 2014. Known as a Slow Drip, or Random Qname attack, these attacks were particularly disruptive in the 2014-2015, particularly to the Internet's middle infrastructure. Little malware was ever recovered, and none that explained the breadth and magnitude of the attacks. These attacks continue today, but in the largest known study of the attacks, we found that the threat landscape has changed significantly in the last few years. Through a combination of text and time series features, we are able to characterize the dominant malware and demonstrate that the number of global-scale attack systems is relatively small. These results are based on large-scale global pDNS analysis over eight months.
While the results are useful to organizations needing to understand global DNS-based DDoS threat actors, the methodologies are more universal. We consider the case where a reasonably large amount of data, unlabeled, exists over time; this might be the case for certain DGAs, for example, or DNS tunneling. In our case, this data comes from a strong statistical classifier, but could encompass weaker classifiers.
The observable metadata, in our case the DNS queries, are the source to understand the underlying malware. We use traditional Exploratory Data Analysis (EDA) and feature engineering to gain intuition of how different malware may manifest in our data. The divergence of character distributions between different attacks proves enlightening, but won't scale over time as a production system needs. Identifying archetypical distributions from an initial large sample allows us to overcome this hurdle, and create a distance measure that can be combined with other features to cluster attacks, and the attack generators by extension.
The use of archetypes in unsupervised learning allows us to reliably compare data over time to fixed points, and in a way that scales. We need to be concerned about model drift, where the underlying threat changes, and in our study we did this by considering the application of the unsupervised model to data six months later.