Software Engineering Institute | Carnegie Mellon University
Software Engineering Institute | Carnegie Mellon University

Digital Library

Javascript is currently disabled for your browser. For an optimal search experience, please enable javascript.

Advanced Search

Basic Search

Content Type


Publication Date

White Paper

Blacklist Ecosystem Analysis: 2016 Update

  • Abstract

    This update, which is the latest in a series of regular updates, builds upon the analysis of blacklists presented in our 2013 and 2014 reports. In those reports, we established that the contents of blacklists generally fail to overlap substantially with each other. This report further corroborates that over-arching result. Our results suggest that available blacklists present an incomplete and fragmented pic-ture of the malicious infrastructure on the Internet, and practitioners should be aware of that insight. This result also provides a starting point for further investigation to understand the dynamics of the blacklist ecosystem.

    We have included 123 lists in our latest analysis. This includes 88 IP-address-based lists and 35 domain-name-based lists. The number of indicators included on any individual list varies from under 1,000 to over 50 million. Our analysis covers the 18-month period from July 1, 2014 to December 31, 2015.

    In this report, we revisit three of the metrics considered in the 2014 report to characterize overlaps: reverse counts, list counts, and pairwise intersection counts. We have omitted the following metric in order to give the issue of following a more complete treatment in a future report. We have added two new metrics: a reverse lookup metric to capture counts of domains seen being resolved in passive DNS, and a persistence in blacklists metric that captures persistence of IPs on blacklists over long spans of time.

    Most indicators appear on a single list. Our analysis revealed that 86.6% of IP address indicators appear on exactly one of the lists included in the study. For domain name indicators, 93.7% appear on a single list. Additionally, in the case of domain-name-based lists, there are two distinct “clusters” of lists: 13 of the lists (out of 35) are populated in such a way that fewer than half of the domain names listed are active, while 18 of the 35 are populated such that 80% or more of their entries do resolve.


    See the prior report at

    See the prior report at

    See the prior report at

  • Download