Prioritizing Alerts from Multiple Static Analysis Tools, Using Classification Models
August 2018 • Conference Paper
Lori Flynn, William Snavely, David Svoboda, Nathan VanHoudnos, Richard Qin, Jennifer Burns, David Zubrow, Robert W. Stoddard, Guillermo Marce-Santurio
This paper was accepted by the SQUADE workshop at ICSE 2018. It describes the development of several classification models for the prioritization of alerts produced by static analysis tools and how those models were tested for accuracy.
Static analysis (SA) tools examine code for flaws without executing the code, and produce warnings ("alerts") about possible flaws. A human auditor then evaluates the validity of the purported code flaws. The effort required to manually audit all alerts and repair all confirmed code flaws is often too much for a project’s budget and schedule. An alert triaging tool enables strategically prioritizing alerts for examination, and could use classifier confidence. We developed and tested classification models that predict if static analysis alerts are true or false positives, using a novel combination of multiple static analysis tools, features from the alerts, alert fusion, code base metrics, and archived audit determinations. We developed classifiers using a partition of the data, then evaluated the performance of the classifier using standard measurements, including specificity, sensitivity, and accuracy. Test results and overall data analysis show accurate classifiers were developed, and specifically using multiple SA tools increased classifier accuracy, but labeled data for many types of flaws were inadequately represented (if at all) in the archive data, resulting in poor predictive accuracy for many of those flaws.