CERT PODCAST SERIES: SECURITY FOR BUSINESS LEADERS: SHOW NOTES

Characterizing and Prioritizing Malicious Code

Key Message: Identify which malware to analyze first, based on its ability to do harm.

Executive Summary

"Every day, major anti-virus companies and research organizations are inundated with new malware samples. Although estimates vary, approximately 150,000 new malware strains are released each day. Not enough manpower exists to manually address the volume of new malware samples that arrive daily in analysts' queues. Malware analysts need an approach that allows them to sort samples in a fundamental way so they can assign priority to the most malicious binary files." [1]

In this podcast, Jose Morales, a malicious software researcher with the CERT Division, discusses an approach for prioritizing malware samples, helping analysts to identify the most destructive malware to examine first, based on the binary file's execution behavior and its potential impact. This podcast is based on two SEI blogs that are listed in Resources.

PART 1: IDENTIFY CHARACTERISTICS OF DESTRUCTIVE BEHAVIOR

Motivation

In early 2012, a backdoor Trojan malware named Flame was discovered in the wild. Flame had been in one company's anti-malware repository for two years before it became public knowledge. Jose and his research team wanted to know why it didn’t get analyzed sooner, given the damage it caused.

Malware analysts receive a huge amount of new malware samples daily. It is a daunting task to know which malware to analyze first.

To address this problem, the research team decided to pursue an approach for automating the prioritization of malware based on defined criteria.

Destructive Characteristics

Some malicious behaviors that malware executes behind the scenes include:

self-replication
code injection
process execution
killing anti-malware -elated execution processes
modifying operating systems
reaching out to various remote hosts

Binary vs. Malware Behavior

Binaries may execute some of the following actions:

create files
open sockets
do a DNS lookup

These are very low-level and not of great interest but are often used as the basis for analyzing malware. What the binary does is not what the malware does.

Malware may execute binaries to accomplish the actions listed above plus some additional ones such as:

remove themselves from the process list
log keystrokes
upload information
set themselves up to run during system reboot

A key to prioritizing malware is to enumerate such behaviors and figure out how to find them on a given operating system by running them through an analysis system.

Digital Signatures

Almost all malware behaviors only occur when you run the binary on the target operating system and observe what it does. The one behavior that does not require binary execution is digital signatures.

The absence of a digital signature or the presence of an unverified signature indicate a lack of attribution and provenance, which may call for further analysis, i.e., may be suspicious. This is a key prioritization characteristic.

Some malware authors create bogus companies for the sole purpose of being able to attribute digital signatures to those companies.

PART 2: STEPS TO DETECT AND PRIORITIZE

First Step – Collect Malicious and Benign Code Samples

The first step was to create a large set of known malware and known benign code samples for the Windows XP and Windows 7 operating systems for algorithm training purposes. Approximately 11,000 were used.

The sample malware set included viruses, worms, password stealers, key loggers, botnets, backdoors, droppers, and downloaders. The subset also included Advanced Persistent Threat (APT) samples from Mandiant.

The benign code set was taken from a number of desktops and laptops. It included standard Windows software as well as third-party applications. The users of these machines included designers, developers, home users, and business users.

The malware set and the benign set were approximately equal in number of samples.

Second Step – Establish Training Set

All samples were run through the CERT Malicious Code Automated Run-Time Analysis (MCARTA) system, which describes what each binary does. From the MCARTA report, the research team figured out how to identify the destructive characteristics described above and developed a script to detect these.

The team also worked on minimizing the identification of false positives and the absence of false negatives.

This initial sample was used as the training set for the machine learning algorithms that would be used to identify the desired behaviors for future (and larger) malicious and benign samples.

Third Step – Machine Learning Classification

The team used the Random Forest and AdaBoost algorithms to identify which samples were malware and which samples were benign using the training input from the second step. These algorithms use a scale of zero to one to connote confidence. A "one" means that the algorithm is 100 percent confident that the sample is malicious.

Fourth Step – Determining which Malware to Analyze

The team confirmed that the samples that scored at "one" or closest to one were malware, matching the characterization as defined by the training set.

Not so obviously, the team also examined samples that scored at the bottom – anything below 10 percent confidence all the way down to zero confidence. They knew these samples were also malicious and they wondered why the algorithms didn’t identify them. Some of the reasons may have been:

they didn’t execute correctly
they determined that they were in an analysis environment so acted benign
they were able to run in some type of stealth mode, evading all of the known destructive behaviors
they needed a specific environment to perform their malicious behavior

Stealth Malware

The team determined that the malware that scored low or no confidence possessed one or more of the following abilities:

to run stealthily
to target only certain systems
to avoid analysis under certain conditions

Thus these samples were likely more sophisticated and just as dangerous, if not more so, than the ones that scored with high confidence.

PART 3: 96-98 PERCENT DETECTION ACCURACY

Using This Approach

A malware analyst can use this approach by performing the following steps:

Use the list of destructive characteristics provided by this research.
Identify when they occur in your analysis system to build a training set of known malware samples.
Train your analysis machine learning algorithms, such as Random Forest and AdaBoost.
Use the algorithms to classify an unknown set of malware samples.
Analyze those that score with high and low confidence.

This research approach detects known malware with 98 percent accuracy and detects APTs with 96 percent accuracy. This reinforces the research results that the set of destructive characteristics are useful in detecting high priority malware.

It is important to remember that the objective is NOT to describe what the binary does but to identify the more abstract behaviors that the malware demonstrates – and then identify how these are implemented in your target operating systems.

Malware Infection Trees

Malware binaries often cooperate with one another, particularly those that replicate. One binary process may start other processes and create other files. It is important to capture these linkages using malware infection trees so that all binaries are analyzed and eliminated as a set. If not, some elements stay behind and may continue to do damage.

Resources

[1] Morales, Jose. "Prioritizing Malware Analysis." SEI blog post, November 2013.

Morales, Jose. "A New Approach to Prioritizing Malware Analysis." SEI blog post, April 2014.

Morales, Jose, et al. "Building Malware Infection Trees." 6th International Conference on Malicious and Unwanted Software (MALWARE), 2011.