IME

Logic and Data Flow Extraction for Live and Informed Malware Execution



In collaboration with


Honeynets provide an interesting and important perspective on intrusion activity in the Internet. Moderate- and high-interaction honeynets deployed over distributed dark space provide the ability to quickly gather vast amounts of malware. However, a vital step in reaping benefits from honeynet deployments remains largely unsolved, i.e., effective and timely analysis of captured binary executables. These executables are typically malicious (self-propagating worms, bots etc.) in nature and carefully packed to evade detection systems.

Current state of the art in malware identification relies heavily on fine-grained instance-specific malware signatures that are developed through manual examination and repeated time-bound execution of binaries in a statically defined environment. Such techniques tend to be subjective, labor intensive, and inaccurate, because there is no prior knowledge of environmental dependencies, communication requirements, or system services that are required in order to observe the malware's full range of functional capabilities. These limitations profoundly affect the quality and efficiency with which malware behavioral profiles and signatures can be generated.

In this research project, we hope to stimulate the development of the next generation of malware defense and forensic analysis tools. The project will develop novel techniques for fast and efficient analysis of malware binaries that will enable the informed and controlled execution of malware. The innovative ideas include 1) the introduction of new methodologies for malware binary unpacking, 2) improved static analysis with structure and behavioral abstraction techniques for facilitating automated domain-specific code block annotations, 3) the use of such abstractions to compute logical equivalence of binaries; and 4) techniques to instrument the runtime execution environment and the malware instance based on the code block annotations formulated by the static analysis.

Our research will enable the deep understanding of malware logic and will allow us to fully control its execution in order to reveal all of its potency. This will lead to generalized (non-instance-specific) signatures that are much more effective in detecting malware than current signatures. It will also aid the design of rapid and effective responses against the expanding threats to the computing infrastructure on which our society depends.