New perspectives in classification, complexity analysis and unpacking of run-time packers

  1. Ugarte Pedrero, Xabier
Dirigida por:
  1. Igor Santos Grueiro Director/a
  2. Pablo García Bringas Director

Universidad de defensa: Universidad de Deusto

Fecha de defensa: 24 de febrero de 2015

Tribunal:
  1. Javier Alonso López Presidente/a
  2. Pablo Garaizar Secretario
  3. Davide Balzarotti Vocal

Tipo: Tesis

Resumen

Run-time packers are widely used by malware writers in order to hinder reverse engineering and automated analysis. These tools consist in the encryption of the original program, which is restored at runtime and afterwards executed. When these tools became popular for malware protection, the research community focused the efforts on the detection and generic unpacking of this type of obfuscations. Researchers quickly shifted their attention to other problems in this domain. The current malware landscape evidences that the problem is not yet completely solved. Malware authors keep protecting their samples with run-time packers and available on-line malware analysis services do not provide any information about the packer used for protection beyond the result provided by signature-based detection methods. Malware writers typically scramble well-known versions of packers or implement their own custom unpackers. The fact that malware writers still make an effort to implement these techniques highlights that they are still effective to protect binaries. In this dissertation we combine different static techniques for packed binary classification. First, we propose the use of structural features extracted from Portable Executable files in order to discriminate packed from non-packed binaries, and study the performance differences among several feature-sets for this classification task. In this context, we also propose the application of anomaly detection under the intuition that common compilers produce binaries following standards and conventions, making the set of non-packed binaries easier to model than packed binaries. Second, although the research community has published successful solutions to generically unpack malware, this technique is still employed to protect samples. Following this idea, we raise a series of questions: What is the actual complexity of current run-time packers? Do these packers violate the assumptions made by previous approaches? What has been the evolution of the packer landscape over the years? In order to answer these questions we focus on studying the structural complexity of run-time packers. To this aim we design and develop a complete framework based on a dynamic analysis platform to record and analyse many different system events related to run-time packer behaviour. Moreover, we propose a taxonomy that combines several dimensions of complexity into one single score. This framework allowed us to perform the first longitudinal study on run-time packer complexity over custom-packed binaries, collected by the Anubis on-line sandbox since 2007. Finally, we focus on the technical challenges and limitations involved to apply multi-path exploration to the unpacking domain. It is well-known that multi-path exploration presents severe limitations for the analysis of highly obfuscated software and does not scale to large programs. In order to unpack binaries that partially reveal their code on demand (the most complex type of packer represented in our taxonomy), the first solution that comes to mind is multi-path exploration. Our research describes some of these limitations and proposes a set of domain-specific optimisations and a heuristic that combined together improve the feasibility of multi-path exploration for unpacking.