Applying normalised compression distance for architecture classification

An NCC Group whitepaper: Applying normalised compression distance for architecture classification

When working with malware research and black box penetration testing, it is not always clear what data you are working on and in order to disassemble binaries properly, one needs to know the architecture that the binary has been compiled for.

Normalised compression distance has been shown to be useful for classifying unknown data in the field of forensics and malware. It was introduced by R. Cilibrasi and P.M.B. Vitanyi in 2005.

This whitepaper aims to present a technique to classify binaries and shellcode with statistical analysis using normalised compression distance.

It will demonstrate that it is possible to use normalised compression distance to discern architecture classification of computer binaries.

Download the whitepaper

NCC Group Publication Archive