(The version of Ghidra used in this article is 10.1.2. For the Go string recovery tool release, skip ahead to Ghostrings Release.)
Introduction
A well-known issue with reverse engineering Go programs is that the lack of null terminators in Go strings makes recovering string definitions from compiled binaries difficult. Within a compiled Go program, many of the constant string values are stored together in one giant blob, without any terminator characters built into the string data to mark where one string ends and another begins. Even a simple program that just prints “Hello world!” has over 1,500 strings in it related to the Go runtime system and other standard libraries. This can cause typical ASCII string discovery implementations, such as the one provided by Ghidra, to create false positive string definitions that are tens of thousands of characters long.
Instead of null terminated strings, Go uses a string structure that consists of a pointer and length value. Many of these string structures are created on the program’s stack at runtime, so recovering individual string start locations and length values requires analyzing the compiled machine code. There are a few existing scripts that perform this analysis by checking for certain patterns of x86-64 instructions,1 but they miss structures created with unhandled variations of instructions that ultimately have the same effect on the stack, and they’re also restricted to a specific ISA.
I thought this problem presented a good opportunity to leverage the Ghidra decompiler for program analysis. Ideally, the decompiler’s lifted P-Code output would simplify analysis by translating different variations of instructions that write a value to the stack into a single copy operation. With a convenient way to scan for operations that write potential string address and length values to adjacent locations on the stack, the remaining analysis step would be to check if a chunk of memory defined by those address and length values contains only printable ASCII characters. As a bonus, the analysis wouldn’t be restricted to a single ISA, thanks to the many SLEIGH specifications that translate machine code into the initial raw form of P-Code.
However, when I first attempted to implement this analysis, the machine instructions in question didn’t appear to produce any P-Code output at all…
The Case of the Missing P-Code Operations
Consider the following example Go program:
package main import "fmt" func main() { fmt.Printf("Who killed Mr. COPY?n") }
In an x86-64 ELF build of this program, the instructions that set up the string structure on the stack in main.main
can be found at addresses 0x48e7c4
through 0x48e7d0
:
0048e7c4 LEA RAX,[DAT_004c584a] 0048e7cb MOV qword ptr [RSP + local_48],RAX=>DAT_004c584a 0048e7d0 MOV qword ptr [RSP + local_40],0x15 0048e7d9 MOV qword ptr [RSP + local_38],0x0 0048e7e2 XORPS XMM0,XMM0 0048e7e5 MOVUPS xmmword ptr [RSP + local_30[0]],XMM0 0048e7ea CALL fmt.Fprintf
The address of the string content, 0x4c584a
, is written to the stack by the MOV instruction at 0x48e7cb
. The string’s length, 0x15
, is written to the stack by the MOV instruction at 0x48e7d0
.
We can display the raw P-Code for each machine instruction by enabling the P-Code listing field in Ghidra’s code listing:
0048e7c4 LEA RAX,[DAT_004c584a] RAX = COPY 0x4c584a:8 0048e7cb MOV qword ptr [RSP + local_48],RAX=>DAT_004c584a $U3800:8 = INT_ADD 16:8, RSP $Uc000:8 = COPY RAX STORE ram($U3800:8), $Uc000:8 0048e7d0 MOV qword ptr [RSP + local_40],0x15 $U3800:8 = INT_ADD 24:8, RSP $Uc080:8 = COPY 21:8 STORE ram($U3800:8), $Uc080:8
Note that both MOV instructions are translated into an INT_ADD, COPY, and STORE operation sequence in the raw P-Code.
To get the high (analyzed) P-Code output from the decompiler, we can write a script to use the decompiler API. This script decompiles the currently selected function and prints the resulting P-Code operations to the console:
PrintHighPCode.java> Running... PCode for function main.main @ 0048e790 ... 0x48e799:0xb2 (unique, 0xc000, 8) CAST (unique, 0x1000007a, 8) 0x48e799:0xb3 (unique, 0x10000082, 8) CAST (register, 0x20, 8) 0x48e79d:0x11 --- CBRANCH (ram, 0x48e7f9, 1) , (unique, 0x1000005f, 1) 0x48e79d:0xac (unique, 0x1000005f, 1) BOOL_AND (register, 0x200, 1) , (register, 0x206, 1) 0x48e7ea:0x3a --- CALL (ram, 0x4866f0, 8) 0x48e7ea:0x8c (ram, 0x5601b0, 8) INDIRECT (ram, 0x5601b0, 8) , (const, 0x3a, 4) 0x48e7f8:0x49 --- RETURN (const, 0x0, 8) 0x48e7f8:0x8d (ram, 0x5601b0, 8) COPY (ram, 0x5601b0, 8) ... PrintHighPCode.java> Finished!
The instructions for setting up the stack address and length are completely missing! In fact, all thirteen instructions leading up to the function call at 0x48e7ea
are missing from the P-Code output.
Running the “Decompiler Parameter ID” analyzer provides a small hint as to what’s happening with these instructions. For reference, here’s the analyzer’s description:
Creates parameter and local variables for a Function using Decompiler. WARNING: This can take a SIGNIFICANT Amount of Time! Turned off by default for large programs You can run this later using "Analysis->Decompiler Parameter ID"
(This analyzer won’t be enabled by default unless the program is a PE binary under 2 MB in size.2 A Windows build of a Go “hello world” program is just over 2 MB, so it’s unlikely to run automatically for a Go binary.)
After the analyzer finishes, some parameters are associated with the fmt.Fprintf
function call in the decompiled C code view. Although the call setup is incorrect, some of the missing stack data now appears:
fmt.Fprintf(param_1,param_2,param_3,go.itab.*os.File,io.Writer,param_5,param_6, (long)go.itab.*os.File,io.Writer,os.Stdout,(sigaction *) DAT_004c584a, (sigaction *)0x15,0,(sigaction *)0x0,0);
Looking at the decompiler P-Code output again, the CALL op now has those same parameters associated with it, but the P-Code ops for the MOV instructions are still missing:
PrintHighPCode.java> Running... PCode for function main.main @ 0048e790 ... 0x48e79d:0x11 --- CBRANCH (ram, 0x48e7f9, 1) , (unique, 0x100000d5, 1) 0x48e79d:0xcb (unique, 0x100000d5, 1) BOOL_AND (register, 0x200, 1) , (register, 0x206, 1) 0x48e7ea:0x3a --- CALL (ram, 0x4866f0, 8) , (register, 0x38, 8) , (register, 0x30, 8) , (register, 0x10, 8) , (unique, 0x100000e0, 8) , (register, 0x80, 8) , (register, 0x88, 8) , (unique, 0x10000130, 8) , (ram, 0x5601b0, 8) , (unique, 0x100000d8, 8) , (const, 0x15, 8) , (const, 0x0, 8) , (const, 0x0, 8) , (const, 0x0, 8) 0x48e7ea:0xa7 (ram, 0x5601b0, 8) INDIRECT (ram, 0x5601b0, 8) , (const, 0x3a, 4) 0x48e7ea:0xce (unique, 0x10000128, 8) PTRSUB (const, 0x0, 8) , (const, 0x4c584a, 8) 0x48e7ea:0xcf (unique, 0x100000e0, 8) PTRSUB (const, 0x0, 8) , (const, 0x4dc7c0, 8) 0x48e7ea:0xd0 (unique, 0x100000e8, 8) PTRSUB (const, 0x0, 8) , (const, 0x4dc7c0, 8) 0x48e7ea:0xd8 (unique, 0x100000d8, 8) CAST (unique, 0x10000128, 8) 0x48e7ea:0xd9 (unique, 0x10000130, 8) CAST (unique, 0x100000e8, 8) ... PrintHighPCode.java> Finished!
This suggests that the stack write operations are eaten up by analysis that associates stack values with function calls. This may be due to incorrect function signatures or calling convention definitions, but in the interest of performing automatic analysis, I’d prefer not to have to fix up these definitions when encountering a new Go binary, or any other type of binary that I might want to use P-Code analysis on.
So, what’s actually happening to the raw P-Code ops that represent the machine instructions we’re interested in? Is there any way to get them to appear in the decompiler output?
Simplification Styles
Looking over the DecompInterface
class documentation, one method stands out:
public boolean setSimplificationStyle(java.lang.String actionstring)
This allows the application to the type of analysis performed by the decompiler, by giving the name of an analysis class. Right now, there are a few predefined classes. But there soon may be support for applications to define their own class and tailoring the decompiler’s behaviour for that class.
The current predefined analysis class are:
- “decompile” – this is the default, and performs all analysis steps suitable for producing C code.
- “normalize” – omits type recovery from the analysis and some of the final clean-up steps involved in making valid C code. It is suitable for creating normalized pcode syntax trees of the dataflow.
- “firstpass” – does no analysis, but produces an unmodified syntax tree of the dataflow from the
- “register” – does ???.
- “paramid” – does required amount of decompilation followed by analysis steps that send parameter measure information for parameter id analysis. raw pcode.
The analysis class descriptions here are limited, but I experimented with the different simplification style options and found that the register
and firstpass
styles produce P-Code ops for the target MOV instructions. However, the output for these styles is also more cumbersome: the decompile
style outputs 24 ops, whereas the firstpass
and register
styles output over 150.
Here’s how the P-Code ops for the LEA and MOV instructions that produce the string structure look with the register
simplification style:
PrintHighPCode.java> Running... PCode for function main.main @ 0048e790 (simplification style: register) ... 0x48e7c4:0x27 (register, 0x0, 8) COPY (const, 0x4c584a, 8) 0x48e7cb:0x28 (unique, 0x3800, 8) INT_ADD (register, 0x20, 8) , (const, 0xffffffffffffffb8, 8) 0x48e7cb:0x29 (unique, 0xc000, 8) COPY (const, 0x4c584a, 8) 0x48e7cb:0x2a --- STORE (const, 0x1b1, 4) , (unique, 0x3800, 8) , (const, 0x4c584a, 8) 0x48e7cb:0x8c (ram, 0x5601b0, 8) INDIRECT (ram, 0x5601b0, 8) , (const, 0x2a, 4) 0x48e7d0:0x2b (unique, 0x3800, 8) INT_ADD (register, 0x20, 8) , (const, 0xffffffffffffffc0, 8) 0x48e7d0:0x2c (unique, 0xc080, 8) COPY (const, 0x15, 8) 0x48e7d0:0x2d --- STORE (const, 0x1b1, 4) , (unique, 0x3800, 8) , (const, 0x15, 8) 0x48e7d0:0x8d (ram, 0x5601b0, 8) INDIRECT (ram, 0x5601b0, 8) , (const, 0x2d, 4) ... PrintHighPCode.java> Finished!
This output is very similar to the raw P-Code we saw earlier in the code listing. For example, compare the raw P-Code for 0x48e7cb
:
MOV qword ptr [RSP + local_48],RAX=>DAT_004c584a $U3800:8 = INT_ADD 16:8, RSP $Uc000:8 = COPY RAX STORE ram($U3800:8), $Uc000:8
Although this less processed P-Code output isn’t ideal, it appears to be the only readily available option. Let’s see how difficult it is to implement an analysis script based on this simplification style.
Register Style Analysis
The STORE operations appear to be the best targets for analyzing stack writes. The string address value conveniently propagates from the LEA instruction, through RAX, to the input of the P-Code STORE op for the MOV instruction. However, it’s not immediately obvious how to map the STORE output destination “(const, 0x1b1, 4), (unique, 0x3800, 8)
” to its stack pointer offset without tracking all of these “unique” space varnodes.3
Digging through the API documentation some more, we can see the VarnodeAST
class has a getDef()
method that links to the P-Code operation that defined the varnode (if available). Using the Eclipse debugger on the P-Code listing script, we can interactively explore the decompiler output in the Eclipse debug console. Here’s the STORE op that copies the string address to the stack again:
pcodeOpAST.toString() (java.lang.String) --- STORE (const, 0x1b1, 4) , (unique, 0x3800, 8) , (const, 0x4c584a, 8)
The STORE operation setup is a little weird; instead of having an output, it has two inputs that define the output location. Input 0 is the constant ID of the destination address space, and input 1 is the varnode containing the destination pointer offset. Here’s input 1:
pcodeOpAST.getInput(1) (ghidra.program.model.pcode.VarnodeAST) (unique, 0x3800, 8)
Finally, input 1 has a reference to its defining operation – the INT_ADD that calculates an offset from RSP:
pcodeOpAST.getInput(1).getDef() (ghidra.program.model.pcode.PcodeOpAST) (unique, 0x3800, 8) INT_ADD (register, 0x20, 8) , (const, 0xffffffffffffffb8, 8)
Since we’d like to know that the constant is written to a location on the stack, the next question is how to map “(register, 0x20, 8)
” to the stack pointer register. The Program
class has a getRegister (Varnode varnode)
method that returns a varnode’s corresponding Register
object:
pcodeOpAST.getInput(1).getDef().getInput(0) (ghidra.program.model.pcode.VarnodeAST) (register, 0x20, 8) currentProgram.getRegister(pcodeOpAST.getInput(1).getDef().getInput(0)) (ghidra.program.model.lang.Register) RSP
The Register
class has a getTypeFlags
method and TYPE_SP
flag constant that would apparently indicate a stack pointer type register. Unfortunately, we’ll have to rely on checking the register name string, as no type flags are set on the RSP register object:
currentProgram.getRegister(pcodeOpAST.getInput(1).getDef().getInput(0)).getTypeFlags() (int) 0
To see if a potential address and length value are adjacent to each other on the stack, we also want to determine what the stack offset value is. The STORE destination’s defining op has always been an INT_ADD in the binaries I tested, usually with a stack pointer register as the first input and a constant offset value as the second input. However, the offset input varnode is sometimes a register instead of a constant value, which requires traversing more defining P-Code ops to attempt to recover a constant offset value that the register would hold.
Similarly, the string address constant isn’t always propagated to the input of the STORE operation. For example, the STORE’s input varnode could be a register that the address was loaded into. This happens in a 32-bit ARM binary when the string data address is loaded into a register from the constant pool, and then copied from the register to the stack:
0009af44 ldr r0,[PTR_DAT_0009af88] = 000c580b r0 = LOAD ram(0x9af88:4) 0009af48 str r0=>DAT_000c580b,[sp,#local_20] $U8280:4 = INT_ADD sp, 12:4 STORE ram($U8280:4), r0
Here’s how the register
style P-Code output looks for the above ARM instructions:
0x9af44:0x19 (register, 0x20, 4) LOAD (const, 0x1a1, 4) , (const, 0x9af88, 4) 0x9af48:0x1a (unique, 0x8280, 4) INT_ADD (register, 0x54, 4) , (const, 0xffffffe0, 4) 0x9af48:0x1b --- STORE (const, 0x1a1, 4) , (unique, 0x8280, 4) , (register, 0x20, 4)
To handle a register value defined by a LOAD operation we’ll have to check if the data at the load address is recognized as a pointer:
// Register may hold an address PcodeOp def = dataToStore.getDef(); // Check for LOAD op that loaded an address into the register, // e.g. getting address from constant pool in ARM 32 if (def != null def.getOpcode() == PcodeOp.LOAD) { int spaceId = (int) def.getInput(0).getOffset(); long offset = def.getInput(1).getOffset(); Address loadFrom = program.getAddressFactory().getAddress(spaceId, offset); Data dataLoaded = getDataAt(loadFrom); if (dataLoaded != null dataLoaded.isPointer()) { candidateAddr = (Address) dataLoaded.getValue(); } }
One more API quirk to mention is that constant value varnodes store their value as an address offset in the “constant” address space.
if (varnode.isConstant()) { // The constant value is stored as its "address" offset long constVal = varnode.getAddress().getOffset(); // ... }
Combining this information is enough to make a working Go string analysis script based on scanning STORE operations from the register
style P-Code output, but this is much clunkier than the ideal scenario I imagined. Preferably, the P-Code operations would just have a constant length or address value as the input and an address in the stack space as the output.
To figure out why P-Code ops for the target machine instructions disappear completely in the higher-level analysis styles, we’ll have to look into the Ghidra decompiler internals.
Decompiler Internals
The decompiler C++ source code is included with Ghidra in the Ghidra/Features/Decompiler/
directory. After installing the necessary build dependencies, run make doc
in Ghidra/Features/Decompiler/src/decompile/cpp
to generate the decompiler documentation.
The SetAction
class documentation provides slightly better info on the simplification styles than “???”:
- decompile – The main decompiler action
- normalize – Decompilation tuned for normalization
- jumptable – Simplify just enough to recover a jump-table
- paramid – Simplify enough to recover function parameters
- register – Perform one analysis pass on registers, without stack variables
- firstpass – Construct the initial raw syntax tree, with no simplification
Searching for the simplification style names in the source code turns up two of the most important pieces of code to look at. The ActionDatabase::buildDefaultGroups
method shows that these simplification styles are defined as groups of analysis rules:
/// (Re)build the default e root Actions: decompile, jumptable, normalize, paramid, register, firstpass void ActionDatabase::buildDefaultGroups(void) { if (isDefaultGroups) return; groupmap.clear(); const char *members[] = { "base", "protorecovery", "protorecovery_a", "deindirect", "localrecovery", "deadcode", "typerecovery", "stackptrflow", "blockrecovery", "stackvars", "deadcontrolflow", "switchnorm", "cleanup", "merge", "dynamic", "casts", "analysis", "fixateglobals", "fixateproto", "segment", "returnsplit", "nodejoin", "doubleload", "doubleprecis", "unreachable", "subvar", "floatprecision", "conditionalexe", "" }; setGroup("decompile",members); const char *jumptab[] = { "base", "noproto", "localrecovery", "deadcode", "stackptrflow", "stackvars", "analysis", "segment", "subvar", "conditionalexe", "" }; setGroup("jumptable",jumptab); const char *normali[] = { "base", "protorecovery", "protorecovery_b", "deindirect", "localrecovery", "deadcode", "stackptrflow", "normalanalysis", "stackvars", "deadcontrolflow", "analysis", "fixateproto", "nodejoin", "unreachable", "subvar", "floatprecision", "normalizebranches", "conditionalexe", "" }; setGroup("normalize",normali); const char *paramid[] = { "base", "protorecovery", "protorecovery_b", "deindirect", "localrecovery", "deadcode", "typerecovery", "stackptrflow", "siganalysis", "stackvars", "deadcontrolflow", "analysis", "fixateproto", "unreachable", "subvar", "floatprecision", "conditionalexe", "" }; setGroup("paramid",paramid); const char *regmemb[] = { "base", "analysis", "subvar", "" }; setGroup("register",regmemb); const char *firstmem[] = { "base", "" }; setGroup("firstpass",firstmem); isDefaultGroups = true; }
To clear up some of the terminology: “simplification styles” are also referred to as “root actions” or “groups” in the decompiler source code. They consist of groups of “base groups” such as “stackvars” or “typerecovery”, which are more fine-grained groups of specific analysis operations.
Just below buildDefaultGroups
is the universal action list, which lists all analysis operations (“actions” and “rules”) in the order they should run. Each operation is associated with a base group. Operations are enabled when their base group is included in the current root action.
/// Construct the b universal Action that contains all possible components /// param conf is the Architecture that will use the Action void ActionDatabase::universalAction(Architecture *conf) { vector<Rule *>::iterator iter; ActionGroup *act; ActionGroup *actmainloop; ActionGroup *actfullloop; ActionPool *actprop,*actprop2; ActionPool *actcleanup; ActionGroup *actstackstall; AddrSpace *stackspace = conf->getStackSpace(); act = new ActionRestartGroup(Action::rule_onceperfunc,"universal",1); registerAction(universalname,act); act->addAction( new ActionStart("base")); act->addAction( new ActionConstbase("base")); act->addAction( new ActionNormalizeSetup("normalanalysis")); act->addAction( new ActionDefaultParams("base")); // act->addAction( new ActionParamShiftStart("paramshift") ); act->addAction( new ActionExtraPopSetup("base",stackspace) ); act->addAction( new ActionPrototypeTypes("protorecovery")); act->addAction( new ActionFuncLink("protorecovery") ); act->addAction( new ActionFuncLinkOutOnly("noproto") ); //... snip ... act->addAction( new ActionDynamicSymbols("dynamic") ); act->addAction( new ActionNameVars("merge") ); act->addAction( new ActionSetCasts("casts") ); act->addAction( new ActionFinalStructure("blockrecovery") ); act->addAction( new ActionPrototypeWarnings("protorecovery") ); act->addAction( new ActionStop("base") ); }
There are about 220 different actions in the universal list, so instead of going through all of them, I looked for a way to dynamically inspect the decompilation process. There is a way to build the decompiler with a debugging CLI, described in this GitHub issue. In short, once the necessary build dependencies are installed, go to Ghidra/Features/Decompiler/src/decompile/cpp
and run make decomp_dbg
.
The issue page describes how to save the function data XML payload that Ghidra generates for IPC with the decompiler and load it in the debugging CLI to reproduce a function decompilation. I couldn’t find any overall documentation on the CLI itself, though some of the individual command handler classes have documentation that describes the arguments they take (see the IfaceCommand
class documentation for a list of handler classes). Beyond the initial XML loading steps, I had to browse through the code a bit to find my way around the CLI and understand what the debugging trace output meant.
To save a copy of the function data XML from the Ghidra GUI, we can use the “Debug Function Decompilation” option in the Decompile window drop-down menu. From a script we can achieve the same thing with the DecompInterface.enableDebug
method:
File debugDump = askFile("Decompiler IPC XML Dump", "Save"); decompIfc.enableDebug(debugDump);
Start decomp_dbg
with the SLEIGHHOME
environment variable set to the Ghidra install directory. Then load the XML file with the restore
command:
$ SLEIGHHOME=/opt/ghidra_10.1.2 ./decomp_dbg [decomp]> restore /tmp/mystery_printf_main.xml /tmp/mystery_printf_main.xml successfully loaded: Intel/AMD 64-bit x86
Next we can specify the function to decompile and set trace points on the individual instructions we want to observe analysis actions for. Here I’m including the addresses of the LEA and MOV instructions that set up the string structure on the stack, as well as the call to fmt.Fprintf
:
[decomp]> load function main.main Function main.main: 0x0048e790 [decomp]> trace address 0x48e7c4 OK (1 ranges) [decomp]> trace address 0x48e7cb OK (2 ranges) [decomp]> trace address 0x48e7d0 OK (3 ranges) [decomp]> trace address 0x48e7ea OK (4 ranges)
Now we can use the decompile
command to decompile the main.main
function and trace the analysis process:
[decomp]> decompile Decompiling main.main DEBUG 0: extrapopsetup 0x0048e7ea:52: ** 0x0048e7ea:52: RSP(0x0048e7ea:52) = RSP(free) + #0x8 DEBUG 1: funclink 0x0048e7ea:57: ** 0x0048e7ea:57: u0x10000012(0x0048e7ea:57) = RSP(free) + #0x0 0x0048e7ea:58: ** 0x0048e7ea:58: u0x1000001a:1(0x0048e7ea:58) = *(ram,u0x10000012(0x0048e7ea:57)) 0x0048e7ea:3a: call ffmt.Fprintf(free) 0x0048e7ea:3a: call ffmt.Fprintf(free)(u0x1000001a:1(0x0048e7ea:58)) DEBUG 2: heritage 0x0048e7ea:5b: ** 0x0048e7ea:5b: RAX(0x0048e7ea:5b) = [create] i0x0048e7ea:3a(free) 0x0048e7ea:3a: call ffmt.Fprintf(free)(u0x1000001a:1(0x0048e7ea:58)) 0x0048e7ea:3a: call ffmt.Fprintf(free)(u0x1000001a:1(0x0048e7ea:58),RCX(0x0048e7b4:21),XMM0_Da(0x0048e7e2:31)) 0x0048e7ea:5e: ** 0x0048e7ea:5e: RCX(0x0048e7ea:5e) = RCX(0x0048e7b4:21) [] i0x0048e7ea:3a(free) ...
Each trace entry shows the name of the rule that executed (after “DEBUG #”), followed by the list of P-Code transformations it performed. P-Code operations are identified by their target address and “time” value, which distinguishes between multiple P-Code operations for a single native machine instruction (e.g., 0x0048e7ea:3a
vs 0x0048e7ea:5b
). Double asterisks (“**”) after an address listing mean the op is dead or has no parent basic block (see PcodeOp::printDebug
).
For example, this trace shows how the “Propagate Copy” rule replaces a reference to the RAX register with the constant value that was written to it by the COPY op at 0x0048e7c4:27
:
DEBUG 5: propagatecopy 0x0048e7cb:29: u0x0000c000(0x0048e7cb:29) = RAX(0x0048e7c4:27) 0x0048e7cb:29: u0x0000c000(0x0048e7cb:29) = #0x4c584a
To investigate these analysis steps in more detail via the source code, search for the name that appears after “DEBUG”. This name comes from the rule or action’s constructor:
class RulePropagateCopy : public Rule { public: RulePropagateCopy(const string g) : Rule(g, 0, "propagatecopy") {} ///< Constructor
The implementation of a rule is found in its class’s applyOp
method, e.g. RulePropagateCopy::applyOp
.
Some of the trace output is a little cryptic, such as these lines with empty square brackets (“[]”):
0x0048e7ea:95: ** 0x0048e7ea:95: s0xffffffffffffffa0(0x0048e7ea:95) = s0xffffffffffffffa0(0x0048e7ea:39) [] i0x0048e7ea:3a(free)
Looking for the relevant printRaw
method that creates the output line can help clarify its meaning. The square brackets happen to refer to INDIRECT operations. The “i”-prefixed address after the brackets shows an iop
space address. This address space is used to store pointers to internal P-Code ops. Documentation for the IopSpace
class and INDIRECT P-Code op both discuss the meaning of this kind of address:
The varnode input1 is not part of the machine state but is really an internal reference to a specific p-code operator that may be affecting the value of the output varnode. A special address space indicates input1’s use as an internal reference encoding.
The decompiler CLI print spaces
command shows the mapping of address prefix characters to address spaces:
[decomp]> print spaces 0 : '#' const constant small addrsize=8 wordsize=1 delay=0 1 : 'o' OTHER processor small addrsize=8 wordsize=1 delay=0 2 : 'u' unique internal small addrsize=4 wordsize=1 delay=0 3 : 'r' ram processor small addrsize=8 wordsize=1 delay=1 4 : '%' register processor small addrsize=4 wordsize=1 delay=0 5 : 'f' fspec special small addrsize=8 wordsize=1 delay=1 6 : 'i' iop special small addrsize=8 wordsize=1 delay=1 7 : 'j' join special small addrsize=4 wordsize=1 delay=0 8 : 's' stack spacebase small addrsize=8 wordsize=1 delay=1
Tracing the Decompiler Analysis
Now that we know a bit more about how to interpret the decompiler trace output, we can use it to understand what happens when we decompile the main.main
function with a higher-level analysis style. For reference, the full trace of the main.main
decompilation follows:
[decomp]> decompile Decompiling main.main DEBUG 0: extrapopsetup 0x0048e7ea:52: ** 0x0048e7ea:52: RSP(0x0048e7ea:52) = RSP(free) + #0x8 DEBUG 1: funclink 0x0048e7ea:57: ** 0x0048e7ea:57: u0x10000012(0x0048e7ea:57) = RSP(free) + #0x0 0x0048e7ea:58: ** 0x0048e7ea:58: u0x1000001a:1(0x0048e7ea:58) = *(ram,u0x10000012(0x0048e7ea:57)) 0x0048e7ea:3a: call ffmt.Fprintf(free) 0x0048e7ea:3a: call ffmt.Fprintf(free)(u0x1000001a:1(0x0048e7ea:58)) DEBUG 2: heritage 0x0048e7ea:5b: ** 0x0048e7ea:5b: RAX(0x0048e7ea:5b) = [create] i0x0048e7ea:3a(free) 0x0048e7ea:3a: call ffmt.Fprintf(free)(u0x1000001a:1(0x0048e7ea:58)) 0x0048e7ea:3a: call ffmt.Fprintf(free)(u0x1000001a:1(0x0048e7ea:58),RCX(0x0048e7b4:21),XMM0_Da(0x0048e7e2:31)) 0x0048e7ea:5e: ** 0x0048e7ea:5e: RCX(0x0048e7ea:5e) = RCX(0x0048e7b4:21) [] i0x0048e7ea:3a(free) 0x0048e7ea:61: ** 0x0048e7ea:61: FS_OFFSET(0x0048e7ea:61) = FS_OFFSET(i) [] i0x0048e7ea:3a(free) 0x0048e7ea:64: ** 0x0048e7ea:64: CF(0x0048e7ea:64) = CF(0x0048e79f:12) [] i0x0048e7ea:3a(free) 0x0048e7ea:67: ** 0x0048e7ea:67: PF(0x0048e7ea:67) = PF(0x0048e79f:1a) [] i0x0048e7ea:3a(free) 0x0048e7ea:6a: ** 0x0048e7ea:6a: ZF(0x0048e7ea:6a) = ZF(0x0048e79f:16) [] i0x0048e7ea:3a(free) 0x0048e7ea:6d: ** 0x0048e7ea:6d: SF(0x0048e7ea:6d) = SF(0x0048e79f:15) [] i0x0048e7ea:3a(free) 0x0048e7ea:70: ** 0x0048e7ea:70: DF(0x0048e7ea:70) = DF(0x0048e790:4f) [] i0x0048e7ea:3a(free) 0x0048e7ea:73: ** 0x0048e7ea:73: OF(0x0048e7ea:73) = OF(0x0048e79f:13) [] i0x0048e7ea:3a(free) 0x0048e7ea:76: ** 0x0048e7ea:76: RIP(0x0048e7ea:76) = RIP(i) [] i0x0048e7ea:3a(free) 0x0048e7ea:7c: ** 0x0048e7ea:7c: XMM0_Da(0x0048e7ea:7c) = [create] i0x0048e7ea:3a(free) 0x0048e7ea:7f: ** 0x0048e7ea:7f: XMM0_Db(0x0048e7ea:7f) = [create] i0x0048e7ea:3a(free) 0x0048e7ea:82: ** 0x0048e7ea:82: XMM0_Dc(0x0048e7ea:82) = [create] i0x0048e7ea:3a(free) 0x0048e7ea:85: ** 0x0048e7ea:85: XMM0_Dd(0x0048e7ea:85) = [create] i0x0048e7ea:3a(free) 0x0048e7cb:28: u0x00003800(0x0048e7cb:28) = #0x10 + RSP(free) 0x0048e7cb:28: u0x00003800(0x0048e7cb:28) = #0x10 + RSP(0x0048e79f:14) 0x0048e7cb:29: u0x0000c000(0x0048e7cb:29) = RAX(free) 0x0048e7cb:29: u0x0000c000(0x0048e7cb:29) = RAX(0x0048e7c4:27) 0x0048e7cb:2a: *(ram,u0x00003800(free)) = u0x0000c000(free) 0x0048e7cb:2a: *(ram,u0x00003800(0x0048e7cb:28)) = u0x0000c000(0x0048e7cb:29) 0x0048e7d0:2b: u0x00003800(0x0048e7d0:2b) = #0x18 + RSP(free) 0x0048e7d0:2b: u0x00003800(0x0048e7d0:2b) = #0x18 + RSP(0x0048e79f:14) 0x0048e7d0:2d: *(ram,u0x00003800(free)) = u0x0000c080(free) 0x0048e7d0:2d: *(ram,u0x00003800(0x0048e7d0:2b)) = u0x0000c080(0x0048e7d0:2c) 0x0048e7ea:38: RSP(0x0048e7ea:38) = RSP(free) - #0x8 0x0048e7ea:38: RSP(0x0048e7ea:38) = RSP(0x0048e79f:14) - #0x8 0x0048e7ea:39: *(ram,RSP(free)) = #0x48e7ef 0x0048e7ea:39: *(ram,RSP(0x0048e7ea:38)) = #0x48e7ef 0x0048e7ea:57: u0x10000012(0x0048e7ea:57) = RSP(free) + #0x0 0x0048e7ea:57: u0x10000012(0x0048e7ea:57) = RSP(0x0048e7ea:38) + #0x0 0x0048e7ea:52: RSP(0x0048e7ea:52) = RSP(free) + #0x8 0x0048e7ea:52: RSP(0x0048e7ea:52) = RSP(0x0048e7ea:38) + #0x8 DEBUG 3: deadcode 0x0048e7ea:5b: RAX(0x0048e7ea:5b) = [create] i0x0048e7ea:3a(free) 0x0048e7ea:5b: ** 0x0048e7ea:5e: RCX(0x0048e7ea:5e) = RCX(0x0048e7b4:21) [] i0x0048e7ea:3a(free) 0x0048e7ea:5e: ** 0x0048e7ea:52: RSP(0x0048e7ea:52) = RSP(0x0048e7ea:38) + #0x8 0x0048e7ea:52: ** 0x0048e7ea:61: FS_OFFSET(0x0048e7ea:61) = FS_OFFSET(i) [] i0x0048e7ea:3a(free) 0x0048e7ea:61: ** 0x0048e7ea:64: CF(0x0048e7ea:64) = CF(0x0048e79f:12) [] i0x0048e7ea:3a(free) 0x0048e7ea:64: ** 0x0048e7ea:67: PF(0x0048e7ea:67) = PF(0x0048e79f:1a) [] i0x0048e7ea:3a(free) 0x0048e7ea:67: ** 0x0048e7ea:6a: ZF(0x0048e7ea:6a) = ZF(0x0048e79f:16) [] i0x0048e7ea:3a(free) 0x0048e7ea:6a: ** 0x0048e7ea:6d: SF(0x0048e7ea:6d) = SF(0x0048e79f:15) [] i0x0048e7ea:3a(free) 0x0048e7ea:6d: ** 0x0048e7ea:70: DF(0x0048e7ea:70) = DF(0x0048e790:4f) [] i0x0048e7ea:3a(free) 0x0048e7ea:70: ** 0x0048e7ea:73: OF(0x0048e7ea:73) = OF(0x0048e79f:13) [] i0x0048e7ea:3a(free) 0x0048e7ea:73: ** 0x0048e7ea:76: RIP(0x0048e7ea:76) = RIP(i) [] i0x0048e7ea:3a(free) 0x0048e7ea:76: ** 0x0048e7ea:7c: XMM0_Da(0x0048e7ea:7c) = [create] i0x0048e7ea:3a(free) 0x0048e7ea:7c: ** 0x0048e7ea:7f: XMM0_Db(0x0048e7ea:7f) = [create] i0x0048e7ea:3a(free) 0x0048e7ea:7f: ** 0x0048e7ea:82: XMM0_Dc(0x0048e7ea:82) = [create] i0x0048e7ea:3a(free) 0x0048e7ea:82: ** 0x0048e7ea:85: XMM0_Dd(0x0048e7ea:85) = [create] i0x0048e7ea:3a(free) 0x0048e7ea:85: ** DEBUG 4: termorder 0x0048e7cb:28: u0x00003800(0x0048e7cb:28) = #0x10 + RSP(0x0048e79f:14) 0x0048e7cb:28: u0x00003800(0x0048e7cb:28) = RSP(0x0048e79f:14) + #0x10 DEBUG 5: propagatecopy 0x0048e7cb:29: u0x0000c000(0x0048e7cb:29) = RAX(0x0048e7c4:27) 0x0048e7cb:29: u0x0000c000(0x0048e7cb:29) = #0x4c584a DEBUG 6: propagatecopy 0x0048e7cb:2a: *(ram,u0x00003800(0x0048e7cb:28)) = u0x0000c000(0x0048e7cb:29) 0x0048e7cb:2a: *(ram,u0x00003800(0x0048e7cb:28)) = #0x4c584a DEBUG 7: termorder 0x0048e7d0:2b: u0x00003800(0x0048e7d0:2b) = #0x18 + RSP(0x0048e79f:14) 0x0048e7d0:2b: u0x00003800(0x0048e7d0:2b) = RSP(0x0048e79f:14) + #0x18 DEBUG 8: propagatecopy 0x0048e7d0:2d: *(ram,u0x00003800(0x0048e7d0:2b)) = u0x0000c080(0x0048e7d0:2c) 0x0048e7d0:2d: *(ram,u0x00003800(0x0048e7d0:2b)) = #0x15 DEBUG 9: sub2add 0x0048e7ea:88: ** 0x0048e7ea:88: u0x1000004f(0x0048e7ea:88) = #0x8 * #0xffffffffffffffff 0x0048e7ea:38: RSP(0x0048e7ea:38) = RSP(0x0048e79f:14) - #0x8 0x0048e7ea:38: RSP(0x0048e7ea:38) = RSP(0x0048e79f:14) + u0x1000004f(0x0048e7ea:88) DEBUG 10: propagatecopy 0x0048e7ea:3a: call ffmt.Fprintf(free)(u0x1000001a:1(0x0048e7ea:58),RCX(0x0048e7b4:21),XMM0_Da(0x0048e7e2:31)) 0x0048e7ea:3a: call ffmt.Fprintf(free)(u0x1000001a:1(0x0048e7ea:58),#0x4dc7c0,XMM0_Da(0x0048e7e2:31)) DEBUG 11: identityel 0x0048e7ea:57: u0x10000012(0x0048e7ea:57) = RSP(0x0048e7ea:38) + #0x0 0x0048e7ea:57: u0x10000012(0x0048e7ea:57) = RSP(0x0048e7ea:38) DEBUG 12: propagatecopy 0x0048e7ea:58: u0x1000001a:1(0x0048e7ea:58) = *(ram,u0x10000012(0x0048e7ea:57)) 0x0048e7ea:58: u0x1000001a:1(0x0048e7ea:58) = *(ram,RSP(0x0048e7ea:38)) DEBUG 13: collapseconstants 0x0048e7ea:88: u0x1000004f(0x0048e7ea:88) = #0x8 * #0xffffffffffffffff 0x0048e7ea:88: u0x1000004f(0x0048e7ea:88) = #0xfffffffffffffff8 DEBUG 14: earlyremoval 0x0048e7c4:27: RAX(0x0048e7c4:27) = #0x4c584a 0x0048e7c4:27: ** DEBUG 15: addmultcollapse 0x0048e7cb:28: u0x00003800(0x0048e7cb:28) = RSP(0x0048e79f:14) + #0x10 0x0048e7cb:28: u0x00003800(0x0048e7cb:28) = RSP(i) + #0xffffffffffffffb8 DEBUG 16: earlyremoval 0x0048e7cb:29: u0x0000c000(0x0048e7cb:29) = #0x4c584a 0x0048e7cb:29: ** DEBUG 17: addmultcollapse 0x0048e7d0:2b: u0x00003800(0x0048e7d0:2b) = RSP(0x0048e79f:14) + #0x18 0x0048e7d0:2b: u0x00003800(0x0048e7d0:2b) = RSP(i) + #0xffffffffffffffc0 DEBUG 18: earlyremoval 0x0048e7d0:2c: u0x0000c080(0x0048e7d0:2c) = #0x15 0x0048e7d0:2c: ** DEBUG 19: propagatecopy 0x0048e7ea:38: RSP(0x0048e7ea:38) = RSP(0x0048e79f:14) + u0x1000004f(0x0048e7ea:88) 0x0048e7ea:38: RSP(0x0048e7ea:38) = RSP(0x0048e79f:14) + #0xfffffffffffffff8 DEBUG 20: propagatecopy 0x0048e7ea:3a: call ffmt.Fprintf(free)(u0x1000001a:1(0x0048e7ea:58),#0x4dc7c0,XMM0_Da(0x0048e7e2:31)) 0x0048e7ea:3a: call ffmt.Fprintf(free)(u0x1000001a:1(0x0048e7ea:58),#0x4dc7c0,#0x0:4) DEBUG 21: earlyremoval 0x0048e7ea:57: u0x10000012(0x0048e7ea:57) = RSP(0x0048e7ea:38) 0x0048e7ea:57: ** DEBUG 22: earlyremoval 0x0048e7ea:88: u0x1000004f(0x0048e7ea:88) = #0xfffffffffffffff8 0x0048e7ea:88: ** DEBUG 23: addmultcollapse 0x0048e7ea:38: RSP(0x0048e7ea:38) = RSP(0x0048e79f:14) + #0xfffffffffffffff8 0x0048e7ea:38: RSP(0x0048e7ea:38) = RSP(i) + #0xffffffffffffffa0 DEBUG 24: storevarnode 0x0048e7cb:2a: *(ram,u0x00003800(0x0048e7cb:28)) = #0x4c584a 0x0048e7cb:2a: s0xffffffffffffffb8(0x0048e7cb:2a) = #0x4c584a DEBUG 25: storevarnode 0x0048e7d0:2d: *(ram,u0x00003800(0x0048e7d0:2b)) = #0x15 0x0048e7d0:2d: s0xffffffffffffffc0(0x0048e7d0:2d) = #0x15 DEBUG 26: storevarnode 0x0048e7ea:39: *(ram,RSP(0x0048e7ea:38)) = #0x48e7ef 0x0048e7ea:39: s0xffffffffffffffa0(0x0048e7ea:39) = #0x48e7ef DEBUG 27: loadvarnode 0x0048e7ea:58: u0x1000001a:1(0x0048e7ea:58) = *(ram,RSP(0x0048e7ea:38)) 0x0048e7ea:58: u0x1000001a:1(0x0048e7ea:58) = s0xffffffffffffffa0:1(free) 0x0048e7ea:3a: call ffmt.Fprintf(free)(u0x1000001a:1(0x0048e7ea:58),#0x4dc7c0,#0x0:4) 0x0048e7ea:3a: call ffmt.Fprintf(free)(#0x4dc7c0,#0x0:4) DEBUG 28: heritage 0x0048e7ea:8c: ** 0x0048e7ea:8c: r0x005601b0(0x0048e7ea:8c) = r0x005601b0(i) [] i0x0048e7ea:3a(free) 0x0048e7ea:3a: call ffmt.Fprintf(free)(#0x4dc7c0,#0x0:4) 0x0048e7ea:3a: call ffmt.Fprintf(free)(#0x4dc7c0,#0x0:4,s0x00000000:1(i),s0xffffffffffffffa8(0x0048e7bb:23),s0xffffffffffffffb0(0x0048e7bf:26),s0xffffffffffffffb8(0x0048e7cb:2a),s0xffffffffffffffc0(0x0048e7d0:2d),s0xffffffffffffffc8(0x0048e7d9:30),s0xffffffffffffffd0:10(0x0048e7e5:37),s0xfffffffffffffff8(0x0048e7a3:1d)) 0x0048e7ea:91: ** 0x0048e7ea:91: s0x00000000:1(0x0048e7ea:91) = s0x00000000:1(i) [] i0x0048e7ea:3a(free) 0x0048e7ea:92: ** 0x0048e7ea:92: s0xffffffffffffffa0:1(0x0048e7ea:92) = SUB81(s0xffffffffffffffa0(0x0048e7ea:39),#0x0) 0x0048e7ea:95: ** 0x0048e7ea:95: s0xffffffffffffffa0(0x0048e7ea:95) = s0xffffffffffffffa0(0x0048e7ea:39) [] i0x0048e7ea:3a(free) 0x0048e7ea:98: ** 0x0048e7ea:98: s0xffffffffffffffa8(0x0048e7ea:98) = s0xffffffffffffffa8(0x0048e7bb:23) [] i0x0048e7ea:3a(free) 0x0048e7ea:9b: ** 0x0048e7ea:9b: s0xffffffffffffffb0(0x0048e7ea:9b) = s0xffffffffffffffb0(0x0048e7bf:26) [] i0x0048e7ea:3a(free) 0x0048e7ea:9e: ** 0x0048e7ea:9e: s0xffffffffffffffb8(0x0048e7ea:9e) = s0xffffffffffffffb8(0x0048e7cb:2a) [] i0x0048e7ea:3a(free) 0x0048e7ea:a1: ** 0x0048e7ea:a1: s0xffffffffffffffc0(0x0048e7ea:a1) = s0xffffffffffffffc0(0x0048e7d0:2d) [] i0x0048e7ea:3a(free) 0x0048e7ea:a4: ** 0x0048e7ea:a4: s0xffffffffffffffc8(0x0048e7ea:a4) = s0xffffffffffffffc8(0x0048e7d9:30) [] i0x0048e7ea:3a(free) 0x0048e7ea:a7: ** 0x0048e7ea:a7: s0xffffffffffffffd0:10(0x0048e7ea:a7) = s0xffffffffffffffd0:10(0x0048e7e5:37) [] i0x0048e7ea:3a(free) 0x0048e7ea:ab: ** 0x0048e7ea:ab: s0xfffffffffffffff8(0x0048e7ea:ab) = s0xfffffffffffffff8(0x0048e7a3:1d) [] i0x0048e7ea:3a(free) DEBUG 29: activeparam 0x0048e7ea:3a: call ffmt.Fprintf(free)(#0x4dc7c0,#0x0:4,s0x00000000:1(i),s0xffffffffffffffa8(0x0048e7bb:23),s0xffffffffffffffb0(0x0048e7bf:26),s0xffffffffffffffb8(0x0048e7cb:2a),s0xffffffffffffffc0(0x0048e7d0:2d),s0xffffffffffffffc8(0x0048e7d9:30),s0xffffffffffffffd0:10(0x0048e7e5:37),s0xfffffffffffffff8(0x0048e7a3:1d)) 0x0048e7ea:3a: call ffmt.Fprintf(free) DEBUG 30: deadcode 0x0048e7cb:28: u0x00003800(0x0048e7cb:28) = RSP(i) + #0xffffffffffffffb8 0x0048e7cb:28: ** 0x0048e7d0:2b: u0x00003800(0x0048e7d0:2b) = RSP(i) + #0xffffffffffffffc0 0x0048e7d0:2b: ** 0x0048e7ea:58: u0x1000001a:1(0x0048e7ea:58) = s0xffffffffffffffa0:1(0x0048e7ea:92) 0x0048e7ea:58: ** 0x0048e7ea:38: RSP(0x0048e7ea:38) = RSP(i) + #0xffffffffffffffa0 0x0048e7ea:38: ** 0x0048e7ea:91: s0x00000000:1(0x0048e7ea:91) = s0x00000000:1(i) [] i0x0048e7ea:3a(free) 0x0048e7ea:91: ** 0x0048e7ea:92: s0xffffffffffffffa0:1(0x0048e7ea:92) = SUB81(s0xffffffffffffffa0(0x0048e7ea:39),#0x0) 0x0048e7ea:92: ** 0x0048e7ea:ab: s0xfffffffffffffff8(0x0048e7ea:ab) = s0xfffffffffffffff8(0x0048e7a3:1d) [] i0x0048e7ea:3a(free) 0x0048e7ea:ab: ** DEBUG 31: earlyremoval 0x0048e7ea:95: s0xffffffffffffffa0(0x0048e7ea:95) = s0xffffffffffffffa0(0x0048e7ea:39) [] i0x0048e7ea:3a(free) 0x0048e7ea:95: ** DEBUG 32: earlyremoval 0x0048e7ea:98: s0xffffffffffffffa8(0x0048e7ea:98) = s0xffffffffffffffa8(0x0048e7bb:23) [] i0x0048e7ea:3a(free) 0x0048e7ea:98: ** DEBUG 33: earlyremoval 0x0048e7ea:9b: s0xffffffffffffffb0(0x0048e7ea:9b) = s0xffffffffffffffb0(0x0048e7bf:26) [] i0x0048e7ea:3a(free) 0x0048e7ea:9b: ** DEBUG 34: earlyremoval 0x0048e7ea:9e: s0xffffffffffffffb8(0x0048e7ea:9e) = s0xffffffffffffffb8(0x0048e7cb:2a) [] i0x0048e7ea:3a(free) 0x0048e7ea:9e: ** DEBUG 35: earlyremoval 0x0048e7ea:a1: s0xffffffffffffffc0(0x0048e7ea:a1) = s0xffffffffffffffc0(0x0048e7d0:2d) [] i0x0048e7ea:3a(free) 0x0048e7ea:a1: ** DEBUG 36: earlyremoval 0x0048e7ea:a4: s0xffffffffffffffc8(0x0048e7ea:a4) = s0xffffffffffffffc8(0x0048e7d9:30) [] i0x0048e7ea:3a(free) 0x0048e7ea:a4: ** DEBUG 37: earlyremoval 0x0048e7ea:a7: s0xffffffffffffffd0:10(0x0048e7ea:a7) = s0xffffffffffffffd0:10(0x0048e7e5:37) [] i0x0048e7ea:3a(free) 0x0048e7ea:a7: ** DEBUG 38: earlyremoval 0x0048e7cb:2a: s0xffffffffffffffb8(0x0048e7cb:2a) = #0x4c584a 0x0048e7cb:2a: ** DEBUG 39: earlyremoval 0x0048e7d0:2d: s0xffffffffffffffc0(0x0048e7d0:2d) = #0x15 0x0048e7d0:2d: ** DEBUG 40: earlyremoval 0x0048e7ea:39: s0xffffffffffffffa0(0x0048e7ea:39) = #0x48e7ef 0x0048e7ea:39: ** Decompilation complete
Right near the end of the trace output, at steps 38 and 39, we can see exactly the type of P-Code ops that would be ideal for our analysis:
DEBUG 38: earlyremoval 0x0048e7cb:2a: s0xffffffffffffffb8(0x0048e7cb:2a) = #0x4c584a 0x0048e7cb:2a: ** DEBUG 39: earlyremoval 0x0048e7d0:2d: s0xffffffffffffffc0(0x0048e7d0:2d) = #0x15 0x0048e7d0:2d: **
These copy the string address and length values to offsets in the stack address space. This would be great, except earlyremoval
is deleting them!
Earlier, at steps 24 and 25, these ops are created by the storevarnode
rule:
DEBUG 24: storevarnode 0x0048e7cb:2a: *(ram,u0x00003800(0x0048e7cb:28)) = #0x4c584a 0x0048e7cb:2a: s0xffffffffffffffb8(0x0048e7cb:2a) = #0x4c584a DEBUG 25: storevarnode 0x0048e7d0:2d: *(ram,u0x00003800(0x0048e7d0:2b)) = #0x15 0x0048e7d0:2d: s0xffffffffffffffc0(0x0048e7d0:2d) = #0x15
At step 28, those destination stack varnodes are associated with the fmt.Fprintf
function call by the heritage
action. In the same step, they’re also assigned to what appears to be a different Static Single Assignment (SSA) form version of the same stack location:
0x0048e7ea:9e: ** 0x0048e7ea:9e: s0xffffffffffffffb8(0x0048e7ea:9e) = s0xffffffffffffffb8(0x0048e7cb:2a) [] i0x0048e7ea:3a(free) 0x0048e7ea:a1: ** 0x0048e7ea:a1: s0xffffffffffffffc0(0x0048e7ea:a1) = s0xffffffffffffffc0(0x0048e7d0:2d) [] i0x0048e7ea:3a(free)
Then at step 29, the activeparam
action removes all parameters from the fmt.Fprintf
call (note that the “Decompile Parameter ID” analysis wasn’t performed on the program before generating this decompilation request XML payload). Even if the function call parameters were retained, once the stack parameters can be associated with the function call, the original ops that set them up are no longer needed and get marked as dead code.
I looked into manipulating the root action configurations and manually editing the fmt.Fprintf
function signature, but I couldn’t find an obvious way to prevent the stack write operations from being associated with the call operation. A lower-level analysis style that preserves stack operations and doesn’t depend on correct function call or data structure definitions would be great to have, but I think it would require more involved modification of the decompiler source code.
Instead, I decided to try disabling the earlyremoval
rule itself to preserve the stack copy operations. The decompiler has some code for parsing root action configurations from XML (the setSimplificationStyle
documentation seems to refer to this when it mentions that applications might eventually be able to define their own analysis classes), but Ghidra doesn’t have any implementation to generate this XML.
I found that the DecompileOptions
class in the Ghidra API is susceptible to XML injection,4 so I was able to directly inject the necessary XML elements from my script anyway. The injected currentaction
element below disables the deadcode
base group in the decompiler:
String protoEvalModel = options.getProtoEvalModel(); options.setProtoEvalModel(protoEvalModel + "n" + "n" + " " + SIMPLIFICATION_STYLE + "n" + "deadcode n" + "off n" + " n" + "" + protoEvalModel);
I don’t consider this XML injection issue to be a vulnerability because the injection is triggered through arbitrary script code the user is voluntarily running; I didn’t see a way to abuse this via a malicious binary. This trick mainly saves me the trouble of customizing Ghidra itself to extend the decompiler interface.
With dead code elimination disabled, the normalize
style output now produces the ideal COPY operations for analysis:
0x48e7c4:0x27 (register, 0x0, 8) COPY (const, 0x4c584a, 8) 0x48e7cb:0x28 (unique, 0x3800, 8) INT_ADD (register, 0x20, 8) , (const, 0xffffffffffffffb8, 8) 0x48e7cb:0x29 (unique, 0xc000, 8) COPY (const, 0x4c584a, 8) 0x48e7cb:0x2a (stack, 0xffffffffffffffb8, 8) COPY (const, 0x4c584a, 8) 0x48e7d0:0x2b (unique, 0x3800, 8) INT_ADD (register, 0x20, 8) , (const, 0xffffffffffffffc0, 8) 0x48e7d0:0x2c (unique, 0xc080, 8) COPY (const, 0x15, 8) 0x48e7d0:0x2d (stack, 0xffffffffffffffc0, 8) COPY (const, 0x15, 8)
An unintended side effect of disabling dead code elimination is that redundant intermediate versions of some operations won’t be removed anymore. This hinders analysis based on finding certain patterns of COPY operations. For example, an analysis loop that looks for a length copy operation directly adjacent to an address copy operation would not catch the above sequence due to the redundant intermediate versions of the same COPY (one to the unique space, one to the stack space). Despite that, the lifted COPY operation output is much easier to work with and still helps recover a significant number of strings.
Normalize Style Analysis
The COPY op’s input will now just be a constant value, there’s no need to traverse its defining operations. We can just check if the constant value corresponds to an address in the memory block where string content is stored (e.g., .rodata
for ELF binaries or .rdata
for PE), or if it looks like a reasonable string length. For the output varnode, we just check if its address is in the stack address space and then get its offset.
To demonstrate how this eases analysis, here’s how the string length value check looks with the simplified COPY ops:
protected LengthCandidate storeLenCheck(Program program, PcodeOpAST pcodeOpAST) { if (pcodeOpAST.getOpcode() != PcodeOp.COPY) return null; // Get input, make sure it's a constant Varnode dataToStore = pcodeOpAST.getInput(0); if (!dataToStore.isConstant()) return null; long constantValue = dataToStore.getAddress().getOffset(); // Simple string length bounds check if (constantValue < MIN_STR_LEN || constantValue > MAX_STR_LEN) { return null; } // If output is a stack address, get the offset Varnode storeLoc = pcodeOpAST.getOutput(); if (!storeLoc.getAddress().isStackAddress()) { return null; } Long stackOffset = storeLoc.getAddress().getOffset(); LengthCandidate result = new LengthCandidate((int) constantValue, stackOffset, pcodeOpAST); return result; }
Due to the deadcode
disable hack we’ll miss some strings that the register
style analysis script finds, but there’s a property of the string data blob we can use to help fill in the remaining gaps: the strings are stored in ascending length order. Using the known lengths of two identified strings and the size of the gap between them, we can sometimes uniquely identify what size strings could exist in the gap. The end result for the register
and hacked normalize
analysis styles is generally the same after performing the gap filler step.
Of course, these hacks are not ideal: the XML injection vector could be patched in a future release of Ghidra, and disabling all dead code removal leaves unwanted operations around. Assuming the decompiler analysis configuration interface might later be exposed through the Ghidra API, I checked which rules are necessary to produce the stack COPY operations. Adding the stackvars
base group to the register
root action is enough to produce the simplified COPY ops, but without adding the deadcode
base group as well, the problem of the leftover intermediate transformations remains.
I noticed an interesting command in the decompiler CLI called deadcode delay
. The class documentation for the command handler shows that it delays dead code elimination in a specific address space:
/// class IfcDeadcodedelay /// brief Change when dead code elimination starts: `deadcode delay/// /// An address space is selected by name, along with a pass number. /// Dead code elimination for Varnodes in that address space is changed to start /// during that pass. If there is a e current function, the delay is altered only for /// that function, otherwise the delay is set globally for all functions. `
Instead of disabling all dead code elimination, I could use this to disable dead code elimination for varnodes specifically in the stack space.
[decomp]> deadcode delay stack 40 Successfully overrided deadcode delay for single function [decomp]> decompile Clearing old decompilation Decompiling main.main ...
I haven’t included the trace output here because storevarnode
still creates the same COPY ops shown before; the ops are just never deleted by the end of the trace. This seemed promising, but the only way to use this feature outside of the debug CLI was to edit the current architecture’s compiler specification file, such as Ghidra/Processors/ARM/data/languages/ARM.cspec
for ARM. The decompiler will check the compiler_spec
element for a deadcodedelay
element with this format:
<> space="stack" delay="40"/>
This is yet another kludge, but it eliminates the problem with the intermediate COPY op forms. Here is the P-Code output with dead code delay enabled:
PrintHighPCode.java> Running... PCode for function main.main @ 0048e790 (simplification style: normalize) ... 0x48e7bf:0x26 (stack, 0xffffffffffffffb0, 8) COPY (ram, 0x5601b0, 8) 0x48e7cb:0x2a (stack, 0xffffffffffffffb8, 8) COPY (const, 0x4c584a, 8) 0x48e7d0:0x2d (stack, 0xffffffffffffffc0, 8) COPY (const, 0x15, 8) 0x48e7d9:0x30 (stack, 0xffffffffffffffc8, 8) COPY (const, 0x0, 8) 0x48e7e5:0x37 (stack, 0xffffffffffffffd0, 16) COPY (unique, 0x1000001b, 16) 0x48e7e5:0x79 (unique, 0x1000001b, 16) INT_ZEXT (const, 0x0, 8) 0x48e7ea:0x39 (stack, 0xffffffffffffffa0, 8) COPY (const, 0x48e7ef, 8) 0x48e7ea:0x3a --- CALL (ram, 0x4866f0, 8) ... PrintHighPCode.java> Finished!
The normalize
style analysis using deadcodedelay
performs at least as well as the original register
style analysis – in some cases better – and the implementation is much simpler. I can’t use the XML injection trick to enable this only when running an analysis script, however, so for now this is just information to aid future work.
Concluding Thoughts
Thinking beyond Go binary analysis or this specific string recovery problem, for automated program analysis I would generally like to have a medium-level form of lifted P-Code output that doesn’t depend on correct function signature, calling convention, or data structure definitions (along the lines of the Low Level and Medium Level IL concept in Binary Ninja). The ideal solution for analyzing data structure creation on the stack would be a normalize
-like simplification style that preserves stack operations that would otherwise be consumed by function call analysis. This is the main opportunity for future work I’m interested in.
For now, when implementing new P-Code analysis scripts where I have no need for source code output, I’d favor the decompiler’s normalize
simplification style. When the normalize
P-Code output is unsuitable due to issues like the one I explored in this blog post, I’d fall back to the register
style, at the cost of needing to reimplement some decompiler analysis passes in my Ghidra script.
Ghostrings Release
The Go string definition recovery scripts described in this article have been released as “Ghostrings”. See the tool release announcement post at https://research.nccgroup.com/2022/05/20/tool-release-ghostrings/. This release also includes the PrintHighPCode.java
script for decompiling a function with a certain simplification style and inspecting the P-Code output.
References
Golang Reverse Engineering
- https://rednaga.io/2016/09/21/reversing_go_binaries_like_a_pro/
- https://cujo.com/reverse-engineering-go-binaries-with-ghidra/
- https://www.sentinelone.com/labs/alphagolang-a-step-by-step-go-malware-reversing-methodology-for-ida-pro/
- https://blog.osiris.cyber.nyu.edu/2019/12/19/go-deepdive/
- https://blog.osiris.cyber.nyu.edu/2019/12/19/ugo-ghidra-plugin/
Ghidra Decompiler and P-Code Analysis
- https://www.riverloopsecurity.com/blog/2019/05/pcode/
- https://spinsel.dev/2021/04/02/ghidra-decompiler-debugging.html
For example, see the x86-64 string defining patterns listed in https://github.com/SentineLabs/AlphaGolang/blob/main/4.string_cast.py#L86↩︎
The “unique” space is “a pool of temporary registers that can hold data but that aren’t a formal part of the state of the processor”.↩︎
The raw
protoEvalModel
option string is copied directly into an XML string here.↩︎