tl;dr
eBPF (extended Berkeley Packet Filter) is slowly taking over as a programmatic way for (generally privileged) users to invoke Linux kernel APIs and performantly execute semi-arbitrary code without having to load it from a custom kernel module. eBPF is a general means to load memory safe restricted code that reduces the risk of crashes, deadlocks, and infinite loops of inherent to the kernel module alternative.
In this post, we describe how to effectively use eBPF to trace Linux kernel functions. We also discuss how we implemented our eBPF-based tracing tool that can sniff Unix domain sockets across an entire Linux host (this “impossible” task was what got us started with eBPF). You can install it with sudo -H pip install unixdump
(or from our repo), but it requires BCC to be installed separately. But, if you just want to see some of what it can do, click here.
If you’re looking to get into building custom eBPF-based Linux kernel tracing tools, we recommend starting with BCC, busting out its reference guide, and pinning a tab to the Linux kernel codebase.
Background
Unix domain sockets1 are a core OS-provided IPC (inter-process communication) mechanism that enable processes on the same host to communicate through send(2)
/sendto(2)
– and recv(2)
/recvfrom(2)
-able file descriptors similarly to network sockets. Unix domain sockets use a file path (or in the case of Linux’s abstract namespace, a key string) as the bind(2)
/connect(2)
“address;” additionally, “unnamed” Unix domain sockets may be created in connected pairs through the socketpair(2)
syscall. Traditionally, Unix domain sockets are often used when applications or services require bidirectional communication between related processes or communication between unrelated ones that may enforce OS-backed permission checks (though the checks differ between Unixes2). Depending on the internal implementation, Unix domain sockets are often significantly more performant than loopback-networking as they do not go through the networking stack. One downside of this is that Unix domain sockets are extremely difficult to inspect as there is no simple interface or API for intercepting Unix domain socket traffic, such as the pcap(3)
APIs for network traffic. Instead, when needing to observe Unix domain socket traffic, one often resorts to interposing a forwarding daemon with a middle Unix domain socket that the connect(2)
/sendto(2)
-ing peer will actually communicate with. In the case where file descriptors are being passed or peer credentials are being validated, and it may be overly complicated to run such a daemon with the right process information, function hooking (typically LD_PRELOAD
-alikes) may be used to interdict libc stubs used to invoke the communication syscalls. Both methods have drawbacks and additionally require more than a cursory understanding of how an application is already using Unix domain sockets in order to intercept them. Additionally, neither method scales across an entire host and may only be used to individually intercept Unix domain socket traffic for single applications and services. Intercepting all Unix domain socket traffic across a host will require dumping the data directly from the kernel; we can do this by tracing the kernel.
Kernel Tracing and Instrumentation
Generally, kernel tracing utilities enable one to obtain very basic information about the execution within the OS kernel. Most implementations provide the ability to dump metadata about executing functions, including their arguments and return values. Often, this data is very limiting as the relevant information is embedded within structs and arrays for which only the pointer address is returned. Fewer OS kernels directly provide instrumentation APIs that enable deeper information to be gleaned from executing functions and kernel memory. The gold standard for such functionality is DTrace. DTrace was originally developed for Solaris, and has since been ported to FreeBSD and macosx (where, unfortunately, it has been purposefully weakened to align with other DRM-related changes and will deny attempts to trace binaries from core system directories). DTrace has also been ported to Linux, twice; once as a /proc/kallsyms
-based kernel module that implements its own function hooking, and, more recently, by Oracle, who relicensed it under the GPL (for kernel code). Unfortunately, the former is essentially unmaintained and fails on current Linux distributions, and the latter is effectively locked to Oracle Linux at the moment and it is unclear if it will be accepted into upstream.
Due to the long period without DTrace on Linux and the eventual realization of the need to have useful debugging features within the kernel, Linux has played fast and loose with a number of tracing frameworks that mostly constitute dead ends. Chief among the failures is SystemTap, which has a painful installation process requiring installation of kernel debug symbols and uses a frustrating NIH clone of the DTrace scripting syntax to dynamically generate kernel modules. Needless to say, one is probably better off writing their own function hooking kernel modules. Sysdig, on the other hand, is a useful tool (but doing anything fancy might involve Lua scripting…). Unfortunately, is not capable of drilling down into arbitrary kernel structures in memory as it is based entirely on Linux’s built-in tracepoint definitions (primarily those for syscalls and process scheduling events), which provide only specific pre-defined values that the kernel’s developers felt would be useful for debugging and tracing kernel functionality.
Linux’s Lego Problem
Linux’s in-kernel tracing features are very similar to other facets of modern Linux, specifically containers. Linux slowly gained a number of “namespacing” features that were eventually composed to form the “concept” of “containers,” an ape of purpose-built sandboxing technology such as FreeBSD’s jails and Solaris’ zones. Containers have only recently started to see legitimate security hardening materialize for normal users within the past few years, most notably with the introduction of user namespaces, a feature feared by Linux distros, but which wipes out a number of container escapes by isolating privileges to the container’s namespaces. Jessie Frazelle sums up the “concept” distinction nicely in her infamous blog post comparing them. We will only add that if you decide to make an ocean out of the Death Star set, it will cost $500 and your ocean will be a murky gray.
The Linux kernel has several different inter-related features that support dynamic instrumentation and tracing (for a more thorough introduction, see Julia Evans’ overview on Linux’s tracing systems):
kprobes: Kernel API via
register_kprobe(struct kprobe*)
that can register callback functions to handle a breakpoint injected at an arbitrary memory address (typically the start of a function). The handlers are provided astruct pt_regs*
containing the architecture-specific register values.ftrace: A function tracing API provided by the Linux kernel built on lower-level APIs such as kprobes and tracepoints, that provides a filesystem-based userland API (
debugfs
) to configure and enable various tracing and profiling operations.perf (aka perf_events): Kernel API for hardware performance monitoring counters (e.g. number of instructions executed), timed sampling (e.g. find where in the callstack time is spent), and userspace mapped ring buffers via
perf_event_open(2)
/mmap(2)
syscalls.tracepoints: Kernel API with tracepoints declared through the
TRACE_EVENT
macro, inserted via calls to the tracepoint “function,” and callbacks registered throughtracepoint_probe_register(struct tracepoint*,void*,void*)
. TheTRACE_EVENT
macro creates metadata useful for perf and ftrace to instrument by tracepoint name.
Some of these are implemented in terms of each other, and several of their subsystems interact with or support each other. But, up until recently, if you wanted to directly interact with any of these things in a meaningful way, you needed to write a bunch of GNU-flavored C in a kernel module and implement your own scheme for communicating data back to userspace (assuming you want something more performant than printk
). This is essentially how every single tracing “framework” is implemented, and one of the common things they seem to share is that they all tend to set up their own ring buffer from scratch that is mmap(2)
’d using a file descriptor obtained by open(2)
-ing a special path registered by the module (implementations vary heavily, from use of cdev_add
to mounted in-memory filesystems).
eBPF: Lego Grout
eBPF (extended Berkeley Packet Filter) is an in-kernel JIT-ing virtual machine that adds extra computational resources (more registers, direct ISA mapping to major CPU architectures, and fast C interop with internal Linux kernel functionality)3 to the classic BPF virtual machine and bytecode instruction set; it is referred to as bpf
in the Linux source, APIs, syscalls, etc. eBPF is slowly taking over as a “programmatic” way for users, often privileged ones, to invoke Linux kernel APIs and execute “performant” code in kernel space (which limits context switches) without necessitating the development and loading of a custom kernel module in an unsafe or dangerous language. eBPF is not specifically a tracing or instrumentation feature, but a general means to load memory safe restricted code that reduces the risk of crashes, deadlocks, and infinite loops of inherent to the kernel module alternative. Given that eBPF itself has already introduced vulnerabilities (Though, being honest, what revolutionary new feature hasn’t had bugs?), it is exacerbating the failings of the Linux capability “model” and raising concerns about the balance between functionality and security. But if the intention behind eBPF is to keep mortals from writing buggy kernel modules, it may well succeed and improve the security status quo in doing so.
eBPF is slowly but surely becoming a framework within the kernel with an ever-increasing menagerie of features, programming capabilities, kernel-backed (and userland-mapped) data structures, helper functions (for which an in-eBPF implementation would otherwise violate the safety restrictions), and pluggable APIs. These include multiple forms of packet and socket filtering and processing, an interesting API for adding encapsulation to packets based on route tables, shenanigans to hook bind(2)
to “fix” broken apps that run in containers, and multiple APIs to attach them to all manner of kernel-based tracing and instrumentation mechanisms. The last of these is most relevant to our purposes, but we will nonetheless remind folks to stay safe and make sure to properly account for variable-length headers when processing packets with eBPF.
BCC: C to eBPF By Way of the Long Way Round
BCC (BPF Compiler Collection) is a toolkit for having userland code (generally Python) interact with kernel space eBPF code, and includes an LLVM-based cross-compilation toolchain that compiles C code into eBPF bytecode. At a very high level, this toolchain is based on using Clang’s RecursiveASTVisitor
AST traversal library to modify the C code into a “suitable” format and then use LLVM’s eBPF backend to emit the bytecode. These modifications exist primarily to replace external memory accesses with equivalent eBPF memory accessing helper functions and expand other simplified C coding constructs such as BCC library functions and “magic” semantics introduced by BCC that are used to denote the eBPF attach target (e.g. kprobe__funcname
to attach the eBPF-compiled function as a kprobe hook on funcname
). For a slightly deeper dive into how BCC C code works and how eBPF kprobes are registered, see our talk given recently at the 35C3 conference.
BCC and eBPF Bytecode Validator Hell
To make eBPF “safe,” the Linux kernel validates all eBPF code before loading it. For example, eBPF code is not allowed to “loop” (to prevent infinite loops), so any attempt to run code at an address/offset before the currently running eBPF instruction will be deemed illegal (This restriction has been weakened slightly in newer versions of Linux to reduce code bloat, but is still enforced by default.). Additionally, the eBPF validator may detect loops when there are none in the source code; this can happen due to compiler optimizations or because of faulty identification of normal function calling operations as being loop-like. As such, BCC C often uses unrolled loops and inline
d functions. The eBPF validator additionally has a number of call-site validations and register taint tracking logic that attempt to ensure that helper functions, such as those used to manipulate memory-mapped tables and access kernel memory, are only passed “safe” argument values. This “logic” is problematic as it is often not thorough enough to properly determine value bounds. This problem is further complicated by the fact that BCC compiles code with -O2
; most naïve attempts to make such bounds “more obvious” are likely to be optimized out by Clang. Additionally, updating BCC (and possibly the Linux kernel) may potentially result in a slightly different bytecode output that trip the validator. However, this is generally not the case for very simple code, such as that of the tools and examples code within the BCC repo itself. We have also observed errors when using certain Linux/BCC versions where the use of a bool
function parameter was not tolerated in certain variants of our code (e.g. different filtering comparisons being applied) and integer types were not tolerated in the others; we originally had to solve this by using #ifdef
magic to control the parameter’s type depending on the variant of the code until a BCC update unbroke it. These issues are so pervasive that the BCC developers themselves appear to believe that certain tolerable code constructs, such as variable-length byte copies, are not possible in eBPF because the idiomatic code C is not accepted by the eBPF validator.
While these issues present challenges when attempting to develop portable BCC/eBPF-based tooling, it is useful to be able to disable these inane validations when simply attempting to quickly trace a kernel function and extract some interesting data. Unfortunately, the current validator implementation suffers from high coupling-low cohesion as the validation routine itself pre-processes the bytecode and configures the internal kernel data structures responsible for running it. As a result, the validation routine cannot be bypassed directly with a NOP
or by stomping over its implementation with a return 0
. Instead, one has to individually clip the strings of the eBPF validator’s golden fiddle by performing a number of nigh-surgical function hooks that will both bypass lower-level state validation checks and override registers with safe bound values. We have implemented a set of such hooks that have bypassed the more pernicious and maddening errors that we have experienced while writing our Unix domain socket sniffer tool. While we do not recommend using it in production as it can definitely lead to unstable and, more importantly, unsafe eBPF code, our yolo-ebpf
kernel module can help in a pinch when attempting to reverse engineer applications on the fly. And if you still happen to hit an incorrect eBPF validator error while using it, please send us an issue. It is disappointing that such tomfoolery is needed in the first place, but eBPF and BCC are both relatively new and these things take time.
unixdump: An eBPF-based Unix Domain Socket Sniffer
unixdump
is a full-featured utility for passively capturing Unix domain socket traffic from Linux hosts built on top of eBPF and BCC. It can capture all traffic across a host, including file descriptor transfers and Unix credential passes. unixdump
supports fine-grained filtering based on Unix domain socket paths (including abstract namespace keys) and PIDs, and can perform both inclusive and exclusive filtering of PIDs. unixdump
additionally supports outputting to readable log files amenable to extracting binary content (We are currently looking into outputting to the pcapng format, which can support ancillary data, but performantly and accurately timestamping events may pose a challenge).
Design and Implementation
As with other BCC-based tools, our userland event handling code is written in Python and our kernel space kprobe hook that generates events is written in C. Essentially, the C code is what performs the important operations; in our case, this is the retrieval and filtering of metadata and content from sockets and other kernel structures. This code then marshals the event data into a struct that is unpacked on the Python side. The Python code then processes the event stream into a more user-friendly data output.
This flow is implemented through the use of two ring buffers, one the perf_event
ring buffer, and the other a custom ring buffer built on top of an eBPF map. Events are pushed to userspace through the perf_event
ring buffer using perf_submit
calls in the C code. The Python userspace code constantly poll(2)
s file descriptors associated with these ring buffers to detect event submissions. The Python code then attempts to read the rest of the data from the asynchronously updated custom ring buffer mapped into userspace. Following this, the Python code process the data and clears the custom ring buffer entry.
In unixdump
, we are extracting the data sent over Unix domain sockets at one of the lowest possible levels, from the internal msghdr
structs holding them. When the send
syscall is invoked on a Unix domain socket, a msghdr
parameter, msg
, gets passed along. The data in the msghdr
struct is contained within another structure, iov_iter
, that is embedded into the msghdr
as its msg_iter
field. iov_iter
s can wrap several kernel buffer structures, but in our case, it uses the const struct iovec* iov
union variant, which is a simple structure that contains a buffer base pointer, iov_base
, and a buffer length, iov_len
, that together refer to our Unix domain socket message content.
We extract this data using the bpf_probe_read()
helper function, which acts as a “safe” memcpy
enabling eBPF programs to read arbitrary kernel memory into their own memory space. An interesting quirk of how BCC works is that function calls to bpf_*
functions, which are part of the kernel’s eBPF API, and other BCC-specific helper functions/methods (yes, “methods”) are rewritten using an LLVM-based code generation pass. This enables the helper functions to be translated into the appropriate bpf_call
instructions and is additionally used to translate all kernel memory dereferences into bpf_probe_read()
calls.
Unix domain sockets also allow processes to pass file descriptors to one another (SCM_RIGHTS
), and authenticate their identity (or act on behalf of another) by passing user credentials (SCM_CREDENTIALS
) through the kernel. This “ancillary data” takes the form of several cmsghdr
structures and CMSG_DATA
payloads lined up within a single byte buffer blob. This blob is pointed to by the void* msg_control
field of the msghdr
struct and the size_t msg_controllen
field specifies the total size. To differentiate and identify the raw contents of the CMSG_DATA
payloads, the cmsghdr
struct stores metadata about the type and size of the data. For example, if the int cmsg_type
field is SCM_RIGHTS
, the particular CMSG
is being used to pass file descriptors. An interesting quirk of the CSMG
system in the kernel is that separate CMSG
objects of the same type will be combined into one CMSG
observed by the receiver. Like most ad-hoc data structures in the Linux kernel codebase, CMSG
blobs are not simple to parse given eBPF’s constraints. In particular, these blobs are typically iterated through by using multiple layers of pointer shifting macros that embed a for-loop construct to iterate the initially unknown number of CMSG
objects; it is worth keeping in mind that the msghdr.msg_controllen
field refers to the byte length of the whole CMSG
blob, and is used to ensure that CMSG
objects are not iterated or processed past the end of the buffer allocated to them. To get around the eBPF limitations, we use CLI flag-based tunables to guide (hacky string concat) code generation of C code that statically iterates these blobs, if present, and copies the metadata and typing information into our ring buffer; we feel this was still less painful to implement than it would have been to copy the entire blob to userspace and process it in Python.
BCC, providing a userland interface on top of a kernel-only one, enables BCC C code to specify I/O data structures that map to the ones in
through the use of BCC-provided macros. The primary benefit of these structures is that they may allocate much more storage space than is otherwise provided on the eBPF stack and are they are considered a valid copy target for reading arbitrary kernel memory (there are some inconsistent “protections” around copying pointer addresses directly onto the eBPF stack). unixdump
uses a BPF_PERCPU_ARRAY()
to store (potentially large) message content as it enables easier iterating of ring buffer slots. For simpler one-off event notifications, BCC C supports setting and registering a perf_event
output ring buffer-based output struct through BPF_PERF_OUTPUT()
; the perf_submit()
helper function may then be called on the output object declared by the macro. This function call is actually translated into a bpf_perf_event_output()
helper function call through BCC’s code generation.
Dividing our message content and event metadata across these two storage mechanisms enables us to better tune memory usage; and detect, mitigate, and report when unixdump
is bottlenecking against the system. One major difference between these two I/O mechanisms is that BPF_PERF_OUTPUT()
-registered data structures will be automatically parsed/deserialized by BCC, whereas BCC-registered tables/arrays/maps will be provided to registered event handlers as raw byte buffers, necessitating the use of custom Python ctypes
parsing logic. However, One major pain point to be aware of with this behavior of BCC is that in the former case, char[]
fields will be parsed as NUL-terminated C strings, and all data after a NULL byte will be lost; it may not be recovered by using ctypes.string_at
. The solution is to use uint8_t[]
for non-C string data, as it will result in BCC reading all of the data.
When writing unixdump
, we quickly learned that display server traffic (e.g. X11) for terminal applications goes over Unix domain sockets. A naive implementation would quickly result in a feedback loop that would suck up memory and CPU resources. Since we wanted to avoid locking up the system, and since we also wanted to capture tunable amounts of data larger than the eBPF stack size (since eBPF cannot perform dynamic allocations), we went with a CLI-configurable ring buffer for content storage. The current implementation will simply drop events (but notify userland of the drop with additional metadata) if the ring buffer slot is still in use by the time it is needed again. We also do not directly perf_submit
the large slots of the ring buffer as this resulted in a large number of kernel-dropped perf events. Instead we perf_submit
smaller event metadata, which includes information like PIDs, socket paths, and the index of the ring buffer slot synced to userspace using the bpf_map_*
APIs. This results a slight “race condition” in that the ring buffer slot may not yet be accessible to userspace by the time we attempt to access it. However, this delay is not subject to the ABA problem or any similar use-after-free-like issues as the userland pages will have always been cleared by the userspace code prior to being updated by the kernel. We use a few fallback mechanisms to poll at it depending on whether or not event order preservation is necessary, but will give up after a few tries as we have observed complete losses of the data in some circumstances where the slot never updates. It is currently unclear if this is due to a flaw in BCC or the Linux kernel itself.
For better throughput, we perform various checks to determine whether it is worth it to continue processing. For example, we will bail out early if various validation checks do not pass (e.g. if certain metadata is missing or unexpected). On top of this, we provide a number of in-kernel filters to reduce and refine the amount of processing done in the kernel. Users can filter on specific Unix domain socket paths (or match path prefixes) and PIDs. It is also possible to exclude certain noisy PIDs altogether (e.g. the GUI terminal process rendering the output of its own Unix socket communication with the display server, or the display server itself). Using the filters will reduce the amount of data copied to the fixed-size perf_events
ring buffer and therefore help to prevent it from overfilling and dropping (“missing” in the Linux parlance) events that cannot fit. We also support configuring the size of this buffer should stable throughput still be too much for the default size.
Case Study: Sniffing Frida C2 Traffic
Frida is a popular “cross-platform dynamic instrumentation toolkit” that injects a JavaScript interpreter into a target process and uses it to run a semi-DSL of JavaScript function hooking code. The impetus for unixdump
was part of a greater desire to answer a seemingly simple question: “How does Frida work?” More specifically, we were looking to find out how Frida’s agent communication protocol works. At a high-level, Frida works by attaching to a target process using platform-specific debugging APIs (i.e. ptrace(2)
on Linux, task_for_pid()
/mach_vm_*()
on macosx, and OpenProcess()
/VirtualAllocEx()
/WriteProcessMemory()
/CreateRemoteThread()
on Windows), and then uses them to inject an “agent” that runs within the target. This agent is what runs the instrumenting JavaScript code and performs the lower-level operations invoked by it (e.g. function hooking, memory reads/writes, etc.). While Frida does support non-interactively loading a single JavaScript file, its primary mode of operation involves the use of a “client” process that interacts with the agent running within the target. In addition to the protocol used for direct attachment, Frida also supports having the client connect to a frida-server
instance that makes direct attachments to targets. While we are interested in the goings-on of the direct attachment/connection case, it is worth noting that a frida-server
can be loaded into a given process through frida-gadget
, and that the frida-server
and client libraries support several protocols to enable remote attachment to hosts over TCP and mobile devices using a TCP-forwarding mechanism to connect to a a frida-server
on a USB-attached device (e.g. ADB’s TCP forwarding for Android, and usbmuxd
TCP forwarding) for iOS).
Through strace(1)
-ing the client, it became clear early on that the direct attachment communication protocol had to be transported over Unix domain sockets with dynamically-generated paths. The problem was that multiple such Unix domain sockets are created, and it wasn’t clear which ones were being used. Additionally, because the Frida client is still ptrace(2)
-ing the target, we cannot simply strace(1)
it, as strace(1)
uses ptrace(2)
and a process can only be ptrace(2)
-d by one tracer at a time. While we could have tried to hack up a sniffer by instrumenting the Frida client itself to hook its Unix domain socket I/O, this was not an ideal solution for a number of reasons, and we instead tried to solve the Unix domain socket traffic sniffing problem once and for all (on Linux at least). After we got the MVP version of the eBPF hooks running, it quickly became obvious that Frida uses DBus to serialize custom API calls over Unix domain sockets. In fact, outside of specific protocols to initialize direct connections between Frida clients, servers, and targets, pretty much all of Frida’s communications use the DBus protocol.
Frida Agent Script Communication Protocol
Using the code in the first example defined in the Frida documentation, we demonstrate unixdump
’s ability to intercept Unix domain socket traffic. Knowing (from strace(1)
) that Frida’s Unix domain socket paths begin with /tmp/frida
, we can instruct unixdump
to filter for messages starting with that path name:
sudo unixdump -s '/tmp/frida' -b
We then proceed to start our target binary, hello
, and our Frida hook script, hook.py
, passing the latter the PID from of the hello
process:
./hello ./hook.py $FUNCTION_VALUE
- When Frida starts, it begins authenticating via DBus and the hooked process begins to send Unix credentials to identifying itself (in this case,
hello
was run as root):
Output
==== STREAM PID 26525.0xffff8cd6e81ad000 (C) > 26527.0xffff8cd6e81ad700 (S), length 1 command[26525]: './hello' command[26527]: 'python ./hook.py 0x55bd9092f155' ancillary data sent (attempted): 1 CMSG observed SCM_CREDENTIALS: pid=26525 uid=0(root) gid=0(root) ---- 00000000: 00 . ==== STREAM PID 26525.0xffff8cd6e81ad000 (C) > 26527.0xffff8cd6e81ad700 (S), length 6 command[26525]: './hello' command[26527]: 'python ./hook.py 0x55bd9092f155' ---- 00000000: 41 55 54 48 0D 0A AUTH.. ==== STREAM PID 26527.0xffff8cd6e81ad700 (S) > 26525.0xffff8cd6e81ad000 (C), length 46 command[26527]: 'python ./hook.py 0x55bd9092f155' command[26525]: './hello' ---- 00000000: 52 45 4A 45 43 54 45 44 20 45 58 54 45 52 4E 41 REJECTED EXTERNA 00000010: 4C 20 41 4E 4F 4E 59 4D 4F 55 53 20 44 42 55 53 L ANONYMOUS DBUS 00000020: 5F 43 4F 4F 4B 49 45 5F 53 48 41 31 0D 0A _COOKIE_SHA1.. ==== STREAM PID 26525.0xffff8cd6e81ad000 (C) > 26527.0xffff8cd6e81ad700 (S), length 18 command[26525]: './hello' command[26527]: 'python ./hook.py 0x55bd9092f155' ---- 00000000: 41 55 54 48 20 45 58 54 45 52 4E 41 4C 20 33 30 AUTH EXTERNAL 30 00000010: 0D 0A .. ==== STREAM PID 26527.0xffff8cd6e81ad700 (S) > 26525.0xffff8cd6e81ad000 (C), length 37 command[26527]: 'python ./hook.py 0x55bd9092f155' command[26525]: './hello' ---- 00000000: 4F 4B 20 36 37 36 39 37 34 36 38 37 35 36 32 32 OK 6769746875622 00000010: 65 36 33 36 66 36 64 32 66 36 36 37 32 36 39 36 e636f6d2f6672696 00000020: 34 36 31 0D 0A 461.. ==== STREAM PID 26525.0xffff8cd6e81ad000 (C) > 26527.0xffff8cd6e81ad700 (S), length 19 command[26525]: './hello' command[26527]: 'python ./hook.py 0x55bd9092f155' ---- 00000000: 4E 45 47 4F 54 49 41 54 45 5F 55 4E 49 58 5F 46 NEGOTIATE_UNIX_F 00000010: 44 0D 0A D.. ==== STREAM PID 26527.0xffff8cd6e81ad700 (S) > 26525.0xffff8cd6e81ad000 (C), length 15 command[26527]: 'python ./hook.py 0x55bd9092f155' command[26525]: './hello' ---- 00000000: 41 47 52 45 45 5F 55 4E 49 58 5F 46 44 0D 0A AGREE_UNIX_FD.. ==== STREAM PID 26525.0xffff8cd6e81ad000 (C) > 26527.0xffff8cd6e81ad700 (S), length 7 command[26525]: './hello' command[26527]: 'python ./hook.py 0x55bd9092f155' ---- 00000000: 42 45 47 49 4E 0D 0A BEGIN..
- Afterwords, Frida performs a
GetAll
request of the DBus properties:
Output
==== STREAM PID 26527.0xffff8cd6e81ad700 (S) > 26525.0xffff8cd6e81ad000 (C), length 156 command[26527]: 'python ./hook.py 0x55bd9092f155' command[26525]: './hello' ---- 00000000: 6C 01 00 01 24 00 00 00 01 00 00 00 68 00 00 00 l...$.......h... 00000010: 08 01 67 00 01 73 00 00 01 01 6F 00 1E 00 00 00 ..g..s....o..... 00000020: 2F 72 65 2F 66 72 69 64 61 2F 41 67 65 6E 74 53 /re/frida/AgentS 00000030: 65 73 73 69 6F 6E 50 72 6F 76 69 64 65 72 00 00 essionProvider.. 00000040: 03 01 73 00 06 00 00 00 47 65 74 41 6C 6C 00 00 ..s.....GetAll.. 00000050: 02 01 73 00 1F 00 00 00 6F 72 67 2E 66 72 65 65 ..s.....org.free 00000060: 64 65 73 6B 74 6F 70 2E 44 42 75 73 2E 50 72 6F desktop.DBus.Pro 00000070: 70 65 72 74 69 65 73 00 1F 00 00 00 72 65 2E 66 perties.....re.f 00000080: 72 69 64 61 2E 41 67 65 6E 74 53 65 73 73 69 6F rida.AgentSessio 00000090: 6E 50 72 6F 76 69 64 65 72 31 32 00 nProvider12. ==== STREAM PID 26525.0xffff8cd6e81ad000 (C) > 26527.0xffff8cd6e81ad700 (S), length 151 command[26525]: './hello' command[26527]: 'python ./hook.py 0x55bd9092f155' ---- 00000000: 6C 01 00 01 1F 00 00 00 01 00 00 00 68 00 00 00 l...........h... 00000010: 08 01 67 00 01 73 00 00 01 01 6F 00 19 00 00 00 ..g..s....o..... 00000020: 2F 72 65 2F 66 72 69 64 61 2F 41 67 65 6E 74 43 /re/frida/AgentC 00000030: 6F 6E 74 72 6F 6C 6C 65 72 00 00 00 00 00 00 00 ontroller....... 00000040: 03 01 73 00 06 00 00 00 47 65 74 41 6C 6C 00 00 ..s.....GetAll.. 00000050: 02 01 73 00 1F 00 00 00 6F 72 67 2E 66 72 65 65 ..s.....org.free 00000060: 64 65 73 6B 74 6F 70 2E 44 42 75 73 2E 50 72 6F desktop.DBus.Pro 00000070: 70 65 72 74 69 65 73 00 1A 00 00 00 72 65 2E 66 perties.....re.f 00000080: 72 69 64 61 2E 41 67 65 6E 74 43 6F 6E 74 72 6F rida.AgentContro 00000090: 6C 6C 65 72 31 32 00 ller12. ==== STREAM PID 26525.0xffff8cd6e81ad000 (C) > 26527.0xffff8cd6e81ad700 (S), length 48 command[26525]: './hello' command[26527]: 'python ./hook.py 0x55bd9092f155' ---- 00000000: 6C 02 01 01 08 00 00 00 02 00 00 00 18 00 00 00 l............... 00000010: 08 01 67 00 05 61 7B 73 76 7D 00 00 00 00 00 00 ..g..a{sv}...... 00000020: 05 01 75 00 01 00 00 00 00 00 00 00 00 00 00 00 ..u............. ==== STREAM PID 26527.0xffff8cd6e81ad700 (S) > 26525.0xffff8cd6e81ad000 (C), length 48 command[26527]: 'python ./hook.py 0x55bd9092f155' command[26525]: './hello' ---- 00000000: 6C 02 01 01 08 00 00 00 02 00 00 00 18 00 00 00 l............... 00000010: 08 01 67 00 05 61 7B 73 76 7D 00 00 00 00 00 00 ..g..a{sv}...... 00000020: 05 01 75 00 01 00 00 00 00 00 00 00 00 00 00 00 ..u.............
- Frida then instructs the script to open and waits for a confirmation that the open succeeded:
Output
==== STREAM PID 26527.0xffff8cd6e81ad700 (S) > 26525.0xffff8cd6e81ad000 (C), length 132 command[26527]: 'python ./hook.py 0x55bd9092f155' command[26525]: './hello' ---- 00000000: 6C 01 00 01 04 00 00 00 03 00 00 00 70 00 00 00 l...........p... 00000010: 08 01 67 00 03 28 75 29 00 00 00 00 00 00 00 00 ..g..(u)........ 00000020: 01 01 6F 00 1E 00 00 00 2F 72 65 2F 66 72 69 64 ..o...../re/frid 00000030: 61 2F 41 67 65 6E 74 53 65 73 73 69 6F 6E 50 72 a/AgentSessionPr 00000040: 6F 76 69 64 65 72 00 00 03 01 73 00 04 00 00 00 ovider....s..... 00000050: 4F 70 65 6E 00 00 00 00 02 01 73 00 1F 00 00 00 Open......s..... 00000060: 72 65 2E 66 72 69 64 61 2E 41 67 65 6E 74 53 65 re.frida.AgentSe 00000070: 73 73 69 6F 6E 50 72 6F 76 69 64 65 72 31 32 00 ssionProvider12. 00000080: 01 00 00 00 .... ==== STREAM PID 26525.0xffff8cd6e81ad000 (C) > 26527.0xffff8cd6e81ad700 (S), length 132 command[26525]: './hello' command[26527]: 'python ./hook.py 0x55bd9092f155' ---- 00000000: 6C 04 01 01 04 00 00 00 03 00 00 00 70 00 00 00 l...........p... 00000010: 08 01 67 00 03 28 75 29 00 00 00 00 00 00 00 00 ..g..(u)........ 00000020: 01 01 6F 00 1E 00 00 00 2F 72 65 2F 66 72 69 64 ..o...../re/frid 00000030: 61 2F 41 67 65 6E 74 53 65 73 73 69 6F 6E 50 72 a/AgentSessionPr 00000040: 6F 76 69 64 65 72 00 00 03 01 73 00 06 00 00 00 ovider....s..... 00000050: 4F 70 65 6E 65 64 00 00 02 01 73 00 1F 00 00 00 Opened....s..... 00000060: 72 65 2E 66 72 69 64 61 2E 41 67 65 6E 74 53 65 re.frida.AgentSe 00000070: 73 73 69 6F 6E 50 72 6F 76 69 64 65 72 31 32 00 ssionProvider12. 00000080: 01 00 00 00 .... ==== STREAM PID 26525.0xffff8cd6e81ad000 (C) > 26527.0xffff8cd6e81ad700 (S), length 32 command[26525]: './hello' command[26527]: 'python ./hook.py 0x55bd9092f155' ---- 00000000: 6C 02 01 01 00 00 00 00 04 00 00 00 10 00 00 00 l............... 00000010: 08 01 67 00 00 00 00 00 05 01 75 00 03 00 00 00 ..g.......u..... ==== STREAM PID 26527.0xffff8cd6e81ad700 (S) > 26525.0xffff8cd6e81ad000 (C), length 148 command[26527]: 'python ./hook.py 0x55bd9092f155' command[26525]: './hello' ---- 00000000: 6C 01 00 01 1C 00 00 00 04 00 00 00 68 00 00 00 l...........h... 00000010: 08 01 67 00 01 73 00 00 01 01 6F 00 18 00 00 00 ..g..s....o..... 00000020: 2F 72 65 2F 66 72 69 64 61 2F 41 67 65 6E 74 53 /re/frida/AgentS 00000030: 65 73 73 69 6F 6E 2F 31 00 00 00 00 00 00 00 00 ession/1........ 00000040: 03 01 73 00 06 00 00 00 47 65 74 41 6C 6C 00 00 ..s.....GetAll.. 00000050: 02 01 73 00 1F 00 00 00 6F 72 67 2E 66 72 65 65 ..s.....org.free 00000060: 64 65 73 6B 74 6F 70 2E 44 42 75 73 2E 50 72 6F desktop.DBus.Pro 00000070: 70 65 72 74 69 65 73 00 17 00 00 00 72 65 2E 66 perties.....re.f 00000080: 72 69 64 61 2E 41 67 65 6E 74 53 65 73 73 69 6F rida.AgentSessio 00000090: 6E 31 32 00 n12. ==== STREAM PID 26525.0xffff8cd6e81ad000 (C) > 26527.0xffff8cd6e81ad700 (S), length 48 command[26525]: './hello' command[26527]: 'python ./hook.py 0x55bd9092f155' ---- 00000000: 6C 02 01 01 08 00 00 00 05 00 00 00 18 00 00 00 l............... 00000010: 08 01 67 00 05 61 7B 73 76 7D 00 00 00 00 00 00 ..g..a{sv}...... 00000020: 05 01 75 00 04 00 00 00 00 00 00 00 00 00 00 00 ..u.............
- Following this, Frida creates the JavaScript to be injected:
Output
==== STREAM PID 26527.0xffff8cd6e81ad700 (S) > 26525.0xffff8cd6e81ad000 (C), length 251 command[26527]: 'python ./hook.py 0x55bd9092f155' command[26525]: './hello' ---- 00000000: 6C 01 00 01 83 00 00 00 05 00 00 00 68 00 00 00 l...........h... 00000010: 08 01 67 00 02 73 73 00 01 01 6F 00 18 00 00 00 ..g..ss...o..... 00000020: 2F 72 65 2F 66 72 69 64 61 2F 41 67 65 6E 74 53 /re/frida/AgentS 00000030: 65 73 73 69 6F 6E 2F 31 00 00 00 00 00 00 00 00 ession/1........ 00000040: 03 01 73 00 0C 00 00 00 43 72 65 61 74 65 53 63 ..s.....CreateSc 00000050: 72 69 70 74 00 00 00 00 02 01 73 00 17 00 00 00 ript......s..... 00000060: 72 65 2E 66 72 69 64 61 2E 41 67 65 6E 74 53 65 re.frida.AgentSe 00000070: 73 73 69 6F 6E 31 32 00 00 00 00 00 00 00 00 00 ssion12......... 00000080: 76 00 00 00 0A 49 6E 74 65 72 63 65 70 74 6F 72 v....Interceptor 00000090: 2E 61 74 74 61 63 68 28 70 74 72 28 22 39 34 32 .attach(ptr("942 000000A0: 37 32 36 36 32 37 32 39 30 34 35 22 29 2C 20 7B 72662729045"), { 000000B0: 0A 20 20 20 20 6F 6E 45 6E 74 65 72 3A 20 66 75 . onEnter: fu 000000C0: 6E 63 74 69 6F 6E 28 61 72 67 73 29 20 7B 0A 20 nction(args) {. 000000D0: 20 20 20 20 20 20 20 73 65 6E 64 28 61 72 67 73 send(args 000000E0: 5B 30 5D 2E 74 6F 49 6E 74 33 32 28 29 29 3B 0A [0].toInt32());. 000000F0: 20 20 20 20 7D 0A 7D 29 3B 0A 00 }.});.. ==== STREAM PID 26525.0xffff8cd6e81ad000 (C) > 26527.0xffff8cd6e81ad700 (S), length 44 command[26525]: './hello' command[26527]: 'python ./hook.py 0x55bd9092f155' ---- 00000000: 6C 02 01 01 04 00 00 00 06 00 00 00 18 00 00 00 l............... 00000010: 08 01 67 00 03 28 75 29 00 00 00 00 00 00 00 00 ..g..(u)........ 00000020: 05 01 75 00 05 00 00 00 01 00 00 00 ..u.........
- Frida then signals the agent to load the script:
Output
==== STREAM PID 26527.0xffff8cd6e81ad700 (S) > 26525.0xffff8cd6e81ad000 (C), length 132 command[26527]: 'python ./hook.py 0x55bd9092f155' command[26525]: './hello' ---- 00000000: 6C 01 00 01 04 00 00 00 06 00 00 00 70 00 00 00 l...........p... 00000010: 08 01 67 00 03 28 75 29 00 00 00 00 00 00 00 00 ..g..(u)........ 00000020: 01 01 6F 00 18 00 00 00 2F 72 65 2F 66 72 69 64 ..o...../re/frid 00000030: 61 2F 41 67 65 6E 74 53 65 73 73 69 6F 6E 2F 31 a/AgentSession/1 00000040: 00 00 00 00 00 00 00 00 03 01 73 00 0A 00 00 00 ..........s..... 00000050: 4C 6F 61 64 53 63 72 69 70 74 00 00 00 00 00 00 LoadScript...... 00000060: 02 01 73 00 17 00 00 00 72 65 2E 66 72 69 64 61 ..s.....re.frida 00000070: 2E 41 67 65 6E 74 53 65 73 73 69 6F 6E 31 32 00 .AgentSession12. 00000080: 01 00 00 00 .... ==== STREAM PID 26525.0xffff8cd6e81ad000 (C) > 26527.0xffff8cd6e81ad700 (S), length 32 command[26525]: './hello' command[26527]: 'python ./hook.py 0x55bd9092f155' ---- 00000000: 6C 02 01 01 00 00 00 00 07 00 00 00 10 00 00 00 l............... 00000010: 08 01 67 00 00 00 00 00 05 01 75 00 06 00 00 00 ..g.......u.....
- The injected script, when run, returns the requested data through the socket:
Output
==== STREAM PID 26525.0xffff8cd6e81ad000 (C) > 26527.0xffff8cd6e81ad700 (S), length 180 command[26525]: './hello' command[26527]: 'python ./hook.py 0x55bd9092f155' ---- 00000000: 6C 04 01 01 2C 00 00 00 08 00 00 00 78 00 00 00 l...,.......x... 00000010: 08 01 67 00 07 28 75 29 73 62 61 79 00 00 00 00 ..g..(u)sbay.... 00000020: 01 01 6F 00 18 00 00 00 2F 72 65 2F 66 72 69 64 ..o...../re/frid 00000030: 61 2F 41 67 65 6E 74 53 65 73 73 69 6F 6E 2F 31 a/AgentSession/1 00000040: 00 00 00 00 00 00 00 00 03 01 73 00 11 00 00 00 ..........s..... 00000050: 4D 65 73 73 61 67 65 46 72 6F 6D 53 63 72 69 70 MessageFromScrip 00000060: 74 00 00 00 00 00 00 00 02 01 73 00 17 00 00 00 t.........s..... 00000070: 72 65 2E 66 72 69 64 61 2E 41 67 65 6E 74 53 65 re.frida.AgentSe 00000080: 73 73 69 6F 6E 31 32 00 01 00 00 00 1B 00 00 00 ssion12......... 00000090: 7B 22 74 79 70 65 22 3A 22 73 65 6E 64 22 2C 22 {"type":"send"," 000000A0: 70 61 79 6C 6F 61 64 22 3A 31 7D 00 00 00 00 00 payload":1}..... 000000B0: 00 00 00 00 .... ==== STREAM PID 26525.0xffff8cd6e81ad000 (C) > 26527.0xffff8cd6e81ad700 (S), length 180 command[26525]: './hello' command[26527]: 'python ./hook.py 0x55bd9092f155' ---- 00000000: 6C 04 01 01 2C 00 00 00 09 00 00 00 78 00 00 00 l...,.......x... 00000010: 08 01 67 00 07 28 75 29 73 62 61 79 00 00 00 00 ..g..(u)sbay.... 00000020: 01 01 6F 00 18 00 00 00 2F 72 65 2F 66 72 69 64 ..o...../re/frid 00000030: 61 2F 41 67 65 6E 74 53 65 73 73 69 6F 6E 2F 31 a/AgentSession/1 00000040: 00 00 00 00 00 00 00 00 03 01 73 00 11 00 00 00 ..........s..... 00000050: 4D 65 73 73 61 67 65 46 72 6F 6D 53 63 72 69 70 MessageFromScrip 00000060: 74 00 00 00 00 00 00 00 02 01 73 00 17 00 00 00 t.........s..... 00000070: 72 65 2E 66 72 69 64 61 2E 41 67 65 6E 74 53 65 re.frida.AgentSe 00000080: 73 73 69 6F 6E 31 32 00 01 00 00 00 1B 00 00 00 ssion12......... 00000090: 7B 22 74 79 70 65 22 3A 22 73 65 6E 64 22 2C 22 {"type":"send"," 000000A0: 70 61 79 6C 6F 61 64 22 3A 32 7D 00 00 00 00 00 payload":2}..... 000000B0: 00 00 00 00 .... ==== STREAM PID 26525.0xffff8cd6e81ad000 (C) > 26527.0xffff8cd6e81ad700 (S), length 180 command[26525]: './hello' command[26527]: 'python ./hook.py 0x55bd9092f155' ---- 00000000: 6C 04 01 01 2C 00 00 00 0A 00 00 00 78 00 00 00 l...,.......x... 00000010: 08 01 67 00 07 28 75 29 73 62 61 79 00 00 00 00 ..g..(u)sbay.... 00000020: 01 01 6F 00 18 00 00 00 2F 72 65 2F 66 72 69 64 ..o...../re/frid 00000030: 61 2F 41 67 65 6E 74 53 65 73 73 69 6F 6E 2F 31 a/AgentSession/1 00000040: 00 00 00 00 00 00 00 00 03 01 73 00 11 00 00 00 ..........s..... 00000050: 4D 65 73 73 61 67 65 46 72 6F 6D 53 63 72 69 70 MessageFromScrip 00000060: 74 00 00 00 00 00 00 00 02 01 73 00 17 00 00 00 t.........s..... 00000070: 72 65 2E 66 72 69 64 61 2E 41 67 65 6E 74 53 65 re.frida.AgentSe 00000080: 73 73 69 6F 6E 31 32 00 01 00 00 00 1B 00 00 00 ssion12......... 00000090: 7B 22 74 79 70 65 22 3A 22 73 65 6E 64 22 2C 22 {"type":"send"," 000000A0: 70 61 79 6C 6F 61 64 22 3A 33 7D 00 00 00 00 00 payload":3}..... 000000B0: 00 00 00 00 .... ---snip---
- This message repeats with the payload incrementing by
1
as specified in the example code. When the user is done using Frida, Frida will instruct the agent to unload the injected script:
Output
==== STREAM PID 26527.0xffff8cd6e81ad700 (S) > 26525.0xffff8cd6e81ad000 (C), length 132 command[26527]: 'python ./hook.py 0x55bd9092f155' command[26525]: './hello' ---- 00000000: 6C 01 00 01 04 00 00 00 07 00 00 00 70 00 00 00 l...........p... 00000010: 08 01 67 00 03 28 75 29 00 00 00 00 00 00 00 00 ..g..(u)........ 00000020: 01 01 6F 00 18 00 00 00 2F 72 65 2F 66 72 69 64 ..o...../re/frid 00000030: 61 2F 41 67 65 6E 74 53 65 73 73 69 6F 6E 2F 31 a/AgentSession/1 00000040: 00 00 00 00 00 00 00 00 03 01 73 00 0D 00 00 00 ..........s..... 00000050: 44 65 73 74 72 6F 79 53 63 72 69 70 74 00 00 00 DestroyScript... 00000060: 02 01 73 00 17 00 00 00 72 65 2E 66 72 69 64 61 ..s.....re.frida 00000070: 2E 41 67 65 6E 74 53 65 73 73 69 6F 6E 31 32 00 .AgentSession12. 00000080: 01 00 00 00 .... ==== STREAM PID 26525.0xffff8cd6e81ad000 (C) > 26527.0xffff8cd6e81ad700 (S), length 32 command[26525]: './hello' command[26527]: 'python ./hook.py 0x55bd9092f155' ---- 00000000: 6C 02 01 01 00 00 00 00 0B 00 00 00 10 00 00 00 l............... 00000010: 08 01 67 00 00 00 00 00 05 01 75 00 07 00 00 00 ..g.......u.....
- The injected script is then closed by Frida:
Output
==== STREAM PID 26527.0xffff8cd6e81ad700 (S) > 26525.0xffff8cd6e81ad000 (C), length 112 command[26527]: 'python ./hook.py 0x55bd9092f155' command[26525]: './hello' ---- 00000000: 6C 01 00 01 00 00 00 00 08 00 00 00 60 00 00 00 l...........`... 00000010: 08 01 67 00 00 00 00 00 01 01 6F 00 18 00 00 00 ..g.......o..... 00000020: 2F 72 65 2F 66 72 69 64 61 2F 41 67 65 6E 74 53 /re/frida/AgentS 00000030: 65 73 73 69 6F 6E 2F 31 00 00 00 00 00 00 00 00 ession/1........ 00000040: 03 01 73 00 05 00 00 00 43 6C 6F 73 65 00 00 00 ..s.....Close... 00000050: 02 01 73 00 17 00 00 00 72 65 2E 66 72 69 64 61 ..s.....re.frida 00000060: 2E 41 67 65 6E 74 53 65 73 73 69 6F 6E 31 32 00 .AgentSession12. ==== STREAM PID 26525.0xffff8cd6e81ad000 (C) > 26527.0xffff8cd6e81ad700 (S), length 132 command[26525]: './hello' command[26527]: 'python ./hook.py 0x55bd9092f155' ---- 00000000: 6C 04 01 01 04 00 00 00 0C 00 00 00 70 00 00 00 l...........p... 00000010: 08 01 67 00 03 28 75 29 00 00 00 00 00 00 00 00 ..g..(u)........ 00000020: 01 01 6F 00 1E 00 00 00 2F 72 65 2F 66 72 69 64 ..o...../re/frid 00000030: 61 2F 41 67 65 6E 74 53 65 73 73 69 6F 6E 50 72 a/AgentSessionPr 00000040: 6F 76 69 64 65 72 00 00 03 01 73 00 06 00 00 00 ovider....s..... 00000050: 43 6C 6F 73 65 64 00 00 02 01 73 00 1F 00 00 00 Closed....s..... 00000060: 72 65 2E 66 72 69 64 61 2E 41 67 65 6E 74 53 65 re.frida.AgentSe 00000070: 73 73 69 6F 6E 50 72 6F 76 69 64 65 72 31 32 00 ssionProvider12. 00000080: 01 00 00 00 .... ==== STREAM PID 26525.0xffff8cd6e81ad000 (C) > 26527.0xffff8cd6e81ad700 (S), length 32 command[26525]: './hello' command[26527]: 'python ./hook.py 0x55bd9092f155' ---- 00000000: 6C 02 01 01 00 00 00 00 0D 00 00 00 10 00 00 00 l............... 00000010: 08 01 67 00 00 00 00 00 05 01 75 00 08 00 00 00 ..g.......u.....
- And, for the final step, Frida instructs the injected agent to unload:
Output
==== STREAM PID 26527.0xffff8cd6e81ad700 (S) > 26525.0xffff8cd6e81ad000 (C), length 120 command[26527]: 'python ./hook.py 0x55bd9092f155' command[26525]: './hello' ---- 00000000: 6C 01 00 01 00 00 00 00 09 00 00 00 68 00 00 00 l...........h... 00000010: 08 01 67 00 00 00 00 00 01 01 6F 00 1E 00 00 00 ..g.......o..... 00000020: 2F 72 65 2F 66 72 69 64 61 2F 41 67 65 6E 74 53 /re/frida/AgentS 00000030: 65 73 73 69 6F 6E 50 72 6F 76 69 64 65 72 00 00 essionProvider.. 00000040: 03 01 73 00 06 00 00 00 55 6E 6C 6F 61 64 00 00 ..s.....Unload.. 00000050: 02 01 73 00 1F 00 00 00 72 65 2E 66 72 69 64 61 ..s.....re.frida 00000060: 2E 41 67 65 6E 74 53 65 73 73 69 6F 6E 50 72 6F .AgentSessionPro 00000070: 76 69 64 65 72 31 32 00 vider12. ==== STREAM PID 26525.0xffff8cd6e81ad000 (C) > 26527.0xffff8cd6e81ad700 (S), length 32 command[26525]: './hello' command[26527]: 'python ./hook.py 0x55bd9092f155' ---- 00000000: 6C 02 01 01 00 00 00 00 0E 00 00 00 10 00 00 00 l............... 00000010: 08 01 67 00 00 00 00 00 05 01 75 00 09 00 00 00 ..g.......u.....
Frida CLI Tab Completion Protocol
The Frida command line tool has a tab completion-based prompt that allows quick access to all of its features. We will examine the communications that occur while Frida is performing a tab complete operation.
- Frida starts the interaction by sending a
PostToScript
command to the injected script. The script sent callsObject.getOwnProperties()
on thethis
object:
Output
==== STREAM PID 26851.0xffff8cd6a8a25900 (S) > 26847.0xffff8cd62e7e9100 (C), length 220 command[26851]: '/usr/bin/python /usr/local/bin/frida -p 26847' command[26847]: './hello' ---- 00000000: 6C 01 00 01 5C 00 00 00 0B 00 00 00 70 00 00 00 l..........p... 00000010: 08 01 67 00 07 28 75 29 73 62 61 79 00 00 00 00 ..g..(u)sbay.... 00000020: 01 01 6F 00 18 00 00 00 2F 72 65 2F 66 72 69 64 ..o...../re/frid 00000030: 61 2F 41 67 65 6E 74 53 65 73 73 69 6F 6E 2F 31 a/AgentSession/1 00000040: 00 00 00 00 00 00 00 00 03 01 73 00 0C 00 00 00 ..........s..... 00000050: 50 6F 73 74 54 6F 53 63 72 69 70 74 00 00 00 00 PostToScript.... 00000060: 02 01 73 00 17 00 00 00 72 65 2E 66 72 69 64 61 ..s.....re.frida 00000070: 2E 41 67 65 6E 74 53 65 73 73 69 6F 6E 31 32 00 .AgentSession12. 00000080: 01 00 00 00 4A 00 00 00 5B 22 66 72 69 64 61 3A ....J...["frida: 00000090: 72 70 63 22 2C 20 35 2C 20 22 63 61 6C 6C 22 2C rpc", 5, "call", 000000A0: 20 22 65 76 61 6C 75 61 74 65 22 2C 20 5B 22 4F "evaluate", ["O 000000B0: 62 6A 65 63 74 2E 67 65 74 4F 77 6E 50 72 6F 70 bject.getOwnProp 000000C0: 65 72 74 79 4E 61 6D 65 73 28 74 68 69 73 29 22 ertyNames(this)" 000000D0: 5D 5D 00 00 00 00 00 00 00 00 00 00 ]].......... ==== STREAM PID 26847.0xffff8cd62e7e9100 (C) > 26851.0xffff8cd6a8a25900 (S), length 32 command[26847]: './hello' command[26851]: '/usr/bin/python /usr/local/bin/frida -p 26847' ---- 00000000: 6C 02 01 01 00 00 00 00 10 00 00 00 10 00 00 00 l............... 00000010: 08 01 67 00 00 00 00 00 05 01 75 00 0B 00 00 00 ..g.......u.....
- This causes the Frida agent within the target process to evaluate the script and return all properties of the
this
object, the possible actions and available commands we are attempting to tab complete:
Output
==== STREAM PID 26847.0xffff8cd62e7e9100 (C) > 26851.0xffff8cd6a8a25900 (S), length 1736 command[26847]: './hello' command[26851]: '/usr/bin/python /usr/local/bin/frida -p 26847' ---- 00000000: 6C 04 01 01 40 06 00 00 11 00 00 00 78 00 00 00 l...@.......x... 00000010: 08 01 67 00 07 28 75 29 73 62 61 79 00 00 00 00 ..g..(u)sbay.... 00000020: 01 01 6F 00 18 00 00 00 2F 72 65 2F 66 72 69 64 ..o...../re/frid 00000030: 61 2F 41 67 65 6E 74 53 65 73 73 69 6F 6E 2F 31 a/AgentSession/1 00000040: 00 00 00 00 00 00 00 00 03 01 73 00 11 00 00 00 ..........s..... 00000050: 4D 65 73 73 61 67 65 46 72 6F 6D 53 63 72 69 70 MessageFromScrip 00000060: 74 00 00 00 00 00 00 00 02 01 73 00 17 00 00 00 t.........s..... 00000070: 72 65 2E 66 72 69 64 61 2E 41 67 65 6E 74 53 65 re.frida.AgentSe 00000080: 73 73 69 6F 6E 31 32 00 01 00 00 00 2C 06 00 00 ssion12.....,... 00000090: 7B 22 74 79 70 65 22 3A 22 73 65 6E 64 22 2C 22 {"type":"send"," 000000A0: 70 61 79 6C 6F 61 64 22 3A 5B 22 66 72 69 64 61 payload":["frida 000000B0: 3A 72 70 63 22 2C 35 2C 22 6F 6B 22 2C 5B 22 6F :rpc",5,"ok",["o 000000C0: 62 6A 65 63 74 22 2C 5B 22 4E 61 4E 22 2C 22 49 bject",["NaN","I 000000D0: 6E 66 69 6E 69 74 79 22 2C 22 75 6E 64 65 66 69 nfinity","undefi 000000E0: 6E 65 64 22 2C 22 4F 62 6A 65 63 74 22 2C 22 46 ned","Object","F 000000F0: 75 6E 63 74 69 6F 6E 22 2C 22 41 72 72 61 79 22 unction","Array" 00000100: 2C 22 53 74 72 69 6E 67 22 2C 22 42 6F 6F 6C 65 ,"String","Boole 00000110: 61 6E 22 2C 22 4E 75 6D 62 65 72 22 2C 22 44 61 an","Number","Da 00000120: 74 65 22 2C 22 52 65 67 45 78 70 22 2C 22 45 72 te","RegExp","Er 00000130: 72 6F 72 22 2C 22 45 76 61 6C 45 72 72 6F 72 22 ror","EvalError" 00000140: 2C 22 52 61 6E 67 65 45 72 72 6F 72 22 2C 22 52 ,"RangeError","R 00000150: 65 66 65 72 65 6E 63 65 45 72 72 6F 72 22 2C 22 eferenceError"," 00000160: 53 79 6E 74 61 78 45 72 72 6F 72 22 2C 22 54 79 SyntaxError","Ty 00000170: 70 65 45 72 72 6F 72 22 2C 22 55 52 49 45 72 72 peError","URIErr 00000180: 6F 72 22 2C 22 4D 61 74 68 22 2C 22 4A 53 4F 4E or","Math","JSON 00000190: 22 2C 22 44 75 6B 74 61 70 65 22 2C 22 50 72 6F ","Duktape","Pro 000001A0: 78 79 22 2C 22 52 65 66 6C 65 63 74 22 2C 22 42 xy","Reflect","B 000001B0: 75 66 66 65 72 22 2C 22 41 72 72 61 79 42 75 66 uffer","ArrayBuf 000001C0: 66 65 72 22 2C 22 44 61 74 61 56 69 65 77 22 2C fer","DataView", 000001D0: 22 49 6E 74 38 41 72 72 61 79 22 2C 22 55 69 6E "Int8Array","Uin 000001E0: 74 38 41 72 72 61 79 22 2C 22 55 69 6E 74 38 43 t8Array","Uint8C 000001F0: 6C 61 6D 70 65 64 41 72 72 61 79 22 2C 22 49 6E lampedArray","In 00000200: 74 31 36 41 72 72 61 79 22 2C 22 55 69 6E 74 31 t16Array","Uint1 00000210: 36 41 72 72 61 79 22 2C 22 49 6E 74 33 32 41 72 6Array","Int32Ar 00000220: 72 61 79 22 2C 22 55 69 6E 74 33 32 41 72 72 61 ray","Uint32Arra 00000230: 79 22 2C 22 46 6C 6F 61 74 33 32 41 72 72 61 79 y","Float32Array 00000240: 22 2C 22 46 6C 6F 61 74 36 34 41 72 72 61 79 22 ","Float64Array" 00000250: 2C 22 70 61 72 73 65 49 6E 74 22 2C 22 70 61 72 ,"parseInt","par 00000260: 73 65 46 6C 6F 61 74 22 2C 22 54 65 78 74 45 6E seFloat","TextEn 00000270: 63 6F 64 65 72 22 2C 22 54 65 78 74 44 65 63 6F coder","TextDeco 00000280: 64 65 72 22 2C 22 70 65 72 66 6F 72 6D 61 6E 63 der","performanc 00000290: 65 22 2C 22 65 76 61 6C 22 2C 22 69 73 4E 61 4E e","eval","isNaN 000002A0: 22 2C 22 69 73 46 69 6E 69 74 65 22 2C 22 64 65 ","isFinite","de 000002B0: 63 6F 64 65 55 52 49 22 2C 22 64 65 63 6F 64 65 codeURI","decode 000002C0: 55 52 49 43 6F 6D 70 6F 6E 65 6E 74 22 2C 22 65 URIComponent","e 000002D0: 6E 63 6F 64 65 55 52 49 22 2C 22 65 6E 63 6F 64 ncodeURI","encod 000002E0: 65 55 52 49 43 6F 6D 70 6F 6E 65 6E 74 22 2C 22 eURIComponent"," 000002F0: 65 73 63 61 70 65 22 2C 22 75 6E 65 73 63 61 70 escape","unescap 00000300: 65 22 2C 22 67 6C 6F 62 61 6C 22 2C 22 46 72 69 e","global","Fri 00000310: 64 61 22 2C 22 53 63 72 69 70 74 22 2C 22 57 65 da","Script","We 00000320: 61 6B 52 65 66 22 2C 22 5F 73 65 74 54 69 6D 65 akRef","_setTime 00000330: 6F 75 74 22 2C 22 5F 73 65 74 49 6E 74 65 72 76 out","_setInterv 00000340: 61 6C 22 2C 22 63 6C 65 61 72 54 69 6D 65 6F 75 al","clearTimeou 00000350: 74 22 2C 22 63 6C 65 61 72 49 6E 74 65 72 76 61 t","clearInterva 00000360: 6C 22 2C 22 67 63 22 2C 22 5F 73 65 6E 64 22 2C l","gc","_send", 00000370: 22 5F 73 65 74 55 6E 68 61 6E 64 6C 65 64 45 78 "_setUnhandledEx 00000380: 63 65 70 74 69 6F 6E 43 61 6C 6C 62 61 63 6B 22 ceptionCallback" 00000390: 2C 22 5F 73 65 74 49 6E 63 6F 6D 69 6E 67 4D 65 ,"_setIncomingMe 000003A0: 73 73 61 67 65 43 61 6C 6C 62 61 63 6B 22 2C 22 ssageCallback"," 000003B0: 5F 77 61 69 74 46 6F 72 45 76 65 6E 74 22 2C 22 _waitForEvent"," 000003C0: 49 6E 74 36 34 22 2C 22 55 49 6E 74 36 34 22 2C Int64","UInt64", 000003D0: 22 4E 61 74 69 76 65 50 6F 69 6E 74 65 72 22 2C "NativePointer", 000003E0: 22 4E 61 74 69 76 65 52 65 73 6F 75 72 63 65 22 "NativeResource" 000003F0: 2C 22 4E 61 74 69 76 65 46 75 6E 63 74 69 6F 6E ,"NativeFunction 00000400: 22 2C 22 53 79 73 74 65 6D 46 75 6E 63 74 69 6F ","SystemFunctio 00000410: 6E 22 2C 22 4E 61 74 69 76 65 43 61 6C 6C 62 61 n","NativeCallba 00000420: 63 6B 22 2C 22 43 70 75 43 6F 6E 74 65 78 74 22 ck","CpuContext" 00000430: 2C 22 53 6F 75 72 63 65 4D 61 70 22 2C 22 4B 65 ,"SourceMap","Ke 00000440: 72 6E 65 6C 22 2C 22 4D 65 6D 6F 72 79 22 2C 22 rnel","Memory"," 00000450: 4D 65 6D 6F 72 79 41 63 63 65 73 73 4D 6F 6E 69 MemoryAccessMoni 00000460: 74 6F 72 22 2C 22 50 72 6F 63 65 73 73 22 2C 22 tor","Process"," 00000470: 54 68 72 65 61 64 22 2C 22 42 61 63 6B 74 72 61 Thread","Backtra 00000480: 63 65 72 22 2C 22 4D 6F 64 75 6C 65 22 2C 22 4D cer","Module","M 00000490: 6F 64 75 6C 65 4D 61 70 22 2C 22 46 69 6C 65 22 oduleMap","File" 000004A0: 2C 22 49 4F 53 74 72 65 61 6D 22 2C 22 49 6E 70 ,"IOStream","Inp 000004B0: 75 74 53 74 72 65 61 6D 22 2C 22 4F 75 74 70 75 utStream","Outpu 000004C0: 74 53 74 72 65 61 6D 22 2C 22 55 6E 69 78 49 6E tStream","UnixIn 000004D0: 70 75 74 53 74 72 65 61 6D 22 2C 22 55 6E 69 78 putStream","Unix 000004E0: 4F 75 74 70 75 74 53 74 72 65 61 6D 22 2C 22 53 OutputStream","S 000004F0: 6F 63 6B 65 74 22 2C 22 53 6F 63 6B 65 74 4C 69 ocket","SocketLi 00000500: 73 74 65 6E 65 72 22 2C 22 53 6F 63 6B 65 74 43 stener","SocketC 00000510: 6F 6E 6E 65 63 74 69 6F 6E 22 2C 22 53 71 6C 69 onnection","Sqli 00000520: 74 65 44 61 74 61 62 61 73 65 22 2C 22 53 71 6C teDatabase","Sql 00000530: 69 74 65 53 74 61 74 65 6D 65 6E 74 22 2C 22 49 iteStatement","I 00000540: 6E 74 65 72 63 65 70 74 6F 72 22 2C 22 49 6E 76 nterceptor","Inv 00000550: 6F 63 61 74 69 6F 6E 4C 69 73 74 65 6E 65 72 22 ocationListener" 00000560: 2C 22 49 6E 76 6F 63 61 74 69 6F 6E 43 6F 6E 74 ,"InvocationCont 00000570: 65 78 74 22 2C 22 49 6E 76 6F 63 61 74 69 6F 6E ext","Invocation 00000580: 41 72 67 73 22 2C 22 49 6E 76 6F 63 61 74 69 6F Args","Invocatio 00000590: 6E 52 65 74 75 72 6E 56 61 6C 75 65 22 2C 22 41 nReturnValue","A 000005A0: 70 69 52 65 73 6F 6C 76 65 72 22 2C 22 44 65 62 piResolver","Deb 000005B0: 75 67 53 79 6D 62 6F 6C 22 2C 22 49 6E 73 74 72 ugSymbol","Instr 000005C0: 75 63 74 69 6F 6E 22 2C 22 58 38 36 57 72 69 74 uction","X86Writ 000005D0: 65 72 22 2C 22 58 38 36 52 65 6C 6F 63 61 74 6F er","X86Relocato 000005E0: 72 22 2C 22 53 74 61 6C 6B 65 72 22 2C 22 53 74 r","Stalker","St 000005F0: 61 6C 6B 65 72 49 74 65 72 61 74 6F 72 22 2C 22 alkerIterator"," 00000600: 50 72 6F 62 65 41 72 67 73 22 2C 22 5F 5F 63 6F ProbeArgs","__co 00000610: 72 65 2D 6A 73 5F 73 68 61 72 65 64 5F 5F 22 2C re-js_shared__", 00000620: 22 50 72 6F 6D 69 73 65 22 2C 22 72 70 63 22 2C "Promise","rpc", 00000630: 22 72 65 63 76 22 2C 22 73 65 6E 64 22 2C 22 73 "recv","send","s 00000640: 65 74 54 69 6D 65 6F 75 74 22 2C 22 73 65 74 49 etTimeout","setI 00000650: 6E 74 65 72 76 61 6C 22 2C 22 73 65 74 49 6D 6D nterval","setImm 00000660: 65 64 69 61 74 65 22 2C 22 63 6C 65 61 72 49 6D ediate","clearIm 00000670: 6D 65 64 69 61 74 65 22 2C 22 69 6E 74 36 34 22 mediate","int64" 00000680: 2C 22 75 69 6E 74 36 34 22 2C 22 70 74 72 22 2C ,"uint64","ptr", 00000690: 22 4E 55 4C 4C 22 2C 22 63 6F 6E 73 6F 6C 65 22 "NULL","console" 000006A0: 2C 22 68 65 78 64 75 6D 70 22 2C 22 4F 62 6A 43 ,"hexdump","ObjC 000006B0: 22 2C 22 4A 61 76 61 22 5D 5D 5D 7D 00 00 00 00 ","Java"]]]}.... 000006C0: 00 00 00 00 00 00 00 00 ........
- Seeing this list, we begin to type out
File.
and hit tab to see our options.Object.getOwnProperties()
is called again, but now it is called onFile
. This returns us the following attributes:prototype
,length
, andname
:
Output
==== STREAM PID 26851.0xffff8cd6a8a25900 (S) > 26847.0xffff8cd62e7e9100 (C), length 1220 command[26851]: '/usr/bin/python /usr/local/bin/frida -p 26847' command[26847]: './hello' ---- 00000000: 6C 01 00 01 44 04 00 00 0C 00 00 00 70 00 00 00 l...D.......p... 00000010: 08 01 67 00 07 28 75 29 73 62 61 79 00 00 00 00 ..g..(u)sbay.... 00000020: 01 01 6F 00 18 00 00 00 2F 72 65 2F 66 72 69 64 ..o...../re/frid 00000030: 61 2F 41 67 65 6E 74 53 65 73 73 69 6F 6E 2F 31 a/AgentSession/1 00000040: 00 00 00 00 00 00 00 00 03 01 73 00 0C 00 00 00 ..........s..... 00000050: 50 6F 73 74 54 6F 53 63 72 69 70 74 00 00 00 00 PostToScript.... 00000060: 02 01 73 00 17 00 00 00 72 65 2E 66 72 69 64 61 ..s.....re.frida 00000070: 2E 41 67 65 6E 74 53 65 73 73 69 6F 6E 31 32 00 .AgentSession12. 00000080: 01 00 00 00 30 04 00 00 5B 22 66 72 69 64 61 3A ....0...["frida: 00000090: 72 70 63 22 2C 20 36 2C 20 22 63 61 6C 6C 22 2C rpc", 6, "call", 000000A0: 20 22 65 76 61 6C 75 61 74 65 22 2C 20 5B 22 74 "evaluate", ["t 000000B0: 72 79 20 7B 5C 6E 20 20 20 20 20 20 20 20 20 20 ry {n 000000C0: 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 000000D0: 20 20 20 20 20 20 20 20 20 20 28 66 75 6E 63 74 (funct 000000E0: 69 6F 6E 20 28 6F 29 20 7B 5C 6E 20 20 20 20 20 ion (o) {n 000000F0: 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 00000100: 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 00000110: 20 20 20 5C 22 75 73 65 20 73 74 72 69 63 74 5C "use strict 00000120: 22 3B 5C 6E 20 20 20 20 20 20 20 20 20 20 20 20 ";n 00000130: 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 00000140: 20 20 20 20 20 20 20 20 20 20 20 20 76 61 72 20 var 00000150: 6B 20 3D 20 4F 62 6A 65 63 74 2E 67 65 74 4F 77 k = Object.getOw 00000160: 6E 50 72 6F 70 65 72 74 79 4E 61 6D 65 73 28 6F nPropertyNames(o 00000170: 29 3B 5C 6E 20 20 20 20 20 20 20 20 20 20 20 20 );n 00000180: 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 00000190: 20 20 20 20 20 20 20 20 20 20 20 20 69 66 20 28 if ( 000001A0: 6F 20 21 3D 3D 20 6E 75 6C 6C 20 26 26 20 6F 20 o !== null o 000001B0: 21 3D 3D 20 75 6E 64 65 66 69 6E 65 64 29 20 7B !== undefined) { 000001C0: 5C 6E 20 20 20 20 20 20 20 20 20 20 20 20 20 20 n 000001D0: 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 000001E0: 20 20 20 20 20 20 20 20 20 20 20 20 20 20 76 61 va 000001F0: 72 20 70 3B 5C 6E 20 20 20 20 20 20 20 20 20 20 r p;n 00000200: 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 00000210: 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 00000220: 20 20 69 66 20 28 74 79 70 65 6F 66 20 6F 20 21 if (typeof o ! 00000230: 3D 3D 20 27 6F 62 6A 65 63 74 27 29 5C 6E 20 20 == 'object')n 00000240: 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 00000250: 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 00000260: 20 20 20 20 20 20 20 20 20 20 20 20 20 20 70 20 p 00000270: 3D 20 6F 2E 5F 5F 70 72 6F 74 6F 5F 5F 3B 5C 6E = o.__proto__;n 00000280: 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 00000290: 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 000002A0: 20 20 20 20 20 20 20 20 20 20 20 20 65 6C 73 65 else 000002B0: 5C 6E 20 20 20 20 20 20 20 20 20 20 20 20 20 20 n 000002C0: 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 000002D0: 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 000002E0: 20 20 70 20 3D 20 4F 62 6A 65 63 74 2E 67 65 74 p = Object.get 000002F0: 50 72 6F 74 6F 74 79 70 65 4F 66 28 6F 29 3B 5C PrototypeOf(o); 00000300: 6E 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 n 00000310: 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 00000320: 20 20 20 20 20 20 20 20 20 20 20 20 20 69 66 20 if 00000330: 28 70 20 21 3D 3D 20 6E 75 6C 6C 20 26 26 20 70 (p !== null p 00000340: 20 21 3D 3D 20 75 6E 64 65 66 69 6E 65 64 29 5C !== undefined) 00000350: 6E 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 n 00000360: 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 00000370: 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 00000380: 20 6B 20 3D 20 6B 2E 63 6F 6E 63 61 74 28 4F 62 k = k.concat(Ob 00000390: 6A 65 63 74 2E 67 65 74 4F 77 6E 50 72 6F 70 65 ject.getOwnPrope 000003A0: 72 74 79 4E 61 6D 65 73 28 70 29 29 3B 5C 6E 20 rtyNames(p));n 000003B0: 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 000003C0: 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 000003D0: 20 20 20 20 20 20 20 7D 5C 6E 20 20 20 20 20 20 }n 000003E0: 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 000003F0: 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 00000400: 20 20 72 65 74 75 72 6E 20 6B 3B 5C 6E 20 20 20 return k;n 00000410: 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 00000420: 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 00000430: 20 7D 29 28 46 69 6C 65 29 3B 5C 6E 20 20 20 20 })(File);n 00000440: 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 00000450: 20 20 20 20 20 20 20 20 20 20 20 20 7D 20 63 61 } ca 00000460: 74 63 68 20 28 65 29 20 7B 5C 6E 20 20 20 20 20 tch (e) {n 00000470: 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 00000480: 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 5B [ 00000490: 5D 3B 5C 6E 20 20 20 20 20 20 20 20 20 20 20 20 ];n 000004A0: 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 000004B0: 20 20 20 20 7D 22 5D 5D 00 00 00 00 00 00 00 00 }"]]........ 000004C0: 00 00 00 00 .... ==== STREAM PID 26847.0xffff8cd62e7e9100 (C) > 26851.0xffff8cd6a8a25900 (S), length 32 command[26847]: './hello' command[26851]: '/usr/bin/python /usr/local/bin/frida -p 26847' ---- 00000000: 6C 02 01 01 00 00 00 00 12 00 00 00 10 00 00 00 l............... 00000010: 08 01 67 00 00 00 00 00 05 01 75 00 0C 00 00 00 ..g.......u..... ==== STREAM PID 26847.0xffff8cd62e7e9100 (C) > 26851.0xffff8cd6a8a25900 (S), length 240 command[26847]: './hello' command[26851]: '/usr/bin/python /usr/local/bin/frida -p 26847' ---- 00000000: 6C 04 01 01 68 00 00 00 13 00 00 00 78 00 00 00 l...h.......x... 00000010: 08 01 67 00 07 28 75 29 73 62 61 79 00 00 00 00 ..g..(u)sbay.... 00000020: 01 01 6F 00 18 00 00 00 2F 72 65 2F 66 72 69 64 ..o...../re/frid 00000030: 61 2F 41 67 65 6E 74 53 65 73 73 69 6F 6E 2F 31 a/AgentSession/1 00000040: 00 00 00 00 00 00 00 00 03 01 73 00 11 00 00 00 ..........s..... 00000050: 4D 65 73 73 61 67 65 46 72 6F 6D 53 63 72 69 70 MessageFromScrip 00000060: 74 00 00 00 00 00 00 00 02 01 73 00 17 00 00 00 t.........s..... 00000070: 72 65 2E 66 72 69 64 61 2E 41 67 65 6E 74 53 65 re.frida.AgentSe 00000080: 73 73 69 6F 6E 31 32 00 01 00 00 00 57 00 00 00 ssion12.....W... 00000090: 7B 22 74 79 70 65 22 3A 22 73 65 6E 64 22 2C 22 {"type":"send"," 000000A0: 70 61 79 6C 6F 61 64 22 3A 5B 22 66 72 69 64 61 payload":["frida 000000B0: 3A 72 70 63 22 2C 36 2C 22 6F 6B 22 2C 5B 22 6F :rpc",6,"ok",["o 000000C0: 62 6A 65 63 74 22 2C 5B 22 70 72 6F 74 6F 74 79 bject",["prototy 000000D0: 70 65 22 2C 22 6C 65 6E 67 74 68 22 2C 22 6E 61 pe","length","na 000000E0: 6D 65 22 5D 5D 5D 7D 00 00 00 00 00 00 00 00 00 me"]]]}.........
- Back in the UI, we tab cycle to the
length
attribute and hit enter onFile.length
. This tells the injected script to callevaluate
onFile.length
. The script responds with an array indicating that the type of the evaluated expression is"number"
and the value is2
:
Output
==== STREAM PID 26851.0xffff8cd6a8a25900 (S) > 26847.0xffff8cd62e7e9100 (C), length 200 command[26851]: '/usr/bin/python /usr/local/bin/frida -p 26847' command[26847]: './hello' ---- 00000000: 6C 01 00 01 48 00 00 00 0D 00 00 00 70 00 00 00 l...H.......p... 00000010: 08 01 67 00 07 28 75 29 73 62 61 79 00 00 00 00 ..g..(u)sbay.... 00000020: 01 01 6F 00 18 00 00 00 2F 72 65 2F 66 72 69 64 ..o...../re/frid 00000030: 61 2F 41 67 65 6E 74 53 65 73 73 69 6F 6E 2F 31 a/AgentSession/1 00000040: 00 00 00 00 00 00 00 00 03 01 73 00 0C 00 00 00 ..........s..... 00000050: 50 6F 73 74 54 6F 53 63 72 69 70 74 00 00 00 00 PostToScript.... 00000060: 02 01 73 00 17 00 00 00 72 65 2E 66 72 69 64 61 ..s.....re.frida 00000070: 2E 41 67 65 6E 74 53 65 73 73 69 6F 6E 31 32 00 .AgentSession12. 00000080: 01 00 00 00 35 00 00 00 5B 22 66 72 69 64 61 3A ....5...["frida: 00000090: 72 70 63 22 2C 20 37 2C 20 22 63 61 6C 6C 22 2C rpc", 7, "call", 000000A0: 20 22 65 76 61 6C 75 61 74 65 22 2C 20 5B 22 46 "evaluate", ["F 000000B0: 69 6C 65 2E 6C 65 6E 67 74 68 22 5D 5D 00 00 00 ile.length"]]... 000000C0: 00 00 00 00 00 00 00 00 ........ ==== STREAM PID 26847.0xffff8cd62e7e9100 (C) > 26851.0xffff8cd6a8a25900 (S), length 32 command[26847]: './hello' command[26851]: '/usr/bin/python /usr/local/bin/frida -p 26847' ---- 00000000: 6C 02 01 01 00 00 00 00 14 00 00 00 10 00 00 00 l............... 00000010: 08 01 67 00 00 00 00 00 05 01 75 00 0D 00 00 00 ..g.......u..... ==== STREAM PID 26847.0xffff8cd62e7e9100 (C) > 26851.0xffff8cd6a8a25900 (S), length 212 command[26847]: './hello' command[26851]: '/usr/bin/python /usr/local/bin/frida -p 26847' ---- 00000000: 6C 04 01 01 4C 00 00 00 15 00 00 00 78 00 00 00 l...L.......x... 00000010: 08 01 67 00 07 28 75 29 73 62 61 79 00 00 00 00 ..g..(u)sbay.... 00000020: 01 01 6F 00 18 00 00 00 2F 72 65 2F 66 72 69 64 ..o...../re/frid 00000030: 61 2F 41 67 65 6E 74 53 65 73 73 69 6F 6E 2F 31 a/AgentSession/1 00000040: 00 00 00 00 00 00 00 00 03 01 73 00 11 00 00 00 ..........s..... 00000050: 4D 65 73 73 61 67 65 46 72 6F 6D 53 63 72 69 70 MessageFromScrip 00000060: 74 00 00 00 00 00 00 00 02 01 73 00 17 00 00 00 t.........s..... 00000070: 72 65 2E 66 72 69 64 61 2E 41 67 65 6E 74 53 65 re.frida.AgentSe 00000080: 73 73 69 6F 6E 31 32 00 01 00 00 00 3B 00 00 00 ssion12.....;... 00000090: 7B 22 74 79 70 65 22 3A 22 73 65 6E 64 22 2C 22 {"type":"send"," 000000A0: 70 61 79 6C 6F 61 64 22 3A 5B 22 66 72 69 64 61 payload":["frida 000000B0: 3A 72 70 63 22 2C 37 2C 22 6F 6B 22 2C 5B 22 6E :rpc",7,"ok",["n 000000C0: 75 6D 62 65 72 22 2C 32 5D 5D 7D 00 00 00 00 00 umber",2]]}..... 000000D0: 00 00 00 00 ....
eBPF Coding Tricks
While writing unixdump
, we spent an inordinate amount of time attempting to please the eBPF bytecode validator with code constructs it would accept. Most of the time, it would not like idiomatic code that was correct; this was seemingly due to compiler optimizations used by the BCC toolchain. Regardless, we often had to obscure our code in ways that would enable it to pass inspection, and, as a result, the code likely performs worse than if the validator worked correctly in the first place. Additionally, as some of the data structures we needed to parse are dynamically sized and based on dynamic offsets, we had to write (or generate) inline code to parse them directly without loops or recursion. And then there are the generic eBPF hoops that need to be jumped through on a regular basis.
eBPF Gotchas
No Loops, No Jumper Cables
eBPF doesn’t like loops, that much is clear; but we often still need to perform such operations. Abusing the eBPF memcpy
-alike, bpf_probe_read
will only get one so far, especially if one needs to NULL out a struct. In practice, short statically-bounded loops will be unrolled by the compiler and work, but longer loops will not and won’t. However, it is simple to unroll loops with statically-known bounds using compiler pragmas:
#pragma unroll for (size_t i=0; i < 30; i++) { arr[i] = arr[i] + 1; }
This is a fairly useful construct that be ruthlessly applied to a number of different problems.
Uninitialized Memory
One of the things to be careful about with eBPF is that when attempting to copy data from the eBPF stack elsewhere, if any uninitialized memory would be copied, the validator will error with offending stack offsets that are entirely unhelpful. Usually, this is the result of having padding between fields in your structs. A simple way of handling this is to use an unrolled loop akin to memset
-ing zero; where possible, such code will be optimized to use 8-byte writes. However, this is computationally wasteful. Instead, another option is to carefully control field types and ordering to fill in all gaps. Failing this, explicitly declaring settable padding values and padding unions can enable a programmer to manually elide double writes to the struct. And lastly, one can always use a packed struct (e.g. struct __attribute__((__packed__)) foo {...}
); this may require more byte shuffling operations to write and read, but can be of help when the limiting factor of the eBPF code is the effective rate/drop limit of perf_submit
, by reducing the overall amount of data sent.
eBPF Chicanery
For unixdump
we had a number of operational needs based on correctness or performance goals that required writing a significant amount of non-idiomatic C code and code generation tooling. While none of this is especially groundbreaking, it is worth discussing how to perform common programmatic tasks while under constraints like those imposed by eBPF.
Ratcheting
In addition to managing memory shared between kernel space and userspace, we also needed to maintain state of the current position within the custom ring buffer. This is achieved simply enough by using another per-CPU ring buffer, one that only holds a single value. This provides a separate position value associated with each per-CPU ring buffer. However, the problem with this setup is not in the data itself, but the mechanism by which it is incremented, or, more importantly, wrapped. The eBPF validator was displeased with any ratchet that tried to perform the wrap via a specific switch case. Instead only it accepts implementations where wrapping is only performed using the default:
label; attempts to wrap the value in the last “valid” case or “guess” the wrapping position will fail, even if the default:
code also wraps. For example, the following is an eBPF-valid position counter ratchet implementation:
u32 pos = UINT32_MAX; int key = 0; sync = sync_buf.lookup( key); if (!sync) { return 0; } pos = 0; switch (sync->next) { case 0: { pos = 0; sync->next = 1; break; }; case 1: { pos = 1; sync->next = 2; break; }; default: { pos = 0; sync->next = 1; } }
Dynamic Structure Parsing
While writing unixdump
, we got the crazy idea to keep track of all ancillary data (e.g. file descriptors) passing over Unix domain sockets. While this is of great benefit for tracking how processes are passing file handles, sockets, and other descriptors to each other, the “format” into which the data is marshalled is very fluid and poorly specified. For example, similarly to SMS messages, received messages may have a different structure from what what actually sent; in particular, multiple messages of the same type may be coalesced into a single message containing multiple values, regardless of the order in which they were sent.
In unixdump
, we use unrolled nested loops to iterate through the CMSG
structures containing ancillary data. Where possible, we use the CMSG_*
macros to index into the buffer and access fields; however, we reimplemented several of these macros to be compatible with BCC’s pointer dereference instrumentation which was unable to handle all of the CMSG_*
macros. To store the data and report it back to userspace, we used a typed union struct that can store both SCM_RIGHTS
(file descriptors) and SCM_CREDENTIALS
(Unix credentials), which additionally keeps track of the count of the former and whether or not the last element returned to userspace was actually the last element of the in-kernel CMSG
structure. Both the max count of copyable CMSG
s and slots within a CMSG
(for storing SCM_RIGHTS
file descriptors) are configurable via the CLI; this also modifies the unrolled loop counts.
Static Data Structures and Algorithms
To appropriately handle the glut of data caught by unixdump
, we needed to performantly filter PIDs (inclusively or exclusively) in eBPF C code, so as to limit the amount of data and number of events sent to userland. Iteratively comparing each one would be extremely costly, so we instead opted to use a binary search tree. As a recursive binary search implementation will trip the loop check, we instead generate the entire static C implementation (dynamically in Python) for the values being filtered. For reference, the implementation can be found here.
Dark eBPF Thaumaturgy
Even with all of the above tricks to keep it happy, the eBPF validator’s muse is still a fickle miscreant with a very short attention span. Be it due to changes in the toolchain or the Linux kernel itself, the validator may look upon your overly clever code and decide to smite you where you stand. Sometimes, appeasing the validator requires ever greater sacrifices of idiomaticity.
Dynamic Length Byte Copies
Per the issue mentioned earlier, it is not immediately clear that variable length byte copies from kernel memory are possible with eBPF. Given that the recommended solution is to use a helper function for copying NUL-terminated C strings, this would be a problem when the variable length data is binary content that may contain NULL bytes. However, this is not the case, and such copies can be performed, albeit with some careful slight slight-of-hand. While this is not an issue for socket paths, of which the sun_path
field of struct sockaddr_un
is guaranteed to be at least UNIX_PATH_MAX
(108) bytes long, this is an issue for copying arbitrary socket data. While it is important to ensure that stack-based arrays are fully written to (e.g. write NULL bytes to the remainder of arrays), this limitation does not exist for eBPF map structures as they are zero-initialized by the kernel to prevent information leaks. Instead, the trouble occurs when one attempts to truncate the copy length. Between BCC and the eBPF validator, it is often the case that a byte copy of the length of an array or less is considered unsafe, and therefore rejected. Instead, when tapering off the array length, it was previously necessary to cap the copy length to sizeof(buffer)-1
. The odd behavior here is that if the source length is the same as the destination length, it must still be truncated. Additionally, to prevent optimizations that may elide certain comparisons needed to provide the eBPF validator with register bounds, we found that it was possible to simply wrap the desired code in a static inline
function to shadow the variables in play. For example, in unixdump
we perform this copy and track whether or not the data was truncated in the following code:
inline static void copy_into_entry_buffer(data_t* entry, size_t const len, char* base, u8 volatile* trunc) { int l = (int)len; if (l < 0) { l = 0; } if (l >= BUFFER_SIZE) { *trunc = 1; } if (l >= BUFFER_SIZE) { l = BUFFER_SIZE - 1; } bpf_probe_read(entry->buffer, l, base); }
Note: This behavior has changed a few times between BCC and Linux kernel versions, and when using current versions of both, it is possible to implement the optimal case of copying right up to the end of the array; however, to support older versions we continue to use the less optimal “truncate on equal” version shown above.
Type Juggling
Another spooky behavior we observed with a previous version of BCC (which we have not observed since) was an interesting case where the return type of a function could cause the validator to raise an error. While it may be simple enough to imagine such a situation involving mixing signed and unsigned integers, this instance related to the use ofbool
as both the return and variable type, which was eventually casted to size_t
. In some versions of our code, the validator would raise an error if the return value was bool
, but in others it would raise an error if the return value was size_t
. For context, unixdump
will, based on CLI options for certain features, enable or disable certain kprobe functionality with #if(def)
s. As a result, we simply used the same feature detections to set a BOOL_TYPE
define used as the return and variable type with either bool
or size_t
. At the time, we did not bother to triage this issue (sorry!), but it does not affect the current unixdump
code when using a current BCC. As for whether or not this is because the current BCC fixed the issue, or our current code is unaffected, it is a mystery.
Obfuscation, or: How I Learned to Stop Worrying and Outsmart the Compiler
When writing eBPF code, one’s greatest enemy is often the compiler’s optimizers. eBPF’s most glaring flaw is that the compiler and the validator have no means to communicate other than through the generated code. Try as you might to write your code in a concise way that would otherwise ensure its correctness, the compiler may simply optimize out all of your “unnecessary” data validation checks, leaving the validator to complain that you are not “properly” validating all of the edge cases. While sometimes, one can get around such occurrences with the volatile
keyword, other times it will be necessary to rework the code over and over in an attempt to fool both the compiler and eBPF validator. As noted earlier, we have observed that placing code verbatim within an inline static
function would result in certain offending code passing validation. This appeared to be due to the fact that certain assumptions on the “parameters” could no longer be made, preventing the compiler from eliding code required by the validator. However, it is worth noting that because of such blunders within the validator, one’s code must sometimes be implemented suboptimally, which will incur unnecessary performance penalties. We still prefer to accept such specific penalties over configuring BCC to compile eBPF C code with -O0
.
Conclusion
While it can be a bit tricky to write anything more than the sorts of very simple eBPF kernel tracing tools currently promoted as BCC reference examples focused on basic system profiling, it is very much possible to use eBPF to develop full-featured tracing tools and tooling. Additionally, though the developer experience has a tendency to be extremely perplexing, it does appear to be actively improving over time, given the lessened need for hacky validator appeasement rituals.
We got our feet wet in the world of eBPF-based kernel tracing by attempting to solve a somewhat niche problem, but the outcome seems promising. Our initial test case for eBPF, unixdump
, is open source and available on GitHub; check it out here: https://github.com/nccgroup/ebpf/tree/master/unixdump. We plan to continue to add features and filters to unixdump
, and would greatly appreciate any contributions. The next features on the roadmap are proper timestamping, and outputting to pcapng so that one can load Unix domain socket traffic dumps into Wireshark/tshark
and apply their vast repertoire of protocol dissectors.
Depending on your OS, Unix domain sockets may be described in
unix(7)
,unix(4)
, orsockaddr(3socket)
.↩︎While Linux enforces file path permissions on file path-based Unix domain sockets, this behavior is not consistent across all Unix implementations. However, in general, Unix OSes have similar sets of APIs enabling Unix domain socket peer processes to verify each other’s identity.↩︎
https://www.kernel.org/doc/Documentation/networking/filter.txt↩︎