Santa's ELFs: Running Linux Executables Without execve

This blog is the 11th post in our annual 12 Days of HaXmas blog series.

Merry HaXmas! Now that the holidays are winding down, Santa's elves can finally get some much-needed rest after working tirelessly on dolls, wooden trains, and new copies of Fortnite. Santa's ELFs, however, do not get such a break, since the Executable and Linkable Format (ELF) is the base of numerous Unix-like operating systems such as Linux, most modern BSDs, and Solaris. ELF files are capable of many tricks, like those we use in our *nix Meterpreter implementation, but those tricks require building each executable either with our special toolchain or GCC 8+ with the new -static-pie flag. What if things were different? The kernel doesn't need a file on disk to load and run code, even if it makes us have one. Surely with some HaXmas magic and elbow grease we can do it ourselves.

Handcrafted mirrors

Perhaps the most-desired trick for executable formats is reflective loading. Reflective loading is an important post-exploitation technique used to avoid detection and execute more complex tools in locked-down environments. It generally has three broad steps:

Get code execution (e.g., exploitation or phishing)
Grab your own code from somewhere
Convince the operating system to run your code without loading it like a normal process

That last part is what makes reflective loading desirable. Environments are increasingly locked down, and the components of a normal process start are the most obvious targets. Traditional antivirus scans things on the disk, code signing checks integrity as new processes start, and behavioral monitoring keeps checking to make sure none of those processes start doing anything too weird. The thought is, if attackers can't run any programs, they can't do anything and the system is secure. This is not entirely correct; blocking the most obvious paths just makes things painful.

In particular, Windows, which uses the Portable Executable (PE) format, has seen much research into this topic, both because it is widely deployed and because it has useful reflection building blocks built into the core operating system. In fact, it has the gold standard API for this sort of work, CreateRemoteThread and friends, which allows attackers to not merely load, but inject code into other running processes.

Despite lacking fun APIs for reflective injection like CreateRemoteThread found in Windows, the last few years have seen some interesting research in unorthodox ways to use developer-focused Linux features to make their own security-focused lives better. A good summary of command-only approaches can be found here, and other techniques that require helper files include tools such as linux-inject. A more modern technique that can be executed from a helper binary or scripting language uses some new-ish sycalls.

These approaches on Linux can be grouped into five categories:

Writing to temporary files: This is not too different from typical code, but it doesn't leave disk artifacts.
Injecting with ptrace: This requires some sophistication to control and classically allows broad process-hopping.
Self-modifying executables: dd is a classic for this, and it requires a bit of fineness and customized shellcode.
FFI integration in scripting languages: Python and Ruby are both suitable, current techniques that only load shellcode.
Creating non-filesystem temporary files: This uses syscalls added in 2014, so it's approaching broad usability.

Strict working conditions

Few people closely watch their Linux boxes (congratulations if you do!), but these technique styles have security/opsec considerations in addition to their respective technical challenges alluded to above. If you are on the bluer end of the security spectrum, think of the following as ways to frustrate any remote attackers lucky enough to compromise something:

Temporary files

It is increasingly common to find /tmp and /dev/shm mounted noexec (that is, no file in that tree can be executed), especially on mobile and embedded systems. Speaking of embedded systems, those often have read-only persistent file storage as well, so you can't even fall back to hoping no one is looking at the disk.

Self-modifying executables and ptrace

Access to ptrace and most of fun introspection in /proc/<PID> is governed by the kernel.yama.ptrace_scope sysctl variable. On boxes that are not being used for development, it should be set to at least 2 to remove access from non-privileged users. This is default on many mobile and embedded systems, and modern desktop/server distros default to at least 1, which reduces its usefulness for wildly hopping across processes. Also, it's Linux-specific, so no sweet, sweet BSD shells for you.

FFI integrations

Ruby's fiddle and Python's ctypes specifically are very flexible and for a lot of small things can function like an interpreted C. They don't have an assembler, though, so any detail work with registers or bootstrapping into different executables will need done with shellcode from your end. You also never know which version, if any, is going to be installed on a given system.

Non-filesystem temporary files

Also Linux-specific, this calls a pair of new syscalls that together can bypass any noexec flags I have been able to muster (tested through kernel 4.19.10). The first syscall is memfd_create(2). Added in 3.17, it allocates a new temporary filesystem with default permissions and creates a file inside that that does not show up in any mounted filesystem except /proc. The second syscall, added in 3.19, is execveat(2). It can take a file descriptor and pass it to the kernel for execution. The downside is that the created file can easily be found with find /proc/*/fd -lname '/memfd:*', since all the memfd_create(2) files are represented as symbolic links with constant prefix. This feature in particular is rarely used in common software—the only legitimate example I can find on my Linux boxen was added to PulseAudio in 2016.

A (runtime) linker to the past

And then there's big bottleneck: To run other programs, all these techniques use the standard execve(2) (or the related execveat(2) call in that last case). The presence of tailored SELinux profiles or syscall auditing could easily render them off-limits, and on recent SELinux-enforcing Android builds, it does. There is another technique called userland exec or ul_exec; however, that mimics how the kernel initializes a process during an execve and handing control off to runtime linker: ld.so(8). This is one of the earliest techniques in this field, pioneered by the grugq, though it's never been much pursued because it is a bit fiddly compared to the techniques above. There was an update and rewrite for x86_64, but it implements its own standard library, making it difficult to extend and unable to compile with modern stack-smashing protection.

The Linux world is very different from what it was nearly 15 years ago when this technique was first published. In the modern day, with 40+ bits of address space and position-independent executables by default, cobbling together the execve process is more straightforward. Sure, we can't hardcode the stack address and have it work >90% of the time, but programs don't rely on it being constant anymore, either, so we can put it nearly wherever and do less juggling with memory addresses. Linux environments are now also more common, valuable, and closely guarded than in 2004, so going through the effort is worth it in some scenarios.

Process

There are still two requirements for emulating the work of execve that are not a given all the time, depending on execution method and environment. First, we require page-aligned memory allocation and the ability to mark the memory executable after we populate it. Because it is required for JITs, the built-in library loader dlopen(3), and some DRM implementations, it is difficult to completely remove from a system. However, SELinux can restrict executable memory allocation, and some more self-contained platforms like Android use these restrictions to good effect on things that are not approved browsers or DRM libraries. Next, we require the ability to arbitrarily jump into said memory. Trivial in C or shellcode, it does requires a full FFI interface in a scripting language and rules out a non-XS Perl implementation, for example.

The process detailed by grugq has changed in subtle but interesting ways. Even with all the development that has taken place, the overall steps are the same as before. One security feature of the modern GNU/Linux userland is that it is less concerned with particular memory addresses, which gives us more flexibility to implement our other kind of security feature. There are also more hints that the kernel passes to the runtime in the auxiliary vector and finding the original is now more desirable, but most programs can work fine with a simple one on most architectures.

The tool

The downfall of many tools, especially in open source and security, is staleness. With execve emulation left aside for other loading methods, the two ul_exec implementations have gotten little interest and (as best as I can tell) few updates from their authors.

Our Linux Meterpreter implementation currently lacks support for the popular -m option to the execute command that on Windows runs a process completely in-memory under the guise of a benign program. Using this and a fallback technique or two from above would give us the capabilities that we need to run uploaded files from memory and with a little trickery, such as mapping extra benign files or changing the process name before handing control over to the uploaded executable, completely replicate -m. A nice side effect is that this will also make building and distributing plugins easier, as they will no longer need to be made into a memory image at build time.

To enable this, I am creating a shared library that will live alongside our Linux Meterpreter, mettle. It won't depend on any of the Meterpreter code, but by default, it will build with its toolchain. It will also be free and packageable into whatever post-exploitation mechanism you desire. Be sure to check out the pull request if you have any questions or suggestions.

Here we can see an example tool that uses this library in action as viewed through the microscope of the syscall tracer strace(1). To avoid all the output associated with normal tracing, we are just using the %process trace expression to view only calls associated with the process life cycle like fork, execve, and exit. On x86_64, the %process expression also grabs the arch_prctl syscall, which is only used on x86_64 and only for setting up thread-local storage. The execve is from strace starting the executable, and the first pair of arch_prctl calls is from the library initialization. Then nothing until the target library starts up with its own pair of arch_prctl calls and prints out our message.

$ strace -e trace=%process ./noexec $(which cat) haxmas.txt
execve("./noexec", ["./noexec", "/usr/bin/cat", "haxmas.txt"], 0x7ffdcbdf0bc0 /* 23 vars */) = 0
arch_prctl(0x3001 /* ARCH_??? */, 0x7fffa750dd20) = -1 EINVAL (Invalid argument)
arch_prctl(ARCH_SET_FS, 0x7f17ca7db540) = 0
arch_prctl(0x3001 /* ARCH_??? */, 0x7fffa750dd20) = -1 EINVAL (Invalid argument)
arch_prctl(ARCH_SET_FS, 0x7f17ca7f3540) = 0
Merry, HaXmas!
exit_group(0)                       	= ?
+++ exited with 0 +++

$ strace -e trace=%process ./noexec $(which cat) haxmas.txt
execve("./noexec", ["./noexec", "/usr/bin/cat", "haxmas.txt"], 0x7ffdcbdf0bc0 /* 23 vars */) = 0
arch_prctl(0x3001 /* ARCH_??? */, 0x7fffa750dd20) = -1 EINVAL (Invalid argument)
arch_prctl(ARCH_SET_FS, 0x7f17ca7db540) = 0
arch_prctl(0x3001 /* ARCH_??? */, 0x7fffa750dd20) = -1 EINVAL (Invalid argument)
arch_prctl(ARCH_SET_FS, 0x7f17ca7f3540) = 0
Merry, HaXmas!
exit_group(0)                       	= ?
+++ exited with 0 +++

MDR Buyer's Guide

Grow your business

Meet with Rapid7 at Black Hat