ctf-reverse — Skillopedia

CTF Reverse Engineering Quick reference for RE challenges. For detailed techniques, see supporting files. Prerequisites Python packages (all platforms): Linux (apt): macOS (Homebrew): radare2 plugins: Manual install: - pwndbg — Linux: GitHub, macOS: Additional Resources - tools.md - Static analysis tools (GDB, Ghidra, radare2, IDA, Binary Ninja, dogbolt.org, RISC-V with Capstone, Unicorn emulation, Python bytecode, WASM, Android APK, .NET, packed binaries) - tools-dynamic.md - Dynamic analysis tools: Frida (hooking, anti-debug bypass, memory scanning, Android/iOS), angr symbolic execution (pa…

) to mark the original string's position. Without it, BWT inversion produces n candidates (one per rotation). Use domain-specific constraints (binary format, XOR round structure, flag prefix) to identify the correct candidate.\n\n---\n\n## OpenType Font Ligature Exploitation for Hidden Messages (Hack The Vote 2016)\n\nFont files with custom OpenType ligatures map visible characters to hidden glyphs. The GSUB (Glyph Substitution) table defines these mappings.\n\n```python\nfrom fontTools.ttLib import TTFont\n\ndef decode_font_ligatures(font_path, encoded_text):\n \"\"\"Extract ligature substitution table and decode message\"\"\"\n font = TTFont(font_path)\n\n # Extract GSUB table for ligature substitutions\n gsub = font['GSUB']\n\n # Navigate to ligature lookup\n ligature_map = {}\n for lookup in gsub.table.LookupList.Lookup:\n for subtable in lookup.SubTable:\n if hasattr(subtable, 'ligatures'):\n for glyph_name, ligatures in subtable.ligatures.items():\n for lig in ligatures:\n # Map: input sequence -> output glyph\n input_seq = [glyph_name] + lig.Component\n output = lig.LigGlyph\n ligature_map[tuple(input_seq)] = output\n\n print(\"Ligature mappings found:\")\n for inp, out in ligature_map.items():\n print(f\" {inp} -> {out}\")\n\n # Alternative: convert TTF to XML for manual analysis\n # font.saveXML('font_dump.xml')\n # Search for \u003cLigatureSubst> entries\n\n# Command-line approach:\n# pip install fonttools\n# ttx font.otf # converts to XML\n# grep -A5 'LigatureSubst' font.ttx\n```\n\n**Key insight:** Custom fonts with GSUB ligature tables create a cipher where displayed characters differ from their glyph mappings. The `fonttools` library's `ttx` command dumps the font to XML, making ligature substitution tables easily readable. Each ligature maps an input character sequence to a different output glyph.\n\n---\n\n## GLSL Shader VM with Self-Modifying Code (ApoorvCTF 2026)\n\n**Pattern (Draw Me):** A WebGL2 fragment shader implements a Turing-complete VM on a 256x256 RGBA texture. The texture is both program memory and display output.\n\n**Texture layout:**\n- **Row 0:** Registers (pixel 0 = instruction pointer, pixels 1-32 = general purpose)\n- **Rows 1-127:** Program memory (RGBA = opcode, arg1, arg2, arg3)\n- **Rows 128-255:** VRAM (display output)\n\n**Opcodes:** NOP(0), SET(1), ADD(2), SUB(3), XOR(4), JMP(5), JNZ(6), VRAM-write(7), STORE(8), LOAD(9). 16 steps per frame.\n\n**Self-modifying code:** Phase 1 (decryption) uses STORE opcode to XOR-patch program memory that Phase 2 (drawing) then executes. The decryption overwrites SET instructions with correct pixel color values before the drawing code runs.\n\n**Why GPU rendering fails:** The GPU runs all pixels in parallel per frame, but the shader tracks only ONE write target per pixel per frame. With multiple VRAM writes per frame, only the last survives — losing 75%+ of pixels. Similarly, STORE patches conflict during parallel decryption.\n\n**Solve via sequential emulation:**\n```python\nfrom PIL import Image\nimport numpy as np\n\nimg = Image.open('program.png').convert('RGBA')\nstate = np.array(img, dtype=np.int32).copy()\nregs = [0] * 33\n\n# Phase 1: Trace decryption — apply all STORE patches sequentially\nx, y = start_x, start_y\nwhile True:\n r, g, b, a = state[y][x]\n opcode = int(r)\n if opcode == 1: regs[g] = b & 255 # SET\n elif opcode == 4: regs[g] = regs[b] ^ regs[a] # XOR\n elif opcode == 8: # STORE — patches program memory\n tx, ty = regs[g], regs[b]\n state[ty][tx] = [regs[a], regs[a+1], regs[a+2], regs[a+3]]\n elif opcode == 5: break # JMP to drawing phase\n x += 1\n if x > 255: x, y = 0, y + 1\n\n# Phase 2: Execute drawing code — all VRAM writes preserved\nvram = np.zeros((128, 256), dtype=np.uint8)\n# ... trace with opcode 7 writing to vram[ty][tx] = color\nImage.fromarray(vram, mode='L').save('output.png')\n```\n\n**Key insight:** GLSL shaders are Turing-complete but GPU parallelism causes write conflicts. Self-modifying code (STORE patches) compounds the problem — patches from parallel executions overwrite each other. Sequential emulation in Python recovers the full output. The program.png file IS the bytecode.\n\n**Detection:** WebGL/shader challenge with a PNG \"program\" file, challenge says \"nothing renders\" or output is garbled. Look for custom opcode tables in GLSL source.\n\n---\n\n## Instruction Counter as Cryptographic State (MetaCTF Flash 2026)\n\n**Pattern (Who's Counting?):** Hand-written assembly binary uses a dedicated register (e.g., `r12`) as an instruction counter that increments after nearly every instruction. The counter value feeds into XOR, ROL, and multiply transformations on each input byte, making the entire transformation path-dependent on the number of instructions executed before reaching each byte.\n\n**Identification:**\n- Hand-written assembly (no compiler patterns, unusual register usage)\n- A register that only increments (`inc r12` or `add r12, 1`) appearing after most instructions\n- Transformations that reference this counter register (`xor rax, r12`, `rol al, cl` where `cl` derives from counter)\n- Sequential byte processing loop where state carries forward\n\n**Solving approach:**\n```python\n# Byte-by-byte brute force with emulation\n# Since each byte's transformation depends on the counter (which depends\n# on all prior instructions), state is path-dependent.\n\nfrom unicorn import *\nfrom unicorn.x86_const import *\n\ndef try_byte(known_prefix, candidate_byte):\n \"\"\"Emulate binary with known prefix + candidate, check output.\"\"\"\n uc = Uc(UC_ARCH_X86, UC_MODE_64)\n # Map code, stack, data segments\n uc.mem_map(CODE_BASE, 0x10000)\n uc.mem_write(CODE_BASE, binary_code)\n uc.mem_map(STACK_BASE, 0x10000)\n uc.mem_map(DATA_BASE, 0x10000)\n\n # Write input: known_prefix + candidate\n test_input = known_prefix + bytes([candidate_byte])\n uc.mem_write(DATA_BASE, test_input + b'\\x00' * (64 - len(test_input)))\n\n # Set up registers (rsp, rdi pointing to input, r12 = 0)\n uc.reg_write(UC_X86_REG_RSP, STACK_BASE + 0x8000)\n uc.reg_write(UC_X86_REG_R12, 0) # instruction counter starts at 0\n\n try:\n uc.emu_start(CODE_BASE + ENTRY_OFFSET, CODE_BASE + EXIT_OFFSET)\n # Read transformed output, compare against expected\n output = uc.mem_read(OUTPUT_ADDR, len(test_input))\n return output[:len(test_input)] == expected[:len(test_input)]\n except:\n return False\n\n# Recover flag byte by byte\nflag = b''\nfor pos in range(FLAG_LEN):\n for b in range(256):\n if try_byte(flag, b):\n flag += bytes([b])\n print(f\"Position {pos}: {chr(b)} -> {flag}\")\n break\n```\n\n**Key insight:** When a register acts as an instruction counter feeding into byte transformations, the transformation of byte N depends on the exact number of instructions executed while processing bytes 0 through N-1. This makes analytical inversion impractical because the counter value at each byte position depends on the execution path through all prior bytes. Byte-by-byte brute force with full emulation (Unicorn or GDB scripting) is the most reliable approach -- try all 256 values for each position, keeping the state from the correct prefix.\n\n**When to recognize:** Binary has no standard library calls, uses unusual registers consistently, and shows a register that only increments. The transformation per byte involves operations (XOR, rotate, multiply) that reference this counter. Challenge name hints at \"counting\" or \"instructions\".\n\n**Alternative approaches:**\n- GDB scripting: set breakpoint after each byte's transformation, compare output\n- Static analysis: count instructions manually to compute counter values, then invert transforms algebraically (error-prone due to counter accumulation)\n\n**References:** MetaCTF Flash CTF 2026 \"Who's Counting?\"\n\n---\n\n## Thread Race Condition with Signed Integer Overflow (Codegate 2017)\n\n**Pattern (Hunting):** A game binary uses thread-unsafe skill selection. The attack thread checks `skill_id \u003c= 4` using signed comparison, then sleeps briefly before applying damage. During the sleep, switch to a different skill. The fireball skill uses `cdqe` (sign-extend EAX to RAX), converting `0xFFFFFFFF` (icesword damage) to `-1` as a signed 64-bit value. Subtracting `-1` from the boss's HP (`0x7FFFFFFFFFFFFFFF`) causes signed overflow to a negative value, killing the boss.\n\n```python\n# Race condition exploit:\n# Thread A: select fireball (skill_id=2, passes \u003c= 4 check)\n# Thread A: sleeps for animation\n# Main: switch to icesword (skill_id=5, damage=0xFFFFFFFF)\n# Thread A: wakes, reads damage from icesword slot\n# cdqe: 0xFFFFFFFF -> 0xFFFFFFFFFFFFFFFF (-1 signed)\n# boss_hp -= (-1) -> boss_hp = 0x7FFFFFFFFFFFFFFF + 1 = negative -> dead\n\nimport time, threading\ndef race():\n select_skill(2) # fireball - passes bounds check\n time.sleep(0.001)\n select_skill(5) # icesword - race into damage calculation\n```\n\n**Key insight:** `cdqe` (Convert Doubleword to Quadword Extension) sign-extends 32-bit EAX into 64-bit RAX. When the attack code reads a 32-bit damage value and sign-extends it, `0xFFFFFFFF` becomes `-1`. Subtracting a negative number adds to HP, but if HP is already at `INT64_MAX`, the addition overflows to negative, killing the target.\n\n---\n\n## ESP32/Xtensa Firmware Reversing with ROM Symbol Map (Insomni'hack 2017)\n\n**Pattern (Internet of Fail):** ESP32 firmware (Xtensa architecture) with no native IDA support. Use radare2 with the ESP32 ROM linker script (`esp32.rom.ld`) to map function addresses to names. Cross-reference with public ESP32 HTTP server source code to identify the password-checking logic, composed of ~20 conditional XOR functions operating on a global state variable.\n\n```bash\n# Load ESP32 firmware in radare2\nr2 -a xtensa -b 32 firmware.bin\n\n# Apply ROM symbol map from ESP-IDF\n# esp32.rom.ld maps addresses like:\n# 0x40000000 = ets_printf\n# 0x400013A0 = cache_Read_Enable\n# Load as flags: . esp32.rom.ld.r2\n\n# Identify HTTP request handler by cross-referencing\n# with esp-idf/examples/protocols/http_server\n# Look for URI handler registration patterns\n```\n\n**Key insight:** ESP32's Xtensa architecture lacks mainstream RE tool support, but the ESP-IDF SDK provides ROM linker scripts mapping every ROM function address to its name. Loading these as symbols in radare2 immediately resolves hundreds of function calls. Cross-referencing with public ESP-IDF example code identifies application-level patterns (HTTP handlers, WiFi callbacks) even in stripped firmware.\n\n---\n\n## Batch Crackme Automation via objdump Pattern Extraction (DEF CON 2017)\n\nSolve hundreds of identical-structure crackmes by scripting `objdump` to extract comparison values and arithmetic operations, computing keys without execution.\n\n```bash\n# Simple variant: extract CMP immediates directly\nobjdump -M intel -d $binary | grep -P \"cmp\\s+rdi\" | \\\n grep -oP \"0x\\w{1,2}\" | xxd -r -p\n\n# Complex variant: parse add/sub/cmp chains and reverse-compute\n# Each binary: series of add/sub rdi,N then cmp rdi,target\n# Reverse: start from target, undo operations in reverse order\npython3 \u003c\u003c'EOF'\nimport subprocess, re, glob\nfor binary in sorted(glob.glob(\"crackmes/*\")):\n asm = subprocess.check_output([\"objdump\", \"-M\", \"intel\", \"-d\", binary]).decode()\n ops = re.findall(r'(add|sub)\\s+rdi,(0x\\w+)', asm)\n target = int(re.search(r'cmp\\s+rdi,(0x\\w+)', asm).group(1), 16)\n # Reverse operations\n for op, val in reversed(ops):\n val = int(val, 16)\n target = (target - val) if op == 'add' else (target + val)\n print(chr(target & 0xff), end='')\nEOF\n```\n\n**Key insight:** Mass crackme challenges (100s-1000s of binaries) have identical structure with per-binary constants. Script `objdump` disassembly parsing to extract immediates and arithmetic sequences, then reverse-compute the key algebraically. No execution or emulation needed.\n\n---\n\n## Fork + Pipe + Dead Branch Anti-Analysis (RCTF 2017)\n\nBinary uses fork/pipe IPC where the parent writes data and exits, child reads from pipe and continues. Key validation is in a dead branch (always-false comparison) that requires binary patching to reach.\n\n```bash\n# Detection: fork() + pipe() + read()/write() in main\n# The child process reads from pipe, needs to know its own PID\n\n# Dead branch pattern:\n# cmp DWORD PTR [ebp-0xc], 0x1 ; compares 0 with 1, always false\n# je real_flag_computation ; never taken\n\n# Patch: change comparison value from 0x1 to 0x0\n# Find: 83 7d f4 01 → change to: 83 7d f4 00\npython3 -c \"\ndata = open('binary','rb').read()\ndata = data.replace(b'\\x83\\x7d\\xf4\\x01', b'\\x83\\x7d\\xf4\\x00')\nopen('binary_patched','wb').write(data)\n\"\n```\n\n**Key insight:** Fork+pipe creates a parent-child relationship where the parent provides data and exits. Dead branches (comparisons that always evaluate to false) hide the real validation logic. `strace` reveals the fork/pipe/read pattern; patching the comparison constant reaches the hidden code path.\n\n---\n\n---\n\n## Time-Locked Binary with Date-Based Key (Hack.lu 2017)\n\nBinary reads the system date and only executes correctly on a specific date (e.g., December 21, 2012). The date constant appears in the binary as a Unix timestamp or structured date comparison.\n\n**Detection:** Look for comparisons against large integer constants that fall in a recognizable date range (Unix timestamps: 2012 = ~1.35B, 2017 = ~1.5B). Cultural significance helps: apocalypse dates, CTF release dates, historical events.\n\n```bash\n# Set system clock to the required date\nsudo date -s \"2012-12-21 00:00:00\"\n./binary\n\n# Or use faketime to avoid system-wide change\nLD_PRELOAD=/usr/lib/faketime/libfaketime.so.1 FAKETIME=\"2012-12-21 00:00:00\" ./binary\n\n# Restore system time afterward\nsudo ntpdate pool.ntp.org\n```\n\n**In IDA/Ghidra:** Search for `time()` or `localtime()` calls. The struct `tm` fields to watch: `tm_year` (years since 1900), `tm_mon` (0-based), `tm_mday`.\n\n**Key insight:** Time-based keys use culturally significant dates. Always check for date comparisons in reversed code and try setting the system clock or using faketime before attempting deeper analysis.\n\n**References:** Hack.lu CTF 2017\n\n---\n\n## ARM Code in Image Pixels via UnicornJS (Hack.lu 2017)\n\nJavaScript challenge embeds ARM bytecode in image pixel data. The image is base64-encoded in the HTML/JS source. Pixel RGBA values encode ARM instructions. A bundled UnicornJS library (ARM CPU emulator in JavaScript) extracts and executes the bytecode.\n\n**Identification flow:**\n1. Find base64 blob in JS source → decode → PNG/BMP file\n2. Identify UnicornJS import (`unicorn.js`, `uc.js`, or similar) → confirms ARM emulation\n3. Pixel extraction loop: RGBA bytes concatenated in raster order form the ARM instruction stream\n4. Feed the extracted bytes to an ARM disassembler\n\n```python\nfrom PIL import Image\nimport capstone\n\nimg = Image.open('decoded.png').convert('RGBA')\npixels = list(img.getdata())\n\n# Extract ARM bytecode from pixel data (4 bytes per pixel: R, G, B, A)\narm_code = bytes([channel for pixel in pixels for channel in pixel])\n\n# Disassemble as ARM Thumb or ARM32\nmd = capstone.Cs(capstone.CS_ARCH_ARM, capstone.CS_MODE_THUMB)\nfor insn in md.disasm(arm_code, 0x0):\n print(f\"0x{insn.address:04x}: {insn.mnemonic} {insn.op_str}\")\n```\n\n**Key insight:** Multi-layer obfuscation: ARM code in image pixels, base64 encoded, emulated via UnicornJS at runtime. Identify the emulator library first to know which ISA to reverse — the library name reveals the architecture.\n\n**References:** Hack.lu CTF 2017\n\n---\n\n## x86 16-bit MBR psadbw Constraint Solving (CSAW 2017)\n\nBootable MBR uses SSE2 `psadbw` (Packed Sum of Absolute Differences of Bytes) on xmm registers to validate the flag. Each iteration masks 2 input bytes, computes `psadbw` against known constants, and compares the sum to an expected value.\n\n**`psadbw` semantics:**\n```asm\npsadbw xmm0, xmm1\n; For each of 8 byte pairs: sum += |xmm0[i] - xmm1[i]|\n; Result stored as 16-bit integer in low qword of xmm0\n```\n\nThis generates sum-of-absolute-differences equations:\n```text\n|a[0] - k[0]| + |a[1] - k[1]| + ... + |a[7] - k[7]| = C\n```\n\n**Solution approach:**\n```python\nimport numpy as np\nfrom itertools import product\n\n# For each 2-byte masked group, extract the constants and expected sum\n# Equations are not purely linear (absolute value), but printable ASCII\n# constrains each byte to [0x20, 0x7e], limiting brute-force space\n\ndef solve_psadbw_group(known_constants, expected_sum, printable_range=(0x20, 0x7e)):\n \"\"\"Brute-force 2 unknown bytes given sum-of-abs-diff constraint.\"\"\"\n solutions = []\n for a, b in product(range(*printable_range), repeat=2):\n pair = [a, b]\n sad = sum(abs(pair[i] - known_constants[i]) for i in range(len(pair)))\n if sad == expected_sum:\n solutions.append(bytes([a, b]))\n return solutions\n\n# For ambiguous cases with multiple solutions: apply additional constraints\n# (flag format prefix, character frequency, subsequent iterations)\n```\n\n**Key insight:** `psadbw` creates sum-of-absolute-difference equations — not purely linear but solvable with constrained brute-force when bytes are limited to printable ASCII. Each 2-byte group is independent, keeping the search space to 95^2 = ~9000 candidates per group.\n\n**References:** CSAW CTF 2017\n\n---\n\n## TensorFlow DNN Inversion by Inverting Sigmoid Layers (N1CTF 2018)\n\n**Pattern:** Binary implements a 5-layer deep neural network with sigmoid activation. The input (flag characters) is transformed as `1.0/char_value` before feeding into the network. Extract weights and biases from the binary, then compute the inverse layer-by-layer: apply inverse-sigmoid, subtract bias, multiply by weight matrix inverse.\n\n```python\nimport numpy as np\n\ndef sigmoid_inv(x):\n return -np.log(1.0/x - 1.0)\n\n# Invert layer by layer from output to input\nv = target_output\nfor i in range(num_layers - 1, -1, -1):\n v = np.dot(sigmoid_inv(v) - biases[i], np.linalg.inv(weights[i]))\n\n# Input was 1.0/char, so flag chars are the multiplicative inverse\nflag = ''.join(chr(int(round(1.0 / v[j]))) for j in range(len(v)))\n```\n\n**Key insight:** Neural networks with invertible activation functions (sigmoid, tanh) and square weight matrices can be mathematically inverted layer-by-layer. Apply inverse-sigmoid, subtract bias, multiply by weight inverse. Watch for input transformations (e.g., 1/x) that must also be inverted.\n\n**Detection:** Binary with TensorFlow or custom DNN implementation. Look for sigmoid/tanh calls, matrix multiplications, and hardcoded float arrays (weights/biases) in `.rodata`. Square weight matrices (N x N) indicate the network is invertible.\n\n**References:** N1CTF 2018\n\n---\n\n## BPF Filter Analysis via JIT Compilation to x64 Assembly (Midnight Sun CTF 2018)\n\n**Pattern:** Binary creates a raw socket with a BPF (Berkeley Packet Filter) attached. When standard BPF disassemblers fail to produce readable output, enable the kernel's BPF JIT compiler to convert BPF bytecode to native x64 assembly, then read the compiled code from dmesg.\n\n```bash\n# Enable BPF JIT compilation\necho 1 > /proc/sys/net/core/bpf_jit_enable\n\n# Run the binary, then read JIT-compiled BPF from kernel log\ndmesg | grep -A 100 \"flen=\"\n\n# Analysis revealed: expects DNS TXT query on UDP port 3333\ndig @target -p 3333 'M4d!bKn3~l' TXT\n```\n\n**Key insight:** Linux can JIT-compile BPF filters to native x64 machine code. When standard BPF disassemblers fail or produce unreadable output, enable `bpf_jit_enable` and read the compiled assembly from dmesg. The native code is often easier to understand than BPF bytecode.\n\n**Detection:** Binary using `setsockopt` with `SO_ATTACH_FILTER`, raw socket creation (`socket(AF_PACKET, ...)`), or embedded `struct sock_fprog` structures. BPF programs appear as arrays of `struct sock_filter` (8 bytes each: opcode, jt, jf, k).\n\n**References:** Midnight Sun CTF 2018\n\n---\n\n## Single-Byte XOR ROM Deobfuscation Sweep (X-MAS CTF 2018)\n\n**Pattern:** A large opaque blob (GBA ROM, firmware, game binary) refuses `binwalk`/`file` identification. Sweep all 256 single-byte XOR keys and re-run `file` + `strings` over the outputs; the correct key reveals a recognisable magic/signature.\n\n```bash\nfor i in $(seq 0 255); do\n python3 -c \"\nimport sys\nk = $i\nd = open('blob.bin','rb').read()\nopen(f'xor_{k}','wb').write(bytes(b^k for b in d))\" \n file \"xor_$i\" | grep -v data\ndone\nstrings \"xor_0x42\" | grep -i \"POKEMON\\|ELF\\|MZ\"\n```\n\n**Key insight:** Brute-forcing 256 XOR keys costs seconds and defeats any single-byte XOR packer. Always run this sweep before assuming a custom algorithm; look for format magics (`ELF`, `PK`, `MZ`, `PDF-`, ROM name strings) in the output.\n\n**References:** X-MAS CTF 2018 — Unown Gift, writeup 12665\n\n---\n\n## WebKit Array.slice OOB CVE-2016-4622 (Codegate 2019)\n\n**Pattern:** Challenge ships a WebKit binary with the `isJSArray(thisObj) && length == toLength(...)` bounds check commented out inside `ArrayPrototype.cpp`. `Array.prototype.slice` then reads beyond the backing store, giving OOB into adjacent JS objects. Chain Saelo's `addrof` / `fakeobj` primitives to obtain arbitrary read/write in the JS heap, then pivot to native code via a fake `StructureID`.\n\n```javascript\nlet oob = new Array(8);\nlet victim = {a: 1};\nlet leak = oob.slice(-1, oob.length + 16)[0]; // reads past backing store\n```\n\n**Key insight:** Any JIT/engine challenge that patches out a safety check almost always exposes a classic browser-CVE primitive. Diff the vendored source against upstream `ArrayPrototype.cpp`, `JSArray.cpp`, and `JITOperations.cpp` for removed `if`/`assert` statements — that's the bug.\n\n**References:** Codegate CTF 2019 — Butterfree, writeup 12902\n\n---\n\n## Multi-Modulus CRT Keygen with Matrix Lookup Password (Pragyan CTF 2019)\n\n**Pattern (Super Secure Vault):** `main` asks for a numeric `key` (\u003c= 30 digits) and checks it against five independent modular equations derived from slicing a hardcoded big number `N = \"27644437104591489104652716127\"` into `[27644437, 10459, 1489, 1046527, 16127]`:\n\n```\nkey mod 27644437 == 213\nkey mod 10459 == 229\nkey mod 1489 == 25\nkey mod 1046527 == 83\nkey mod 16127 == 135\n```\n\nThe five moduli are pairwise coprime, so the Chinese Remainder Theorem yields the smallest valid `key = 3087629750608333480917556`. After `scanf`, `func2(password, key, N)` concatenates `key + N + \"80\"` into `v12`, then validates each password byte against a 10000-byte lookup table:\n\n```python\n# Round 1: index = 100*(10*d0 + d1) + 10*d_mid + d_mid+1\n# Round 2: index = 100*((10*d0+d1)**2 % 97) + ((10*d_mid+d_mid+1)**2 % 97)\npassword = b\"\"\nv8, v10 = 0, len(v12) // 2\nwhile v8 \u003c len(v12) // 2:\n idx = 100 * (10*v12[v8] + v12[v8+1]) + 10*v12[v10] + v12[v10+1]\n password += bytes([matrix[idx]]); v8 += 2; v10 += 2\nv9, v11 = 0, len(v12) // 2\nwhile v9 \u003c len(v12) // 2:\n a = 10*v12[v9] + v12[v9+1]; b = 10*v12[v11] + v12[v11+1]\n password += bytes([matrix[100*(a*a % 97) + b*b % 97]])\n v9 += 2; v11 += 2\n```\n\nCRT (via `sympy.ntheory.modular.crt` or a manual `mul_inv` routine) plus `matrix` dumped from the binary reproduces the flag `pctf{R3v3rS1Ng_#s_h311_L0t_Of_Fun}`.\n\n**Key insight:** Five coprime moduli pin the key down uniquely modulo their product (~4.7e19), which fits in a 30-digit input — pick the smallest representative instead of brute-forcing, otherwise you waste hours on 100k+ equally valid but ugly keys. The second stage looks complicated but is really a pair of fixed index-generator functions over a static table; dump the table once and both rounds become direct array reads.\n\n**References:** Pragyan CTF 2019 — Super Secure Vault, writeup 13760\n\n---\n\nSee also: [patterns-ctf.md](patterns-ctf.md) for Part 1, [patterns-ctf-2.md](patterns-ctf-2.md) for Part 2 (multi-layer self-decrypting binary, embedded ZIP+XOR license, stack string deobfuscation, prefix hash brute-force, CVP/LLL lattice, decision tree obfuscation, GF(2^8) Gaussian elimination).\n","content_type":"text/markdown; charset=utf-8","language":"markdown","size":38811,"content_sha256":"a877bf89d5610f81112cdd02dd56260e82371ea4ec612b580bae68df95b96083"},{"filename":"patterns-ctf.md","content":"# CTF Reverse - Competition-Specific Patterns (Part 1)\n\n## Table of Contents\n- [Hidden Emulator Opcodes + LD_PRELOAD Key Extraction (0xFun 2026)](#hidden-emulator-opcodes--ld_preload-key-extraction-0xfun-2026)\n- [Spectre-RSB SPN Cipher — Static Parameter Extraction (0xFun 2026)](#spectre-rsb-spn-cipher--static-parameter-extraction-0xfun-2026)\n- [Image XOR Mask Recovery via Smoothness (VuwCTF 2025)](#image-xor-mask-recovery-via-smoothness-vuwctf-2025)\n- [Shellcode in Data Section via mmap RWX (VuwCTF 2025)](#shellcode-in-data-section-via-mmap-rwx-vuwctf-2025)\n- [Recursive execve Subtraction (VuwCTF 2025)](#recursive-execve-subtraction-vuwctf-2025)\n- [Byte-at-a-Time Block Cipher Attack (UTCTF 2024)](#byte-at-a-time-block-cipher-attack-utctf-2024)\n- [Mathematical Convergence Bitmap (EHAX 2026)](#mathematical-convergence-bitmap-ehax-2026)\n- [Windows PE XOR Bitmap Extraction + OCR (srdnlenCTF 2026)](#windows-pe-xor-bitmap-extraction--ocr-srdnlenctf-2026)\n- [Two-Stage Loader: RC4 Gate + VM Constraints (srdnlenCTF 2026)](#two-stage-loader-rc4-gate--vm-constraints-srdnlenctf-2026)\n- [GBA ROM VM Hash Inversion via Meet-in-the-Middle (srdnlenCTF 2026)](#gba-rom-vm-hash-inversion-via-meet-in-the-middle-srdnlenctf-2026)\n- [Sprague-Grundy Game Theory Binary (DiceCTF 2026)](#sprague-grundy-game-theory-binary-dicectf-2026)\n- [Kernel Module Maze Solving (DiceCTF 2026)](#kernel-module-maze-solving-dicectf-2026)\n- [Multi-Threaded VM with Channel Synchronization (DiceCTF 2026)](#multi-threaded-vm-with-channel-synchronization-dicectf-2026)\n- [Backdoored Shared Library Detection via String Diffing (Hack.lu CTF 2012)](#backdoored-shared-library-detection-via-string-diffing-hacklu-ctf-2012)\n- [Custom binfmt Kernel Module with RC4 Flat Binaries (BSidesSF 2026)](#custom-binfmt-kernel-module-with-rc4-flat-binaries-bsidessf-2026)\n- [Hash-Resolved Imports / No-Import Ransomware (BSidesSF 2026)](#hash-resolved-imports--no-import-ransomware-bsidessf-2026)\n- [ELF Section Header Corruption for Anti-Analysis (BSidesSF 2026)](#elf-section-header-corruption-for-anti-analysis-bsidessf-2026)\n- [VM Trace Diffing Instead of Full Disassembly (CONFidence CTF 2019 Teaser)](#vm-trace-diffing-instead-of-full-disassembly-confidence-ctf-2019-teaser)\n\n---\n\n## Hidden Emulator Opcodes + LD_PRELOAD Key Extraction (0xFun 2026)\n\n**Pattern (CHIP-8):** Non-standard opcode `FxFF` triggers hidden `superChipRendrer()` → AES-256-CBC decryption. Key derived from binary constants.\n\n**Technique:**\n1. Check all instruction dispatch branches for non-standard opcodes\n2. Hidden opcode may trigger crypto functions (OpenSSL)\n3. Use `LD_PRELOAD` hook on `EVP_DecryptInit_ex` to capture AES key at runtime:\n\n```c\n#include \u003copenssl/evp.h>\nint EVP_DecryptInit_ex(EVP_CIPHER_CTX *ctx, const EVP_CIPHER *type,\n ENGINE *impl, const unsigned char *key,\n const unsigned char *iv) {\n // Log key\n for (int i = 0; i \u003c 32; i++) printf(\"%02x\", key[i]);\n printf(\"\\n\");\n // Call original\n return ((typeof(EVP_DecryptInit_ex)*)dlsym(RTLD_NEXT, \"EVP_DecryptInit_ex\"))\n (ctx, type, impl, key, iv);\n}\n```\n\n```bash\ngcc -shared -fPIC -ldl -lssl hook.c -o hook.so\nLD_PRELOAD=./hook.so ./emulator rom.ch8\n```\n\n---\n\n## Spectre-RSB SPN Cipher — Static Parameter Extraction (0xFun 2026)\n\n**Pattern:** Binary uses cache side channels to implement S-boxes, but ALL cipher parameters (round keys, S-box tables, permutation) are in the binary's data section.\n\n**Key insight:** Don't try to run on special hardware. Extract parameters statically:\n- 8 S-boxes × 8 output bits, 256 entries each\n- Values `0x340` = bit 1, `0x100` = bit 0\n- 64-byte permutation table, 8 round keys\n\n```python\n# Extract from binary data section\nimport struct\nsbox = [[0]*256 for _ in range(8)]\nfor i in range(8):\n for j in range(256):\n val = struct.unpack('\u003cI', data[sbox_offset + (i*256+j)*4 : ...])[0]\n sbox[i][j] = 1 if val == 0x340 else 0\n```\n\n**Lesson:** Side-channel implementations embed lookup tables in memory. Extract statically.\n\n---\n\n## Image XOR Mask Recovery via Smoothness (VuwCTF 2025)\n\n**Pattern (Trianglification):** Image divided into triangle regions, each XOR-encrypted with `key = (mask * x - y) & 0xFF` where mask is unknown (0-255).\n\n**Recovery:** Natural images have smooth gradients. Brute-force mask (256 values per region), score by neighbor pixel differences:\n\n```python\nimport numpy as np\nfrom PIL import Image\n\nimg = np.array(Image.open('encrypted.png'))\n\ndef score_smoothness(region_pixels, mask, positions):\n decrypted = []\n for (x, y), pixel in zip(positions, region_pixels):\n key = (mask * x - y) & 0xFF\n decrypted.append(pixel ^ key)\n # Score: sum of absolute differences between adjacent pixels\n return -sum(abs(decrypted[i] - decrypted[i+1]) for i in range(len(decrypted)-1))\n\nfor region in regions:\n best_mask = max(range(256), key=lambda m: score_smoothness(region, m, positions))\n```\n\n**Search space:** 256 candidates × N regions = trivial. Smoothness is a reliable scoring metric for natural images.\n\n---\n\n## Shellcode in Data Section via mmap RWX (VuwCTF 2025)\n\n**Pattern (Missing Function):** Binary relocates data to RWX memory (mmap with PROT_READ|PROT_WRITE|PROT_EXEC) and jumps to it.\n\n**Detection:** Look for `mmap` with PROT_EXEC flag. Embedded shellcode often uses XOR with rotating key.\n\n**Analysis:** Extract data section, apply XOR key (try 3-byte rotating), disassemble result.\n\n---\n\n## Recursive execve Subtraction (VuwCTF 2025)\n\n**Pattern (String Inspector):** Binary recursively calls itself via `execve`, subtracting constants each time.\n\n**Solution:** Find base case and work backward. Often a mathematical relationship like `N * M + remainder`.\n\n---\n\n## Byte-at-a-Time Block Cipher Attack (UTCTF 2024)\n\n**Pattern (PES-128):** First output byte depends only on first input byte (no diffusion).\n\n**Attack:** For each position, try all 256 byte values, compare output byte with target ciphertext. One match per byte = full plaintext recovery without knowing the key.\n\n**Detection:** Change one input byte → only corresponding output byte changes. This means zero cross-byte diffusion = trivially breakable.\n\n---\n\n## Mathematical Convergence Bitmap (EHAX 2026)\n\n**Pattern (Compute It):** Binary classifies complex-plane coordinates by Newton's method convergence. The classification results, arranged as a grid, spell out the flag in ASCII art.\n\n**Recognition:**\n- Input file with coordinate pairs (x, y)\n- Binary iterates a mathematical function (e.g., z^3 - 1 = 0) and outputs pass/fail\n- Grid dimensions hinted by point count (e.g., 2600 = 130×20)\n- 5-pixel-high ASCII art font common in CTFs\n\n**Newton's method for z^3 - 1:**\n```python\ndef newton_converges_to_one(px, py, max_iter=50, target_count=12):\n \"\"\"Returns True if Newton's method converges to z=1 in exactly target_count steps.\"\"\"\n x, y = px, py\n count = 0\n for _ in range(max_iter):\n f_real = x**3 - 3*x*y**2 - 1.0\n f_imag = 3*x**2*y - y**3\n J_rr = 3.0 * (x**2 - y**2)\n J_ri = 6.0 * x * y\n det = J_rr**2 + J_ri**2\n if det \u003c 1e-9:\n break\n x -= (f_real * J_rr + f_imag * J_ri) / det\n y -= (f_imag * J_rr - f_real * J_ri) / det\n count += 1\n if abs(x - 1.0) \u003c 1e-6 and abs(y) \u003c 1e-6:\n break\n return count == target_count\n\n# Read coordinates and render bitmap\npoints = [(float(x), float(y)) for x, y in ...]\nbits = [1 if newton_converges_to_one(px, py) else 0 for px, py in points]\nWIDTH = 130 # 2600 / 20 rows\nfor r in range(len(bits) // WIDTH):\n print(''.join('#' if bits[r*WIDTH+c] else '.' for c in range(WIDTH)))\n```\n\n**Key insight:** The binary is a mathematical classifier, not a flag checker. The flag is in the visual pattern of classifications, not in the binary's output. Reverse-engineer the math, apply to all coordinates, and visualize as bitmap.\n\n---\n\n## Windows PE XOR Bitmap Extraction + OCR (srdnlenCTF 2026)\n\n**Pattern (Artistic Warmup):** Binary renders input text, compares rendered bitmap against expected pixel data stored XOR'd with constant in `.rdata`. No need to compute — extract expected pixels directly.\n\n**Attack:**\n1. Reverse the core check function to identify rendering and comparison logic\n2. Find the expected pixel blob in `.rdata` (look for large data block referenced near comparison)\n3. XOR with constant (e.g., 0xAA) to recover expected rendered DIB\n4. Save as image and OCR to recover flag text\n\n```python\nimport numpy as np\nfrom PIL import Image\n\nwith open(\"binary.exe\", \"rb\") as f:\n data = f.read()\n\n# Extract from .rdata section (offsets from reversing)\nblob_offset = 0xC3620 # .rdata offset to XOR'd blob\nblob_size = 0x15F90 # 450 * 50 * 4 (BGRA)\nblob = np.frombuffer(data[blob_offset:blob_offset + blob_size], dtype=np.uint8)\nexpected = blob ^ 0xAA # XOR with constant key\n\n# Reshape as BGRA image (dimensions from reversing)\nimg = expected.reshape(50, 450, 4)\nchannel = img[:, :, 0] # Take one channel (grayscale text)\nImage.fromarray(channel, \"L\").save(\"target.png\")\n\n# OCR with charset whitelist\nimport subprocess\nresult = subprocess.run(\n [\"tesseract\", \"target.png\", \"stdout\", \"-c\",\n \"tessedit_char_whitelist=abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789{}_\"],\n capture_output=True, text=True)\nprint(result.stdout)\n```\n\n**Key insight:** When a binary renders text and compares pixels, the expected pixel data is the flag rendered as an image. Extract it directly from the binary data section without needing to understand the rendering logic. OCR with charset whitelist improves accuracy for CTF flag characters.\n\n---\n\n## Two-Stage Loader: RC4 Gate + VM Constraints (srdnlenCTF 2026)\n\n**Pattern (Cornflake v3.5):** Two-stage malware loader — stage 1 uses RC4 username gate, stage 2 downloaded from C2 contains VM-based password validation.\n\n**Stage 1 — RC4 username recovery:**\n```python\ndef rc4(key, data):\n s = list(range(256))\n j = 0\n for i in range(256):\n j = (j + s[i] + key[i % len(key)]) & 0xFF\n s[i], s[j] = s[j], s[i]\n i = j = 0\n out = bytearray()\n for b in data:\n i = (i + 1) & 0xFF\n j = (j + s[i]) & 0xFF\n s[i], s[j] = s[j], s[i]\n out.append(b ^ s[(s[i] + s[j]) & 0xFF])\n return bytes(out)\n\n# Key from binary strings, ciphertext from stored hex\nusername = rc4(b\"s3cr3t_k3y_v1\", bytes.fromhex(\"46f5289437bc009c17817e997ae82bfbd065545d\"))\n```\n\n**Stage 2 — VM constraint extraction:**\n1. Download stage 2 from C2 endpoint (e.g., `/updates/check.php`)\n2. Reverse VM bytecode interpreter (typically 15-20 opcodes)\n3. Extract linear equality constraints over flag characters\n4. Solve constraint system (Z3 or manual)\n\n**Key insight:** Multi-stage loaders often use simple crypto (RC4) for the first gate and more complex validation (custom VM) for the second. The VM memory may be uninitialized (all zeros), drastically simplifying constraint extraction since memory-dependent operations become constants.\n\n---\n\n## GBA ROM VM Hash Inversion via Meet-in-the-Middle (srdnlenCTF 2026)\n\n**Pattern (Dante's Trial):** Game Boy Advance ROM implements a custom VM. Hash function uses FNV-1a variant with uninitialized memory (stays all zeros). Meet-in-the-middle attack splits the search space.\n\n**Hash function structure:**\n```python\n# FNV-1a variant with XOR/multiply\nP = 0x100000001b3 # FNV prime\nCUP = 0x9e3779b185ebca87 # Golden ratio constant\nMASK64 = (1 \u003c\u003c 64) - 1\n\ndef fmix64(h):\n \"\"\"Finalization mixer.\"\"\"\n h ^= h >> 33; h = (h * 0xff51afd7ed558ccd) & MASK64\n h ^= h >> 33; h = (h * 0xc4ceb9fe1a85ec53) & MASK64\n h ^= h >> 33\n return h\n\ndef hash_input(chars, seed_lo=0x84222325, seed_hi=0xcbf29ce4):\n hlo, hhi, ptr = seed_lo, seed_hi, 0\n for c in chars:\n # tri_mix(c, mem[ptr]) — mem is always 0\n delta = ((ord(c) * CUP) ^ (0 * P)) & MASK64\n hlo = ((hlo ^ (delta & 0xFFFFFFFF)) * (P & 0xFFFFFFFF)) & 0xFFFFFFFF\n hhi = ((hhi ^ (delta >> 32)) * (P >> 32)) & 0xFFFFFFFF\n ptr = (ptr + 1) & 0xFF\n combined = ((hhi \u003c\u003c 32) | (hlo ^ ptr)) & MASK64\n return fmix64((combined * P) & MASK64)\n```\n\n**Meet-in-the-middle attack:**\n```python\nimport string\n\nTARGET = 0x73f3ebcbd9b4cd93\nLENGTH = 6\nSPLIT = 3\ncharset = [c for c in string.printable if 32 \u003c= ord(c) \u003c 127]\n\n# Forward pass: enumerate first 3 characters from seed state\nforward = {}\nfor c1 in charset:\n for c2 in charset:\n for c3 in charset:\n state = hash_forward(seed, [c1, c2, c3])\n forward[state] = c1 + c2 + c3\n\n# Backward pass: invert fmix64 and final multiply, enumerate last 3 chars\ninv_target = invert_fmix64(TARGET)\nfor c4 in charset:\n for c5 in charset:\n for c6 in charset:\n state = hash_backward(inv_target, [c4, c5, c6])\n if state in forward:\n print(f\"Found: {forward[state]}{c4}{c5}{c6}\")\n```\n\n**Key insight:** Meet-in-the-middle reduces search from `95^6 ≈ 7.4×10^11` to `2×95^3 ≈ 1.7×10^6` — a factor of ~430,000x speedup. Critical when the hash function is invertible from the output side (i.e., `fmix64` and the final multiply can be undone). Also: uninitialized VM memory that stays zero simplifies the hash function by removing a variable.\n\n---\n\n## Sprague-Grundy Game Theory Binary (DiceCTF 2026)\n\n**Pattern (Bedtime):** Stripped Rust binary plays N rounds of bounded Nim. Each round has piles and max-move parameter k. Binary uses a PRNG for moves when in a losing position; user must respond optimally so the PRNG eventually generates an invalid move (returns 1). Sum of return values must equal a target.\n\n**Game theory identification:**\n- Bounded Nim: remove 1 to k items from any pile per turn\n- **Grundy value** per pile: `pile_value % (k+1)`\n- **XOR** of all Grundy values: non-zero = winning (N-position), zero = losing (P-position)\n- N-positions: computer wins automatically (returns 0)\n- P-positions: computer uses PRNG, may make invalid move (returns 1)\n\n**PRNG state tracking through user feedback:**\n```python\nMASK64 = (1 \u003c\u003c 64) - 1\n\ndef prng_step(state, pile_count, k):\n \"\"\"Computer's PRNG move. Returns (pile_idx, amount, new_state).\"\"\"\n r12 = state[2] ^ 0x28027f28b04ccfa7\n rax = (state[1] + r12) & MASK64\n s0_new = ROL64((state[0] ** 2 + rax) & MASK64, 32)\n r12_upd = (r12 + rax) & MASK64\n s0_final = ROL64((s0_new ** 2 + r12_upd) & MASK64, 32)\n\n pile_idx = rax % pile_count\n amount = (r12_upd % k) + 1\n return pile_idx, amount, [s0_final, r12_upd, state[2]]\n\n# Critical: state[2] updated ONLY by user moves (XOR of pile_idx, amount, new_value)\n# PRNG moves do NOT affect state[2] — creates feedback loop\n```\n\n**Solving approach:**\n1. Dump game data from GDB (all entries with pile values and parameters)\n2. Classify: count P-positions (return 1) vs N-positions (return 0)\n3. Simulate each P-position: PRNG moves → user responds optimally → track state[2]\n4. Encode user moves as input format (4-digit decimal pairs, reversed order)\n\n**Key insight:** When a game binary's PRNG state depends on user input, you must simulate the full feedback loop — not just solve the game theory. Use GDB hardware watchpoints to discover which state variables are affected by user vs computer moves.\n\n---\n\n## Kernel Module Maze Solving (DiceCTF 2026)\n\n**Pattern (Explorer):** Rust kernel module implements a 3D maze via `/dev/challenge` ioctls. Navigate the maze, avoid decoy exits (status=2), find the real exit (status=1), read the flag.\n\n**Ioctl enumeration:**\n| Command | Description |\n|---------|-------------|\n| `0x80046481-83` | Get maze dimensions (3 axes, 8-16 each) |\n| `0x80046485` | Get status: 0=playing, 1=WIN, 2=decoy |\n| `0x80046486` | Get wall bitfield (6 directions) |\n| `0x80406487` | Get flag (64 bytes, only when status=1) |\n| `0x40046488` | Move in direction (0-5) |\n| `0x6489` | Reset position |\n\n**DFS solver with decoy avoidance:**\n```c\n// Minimal static binary using raw syscalls (no libc) for small upload size\n// gcc -nostdlib -static -Os -fno-builtin -o solve solve.c -Wl,--gc-sections && strip solve\n\nint visited[16][16][16];\nint bad[16][16][16]; // decoy positions across resets\n\nvoid dfs(int fd, int x, int y, int z) {\n if (visited[x][y][z] || bad[x][y][z]) return;\n visited[x][y][z] = 1;\n\n int status = ioctl_get_status(fd);\n if (status == 1) { read_flag(fd); exit(0); }\n if (status == 2) { bad[x][y][z] = 1; return; } // decoy — mark bad\n\n int walls = ioctl_get_walls(fd);\n int dx[] = {1,-1,0,0,0,0}, dy[] = {0,0,1,-1,0,0}, dz[] = {0,0,0,0,1,-1};\n int opp[] = {2,3,0,1,5,4}; // opposite directions for backtracking\n\n for (int dir = 0; dir \u003c 6; dir++) {\n if (!(walls & (1 \u003c\u003c dir))) continue; // wall present\n ioctl_move(fd, dir);\n dfs(fd, x+dx[dir], y+dy[dir], z+dz[dir]);\n ioctl_move(fd, opp[dir]); // backtrack\n }\n}\n// After decoy hit: reset via ioctl 0x6489, clear visited, re-run DFS\n```\n\n**Remote deployment:** Upload binary via base64 chunks over netcat shell, decode, execute.\n\n**Key insight:** For kernel module challenges, injecting test binaries into initramfs and probing ioctls dynamically is faster than static RE of stripped kernel modules. Keep solver binary minimal (raw syscalls, no libc) for fast upload.\n\n---\n\n## Multi-Threaded VM with Channel Synchronization (DiceCTF 2026)\n\n**Pattern (locked-in):** Custom stack-based VM runs 16 concurrent threads verifying a 30-char flag. Threads communicate via futex-based channels. Pipeline: input → XOR scramble → transformation → base-4 state machine → final check.\n\n**Analysis approach:**\n1. **Identify thread roles** by tracing channel read/write patterns in GDB\n2. **Extract constants** (XOR scramble values, lookup tables) via breakpoints on specific opcodes\n3. **Watch for inverted logic:** validity check returns 0 for valid, non-zero for blocked (opposite of intuition)\n4. **Detect futex quirks:** `unlock_pi` on unowned mutex returns EPERM=1, which can change all computations\n\n**BFS state space search for constrained state machines:**\n```python\nfrom collections import deque\n\ndef solve_flag(scramble_vals, lookup_table, initial_state, target_state):\n \"\"\"BFS through state machine to find valid flag bytes.\"\"\"\n flag = [None] * 30\n # Known prefix/suffix from flag format\n flag[0:5] = list(b'dice{')\n flag[29] = ord('}')\n\n # For each unknown position, try all printable ASCII\n states = {initial_state}\n for pos in range(28, 4, -1): # processed in reverse\n next_states = {}\n for state in states:\n for ch in range(32, 127):\n transformed = transform(ch, scramble_vals[pos])\n digits = to_base4(transformed)\n new_state = apply_digits(state, digits, lookup_table)\n if new_state is not None: # valid path exists\n next_states.setdefault(new_state, []).append((state, ch))\n states = set(next_states.keys())\n\n # Trace back from target_state to recover flag\n```\n\n**Key insight:** Multi-threaded VMs require tracing data flow across thread boundaries. Channel-based communication creates a pipeline — identify each thread's role (input, transform, validate, output) by watching which channels it reads/writes. Constants that affect computation may come from unexpected sources (futex return values, thread IDs).\n\n---\n\n## Backdoored Shared Library Detection via String Diffing (Hack.lu CTF 2012)\n\n**Pattern (Zombie Lockbox):** A setuid binary uses `strcmp` for password validation. The expected password is visible via `strings` and works under GDB (which drops suid), but fails when run normally. The binary links against a non-standard libc that patches function behavior based on suid status.\n\n**Detection steps:**\n1. Check for non-standard library paths with `ldd`:\n```bash\nldd ./binary\n# Suspicious: libc.so.6 => /lib/libc/libc.so.6 (non-standard path)\n# Normal: libc.so.6 => /lib32/libc.so.6\n```\n\n2. Diff strings between the suspicious and system libc:\n```bash\nstrings /lib/libc/libc.so.6 > suspicious_strings\nstrings /lib32/libc-2.15.so > normal_strings\ndiff suspicious_strings normal_strings\n```\n\n3. Disassemble the patched function (e.g., `puts`) to find injected code:\n```bash\ngdb /lib/libc/libc.so.6\n(gdb) disas puts\n# Look for unexpected calls or branches\n# Injected code may check suid status (getuid/geteuid syscalls)\n# and swap the expected password at runtime\n```\n\n**Key insight:** When a binary behaves differently under GDB vs. normal execution, check `ldd` for non-standard library paths. Suid binaries drop privileges under debuggers, so a backdoored libc can detect this via `getuid`/`geteuid` syscalls and change program behavior accordingly. The `strings | diff` approach quickly reveals injected data without full disassembly.\n\n---\n\n---\n\n## Custom binfmt Kernel Module with RC4 Flat Binaries (BSidesSF 2026)\n\n**Pattern (Private Binary):** A custom Linux kernel module (`.ko`) registers a `binfmt` handler for non-standard binary formats. When a file with a specific magic number is executed, the kernel module intercepts it, decrypts the contents in memory, and jumps to the entry point.\n\n**Reverse engineering approach:**\n1. **Analyze the `.ko`:** Look for `register_binfmt()` call — it registers a `struct linux_binfmt` with a `load_binary` callback\n2. **Find the magic number:** The `load_binary` function checks the file's first bytes against a specific magic number to identify its format\n3. **Extract the encryption key:** Look for `movabs` instructions loading 8-byte constants — these are often RC4 key bytes\n4. **Identify the encryption scheme:** Common choices are RC4, XOR, or AES-ECB. RC4 is identifiable by the S-box initialization loop (256-byte array, swap pattern)\n5. **Decrypt the flat binary:** Apply the recovered key to the encrypted file contents, skipping any header\n\n```python\nfrom Crypto.Cipher import ARC4\n\n# Extract RC4 key from kernel module (found via movabs instructions)\nkey = bytes([0x41, 0x42, 0x43, ...]) # Key bytes from .ko disassembly\n\nwith open('encrypted.bin', 'rb') as f:\n header = f.read(HEADER_SIZE) # Skip binfmt header\n encrypted = f.read()\n\ncipher = ARC4.new(key)\ndecrypted = cipher.decrypt(encrypted)\n\n# The decrypted output is a flat binary (no ELF headers)\n# Load at the fixed virtual address specified in the kernel module\n# Disassemble with: objdump -b binary -m i386:x86-64 -D decrypted.bin\n# Or in Ghidra: import as \"Raw Binary\", set base address from .ko\n```\n\n**Detection in kernel module:**\n- `register_binfmt` / `unregister_binfmt` calls\n- `vm_mmap()` or `vm_brk()` for memory allocation at fixed addresses\n- Direct jump to mapped memory (entry point execution)\n- S-box initialization pattern (RC4): loop 0-255, swap `S[i]` with `S[j]`\n\n**Key insight:** The flat binary has no ELF headers, so standard tools won't recognize it. You must extract the load address from the kernel module (look for the `vm_mmap` call's address argument) and import the decrypted blob at that address in your disassembler. RC4 keys in kernel modules are often stored as immediate values in `mov` or `movabs` instructions rather than in data sections.\n\n**References:** BSidesSF 2026 \"Private Binary\"\n\n---\n\n## Hash-Resolved Imports / No-Import Ransomware (BSidesSF 2026)\n\n**Pattern (Ran Somewhere):** Malware binary has zero visible imports — all API calls are resolved at runtime by hashing symbol names and comparing against pre-computed hash values. The binary uses `dlopen` + a custom hash table to find libc and libcrypto functions.\n\n**Identification:**\n- `readelf -d` shows no dynamic symbols or very few (just `dlopen`/`dlsym`)\n- Strings reveal no standard API names\n- Disassembly shows hash computation loops followed by indirect calls\n- RC4-encrypted embedded strings (RSA public key, file paths, passphrases)\n\n**Analysis shortcut — LD_PRELOAD key extraction:**\n\nRather than reversing the full hash resolution and key derivation, hook the crypto functions that the malware ultimately calls:\n\n```c\n// hook_crypto.c — captures AES key used by the ransomware\n#define _GNU_SOURCE\n#include \u003cdlfcn.h>\n#include \u003copenssl/evp.h>\n#include \u003cstdio.h>\n\nint EVP_CipherInit_ex(EVP_CIPHER_CTX *ctx, const EVP_CIPHER *type,\n ENGINE *impl, const unsigned char *key,\n const unsigned char *iv) {\n if (key) {\n FILE *f = fopen(\"/tmp/aes_key.bin\", \"wb\");\n fwrite(key, 1, 32, f); // AES-256\n fclose(f);\n fprintf(stderr, \"[HOOK] AES key captured\\n\");\n }\n typedef int (*orig_t)(EVP_CIPHER_CTX*, const EVP_CIPHER*, ENGINE*,\n const unsigned char*, const unsigned char*);\n orig_t orig = (orig_t)dlsym(RTLD_NEXT, \"EVP_CipherInit_ex\");\n return orig(ctx, type, impl, key, iv);\n}\n```\n\n```bash\n# Compile and run\ngcc -shared -fPIC -o hook.so hook_crypto.c -ldl\n# Run in Docker container (ransomware may be destructive!)\ndocker run --rm -v $(pwd):/work -w /work ubuntu:22.04 \\\n bash -c \"LD_PRELOAD=./hook.so ./ransomware; xxd /tmp/aes_key.bin\"\n```\n\n**Hash resolution patterns:**\n- **SipHash variant:** Two 64-bit seeds, iterative mixing with symbol name bytes\n- **DJB2/FNV variants:** Simpler hash functions with recognizable constants (`5381`, `0xcbf29ce484222325`)\n- **ROR13-based:** Windows malware favorite: `hash = (hash >> 13) | (hash \u003c\u003c 19); hash += c`\n\n**Decryption after key capture:**\n```python\nfrom Crypto.Cipher import AES\n\nkey = open('/tmp/aes_key.bin', 'rb').read()\niv = open('/tmp/aes_iv.bin', 'rb').read() # Also hookable\ncipher = AES.new(key, AES.MODE_CBC, iv)\n\nwith open('flag.txt.enc', 'rb') as f:\n ct = f.read()\npt = cipher.decrypt(ct)\n# Remove PKCS7 padding\npt = pt[:-pt[-1]]\nprint(pt.decode())\n```\n\n**Key insight:** When a binary resolves all imports via hashing, don't waste time reversing the hash function and building a rainbow table. Instead, let the malware resolve everything itself by running it in a sandboxed environment with `LD_PRELOAD` hooks on the functions you care about (OpenSSL crypto functions, file I/O, network calls). The AES key is deterministic across runs — if it works once, it works always.\n\n**Safety:** Always run suspected ransomware in a Docker container or VM. Mount only copies of the encrypted files, never originals.\n\n**References:** BSidesSF 2026 \"Ran Somewhere\"\n\n---\n\n## ELF Section Header Corruption for Anti-Analysis (BSidesSF 2026)\n\n**Pattern (stubborn-elf):** An ELF binary has deliberately corrupted section header table entries, causing standard analysis tools (`readelf`, `objdump`, IDA, Ghidra) to crash or produce errors. However, the **program headers** (which the OS loader uses) are intact, so the binary executes normally. The flag is appended after the corrupted sections, marked with magic bytes.\n\n```python\nimport sys\n\n# Standard tools fail on corrupted section headers\n# Manual parsing bypasses section headers entirely\n\nwith open(\"stubborn_elf\", \"rb\") as f:\n data = f.read()\n\n# Search for magic marker appended after ELF sections\nmagic = b\"\\xDE\\xAD\\xBE\\xEF\\xCA\\xFE\\xBA\\xBE\"\nidx = data.find(magic)\nif idx >= 0:\n # Data after magic is XOR-encrypted\n encrypted = data[idx + len(magic):]\n decrypted = bytes(b ^ 0x42 for b in encrypted)\n print(decrypted.decode(errors='ignore'))\n```\n\n**Key insight:** ELF execution requires **program headers** (PT_LOAD segments), NOT section headers. Section headers are metadata for debuggers and analysis tools — they're optional at runtime. Corrupting `e_shoff`, `e_shnum`, or `e_shstrndx` in the ELF header breaks tools but not execution. When tools fail, parse the binary manually or patch the ELF header to zero out section header references before loading in a disassembler.\n\n**Recovery approach:**\n```bash\n# Patch section header offset to 0 (removes section table)\nprintf '\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00' | dd of=binary bs=1 seek=40 conv=notrunc\n# Now Ghidra/IDA can load it using program headers only\n\n# Or use readelf -l (program headers only, ignores sections)\nreadelf -l stubborn_elf\n```\n\n**When to recognize:** `readelf -S` crashes or shows garbage. `file` command identifies it as ELF. `readelf -l` (lowercase L, program headers) works fine. The binary runs normally despite tool failures.\n\n**References:** BSidesSF 2026 \"stubborn-elf\"\n\n---\n\n## VM Trace Diffing Instead of Full Disassembly (CONFidence CTF 2019 Teaser)\n\n**Pattern (Go Machine):** Go binary runs a 15-handler stack-VM (`0123456789OEQLCI` dispatch string) whose opcode meanings are rotated by an LCG-driven `shuffle` handler after every tick. Rewriting the interpreter faithfully is painful; the VM actually computes a simple 32-bit hash over 4-character input groups.\n\nInstead, attach a debugger-driven tracer to the dispatch routine and dump `(opcode, stack)` per step — then compare traces for two nearly identical inputs:\n\n```python\n# Pseudo-code for an IDAPython / gdb conditional-breakpoint tracer\ndef on_dispatch():\n op = read_byte(bytecode + pc)\n top = stack[:sp+1]\n print(f\"{decode(op)}\\t({'|'.join(hex(x) for x in top)})\")\n\n# Replay the dumped trace in plain Python; no bytecode parsing, no shuffle logic:\nelif line.startswith('save at (0x51)'):\n return stack[top] == expected_hash # calculated hash lands at mem[0x51]\n\n# Diff trace(\"abcd\") vs trace(\"dcba\") -> the same mul/mod sequence shows up,\n# revealing the real algorithm:\ndef calc_hash(x, mod):\n for _ in range(8):\n x = x * x % mod\n return x * x_original % mod\n```\n\nDump the per-group moduli (`[0x88ca6b51, 0x8405b751, 0xbfa08c87, 0x82013f23, 0x4666751b, 0x5271083f]`) and expected hashes from the trace, then brute-force 4-character permutations of `string.printable` against `calc_hash`.\n\n**Key insight:** Custom VMs with self-modifying dispatch (shuffle, rotor, LCG-keyed opcode table) are designed to punish naive reimplementation. Recording the executed instruction stream bypasses the trick entirely — the trace is deterministic for a given input, and diffing two traces with a single-bit difference localises the \"real\" algorithm hiding under the VM overhead.\n\n**References:** CONFidence CTF 2019 Teaser — Go Machine, writeup 13947\n\n---\n\nSee also: [patterns-ctf-2.md](patterns-ctf-2.md) for Part 2 (multi-layer self-decrypting binary, embedded ZIP+XOR license, stack string deobfuscation, prefix hash brute-force, CVP/LLL lattice, decision tree obfuscation, GF(2^8) Gaussian elimination), [patterns-ctf-3.md](patterns-ctf-3.md) for Part 3 (Z3 boolean circuit, sliding window popcount, keyboard LED Morse code, C++ destructor-hidden validation, VM sequential key-chain brute-force, BWT inversion, OpenType font ligature exploitation, GLSL shader VM with self-modifying code).\n","content_type":"text/markdown; charset=utf-8","language":"markdown","size":30764,"content_sha256":"fdc7f03538336bba962ccb84ad51af3d208e8ebe53536f97cae2080eece9ced1"},{"filename":"patterns-runtime.md","content":"# CTF Reverse - Runtime Patching and Oracle Techniques\n\nMalware unpacking, multi-stage shellcode, timing/signal side channels, and CTF-specific oracle attacks that rely on runtime state rather than static pattern matching.\n\nFor static reversing patterns (custom VMs, anti-debug, self-modifying code, LLVM obfuscation, S-box generation, SECCOMP/BPF, memory dumps, x86-64 gotchas, byte-wise transforms), see [patterns.md](patterns.md).\n\n## Table of Contents\n- [Malware Anti-Analysis Bypass via Patching](#malware-anti-analysis-bypass-via-patching)\n- [Multi-Stage Shellcode Loaders](#multi-stage-shellcode-loaders)\n- [Timing Side-Channel Attack](#timing-side-channel-attack)\n- [Multi-Thread Anti-Debug with Decoy + Signal Handler Mixed Boolean-Arithmetic (ApoorvCTF 2026)](#multi-thread-anti-debug-with-decoy--signal-handler-mixed-boolean-arithmetic-apoorvctf-2026)\n- [INT3 Patch + Coredump Brute-Force Oracle (Pwn2Win 2016)](#int3-patch--coredump-brute-force-oracle-pwn2win-2016)\n- [Signal Handler Chain + LD_PRELOAD Oracle (Nuit du Hack 2016)](#signal-handler-chain--ld_preload-oracle-nuit-du-hack-2016)\n- [printf Format String VM Decompilation to Z3 (SECCON 2017)](#printf-format-string-vm-decompilation-to-z3-seccon-2017)\n- [Quadtree Recursive Image Format Parser (Google CTF Quals 2018)](#quadtree-recursive-image-format-parser-google-ctf-quals-2018)\n\n---\n\n## Malware Anti-Analysis Bypass via Patching\n\n**Pattern (Carrot):** Malware with multiple environment checks before executing payload.\n\n**Common checks to patch:**\n| Check | Technique | Patch |\n|-------|-----------|-------|\n| `ptrace(PTRACE_TRACEME)` | Anti-debug | Change `cmp -1` to `cmp 0` |\n| `sleep(150)` | Anti-sandbox timing | Change sleep value to 1 |\n| `/proc/cpuinfo` \"hypervisor\" | Anti-VM | Flip `JNZ` to `JZ` |\n| \"VMware\"/\"VirtualBox\" strings | Anti-VM | Flip `JNZ` to `JZ` |\n| `getpwuid` username check | Environment | Flip comparison |\n| `LD_PRELOAD` check | Anti-hook | Skip check |\n| Fan count / hardware check | Anti-VM | Flip `JLE` to `JGE` |\n| Hostname check | Environment | Flip `JNZ` to `JZ` |\n\n**Ghidra patching workflow:**\n1. Find check function, identify the conditional jump\n2. Click on instruction → `Ctrl+Shift+G` → modify opcode\n3. For `JNZ` (0x75) → `JZ` (0x74), or vice versa\n4. For immediate values: change operand bytes directly\n5. Export: press `O` → choose \"Original File\" format\n6. `chmod +x` the patched binary\n\n**Server-side validation bypass:**\n- If patched binary sends system info to remote server, patch the data too\n- Modify string addresses in data-gathering functions\n- Change format strings to embed correct values directly\n\n---\n\n## Multi-Stage Shellcode Loaders\n\n**Pattern (I Heard You Liked Loaders):** Nested shellcode with XOR decode loops and anti-debug.\n\n**Debugging workflow:**\n1. Break at `call rax` in launcher, step into shellcode\n2. Bypass ptrace anti-debug: step to syscall, `set $rax=0`\n3. Step through XOR decode loop (or break on `int3` if hidden)\n4. Repeat for each stage until final payload\n\n**Flag extraction from `mov` instructions:**\n```python\n# Final stage loads flag 4 bytes at a time via mov ebx, value\n# Extract little-endian 4-byte chunks\nvalues = [0x6174654d, 0x7b465443, ...] # From disassembly\nflag = b''.join(v.to_bytes(4, 'little') for v in values)\n```\n\n---\n\n## Timing Side-Channel Attack\n\n**Pattern (Clock Out):** Validation time varies per correct character (longer sleep on match).\n\n**Exploitation:**\n```python\nimport time\nfrom pwn import *\n\nflag = \"\"\nfor pos in range(flag_length):\n best_char, best_time = '', 0\n for c in string.printable:\n io = remote(host, port)\n start = time.time()\n io.sendline((flag + c).ljust(total_len, 'X'))\n io.recvall()\n elapsed = time.time() - start\n if elapsed > best_time:\n best_time = elapsed\n best_char = c\n io.close()\n flag += best_char\n```\n\n---\n\n## Multi-Thread Anti-Debug with Decoy + Signal Handler Mixed Boolean-Arithmetic (ApoorvCTF 2026)\n\n**Pattern (A Golden Experience Requiem):** Multi-threaded binary with layered anti-analysis: Thread 1 performs decoy operations (fake AES + deliberate crash via `ud2`), Thread 2 does the real flag computation in a SIGSEGV signal handler using Mixed Boolean Arithmetic (MBA), Thread 3 erases memory to prevent post-mortem analysis.\n\n**Thread layout:**\n| Thread | Purpose | Trap |\n|--------|---------|------|\n| Thread 1 | Decoy: AES-looking operations → `ud2` crash | Analysts waste time reversing fake crypto |\n| Thread 2 | Real flag: SIGSEGV handler with MBA transforms | Hidden in signal handler, not main code path |\n| Thread 3 | Memory eraser: zeros out flag data after computation | Prevents memory dumping |\n| Main | rdtsc-based anti-debug timing check | Penalizes debugger-attached execution |\n\n**Solving approach — pure Python emulation of MBA logic:**\n```python\n# MBA helpers (extracted from assembly)\ndef mba_add(a, b): return (a + b) & 0xff\ndef mba_xor(a, b): return (a ^ b) & 0xff\n\ndef mba_transform(i):\n \"\"\"Position-dependent transform from signal handler.\"\"\"\n val = (i * 7 + 0x3f) & 0xff\n rotated = ((i \u003c\u003c 3) | (i >> 5)) & 0xff\n return mba_xor(val, rotated)\n\n# S-box (SHA-256 initial hash values repurposed)\nSBOX = [0x6a09e667, 0xbb67ae85, 0x3c6ef372, 0xa54ff53a,\n 0x510e527f, 0x9b05688c, 0x1f83d9ab, 0x5be0cd19]\n\ndef sbox_lookup(i):\n idx = i & 7\n shift = ((i >> 3) & 3) * 8\n return (SBOX[idx] >> shift) & 0xff\n\n# Two interleaved rodata arrays (even indices → array1, odd → array2)\nrodata1 = bytes.fromhex(\"39407691b717c97879013adf3a2adea11c2b04e0\")\nrodata2 = bytes.fromhex(\"bb19b025e37eaa786c4116e7aeea00c9c623940d\")\n\nflag = []\nfor i in range(40): # flag length\n t = mba_transform(i)\n s = sbox_lookup(i)\n mem = rodata1[i // 2] if i % 2 == 0 else rodata2[i // 2]\n flag.append(chr(t ^ s ^ mem))\n\nprint(''.join(flag))\n```\n\n**Key insight:** The real flag logic is in the signal handler (SIGSEGV/SIGILL), not the main thread. Thread 1's AES-like code and `ud2` crash are intentional misdirection. The `rdtsc` timing check detects debuggers and corrupts output. Bypass by extracting the MBA logic from assembly and reimplementing in Python — never run the binary under a debugger.\n\n**Detection indicators:**\n- Multiple `pthread_create` calls with different handler functions\n- `signal(SIGSEGV, handler)` or `sigaction` setup\n- `ud2` instruction (deliberate illegal instruction)\n- `rdtsc` instructions for timing checks\n- SHA-256 constants (0x6a09e667...) used as lookup tables, not for hashing\n\n---\n\n## INT3 Patch + Coredump Brute-Force Oracle (Pwn2Win 2016)\n\nInstead of reversing complex transformation logic, patch a byte to `0xCC` (INT3) after the transform, enable core dumps, brute-force each character by running the binary and extracting the transformed result from the coredump via `strings`.\n\n```bash\n# Patch byte at transform output point to 0xCC\nprintf '\\xcc' | dd of=binary bs=1 seek=$((0x400ebb)) conv=notrunc\nulimit -c unlimited\n# Brute-force each position:\nfor c in $(seq 32 126); do\n echo -ne \"$(printf '\\\\x%02x' $c)$known_suffix\" | ./binary 2>/dev/null\n strings core | grep -q \"$expected\" && echo \"Found: $c\"\ndone\n```\n\n**Key insight:** Use INT3/SIGTRAP as a breakpoint oracle -- the coredump captures computed state at the crash point. Avoids full reverse engineering of the transformation.\n\n---\n\n## Signal Handler Chain + LD_PRELOAD Oracle (Nuit du Hack 2016)\n\nBinary uses Unix signals for flow control: `main()` sends SIGINT to itself 1024 times, each handler checks one password character, then calls `signal()` to install the next handler. Bypass: LD_PRELOAD a custom `signal()` that logs when it's called (indicating correct character), brute-force each position.\n\n```c\n// LD_PRELOAD library:\n#include \u003csignal.h>\nsighandler_t signal(int sig, sighandler_t handler) {\n write(2, \"CORRECT\\n\", 8); // signal() called = char was correct\n return SIG_DFL;\n}\n```\n\n**Key insight:** Signal-handler-chain anti-reversing can be defeated by hooking `signal()` via LD_PRELOAD. The call to `signal()` (to install the next handler) acts as a side-channel confirming the current character.\n\n---\n\n### printf Format String VM Decompilation to Z3 (SECCON 2017)\n\nA \"virtual machine\" implemented entirely via `%hhn` format strings. Format string `%hhn` writes the count of printed characters (mod 256) to a pointed-to byte. A sequence of `%Nc%hhn` instructions implements arbitrary byte-to-memory writes, effectively creating a bytecode VM.\n\n**Step 1: Identify instruction types.**\nCount unique format patterns to determine the instruction set:\n```bash\n# Normalize numbers and count unique patterns\nsed -e 's/[[:digit:]]\\+/1/g' program.fs | sort | uniq -c | sort -nr\n```\n\n**Step 2: Write a decompiler.**\nConvert format patterns to C-style pseudocode. Each `%N...%hhn` pair maps to a memory write: extract the write address (from the argument pointer) and value (from the character count).\n\n**Step 3: Recognize the algorithm.**\nThe pseudocode typically reveals a linear equation system over bytes. Map memory addresses to symbolic variables.\n\n**Step 4: Generate Z3 constraints and solve.**\n```python\nfrom z3 import *\n\nflag_len = 32 # adjust based on decompiled output\nflag = [BitVec(f'f{i}', 8) for i in range(flag_len)]\ns = Solver()\n\n# Constrain to printable ASCII\nfor f in flag:\n s.add(f >= 0x20, f \u003c= 0x7e)\n\n# Add constraints from decompiled format string operations\n# e.g., flag[3] + flag[7] == 0xAB (mod 256)\n# These come from the write sequences: each %hhn accumulates\n# character counts and writes the result to a target byte\ns.add((flag[0] + flag[1]) & 0xFF == 0x9A) # example constraint\ns.add((flag[2] ^ flag[3]) & 0xFF == 0x3F) # example constraint\n# ... (add all constraints from decompilation)\n\nif s.check() == sat:\n m = s.model()\n print(bytes([m[f].as_long() for f in flag]))\n```\n\n**Decompilation approach in detail:**\n1. Extract the write address and value from each `%N...%hhn` pair\n2. Map memory addresses to symbolic variables (flag bytes)\n3. Build an equation system from the write sequences\n4. Solve with Z3\n\n**Key insight:** Format string `%hhn` writes the count of printed characters (mod 256) to a pointed-to byte. A sequence of `%Nc%hhn` instructions implements arbitrary byte-to-memory writes, effectively creating a bytecode VM. Decompile by: (1) extract the write address and value from each `%N...%hhn` pair, (2) map memory addresses to symbolic variables, (3) build an equation system from the write sequences, (4) solve with Z3.\n\n**References:** SECCON 2017\n\n---\n\n## Quadtree Recursive Image Format Parser (Google CTF Quals 2018)\n\n**Pattern:** Challenge ships a proprietary image format. Reverse engineering shows it is a quadtree: the canvas is split into the largest enclosing power-of-two square, that square is recursively split into four quadrants, and a 1-byte command tells which of the four to subdivide further. Quadrants marked as \"leaf\" are followed by three bytes of RGB color; the rest recurse.\n\n```python\n# Command byte: bits 3..0 = {top-left, top-right, bottom-left, bottom-right}\n# Bit set ⇒ subdivide; bit clear ⇒ leaf (next 3 bytes = RGB)\n\ndef parse(stream, x, y, size):\n cmd = stream.read(1)[0]\n half = size // 2\n children = [\n (x, y ),\n (x + half, y ),\n (x, y + half),\n (x + half, y + half),\n ]\n for i, (cx, cy) in enumerate(children):\n if cmd & (1 \u003c\u003c (3 - i)):\n parse(stream, cx, cy, half)\n else:\n rgb = stream.read(3)\n fill_rect(cx, cy, half, half, rgb)\n```\n\nWalk the recursion until `half == 1` (or until a \"leaf\" bit is seen) and paint the canvas as the format pushes bytes. The flag image renders correctly once the quadrant bit order is matched.\n\n**Key insight:** Proprietary image/compression formats in CTF challenges are almost always quadtrees, LZ77 variants, or Huffman streams. Look for recursive structures with a short command byte followed by either more commands or fixed-width leaf data. Prototype the parser by printing the recursion depth and offset for each call — mismatched depth is the first signal that the bit order or leaf size is wrong.\n\n**References:** Google CTF Quals 2018 — writeup 10335\n","content_type":"text/markdown; charset=utf-8","language":"markdown","size":12280,"content_sha256":"d48ba0d017493db44417cb4cffb128286d77dc032d4038b45ff07addf51c8495"},{"filename":"patterns.md","content":"# CTF Reverse - Patterns & Techniques\n\n## Table of Contents\n- [Custom VM Reversing](#custom-vm-reversing)\n - [Analysis Steps](#analysis-steps)\n - [Common VM Patterns](#common-vm-patterns)\n - [RVA-Based Opcode Dispatching](#rva-based-opcode-dispatching)\n - [State Machine VMs (90K+ states)](#state-machine-vms-90k-states)\n - [Custom VM Reverse Engineering via Fuzzing and Instruction Set Discovery (hxp CTF 2017)](#custom-vm-reverse-engineering-via-fuzzing-and-instruction-set-discovery-hxp-ctf-2017)\n- [Anti-Debugging Techniques](#anti-debugging-techniques)\n - [Common Checks](#common-checks)\n - [Bypass Technique](#bypass-technique)\n - [LD_PRELOAD Hook](#ld_preload-hook)\n - [pwntools Binary Patching (Crypto-Cat)](#pwntools-binary-patching-crypto-cat)\n- [Nanomites](#nanomites)\n - [Linux (Signal-Based)](#linux-signal-based)\n - [Windows (Debug Events)](#windows-debug-events)\n - [Analysis](#analysis)\n- [Self-Modifying Code](#self-modifying-code)\n - [Pattern: XOR Decryption](#pattern-xor-decryption)\n- [Known-Plaintext XOR (Flag Prefix)](#known-plaintext-xor-flag-prefix)\n - [Variant: XOR with Position Index](#variant-xor-with-position-index)\n- [Mixed-Mode (x86-64 / x86) Stagers](#mixed-mode-x86-64--x86-stagers)\n- [LLVM (Low Level Virtual Machine) Obfuscation (Control Flow Flattening)](#llvm-low-level-virtual-machine-obfuscation-control-flow-flattening)\n - [Pattern](#pattern)\n - [De-obfuscation](#de-obfuscation)\n- [S-Box / Keystream Generation](#s-box--keystream-generation)\n - [Fisher-Yates Shuffle (Xorshift32)](#fisher-yates-shuffle-xorshift32)\n - [Xorshift64* Keystream](#xorshift64-keystream)\n - [Identifying Patterns](#identifying-patterns)\n- [SECCOMP/BPF Filter Analysis](#seccompbpf-filter-analysis)\n - [BPF Analysis](#bpf-analysis)\n- [Exception Handler Obfuscation](#exception-handler-obfuscation)\n - [RtlInstallFunctionTableCallback](#rtlinstallfunctiontablecallback)\n - [Vectored Exception Handlers (VEH)](#vectored-exception-handlers-veh)\n- [Memory Dump Analysis](#memory-dump-analysis)\n - [When Binary Dumps Memory](#when-binary-dumps-memory)\n - [Known Plaintext Attack](#known-plaintext-attack)\n- [Byte-Wise Uniform Transforms](#byte-wise-uniform-transforms)\n- [x86-64 Gotchas](#x86-64-gotchas)\n - [Sign Extension](#sign-extension)\n - [Loop Boundary State Updates](#loop-boundary-state-updates)\n- [Custom Mangle Function Reversing](#custom-mangle-function-reversing)\n- [Position-Based Transformation Reversing](#position-based-transformation-reversing)\n- [Hex-Encoded String Comparison](#hex-encoded-string-comparison)\n- [Signal-Based Binary Exploration](#signal-based-binary-exploration)\n\nFor malware patching, multi-stage shellcode loaders, timing/signal oracles, and CTF-specific runtime attacks (INT3 coredump oracle, signal handler chain, printf format string VM, quadtree image format), see [patterns-runtime.md](patterns-runtime.md).\n\n---\n\n## Custom VM Reversing\n\n### Analysis Steps\n1. Identify VM structure: registers, memory, instruction pointer\n2. Reverse `executeIns`/`runvm` function for opcode meanings\n3. Write a disassembler to parse bytecode\n4. Decompile disassembly to understand algorithm\n\n### Common VM Patterns\n```c\nswitch (opcode) {\n case 1: *R[op1] *= op2; break; // MUL\n case 2: *R[op1] -= op2; break; // SUB\n case 3: *R[op1] = ~*R[op1]; break; // NOT\n case 4: *R[op1] ^= mem[op2]; break; // XOR\n case 5: *R[op1] = *R[op2]; break; // MOV\n case 7: if (R0) IP += op1; break; // JNZ\n case 8: putc(R0); break; // PRINT\n case 10: R0 = getc(); break; // INPUT\n}\n```\n\n### RVA-Based Opcode Dispatching\n- Opcodes are RVAs pointing to handler functions\n- Handler performs operation, reads next RVA, jumps\n- Map all handlers by following RVA chain\n\n### State Machine VMs (90K+ states)\n```java\n// BFS for valid path\nvar agenda = new ArrayDeque\u003cState>();\nagenda.add(new State(0, \"\"));\nwhile (!agenda.isEmpty()) {\n var current = agenda.remove();\n if (current.path.length() == TARGET_LENGTH) {\n println(current.path);\n continue;\n }\n for (var transition : machine.get(current.state).entrySet()) {\n agenda.add(new State(transition.getValue(),\n current.path + (char)transition.getKey()));\n }\n}\n```\n\n**Key insight:** Custom VMs appear when the challenge bundles a bytecode blob alongside a dispatcher loop. Reverse the opcode switch table first, then write a disassembler to lift the bytecode before attempting to understand the algorithm.\n\n### Custom VM Reverse Engineering via Fuzzing and Instruction Set Discovery (hxp CTF 2017)\n\nMethodical black-box approach to reversing unknown VM bytecode when static analysis of the dispatch loop is too complex:\n\n**Step 1: Determine instruction alignment.**\nDump the bytecode as bit strings at various widths (6-11 bits) to identify instruction alignment. Look for repeating patterns that suggest opcode boundaries.\n\n**Step 2: Fuzz with random bytes.**\nSend single instructions and observe effects on registers/memory to map opcodes. Reduce to minimal programs: find the shortest input that produces each observable effect.\n\n**Step 3: Build the instruction set.**\nExample discovered ISA (variable-length 6-11 bit):\n```text\n000 xxxxxxxx jmpz 001 xxxxxxxx jmp 010 xxxxxxxx call\n011 xxxxxxxx label 1000 xxxxxxx loadram 1001 xxxxxxx saveram\n110 xxxxxxxx loadi 11100 xxxxxx shl 11101 xxxxxx shr\n111100 not 111101 and 111110 or 111111 setif\n```\n\n**Step 4: Build assembler/disassembler.**\nWrite tools to assemble and disassemble the discovered ISA, then disassemble the challenge bytecode to understand its algorithm.\n\n**Step 5: Implement missing primitives.**\nIf the ISA lacks expected operations, synthesize them from available instructions. Example: implementing XTEA decryption using only AND/OR/NOT (no native XOR or ADD):\n```python\n# XOR from AND/OR/NOT: XOR(a, b) = (a OR b) AND NOT(a AND b)\n# ADD via full-adder chains using AND/OR/NOT for carry propagation\ndef xor_from_primitives(a, b):\n return (a | b) & ~(a & b)\n\ndef add_from_primitives(a, b, bits=32):\n carry = 0\n result = 0\n for i in range(bits):\n ai = (a >> i) & 1\n bi = (b >> i) & 1\n sum_bit = xor_from_primitives(xor_from_primitives(ai, bi), carry)\n carry = (ai & bi) | (carry & xor_from_primitives(ai, bi))\n result |= (sum_bit \u003c\u003c i)\n return result\n```\n\n**Key insight:** When static analysis of a VM's dispatch loop is too complex, black-box fuzzing can map the ISA faster. Send single instructions and observe state changes. Variable-length instruction sets require testing multiple bit widths. Once the ISA is known, complex algorithms (XTEA) can be implemented even with minimal primitives (AND/OR/NOT).\n\n**References:** hxp CTF 2017\n\n---\n\n## Anti-Debugging Techniques\n\n### Common Checks\n- `IsDebuggerPresent()` (Windows)\n- `ptrace(PTRACE_TRACEME)` (Linux)\n- `/proc/self/status` TracerPid\n- Timing checks (`rdtsc`, `time()`)\n- Registry checks (Windows)\n\n### Bypass Technique\n1. Identify `test` instructions after debug checks\n2. Set breakpoint at the `test`\n3. Modify register to bypass conditional\n\n```bash\n# In radare2\ndb 0x401234 # Break at test\ndc # Run\ndr eax=0 # Clear flag\ndc # Continue\n```\n\n### LD_PRELOAD Hook\n```c\n#define _GNU_SOURCE\n#include \u003cdlfcn.h>\n#include \u003csys/ptrace.h>\n\nlong int ptrace(enum __ptrace_request req, ...) {\n long int (*orig)(enum __ptrace_request, pid_t, void*, void*);\n orig = dlsym(RTLD_NEXT, \"ptrace\");\n // Log or modify behavior\n return orig(req, pid, addr, data);\n}\n```\n\nCompile: `gcc -shared -fPIC -ldl hook.c -o hook.so`\nRun: `LD_PRELOAD=./hook.so ./binary`\n\n**Key insight:** Anti-debugging checks are the first obstacle in most reversing challenges. Look for `ptrace`, `IsDebuggerPresent`, or timing checks early in `main()` and patch or hook them before attempting deeper analysis.\n\n### pwntools Binary Patching (Crypto-Cat)\nPatch out anti-debug calls directly using pwntools — replaces function with `ret` instruction:\n```python\nfrom pwn import *\n\nelf = ELF('./challenge', checksec=False)\nelf.asm(elf.symbols.ptrace, 'ret') # Replace ptrace() with immediate return\nelf.save('patched') # Save patched binary\n```\n\nOther common patches:\n```python\nelf.asm(addr, 'nop') # NOP out an instruction\nelf.asm(addr, 'xor eax, eax; ret') # Return 0 (bypass checks)\nelf.asm(addr, 'mov eax, 1; ret') # Return 1 (force success)\n```\n\n---\n\n## Nanomites\n\n### Linux (Signal-Based)\n- `SIGTRAP` (`int 3`) → Custom operation\n- `SIGILL` (`ud2`) → Custom operation\n- `SIGFPE` (`idiv 0`) → Custom operation\n- `SIGSEGV` (null deref) → Custom operation\n\n### Windows (Debug Events)\n- `EXCEPTION_DEBUG_EVENT` → Main handler\n- Parent modifies child via `PTRACE_POKETEXT`\n- Magic markers: `0x1337BABE`, `0xDEADC0DE`\n\n### Analysis\n1. Check for `fork()` + `ptrace(PTRACE_TRACEME)`\n2. Find `WaitForDebugEvent` loop\n3. Map EAX values to operations\n4. Log operations to reconstruct algorithm\n\n**Key insight:** Nanomites hide the real computation inside signal/exception handlers that only fire under a debugger parent. If the binary forks and the child calls `ptrace(TRACEME)`, the parent is the real CPU -- log its POKE operations to reconstruct the algorithm.\n\n---\n\n## Self-Modifying Code\n\n### Pattern: XOR Decryption\n```asm\nlea rax, next_block\nmov dl, [rcx] ; Input char\nxor_loop:\n xor [rax+rbx], dl\n inc rbx\n cmp rbx, BLOCK_SIZE\n jnz xor_loop\njmp rax ; Execute decrypted\n```\n\n**Solution:** Known opcode at block start reveals XOR key (flag char).\n\n**Key insight:** Self-modifying code decrypts the next block using each input character as a key. A known-good opcode at the start of each decrypted block (e.g., function prologue) reveals the correct key byte, recovering the flag one character at a time.\n\n---\n\n## Known-Plaintext XOR (Flag Prefix)\n\n**Pattern:** Encrypted bytes given; flag format known (e.g., `0xL4ugh{`).\n\n**Approach:**\n1. Assume repeating XOR key.\n2. Use known prefix (and any hint phrase) to recover key bytes.\n3. Try small key lengths and validate printable output.\n\n```python\nenc = bytes.fromhex(\"...\") # ciphertext\nknown = b\"0xL4ugh{say_yes_to_me\"\nfor klen in range(2, 33):\n key = bytearray(klen)\n ok = True\n for i, b in enumerate(known):\n if i >= len(enc):\n break\n ki = i % klen\n v = enc[i] ^ b\n if key[ki] != 0 and key[ki] != v:\n ok = False\n break\n key[ki] = v\n if not ok:\n continue\n pt = bytes(enc[i] ^ key[i % klen] for i in range(len(enc)))\n if all(32 \u003c= c \u003c 127 for c in pt):\n print(klen, key, pt)\n```\n\n**Note:** Challenge hints often appear verbatim in the flag body (e.g., \"say_yes_to_me\").\n\n### Variant: XOR with Position Index\n**Pattern:** `cipher[i] = plain[i] ^ key[i % k] ^ i` (or `^ (i & 0xff)`).\n\n**Symptoms:**\n- Repeating-key XOR almost fits known prefix but breaks at later positions\n- XOR with known prefix yields a \"key\" that changes by +1 per index\n\n**Fix:** Remove index first, then recover key with known prefix.\n```python\nenc = bytes.fromhex(\"...\")\nknown = b\"0xL4ugh{say_yes_to_me\"\nfor klen in range(2, 33):\n key = bytearray(klen)\n ok = True\n for i, b in enumerate(known):\n if i >= len(enc):\n break\n ki = i % klen\n v = (enc[i] ^ i) ^ b # strip index XOR\n if key[ki] != 0 and key[ki] != v:\n ok = False\n break\n key[ki] = v\n if not ok:\n continue\n pt = bytes((enc[i] ^ i) ^ key[i % klen] for i in range(len(enc)))\n if all(32 \u003c= c \u003c 127 for c in pt):\n print(klen, key, pt)\n```\n\n---\n\n## Mixed-Mode (x86-64 / x86) Stagers\n\n**Pattern:** 64-bit ELF jumps into a 32-bit blob via far return (`retf`/`retfq`), often after anti-debug.\n\n**Identification:**\n- Bytes `0xCB` (retf) or `0xCA` (retf imm16), sometimes preceded by `0x48` (retfq)\n- 32-bit disasm shows SSE ops (`psubb`, `pxor`, `paddb`) in a tight loop\n- Computed jumps into the 32-bit region\n\n**Gotchas:**\n- `retf` pops **6 bytes**: 4-byte EIP + 2-byte CS (not 8)\n- 32-bit blob may rely on inherited **XMM state** and **EFLAGS**\n- Missing XMM/flags transfer when switching emulators yields wrong output\n\n**Bypass/Emulation Tips:**\n1. Create a UC_MODE_32 emulator, copy memory + GPRs, **EFLAGS**, and **XMM regs**\n2. Run 32-bit block, then copy memory + regs back to 64-bit\n3. If anti-debug uses `fork/ptrace` + patching, emulate parent to log POKEs and apply them in child\n\n---\n\n## LLVM (Low Level Virtual Machine) Obfuscation (Control Flow Flattening)\n\n### Pattern\n```c\nwhile (1) {\n if (i == 0xA57D3848) { /* block */ }\n if (i != 0xA5AA2438) break;\n i = 0x39ABA8E6; // Next state\n}\n```\n\n### De-obfuscation\n1. GDB script to break at `je` instructions\n2. Log state variable values\n3. Map state transitions\n4. Reconstruct true control flow\n\n**Key insight:** Control flow flattening replaces structured if/else/loops with a single dispatcher switch. The state variable is the key -- trace its values at runtime to reconstruct the original control flow graph without fighting the obfuscation statically.\n\n---\n\n## S-Box / Keystream Generation\n\n### Fisher-Yates Shuffle (Xorshift32)\n```python\ndef gen_sbox():\n sbox = list(range(256))\n state = SEED\n for i in range(255, -1, -1):\n state = ((state \u003c\u003c 13) ^ state) & 0xffffffff\n state = ((state >> 17) ^ state) & 0xffffffff\n state = ((state \u003c\u003c 5) ^ state) & 0xffffffff\n j = state % (i + 1) if i > 0 else 0\n sbox[i], sbox[j] = sbox[j], sbox[i]\n return sbox\n```\n\n### Xorshift64* Keystream\n```python\ndef gen_keystream():\n ks = []\n state = SEED_64\n mul = 0x2545f4914f6cdd1d\n for _ in range(256):\n state ^= (state >> 12)\n state ^= (state \u003c\u003c 25)\n state ^= (state >> 27)\n state = (state * mul) & 0xffffffffffffffff\n ks.append((state >> 56) & 0xff)\n return ks\n```\n\n### Identifying Patterns\n- Xorshift32: shifts 13, 17, 5 (no multiplication constant)\n- Xorshift64*: shifts 12, 25, 27, then multiply by `0x2545f4914f6cdd1d`\n- Other common constant: `0x9e3779b97f4a7c15` (golden ratio)\n\n**Key insight:** Recognize S-box generation by the Fisher-Yates shuffle pattern (loop counting down from 255, swap with PRNG-chosen index) and keystream generators by the xorshift constants. Once the PRNG family is identified, the algorithm is fully determined by its seed.\n\n---\n\n## SECCOMP/BPF Filter Analysis\n\n```bash\nseccomp-tools dump ./binary\n```\n\n### BPF Analysis\n- `A = sys_number` followed by comparisons\n- `mem[N] = A`, `A = mem[N]` for memory ops\n- Map to constraint equations, solve with z3\n\n```python\nfrom z3 import *\nflag = [BitVec(f'c{i}', 32) for i in range(14)]\ns = Solver()\ns.add(flag[0] >= 0x20, flag[0] \u003c 0x7f)\n# Add constraints from filter\nif s.check() == sat:\n m = s.model()\n print(''.join(chr(m[c].as_long()) for c in flag))\n```\n\n**Key insight:** SECCOMP (Secure Computing Mode) filters encode flag validation as BPF bytecode operating on syscall arguments. Dump the filter with `seccomp-tools`, translate the comparisons and memory operations into z3 constraints, and solve for the flag without ever running the binary.\n\n---\n\n## Exception Handler Obfuscation\n\n### RtlInstallFunctionTableCallback\n- Dynamic exception handler registration\n- Handler installs new handler, modifies code\n- Use x64dbg with exception handler breaks\n\n### Vectored Exception Handlers (VEH)\n- `AddVectoredExceptionHandler` installs handler\n- Handler decrypts code at exception address\n- Step through, dump decrypted code\n\n**Key insight:** Exception-handler-based obfuscation hides the real control flow inside SEH/VEH handlers that trigger on deliberate faults. Set breakpoints inside the exception handlers rather than on the faulting instructions to follow the actual execution path.\n\n---\n\n## Memory Dump Analysis\n\n### When Binary Dumps Memory\n- Check for `/proc/self/maps` reads\n- Check for `/proc/self/mem` reads\n- Heap data often appended to dump\n\n### Known Plaintext Attack\n```python\nprologue = bytes([0xf3, 0x0f, 0x1e, 0xfa, 0x55, 0x48, 0x89, 0xe5])\nencrypted = data[func_offset:func_offset+8]\npartial_key = bytes(a ^ b for a, b in zip(encrypted, prologue))\n```\n\n**Key insight:** When a binary reads `/proc/self/mem` or `/proc/self/maps`, it is dumping its own memory -- possibly after encrypting it. Use known function prologues (`endbr64; push rbp; mov rbp, rsp`) as known plaintext to recover the XOR key from the encrypted dump.\n\n---\n\n## Byte-Wise Uniform Transforms\n\n**Pattern:** Output buffer depends on each input byte independently (no cross-byte coupling).\n\n**Detection:**\n- Change one input position → only one output position changes\n- Fill input with a single byte → output buffer becomes constant\n\n**Solve:**\n1. For each byte value 0..255, run the program with that byte repeated\n2. Record output byte → build mapping and inverse mapping\n3. Apply inverse mapping to static target bytes to recover the flag\n\n---\n\n## x86-64 Gotchas\n\n### Sign Extension\n```python\nesi = 0xffffffc7 # NOT -57\n\n# For XOR: low byte only\nesi_xor = esi & 0xff # 0xc7\n\n# For addition: full 32-bit with overflow\nr12 = (r13 + esi) & 0xffffffff\n```\n\n### Loop Boundary State Updates\nAssembly often splits state updates across loop boundaries:\n```asm\n jmp loop_middle ; First iteration in middle!\n\nloop_top: ; State for iterations 2+\n mov r13, sbox[a & 0xf]\n ; Uses OLD 'a', not new!\n\nloop_middle:\n ; Main computation\n inc a\n jne loop_top\n```\n\n**Key insight:** Decompilers often get x86-64 sign extension and loop boundary state updates wrong. Always verify decompiled output against the raw assembly for operations involving `movsx`/`cdqe`, and check whether loop variables update before or after their use in each iteration.\n\n---\n\n## Custom Mangle Function Reversing\n\n**Pattern (Flag Appraisal):** Binary mangles input 2 bytes at a time with intermediate state, compares to static target.\n\n**Approach:**\n1. Extract static target bytes from `.rodata` section\n2. Understand mangle: processes pairs with running state value\n3. Write inverse function (process in reverse, undo each operation)\n4. Feed target bytes through inverse → recovers flag\n\n**Key insight:** When a binary mangles input in pairs with running state and compares to a static target, extract the target from `.rodata` and write the inverse function. Process the target bytes in reverse order, undoing each operation, to recover the original input.\n\n---\n\n## Position-Based Transformation Reversing\n\n**Pattern (PascalCTF 2026):** Binary transforms input by adding/subtracting position index.\n\n**Reversing:**\n```python\nexpected = [...] # Extract from .rodata\nflag = ''\nfor i, b in enumerate(expected):\n if i % 2 == 0:\n flag += chr(b - i) # Even: input = output - i\n else:\n flag += chr(b + i) # Odd: input = output + i\n```\n\n---\n\n## Hex-Encoded String Comparison\n\n**Pattern (Spider's Curse):** Input converted to hex, compared against hex constant.\n\n**Quick solve:** Extract hex constant from strings/Ghidra, decode:\n```bash\necho \"4d65746143...\" | xxd -r -p\n```\n\n---\n\n## Signal-Based Binary Exploration\n\n**Pattern (Signal Signal Little Star):** Binary uses UNIX signals as a binary tree navigation mechanism.\n\n**Identification:**\n- Multiple `sigaction()` calls with `SA_SIGINFO`\n- `sigaltstack()` setup (alternate signal stack)\n- Handler decodes embedded payload, installs next pair of signals\n- Two types: Node (installs children) vs Leaf (prints message + exits)\n\n**Solving approach:**\n1. Hook `sigaction` via `LD_PRELOAD` to log signal installations\n2. DFS through the binary tree by sending signals\n3. At each stage, observe which 2 signals are installed\n4. Send one, check if program exits (leaf) or installs 2 more (node)\n5. If wrong leaf, backtrack and try sibling\n\n```c\n// LD_PRELOAD interposer to log sigaction calls\nint sigaction(int signum, const struct sigaction *act, ...) {\n if (act && (act->sa_flags & SA_SIGINFO))\n log(\"SET %d SA_SIGINFO=1\\n\", signum);\n return real_sigaction(signum, act, oldact);\n}\n```\n\nSee [patterns-runtime.md](patterns-runtime.md) for malware patching, multi-stage shellcode, timing/signal oracles, and CTF writeup techniques.\n","content_type":"text/markdown; charset=utf-8","language":"markdown","size":20387,"content_sha256":"d6c3ce394122c47a77e95b3147e1c559d95030516c46cf36bf2e6b384e249314"},{"filename":"platforms-hardware.md","content":"# CTF Reverse - Hardware and Advanced Architecture Reversing\n\nHD44780 LCD GPIO reconstruction, RISC-V advanced extensions and debugging, ARM64/AArch64 reversing and exploitation.\n\n## Table of Contents\n- [HD44780 LCD Controller GPIO Reconstruction (32C3 2015)](#hd44780-lcd-controller-gpio-reconstruction-32c3-2015)\n- [RISC-V (Advanced)](#risc-v-advanced)\n - [Custom Extensions](#custom-extensions)\n - [Privileged Modes](#privileged-modes)\n - [RISC-V Debugging](#risc-v-debugging)\n- [ARM64/AArch64 Reversing and Exploitation](#arm64aarch64-reversing-and-exploitation)\n- [MIPS64 Cavium OCTEON Coprocessor 2 Crypto (SEC-T CTF 2017)](#mips64-cavium-octeon-coprocessor-2-crypto-sec-t-ctf-2017)\n- [EFM32 ARM Microcontroller MMIO AES (SEC-T CTF 2017)](#efm32-arm-microcontroller-mmio-aes-sec-t-ctf-2017)\n- [MBR/Bootloader Reversing with QEMU + GDB (Square CTF 2017)](#mbrbootloader-reversing-with-qemu--gdb-square-ctf-2017)\n- [Game Boy ROM Z80 Analysis in bgb Debugger (Square CTF 2017)](#game-boy-rom-z80-analysis-in-bgb-debugger-square-ctf-2017)\n- [KVM Guest Analysis via ioctl + KVM_EXIT_HLT Block Chaining (CSAW 2018)](#kvm-guest-analysis-via-ioctl--kvm_exit_hlt-block-chaining-csaw-2018)\n- [Coreboot ROM XOR-Pair Bit-Flip Address Discovery (Hack.lu 2018)](#coreboot-rom-xor-pair-bit-flip-address-discovery-hacklu-2018)\n\n---\n\n## HD44780 LCD Controller GPIO Reconstruction (32C3 2015)\n\nRecover text displayed on an HD44780 LCD from raw Raspberry Pi GPIO recordings:\n\n1. **Identify signal lines:** Map GPIO pins to HD44780 signals (RS, CLK, D4-D7 for 4-bit mode)\n2. **Clock edge detection:** Sample data lines on falling clock edges (1->0 transition)\n3. **Nibble assembly:** Combine two 4-bit samples into one 8-bit command/data byte\n4. **DRAM address mapping:** HD44780 uses non-contiguous addressing for multi-line displays:\n - Line 0: 0x00-0x27\n - Line 1: 0x40-0x67\n - Line 2: 0x14-0x3B\n - Line 3: 0x54-0x7B\n\n```python\ndisplay = [' '] * 80 # 4 lines x 20 chars\ncursor = 0\n\nfor timestamp, gpio_state in sorted(gpio_log):\n if falling_edge(gpio_state, CLK_PIN):\n nibble = extract_data_bits(gpio_state)\n byte = assemble_nibble(nibble) # Two nibbles per byte\n if rs_high(gpio_state): # RS=1: data write\n display[dram_to_position(cursor)] = chr(byte)\n cursor += 1\n else: # RS=0: command (set cursor, clear, etc.)\n cursor = parse_command(byte)\n```\n\n**Key insight:** GPIO pin-to-signal mapping is rarely documented; identify CLK by finding the pin with most transitions, RS by correlation with data patterns (alternating command/data phases).\n\n---\n\n## RISC-V (Advanced)\n\nBeyond basic disassembly (see [tools.md](tools.md#risc-v-binary-analysis-ehax-2026)):\n\n### Custom Extensions\n\n```text\nBitmanip extensions (Zbb, Zbc, Zbs):\n clz, ctz, cpop -> count leading/trailing zeros, popcount\n orc.b, rev8 -> byte-level bit manipulation\n andn, orn, xnor -> negated logic operations\n clmul, clmulh, clmulr -> carry-less multiplication (crypto)\n bset, bclr, binv, bext -> single-bit operations\n\nCrypto extensions (Zk*):\n aes32esi, aes32dsmi -> AES round operations\n sha256sig0, sha512sum0 -> SHA hash acceleration\n sm3p0, sm4ed -> Chinese crypto standards\n```\n\n### Privileged Modes\n\n```text\nMachine mode (M): Highest privilege, firmware/bootloader\nSupervisor mode (S): OS kernel\nUser mode (U): Applications\n\nCSR registers to watch:\n mstatus/sstatus -> privilege level, interrupt enable\n mtvec/stvec -> trap handler address\n mepc/sepc -> exception return address\n mcause/scause -> trap cause\n satp -> page table root (virtual memory)\n```\n\n### RISC-V Debugging\n\n```bash\n# OpenOCD + GDB for hardware debugging\nopenocd -f interface/jlink.cfg -f target/riscv.cfg\n\n# GDB for RISC-V\nriscv64-unknown-elf-gdb binary\n(gdb) target remote :3333\n\n# QEMU with GDB server\nqemu-riscv64 -g 1234 -L /usr/riscv64-linux-gnu/ ./binary\nriscv64-linux-gnu-gdb -ex 'target remote :1234' ./binary\n```\n\n---\n\n## ARM64/AArch64 Reversing and Exploitation\n\nAArch64 (ARM 64-bit) appears in mobile apps, cloud servers (AWS Graviton), Apple Silicon, and CTF challenges. Key differences from x86-64 affect both reversing and exploitation.\n\n**Setup and emulation:**\n\n```bash\n# Install cross-toolchain and emulator\napt install gcc-aarch64-linux-gnu gdb-multiarch qemu-user-static\n\n# Run AArch64 binary on x86 host\nqemu-aarch64-static -L /usr/aarch64-linux-gnu/ ./arm64_binary\n\n# Debug with GDB\nqemu-aarch64-static -g 12345 -L /usr/aarch64-linux-gnu/ ./arm64_binary &\ngdb-multiarch -ex 'set arch aarch64' -ex 'target remote :1234' ./arm64_binary\n\n# With library preloading (for challenges that ship libc)\nqemu-aarch64-static -g 12345 -E LD_PRELOAD=./libc.so.6 -L ./lib ./arm64_binary\n```\n\n**AArch64 calling convention (key differences from x86-64):**\n\n```text\nRegisters:\n x0-x7 -- function arguments AND return values (x0 = first arg / return)\n x8 -- indirect result location (struct returns)\n x9-x15 -- caller-saved temporaries\n x19-x28 -- callee-saved (preserved across calls)\n x29 (fp) -- frame pointer\n x30 (lr) -- link register (return address, NOT on stack by default)\n sp -- stack pointer (must be 16-byte aligned)\n xzr -- zero register (reads as 0, writes discarded)\n\nKey exploitation differences:\n - Return address in LR (x30), not on stack -- pushed only if function calls others\n - No RIP-relative addressing like x86 -- uses ADRP+ADD pairs for PC-relative loads\n - Fixed 4-byte instruction width -- no variable-length gadget tricks\n - NOP = 0xD503201F (not 0x90)\n - BLR x8 / BR x30 -- indirect calls/jumps use register operands\n```\n\n**Common AArch64 patterns in Ghidra/IDA:**\n\n```text\n# PC-relative address loading (equivalent to x86 LEA):\nADRP x0, #0x411000 ; Load page address (4KB aligned)\nADD x0, x0, #0x8 ; Add page offset -> x0 = 0x411008\n\n# Function prologue:\nSTP x29, x30, [sp, #-0x30]! ; Push fp + lr, decrement sp\nMOV x29, sp ; Set frame pointer\n\n# Function epilogue:\nLDP x29, x30, [sp], #0x30 ; Pop fp + lr, increment sp\nRET ; Branch to x30 (lr)\n\n# Switch/jump table:\nADR x1, jump_table\nLDRB w2, [x1, x0] ; Load offset byte\nADD x1, x1, w2, SXTB ; Sign-extend and add\nBR x1 ; Indirect branch\n```\n\n**ROP on AArch64:**\n\n```python\nfrom pwn import *\n\n# AArch64 gadgets differ from x86:\n# - \"pop {x0}; ret\" equivalent: LDP x0, x1, [sp], #0x10; RET\n# - Prologue gadgets: LDP x29, x30, [sp, #0x20]; ... RET\n# - system() call: x0 = pointer to \"/bin/sh\", BLR to system\n\ncontext.arch = 'aarch64'\nelf = ELF('./arm64_binary')\n\n# Common gadget pattern in AArch64 libc:\n# LDP X19, X20, [SP,#var_s10]\n# LDP X29, X30, [SP+var_s0],#0x20\n# RET\n# Controls x19, x20, x29, x30 and advances sp by 0x20\n```\n\n**Key insight:** AArch64's fixed instruction width and register-based return address (`lr`/`x30`) make ROP gadgets more constrained than x86. Look for `LDP` (load pair) gadgets that pop multiple registers from the stack. The `STP`/`LDP` instruction pairs that save/restore callee-saved registers in function prologues/epilogues are the primary gadget source.\n\n**When to recognize:** `file` shows \"ELF 64-bit LSB ... ARM aarch64\". Ghidra auto-detects but may need manual processor selection for raw binaries. Use `qemu-aarch64-static` for emulation on x86 hosts.\n\n**Tools:** radare2 (`r2 -AA -a arm -b 64`), Ghidra (auto-detect), `aarch64-linux-gnu-objdump -d`, Unicorn Engine (`UC_ARCH_ARM64`)\n\n**References:** Google CTF 2016 \"Forced Puns\", Insomni'hack 2018 \"onecall\"\n\n---\n\n## MIPS64 Cavium OCTEON Coprocessor 2 Crypto (SEC-T CTF 2017)\n\nCavium OCTEON network processors implement hardware AES and SHA256 via MIPS Coprocessor 2 (CP2) using `dmtc2` (move to CP2) and `dmfc2` (move from CP2) instructions. These look like ordinary register moves to a disassembler but drive the hardware crypto engine.\n\n**Key CP2 register layout (OCTEON):**\n```text\nAES key registers:\n 0x0104 – AES key quadword 0\n 0x0105 – AES key quadword 1\n 0x0106 – AES key quadword 2\n 0x0107 – AES key quadword 3\n\nSHA256 hash registers:\n 0x400E–0x4012 – SHA256 intermediate hash words\n 0x404F – SHA256 control/result\n\ndmtc2 rN, 0x0104 ; load 64 bits of AES key into CP2 register 0x104\ndmtc2 rN, 0x0105 ; ...next quadword\n```\n\n**Approach:**\n1. Disassemble in IDA/Ghidra — `dmtc2`/`dmfc2` with selector in 0x100-0x40FF range indicates OCTEON CP2\n2. Cross-reference the Cavium OCTEON Hardware Reference Manual for register semantics\n3. Trace the key loading sequence to recover the AES or HMAC key material\n\n**Key insight:** Hardware crypto accelerators on MIPS appear as CP2 register writes (`dmtc2`/`dmfc2`). Identify the base register address and cross-reference vendor documentation.\n\n**References:** SEC-T CTF 2017\n\n---\n\n## EFM32 ARM Microcontroller MMIO AES (SEC-T CTF 2017)\n\nSilicon Labs EFM32 Cortex-M binary — a flat binary loaded at 0x1000 in Thumb mode.\n\n**IDA setup:**\n```text\nProcessor: ARM Little-endian (ARMv7-M)\nLoad address: 0x1000\nSet T register = 1 (force Thumb mode decoding)\n```\n\n**AES accelerator MMIO layout (EFM32 AES peripheral at 0x400E0000):**\n```text\n0x400E0000 + 0x000 CTRL – enable, decrypt mode\n0x400E0000 + 0x004 CMD – start/stop\n0x400E0000 + 0x010 KEYLA – key low word 0\n0x400E0000 + 0x014 KEYLB – key low word 1\n0x400E0000 + 0x018 KEYLC – key low word 2\n0x400E0000 + 0x01C KEYLD – key low word 3\n```\n\nThe binary loads two separate values, XORs them together, then writes the result as the AES key. Decrypt the embedded ciphertext block with the composed key in ECB mode.\n\n```python\nfrom Crypto.Cipher import AES\n\nkey_part_a = bytes.fromhex(\"...\") # extracted from IDA .data section\nkey_part_b = bytes.fromhex(\"...\") # second value\nkey = bytes(a ^ b for a, b in zip(key_part_a, key_part_b))\n\ncipher = AES.new(key, AES.MODE_ECB)\nplaintext = cipher.decrypt(ciphertext)\n```\n\n**Key insight:** Hardware AES accelerators on microcontrollers appear as MMIO register writes at a specific base address — cross-reference the vendor reference manual (EFM32 Reference Manual for Silicon Labs peripherals).\n\n**References:** SEC-T CTF 2017\n\n---\n\n## MBR/Bootloader Reversing with QEMU + GDB (Square CTF 2017)\n\nBoot a floppy/disk image in QEMU with the GDB stub enabled, then attach GDB for full source-level debugging of 16-bit real mode or 32-bit protected mode bootloader code.\n\n```bash\n# Boot with GDB stub on port 1234; -S pauses execution at start\nqemu-system-x86_64 -fda disk.img -s -S\n\n# In another terminal, attach GDB\ngdb -ex \"set architecture i8086\" \\\n -ex \"target remote :1234\" \\\n -ex \"break *0x7c00\" \\\n -ex \"continue\"\n\n# Common MBR entry point is 0x7c00 (BIOS loads MBR here)\n# Step through bootloader, inspect registers and memory:\n(gdb) x/20i $pc\n(gdb) info registers\n(gdb) x/16xb 0x7c00\n```\n\nTo bypass a password check: identify the conditional jump after the comparison and NOP it out in the image file, or patch the comparison to always succeed.\n\n```bash\n# Find the comparison offset in the image and patch it\npython3 -c \"\ndata = open('disk.img', 'rb').read()\n# Replace JNZ (0x75) with JMP-short-always or NOP\ndata = data[:offset] + b'\\x90\\x90' + data[offset+2:]\nopen('disk_patched.img', 'wb').write(data)\n\"\n```\n\n**Key insight:** QEMU's `-s` flag exposes a GDB stub on port 1234 for full debugging of MBR/bootloader code — workflow identical to userland debugging.\n\n**References:** Square CTF 2017\n\n---\n\n## Game Boy ROM Z80 Analysis in bgb Debugger (Square CTF 2017)\n\nGame Boy ROMs use the Sharp SM83 (LR35902) CPU, a Z80/8080 hybrid. Load the ROM in the **bgb** emulator which provides GDB-like debugging: breakpoints, memory inspection, and register display.\n\n**Key instructions for flag comparisons:**\n```asm\nLD A, [HL] ; load byte from memory pointed to by HL into A\nAND [HL] ; A = A & *HL — compares player byte against memory value\nCP N ; compare A with immediate N (sets Z flag if equal)\n```\n\nWhen `and (hl)` or `cp (hl)` fires during input validation, the expected byte is visible at the `(hl)` address in the memory view.\n\n**bgb workflow:**\n1. Load ROM: File → Open ROM\n2. Right-click disassembly → \"Run to cursor\" or set breakpoint (F2)\n3. When comparison fires, inspect Registers panel (HL value) and Memory panel (`*HL`)\n4. Note expected value, advance to next comparison position\n\n**Key insight:** Game Boy ROMs are Z80/SM83 architecture. The bgb debugger provides GDB-like functionality; key comparisons use `(hl)`-indirect addressing so the expected value is directly visible in the memory view during the comparison.\n\n**References:** Square CTF 2017\n\n---\n\n## KVM Guest Analysis via ioctl + KVM_EXIT_HLT Block Chaining (CSAW 2018)\n\n**Pattern:** A userspace process hosts a KVM-based VM whose guest \"program\" is nothing but a sequence of code blocks that end in `HLT`. The host handler reads `KVM_EXIT_HLT`, inspects guest registers, and dispatches to the next block by looking up `rax` in a jump table at `0x2020A0`. Reverse the whole program by (1) running the binary under `strace -v` to capture `KVM_GET_REGS`/`KVM_SET_REGS` pairs, (2) dumping each code block from the KVM memory region, and (3) reconstructing the dispatch graph from the host's rax-indexed table.\n\n```bash\n# 1. Observe KVM ioctls + register snapshots\nstrace -v -e ioctl ./challenge 2>&1 | grep -E \"KVM_RUN|KVM_(GET|SET)_REGS\"\n\n# 2. Dump guest code memory (offset + size from KVM_SET_USER_MEMORY_REGION ioctl)\ngdb -batch -ex \"attach $(pgrep challenge)\" \\\n -ex \"dump binary memory guest.bin 0x400000 0x410000\" \\\n -ex \"detach\"\n\n# 3. Disassemble each HLT-terminated block\nobjdump -D -b binary -m i386:x86-64 guest.bin | less\n```\n\n```python\n# Rebuild the dispatch graph\nimport struct\nwith open(\"challenge\", \"rb\") as f:\n data = f.read()\n# Host table at 0x2020A0 maps rax → next block offset\ntable = struct.unpack_from(\"\u003c128Q\", data, 0x2020A0)\nfor rax, ptr in enumerate(table):\n if ptr:\n print(f\"rax={rax:02x} → block {ptr:#x}\")\n```\n\n**Key insight:** KVM-backed challenges hide control flow by moving it into the host process, not the guest. The guest code itself is just a pile of opaque blocks. `strace` on the KVM ioctls replaces a debugger: every `KVM_EXIT_HLT` is a \"basic block boundary\" and the host's response is the real transition function. Any time you see a CTF binary linked against `-lkvm` or opening `/dev/kvm`, drop your normal reverse-engineering pipeline and start from the ioctl trace.\n\n**References:** CSAW CTF Qualification Round 2018 — kvm, writeup 11206\n\n---\n\n## Coreboot ROM XOR-Pair Bit-Flip Address Discovery (Hack.lu 2018)\n\n**Pattern:** A firmware image loads its flag location by XOR-ing two constants from ROM at boot. The challenge service lets the attacker flip a single bit anywhere in the ROM and observe the new flag address. Compute the intended address `X = C1 ^ C2`, compare against the advertised address, and the bit that differs — always exactly one bit away in a well-designed challenge — tells you which ROM byte + bit pair controls the redirect.\n\n```python\n# Two constants in ROM\nC1 = 0xEF56BF92\nC2 = 0xEF5A3F92\nintended = C1 ^ C2 # 0xC8000 per the source\nactual = 0xC0000 # where the flag really lives in memory\n\ndiff = intended ^ actual # 0x08000 → bit 15\n# Find the ROM offset that, when a single bit is flipped, produces `actual`.\n# The flip must land in either C1 or C2 so the XOR result has bit 15 cleared.\n```\n\n**Key insight:** XOR of two ROM constants is trivially a linear operation: any single-bit flip in either operand XORs the corresponding bit into the result. Walk the Hamming distance between computed and observed addresses and you get a bounded list of candidate patch sites — usually one or two. Pairs of addresses in firmware are suspicious: they frequently compose via XOR, ADD, or SUB and every arithmetic relation is a candidate for a targeted single-bit flip attack. Also applies to rowhammer: the sensitive bit is the *differential* between two constants, not the constants themselves.\n\n**References:** Hack.lu CTF 2018 — 1-bit-missile, writeups 11862, 11865\n","content_type":"text/markdown; charset=utf-8","language":"markdown","size":16239,"content_sha256":"f2548b021226489270e669ad99f0a4d10fbd0df59d7a3d2fe85de01a42366e36"},{"filename":"platforms.md","content":"# CTF Reverse - Platform-Specific Reversing\n\nmacOS/iOS, embedded/IoT firmware, kernel driver, automotive, and game engine reverse engineering.\n\n## Table of Contents\n- [macOS / iOS Reversing](#macos--ios-reversing)\n - [Mach-O Binary Format](#mach-o-binary-format)\n - [Code Signing & Entitlements](#code-signing--entitlements)\n - [Objective-C Runtime RE](#objective-c-runtime-re)\n - [Swift Binary Reversing](#swift-binary-reversing)\n - [iOS App Analysis](#ios-app-analysis)\n - [dyld / Dynamic Linking](#dyld--dynamic-linking)\n- [Embedded / IoT Firmware RE](#embedded--iot-firmware-re)\n - [Firmware Extraction](#firmware-extraction)\n - [Firmware Unpacking](#firmware-unpacking)\n - [Architecture-Specific Notes](#architecture-specific-notes)\n - [RTOS Analysis](#rtos-analysis)\n- [Kernel Driver Reversing](#kernel-driver-reversing)\n - [Linux Kernel Modules](#linux-kernel-modules)\n - [eBPF Programs](#ebpf-programs)\n - [Windows Kernel Drivers](#windows-kernel-drivers)\n- [Game Engine Reversing](#game-engine-reversing)\n - [Unreal Engine](#unreal-engine)\n - [Unity (Beyond IL2CPP)](#unity-beyond-il2cpp)\n - [Anti-Cheat Analysis](#anti-cheat-analysis)\n - [Lua-Scripted Games](#lua-scripted-games)\n- [Automotive / CAN Bus RE](#automotive--can-bus-re)\n- [RISC-V QEMU Execution with GLIBC Symbol Version Patching (Pwn2Win 2018)](#risc-v-qemu-execution-with-glibc-symbol-version-patching-pwn2win-2018)\n- [APK Certificate SHA-256 as AES Key (ASIS Finals 2018)](#apk-certificate-sha-256-as-aes-key-asis-finals-2018)\n- [Moxie ISA Custom Opcode Discovery (SECCON 2018)](#moxie-isa-custom-opcode-discovery-seccon-2018)\n- [Unity APK Assembly-CSharp.dll Runtime Patch (SECCON 2018)](#unity-apk-assembly-csharpdll-runtime-patch-seccon-2018)\n- [Il2CppDumper for Unity IL2CPP Metadata Recovery (SECCON 2018)](#il2cppdumper-for-unity-il2cpp-metadata-recovery-seccon-2018)\n\n---\n\n## macOS / iOS Reversing\n\n### Mach-O Binary Format\n\n```bash\n# File identification\nfile binary # \"Mach-O 64-bit executable arm64\" or \"x86_64\"\notool -l binary # Load commands (segments, dylibs, entry point)\notool -L binary # Linked dynamic libraries\n\n# Universal (fat) binaries — multiple architectures in one file\nlipo -info universal_binary # List architectures\nlipo universal_binary -thin arm64 -output binary_arm64 # Extract one arch\n\n# Segments and sections\notool -l binary | grep -A5 \"segment\\|section\"\n# Key segments: __TEXT (code), __DATA (globals), __LINKEDIT (symbols)\n# Key sections: __text (instructions), __cstring (C strings), __objc_methname\n```\n\n**Key Mach-O concepts:**\n- Load commands drive the dynamic linker (`dyld`)\n- `LC_MAIN` → entry point (replaces `LC_UNIXTHREAD`)\n- `LC_LOAD_DYLIB` → shared library dependencies\n- `LC_CODE_SIGNATURE` → code signing blob\n- `__DATA_CONST.__got` → Global Offset Table\n- `__DATA.__la_symbol_ptr` → Lazy symbol pointers (like PLT)\n\n### Code Signing & Entitlements\n\n```bash\n# Check code signature\ncodesign -dvvv binary\ncodesign --verify binary\n\n# Extract entitlements (capability permissions)\ncodesign -d --entitlements - binary\n# Key entitlements: com.apple.security.app-sandbox, com.apple.security.network.client\n\n# Remove code signature (for patching)\ncodesign --remove-signature binary\n\n# Re-sign (ad-hoc, for testing)\ncodesign -f -s - binary\n```\n\n**CTF relevance:** Patched binaries need re-signing to run on macOS. Ad-hoc signing (`-s -`) works for local testing.\n\n### Objective-C Runtime RE\n\n```bash\n# Dump Objective-C class info\nclass-dump binary > classes.h\n# Shows: @interface, @protocol, method signatures with types\n\n# Runtime inspection with lldb\n(lldb) expression -l objc -O -- [NSClassFromString(@\"ClassName\") new]\n(lldb) expression -l objc -O -- [[ClassName alloc] init]\n\n# Method swizzling detection (anti-tamper)\n# Look for: method_exchangeImplementations, class_replaceMethod\n```\n\n**Objective-C in disassembly:**\n```text\n# objc_msgSend(receiver, selector, ...) is THE dispatch mechanism\n# RDI = self (receiver), RSI = selector (char* method name)\n\n# In Ghidra/IDA, look for:\nobjc_msgSend(obj, \"checkPassword:\", input)\n# Selector strings are in __objc_methname section\n# Cross-reference selectors to find implementations\n```\n\n**class-dump alternatives:**\n- `dsdump` — faster, supports Swift + Objective-C\n- `otool -oV binary` — dump Objective-C segments\n- Ghidra: Enable \"Objective-C\" analyzer in Analysis Options\n\n### Swift Binary Reversing\n\n```bash\n# Detect Swift\nstrings binary | grep \"swift\"\notool -l binary | grep \"swift\" # __swift5_* sections\n\n# Swift demangling\nswift demangle 's14MyApp0A8ClassC10checkInput6resultSbSS_tF'\n# → MyApp.MyAppClass.checkInput(result: String) -> Bool\n\n# xcrun swift-demangle \u003c mangled_names.txt\n```\n\n**Swift in disassembly:**\n```text\n# Swift uses value witness tables (VWT) for type operations\n# Protocol witness tables (PWT) for dynamic dispatch (like vtables)\n\n# Key runtime functions to watch:\nswift_allocObject → heap allocation\nswift_release → reference count decrement\nswift_bridgeObjectRetain → bridged (ObjC ↔ Swift) retain\nswift_once → lazy initialization (like dispatch_once)\n\n# String layout:\n# Small strings (≤15 bytes): inline in 16-byte buffer, tagged pointer\n# Large strings: heap-allocated, pointer + length + flags\n\n# Array\u003cT>: pointer to ContiguousArrayStorage (header + elements)\n# Dictionary\u003cK,V>: hash table with open addressing\n```\n\n**Ghidra for Swift:** Enable \"Swift\" language module. Swift metadata sections (`__swift5_types`, `__swift5_proto`) contain type descriptors that Ghidra can parse.\n\n### iOS App Analysis\n\n```bash\n# Extract IPA (iOS app package)\nunzip app.ipa -d extracted/\nls extracted/Payload/*.app/\n\n# Check if encrypted (App Store encryption / FairPlay DRM)\notool -l extracted/Payload/*.app/binary | grep -A4 \"LC_ENCRYPTION_INFO\"\n# cryptid = 1 means encrypted, 0 means decrypted\n\n# Decrypt with frida-ios-dump (requires jailbroken device)\n# Or use Clutch / bfdecrypt on device\nfrida-ios-dump -H jailbroken_ip -p 22 \"App Name\"\n\n# Analyze decrypted binary\nclass-dump decrypted_binary > headers.h\n```\n\n**Jailbreak detection and bypass:**\n```javascript\n// Common jailbreak checks:\n// 1. Check for Cydia/Sileo\n// 2. Check /private/var/lib/apt\n// 3. fork() succeeds (sandboxed apps can't fork)\n// 4. Open /etc/apt, /bin/sh with write\n// 5. Check for substrate/substitute libraries\n\n// Frida bypass:\nvar paths = [\"/Applications/Cydia.app\", \"/bin/sh\", \"/etc/apt\",\n \"/private/var/lib/apt\", \"/usr/bin/ssh\"];\nInterceptor.attach(Module.findExportByName(null, \"access\"), {\n onEnter(args) {\n this.path = Memory.readUtf8String(args[0]);\n },\n onLeave(retval) {\n if (paths.some(p => this.path && this.path.includes(p))) {\n retval.replace(-1); // File not found\n }\n }\n});\n```\n\n### dyld / Dynamic Linking\n\n```bash\n# DYLD environment variables (for analysis, blocked in hardened runtime)\nDYLD_PRINT_LIBRARIES=1 ./binary # Print loaded dylibs\nDYLD_INSERT_LIBRARIES=hook.dylib ./binary # Inject dylib (like LD_PRELOAD)\n# Note: SIP (System Integrity Protection) blocks this for system binaries\n\n# Inspect dyld shared cache (contains all system frameworks)\ndyld_shared_cache_util -list /System/Cryptexes/OS/System/Library/dyld/dyld_shared_cache_arm64e\n```\n\n---\n\n## Embedded / IoT Firmware RE\n\n### Firmware Extraction\n\n```bash\n# binwalk — firmware analysis and extraction\nbinwalk firmware.bin # Identify embedded filesystems, compressed data\nbinwalk -e firmware.bin # Extract all identified components\nbinwalk -Me firmware.bin # Recursive extraction (matryoshka)\nbinwalk --dd='.*' firmware.bin # Extract everything raw\n\n# Manual extraction by signature\nstrings firmware.bin | head -50 # Look for version strings, filesystem markers\nhexdump -C firmware.bin | grep \"hsqs\" # SquashFS magic\nhexdump -C firmware.bin | grep \"UBI#\" # UBI magic\n```\n\n**Hardware extraction methods (physical access):**\n```text\nUART: Serial console — often gives root shell or bootloader access\n Tools: USB-UART adapter, baudrate detection (usually 115200)\n Identify: 4 pins (GND, TX, RX, VCC), use multimeter\n\nJTAG: Direct CPU debug — read/write flash, halt CPU, set breakpoints\n Tools: OpenOCD, J-Link, Bus Pirate\n Identify: 10/14/20-pin header, use JTAGulator for auto-detection\n\nSPI Flash: Direct chip read — dump entire firmware\n Tools: flashrom, CH341A programmer\n Identify: 8-pin SOIC chip (Winbond, Macronix, etc.)\n\neMMC: Embedded MMC — common in routers, phones\n Tools: eMMC reader, direct solder to test pads\n```\n\n### Firmware Unpacking\n\n```bash\n# SquashFS (most common in routers)\nunsquashfs -d output/ squashfs-root.sqfs\n# If custom compression: try different compressors (-comp xz|lzma|lzo|gzip)\n\n# JFFS2\njefferson -d output/ jffs2.img\n\n# UBI/UBIFS\nubireader_extract_images firmware.ubi\nubireader_extract_files ubifs.img\n\n# CPIO (initramfs)\ncpio -idv \u003c initramfs.cpio\n\n# Device tree blob\ndtc -I dtb -O dts -o output.dts device_tree.dtb\n\n# Kernel extraction\nbinwalk -e firmware.bin\n# Look for: zImage, uImage, vmlinux\n# Extract vmlinux from compressed: vmlinux-to-elf tool\n```\n\n### Architecture-Specific Notes\n\n**ARM (most common in IoT):**\n```bash\n# Cross-toolchain\napt install gcc-arm-linux-gnueabihf gdb-multiarch\n\n# QEMU emulation\nqemu-arm -L /usr/arm-linux-gnueabihf/ ./arm_binary\nqemu-arm -g 1234 ./arm_binary # Start GDB server on port 1234\ngdb-multiarch -ex 'target remote :1234' ./arm_binary\n\n# ARM vs Thumb: ARM instructions are 4 bytes, Thumb are 2 bytes\n# LSB of function pointer indicates mode: 0=ARM, 1=Thumb\n# Ghidra: Right-click → Processor Options → ARM/Thumb mode\n```\n\n**ARM64/AArch64:** See [platforms-hardware.md](platforms-hardware.md#arm64aarch64-reversing-and-exploitation) for AArch64 calling convention, ROP gadgets, and qemu-aarch64-static emulation.\n\n**MIPS (routers, embedded):**\n```bash\n# Big-endian vs little-endian — check ELF header or file command\nfile binary # \"MIPS, MIPS32 rel2 (MIPS-II), big-endian\" or \"little-endian\"\n\n# Emulation\nqemu-mips -L /usr/mips-linux-gnu/ ./mips_binary # Big-endian\nqemu-mipsel -L /usr/mipsel-linux-gnu/ ./mipsel_binary # Little-endian\n\n# Key MIPS patterns:\n# Branch delay slots — instruction AFTER branch always executes\n# $gp (global pointer) — used for PIC, points to .got\n# lui + addiu pair — loads 32-bit constant (upper 16 + lower 16)\n```\n\n**RISC-V:** See main [tools.md](tools.md#risc-v-binary-analysis-ehax-2026) for Capstone disassembly and [platforms-hardware.md](platforms-hardware.md#risc-v-advanced) for advanced extensions and debugging.\n\n### RTOS Analysis\n\n```text\nFreeRTOS:\n - Tasks (like threads): xTaskCreate → function pointer + stack\n - Strings: \"IDLE\", \"Tmr Svc\", task names\n - xQueueSend/xQueueReceive → inter-task communication\n - Look for vTaskDelay() for timing, xSemaphoreTake() for sync\n\nZephyr:\n - k_thread_create → kernel thread creation\n - k_msgq_put/k_msgq_get → message queues\n - CONFIG_* symbols reveal kernel configuration\n\nBare metal (no OS):\n - Interrupt vector table at address 0x0 or 0x08000000 (STM32)\n - main loop pattern: while(1) { read_input(); process(); output(); }\n - Peripheral registers at memory-mapped addresses (check datasheet)\n```\n\n---\n\n## Kernel Driver Reversing\n\n### Linux Kernel Modules\n\n```bash\n# Identify kernel module\nfile module.ko # \"ELF 64-bit LSB relocatable\"\nmodinfo module.ko # Module info (description, author, license)\n\n# List module symbols\nnm module.ko | grep -v \" U \" # Exported symbols\n\n# Strings for quick recon\nstrings module.ko | grep -i \"flag\\|secret\\|ioctl\\|device\"\n\n# Find ioctl handler\n# Key pattern: .unlocked_ioctl = my_ioctl_handler in file_operations struct\n# In Ghidra: find struct with function pointers, identify by position\n\n# Load in Ghidra\n# Language: x86:LE:64:default\n# Base address: doesn't matter for .ko (relocatable)\n# Look for init_module / cleanup_module entry points\n```\n\n**Common kernel module CTF patterns:**\n```c\n// Device creation (creates /dev/challenge)\nalloc_chrdev_region(&dev, 0, 1, \"challenge\");\ncdev_init(&cdev, &fops);\n\n// ioctl handler (main interface)\nlong my_ioctl(struct file *f, unsigned int cmd, unsigned long arg) {\n switch (cmd) {\n case CUSTOM_CMD_1: /* operation */ break;\n case CUSTOM_CMD_2: /* operation */ break;\n }\n}\n\n// copy_from_user / copy_to_user — data transfer with userspace\ncopy_from_user(kernel_buf, (void __user *)arg, size);\ncopy_to_user((void __user *)arg, kernel_buf, size);\n```\n\n**Debugging kernel modules:**\n```bash\n# QEMU + GDB for kernel debugging\nqemu-system-x86_64 -kernel bzImage -initrd initrd.cpio -s -S \\\n -append \"console=ttyS0 nokaslr\" -nographic\n\n# In another terminal\ngdb vmlinux\n(gdb) target remote :1234\n(gdb) lx-symbols # Load module symbols (requires scripts)\n(gdb) add-symbol-file module.ko 0x\u003cloaded_address>\n```\n\n### eBPF Programs\n\n```bash\n# Dump eBPF programs from running system\nbpftool prog list\nbpftool prog dump xlated id \u003cN> # Disassemble\nbpftool prog dump jited id \u003cN> # JIT'd machine code\n\n# eBPF bytecode analysis\n# eBPF has 11 registers (r0-r10), 64-bit\n# r0 = return value, r1-r5 = arguments, r10 = frame pointer\n# Instructions are 8 bytes each\n\n# Disassemble .o file containing eBPF\nllvm-objdump -d ebpf_prog.o\n\n# Key eBPF patterns:\n# bpf_map_lookup_elem → read from map\n# bpf_map_update_elem → write to map\n# bpf_probe_read → read kernel memory\n# bpf_trace_printk → debug output\n```\n\n### Windows Kernel Drivers\n\n```bash\n# .sys files are PE format — load in IDA/Ghidra as normal PE\n# Entry point: DriverEntry(PDRIVER_OBJECT, PUNICODE_STRING)\n\n# Key patterns:\n# IoCreateDevice → creates device object\n# IRP_MJ_DEVICE_CONTROL → ioctl handler\n# MmMapIoSpace → memory-mapped I/O\n# ObReferenceObjectByHandle → get kernel object from handle\n# ZwCreateFile/ZwReadFile → kernel-mode file operations\n```\n\n---\n\n## Game Engine Reversing\n\n### Unreal Engine\n\n```bash\n# Pak file extraction\n# UnrealPakTool or quickbms with unreal_tournament_4.bms\nunrealpak.exe extract GameName.pak -output extracted/\n\n# UE4/UE5 asset formats:\n# .uasset — serialized UObject (meshes, textures, blueprints)\n# .umap — level/map data\n# .ushaderbytecode — compiled shader\n# FModel (https://fmodel.app/) — GUI asset viewer/extractor\n```\n\n**Blueprint reversing:**\n```text\nBlueprints compile to bytecode in .uasset files.\n- UAssetGUI / FModel to browse Blueprint assets\n- Kismet bytecode → visual scripting logic\n- Look for: K2_SetTimer, DoOnce, Branch, Custom Events\n- Flag logic often in Blueprint event graphs, not C++\n```\n\n**UE4/UE5 C++ reversing:**\n```bash\n# Key engine classes:\n# UObject → base class for everything\n# AActor → entities in the world\n# UGameInstance → game state\n# APlayerController → player input handling\n\n# Reflection system — UCLASS(), UPROPERTY(), UFUNCTION() macros\n# Generates metadata accessible at runtime\n# In Ghidra: look for UClass::StaticClass() calls → type identification\n\n# String handling: FString (UTF-16), FName (hashed identifier), FText (localized)\n# In memory: FString = {TCHAR* Data, int32 ArrayNum, int32 ArrayMax}\n```\n\n### Unity (Beyond IL2CPP)\n\nSee [languages.md](languages.md#unity-il2cpp-games) for IL2CPP basics.\n\n**Mono-based Unity (not IL2CPP):**\n```bash\n# Managed assemblies in Data/Managed/ directory\n# Assembly-CSharp.dll contains game logic\ndnspy Assembly-CSharp.dll # Full decompilation + debugging\nilspy Assembly-CSharp.dll # Decompilation only\n\n# Common Unity patterns:\n# MonoBehaviour.Start() → initialization\n# MonoBehaviour.Update() → per-frame logic\n# PlayerPrefs.GetString(\"key\") → stored data\n# SceneManager.LoadScene(\"level\") → scene transitions\n```\n\n**Unity asset extraction:**\n```bash\n# AssetStudio — extract textures, models, audio, scripts\n# AssetRipper — comprehensive Unity asset extraction\n# UABE (Unity Asset Bundle Extractor) — low-level asset editing\n\n# Search for flags in:\n# - Text assets (.txt, .json)\n# - TextMesh / UI Text components\n# - Shader source code\n# - ScriptableObject assets\n# - PlayerPrefs save files\n```\n\n### Anti-Cheat Analysis\n\n```text\nFor CTF challenges involving game anti-cheat:\n\nEasyAntiCheat (EAC):\n- Kernel driver (EasyAntiCheat_EOS.sys)\n- User-mode module injected into game\n- Integrity checks on game memory\n- Bypass: kernel-level memory R/W (for research only)\n\nBattlEye:\n- BEService.exe → BEClient.dll injected\n- Communication via encrypted channel\n- Screenshot capture, process scanning\n- Module: BEClient2.dll\n\nValve Anti-Cheat (VAC):\n- User-mode only (no kernel driver)\n- Module hashing, memory scanning\n- Network-based detection (server-side)\n- Delayed bans (not immediate)\n\nCTF approach:\n1. Identify which anti-cheat (strings, loaded modules)\n2. For CTF: usually need to bypass specific check, not full anti-cheat\n3. Memory patching: find game state in memory, modify values\n4. Save file manipulation: often easier than runtime cheating\n```\n\n### Lua-Scripted Games\n\n```bash\n# Many games embed Lua for scripting\n# Look for: lua51.dll, luajit.dll, .lua files in assets\n\n# Luac bytecode decompilation\nluadec bytecode.luac > decompiled.lua # Lua 5.1-5.3\nunluac bytecode.luac > decompiled.lua # Alternative\n\n# LuaJIT bytecode\nluajit -bl bytecode.lua # Disassemble\n# ljd (LuaJIT decompiler): python3 ljd bytecode.lua\n\n# Embedded Lua: strings binary | grep \"lua_\\|luaL_\\|LUA_\"\n# Hook lua_pcall to intercept script execution\n```\n\n---\n\n## Automotive / CAN Bus RE\n\n```bash\n# CAN bus interface setup\nsudo ip link set can0 type can bitrate 500000\nsudo ip link set up can0\n\n# Capture CAN traffic\ncandump can0 # Live capture\ncandump -l can0 # Log to file\ncansniffer can0 # Filter/highlight changes\n\n# Replay CAN messages\ncanplayer -I logfile.log can0\ncansend can0 7DF#0201000000000000 # Send single frame (OBD-II request)\n\n# UDS (Unified Diagnostic Services) — common in automotive CTF\n# Service 0x27: Security Access (seed-key authentication)\n# Service 0x2E: Write Data By Identifier\n# Service 0x31: Routine Control\n\n# Decode CAN frames\n# ID: 11-bit or 29-bit identifier\n# DLC: Data Length Code (0-8 bytes)\n# Data: up to 8 bytes payload\n```\n\n**CTF automotive patterns:**\n- Seed-key bypass: Reverse the key derivation algorithm from ECU firmware\n- CAN message replay: Capture legitimate command, replay to unlock feature\n- Firmware extraction from ECU via UDS/KWP2000\n\n---\n\n## RISC-V QEMU Execution with GLIBC Symbol Version Patching (Pwn2Win 2018)\n\n**Pattern:** Challenge binary targets a RISC-V Debian that isn't available natively. Extract the Debian libc6 and dynamic linker, then patch the binary's required GLIBC symbol version (e.g., `GLIBC_2.25` → `GLIBC_2.27`) in a hex editor so the available libc can serve it. Run with `qemu-riscv64 -L \u003csysroot>`.\n\n```bash\nar x libc6_2.27-5_riscv64.deb && tar xf data.tar.xz\nsed 's@GLIBC_2.25@GLIBC_2.27@g' -i binary\n# Patch the symbol version hash too\nobjdump -p binary # note old hash\n# replace bytes with xxd / hexedit\nqemu-riscv64 -L ./sysroot ./binary\n```\n\n**Key insight:** `ld.so` symbol versioning is enforced by a hashed identifier next to each symbol. Patching both the string and its hash bypasses the check without rebuilding libc; `objdump -p` shows the hash slot to overwrite.\n\n**References:** Pwn2Win CTF 2018 — Too Slow, writeup 12501+\n\n---\n\n## APK Certificate SHA-256 as AES Key (ASIS Finals 2018)\n\n**Pattern:** Android app derives its AES key from `SHA-256(packageInfo.signatures[0].toByteArray())`, truncated to 16 bytes. Because the signing cert is embedded in the APK, the key is recoverable offline — no reverse engineering of native code required.\n\n```python\nfrom hashlib import sha256\nimport base64, zipfile\n\ncert = zipfile.ZipFile('app.apk').read('META-INF/CERT.RSA')\nkey = base64.b64encode(sha256(cert).digest())[:16]\n# Decrypt config resources with AES-ECB using this key\n```\n\n**Key insight:** \"Deterministic key from a public fingerprint\" is a recurring Android anti-pattern. Audit `getSignature`, `getPackageInfo`, and MessageDigest usages for this shape; the fix is to store a real secret server-side.\n\n**References:** ASIS CTF Finals 2018 — Gunshop, writeup 12420\n\n---\n\n## Moxie ISA Custom Opcode Discovery (SECCON 2018)\n\n**Pattern:** Binary is compiled for Moxie (obscure CPU architecture). `strings` on the ELF reveals a help banner declaring custom opcodes `SETRSEED (0x16)` and `GETRAND (0x17)`, with a non-standard xorshift32 implementation. Emulate those opcodes in Python to recover the PRNG stream, then XOR the ciphertext with the sequence.\n\n```python\ndef xorshift32(s):\n s ^= (s \u003c\u003c 13) & 0xffffffff\n s ^= (s >> 17)\n s ^= (s \u003c\u003c 15) & 0xffffffff # Note: *not* standard (\u003c\u003c 5)\n return s & 0xffffffff\n```\n\n**Key insight:** Obscure ISAs often lean on small custom additions to implement crypto inline. Grep the binary for human-readable opcode documentation; it's frequently left in the help text for challenge authors' own debugging.\n\n**References:** SECCON 2018 — Special Instructions, writeup 12001\n\n---\n\n## Unity APK Assembly-CSharp.dll Runtime Patch (SECCON 2018)\n\n**Pattern:** Unity game ships compiled C# game logic in `assets/bin/Data/Managed/Assembly-CSharp.dll`. Decompile with dnSpy/ILSpy, modify the `Update()` or `Start()` methods (e.g., remove rotation animations that hide the flag), recompile, repack the APK with `apktool`, resign with `jarsigner`, install on emulator.\n\n```bash\napktool d game.apk -o game_src\n# Replace game_src/assets/bin/Data/Managed/Assembly-CSharp.dll with patched version\napktool b game_src -o patched.apk\njarsigner -keystore debug.keystore -storepass android patched.apk androiddebugkey\nadb install -r patched.apk\n```\n\n**Key insight:** Unity games are \"just C#\" once you open the DLL. Any hidden flag rendered at the wrong angle or masked by an overlay is exposed by deleting the animation code.\n\n**References:** SECCON 2018 — block, writeup 12001\n\n---\n\n## Il2CppDumper for Unity IL2CPP Metadata Recovery (SECCON 2018)\n\n**Pattern:** Modern Unity games compile to IL2CPP (native code plus metadata) instead of shipping `Assembly-CSharp.dll`. Use `Il2CppDumper` against `libil2cpp.so` plus `assets/bin/Data/Managed/Metadata/global-metadata.dat` to recover pseudo-C#, then grep the output for endpoint strings, API paths, or crypto constants.\n\n```bash\nIl2CppDumper libil2cpp.so global-metadata.dat out/\ngrep -r \"https://\" out/ # find hardcoded endpoints\n```\n\n**Key insight:** IL2CPP looks opaque but the metadata file carries every type name, method name, and string literal. That is enough to reverse CTF-level logic without ever touching native disassembly.\n\n**References:** SECCON 2018 — shooter, writeup 12001\n","content_type":"text/markdown; charset=utf-8","language":"markdown","size":23005,"content_sha256":"e7592cabda16bfce9ab57c8f1f57bcc03ac1d68a725fe420643d8d3c54ca64f3"},{"filename":"tools-advanced-2.md","content":"# Advanced Reverse Engineering Tools (Part 2)\n\nAdvanced GDB scripting, Ghidra automation, patching frameworks, and CTF-specific GDB-driven techniques. Continuation of [tools-advanced.md](tools-advanced.md).\n\n## Table of Contents\n- [Advanced GDB Techniques](#advanced-gdb-techniques)\n - [Python Scripting](#python-scripting)\n - [Brute-Force with GDB Script](#brute-force-with-gdb-script)\n - [Conditional Breakpoints](#conditional-breakpoints)\n - [Watchpoints](#watchpoints)\n - [Reverse Debugging (rr)](#reverse-debugging-rr)\n - [GDB Dashboard / GEF / pwndbg](#gdb-dashboard--gef--pwndbg)\n- [Advanced Ghidra Scripting](#advanced-ghidra-scripting)\n- [Patching Strategies](#patching-strategies)\n - [Binary Ninja Patching (Python API)](#binary-ninja-patching-python-api)\n - [LIEF (Library for Instrumenting Executable Formats)](#lief-library-for-instrumenting-executable-formats)\n- [GDB Constraint Extraction with ILP/LP Solver (BackdoorCTF 2017)](#gdb-constraint-extraction-with-ilplp-solver-backdoorctf-2017)\n- [GDB Position-Encoded Input with Zero Flag Monitoring (EKOPARTY 2017)](#gdb-position-encoded-input-with-zero-flag-monitoring-ekoparty-2017)\n- [LD_PRELOAD to Dump Execute-Only Binary (BackdoorCTF 2017)](#ld_preload-to-dump-execute-only-binary-backdoorctf-2017)\n- [PEDA current_inst Bit-by-Bit Flag Scraper (CONFidence CTF 2019 Teaser)](#peda-current_inst-bit-by-bit-flag-scraper-confidence-ctf-2019-teaser)\n\n---\n\n## Advanced GDB Techniques\n\n### Python Scripting\n\n```python\n# ~/.gdbinit or source from GDB\nimport gdb\n\nclass TraceCompare(gdb.Breakpoint):\n \"\"\"Log all comparison operations.\"\"\"\n def __init__(self, addr):\n super().__init__(f\"*{addr}\", gdb.BP_BREAKPOINT)\n\n def stop(self):\n frame = gdb.selected_frame()\n rdi = int(frame.read_register(\"rdi\"))\n rsi = int(frame.read_register(\"rsi\"))\n rdx = int(frame.read_register(\"rdx\"))\n # Read compared buffers\n inferior = gdb.selected_inferior()\n buf1 = inferior.read_memory(rdi, rdx).tobytes()\n buf2 = inferior.read_memory(rsi, rdx).tobytes()\n print(f\"memcmp({buf1!r}, {buf2!r}, {rdx})\")\n return False # Don't stop, just log\n\n# Usage in GDB:\n# (gdb) source trace_cmp.py\n# (gdb) python TraceCompare(0x401234)\n```\n\n### Brute-Force with GDB Script\n\n```python\n# Byte-by-byte brute force via GDB Python API\nimport gdb, string\n\ndef bruteforce_flag(check_addr, success_addr, fail_addr, flag_len):\n flag = []\n for pos in range(flag_len):\n for ch in string.printable:\n candidate = ''.join(flag) + ch + 'A' * (flag_len - pos - 1)\n gdb.execute('start', to_string=True)\n gdb.execute(f'b *{check_addr}', to_string=True)\n # Write candidate to stdin pipe\n # ... (setup input)\n gdb.execute('continue', to_string=True)\n rip = int(gdb.parse_and_eval('$rip'))\n if rip == success_addr:\n flag.append(ch)\n break\n gdb.execute('delete breakpoints', to_string=True)\n return ''.join(flag)\n```\n\n### Conditional Breakpoints\n\n```bash\n# Break only when register has specific value\n(gdb) b *0x401234 if $rax == 0x41\n(gdb) b *0x401234 if *(char*)$rdi == 'f'\n\n# Break on Nth hit\n(gdb) b *0x401234\n(gdb) ignore 1 99 # Skip first 99 hits, break on 100th\n\n# Log without stopping\n(gdb) b *0x401234\n(gdb) commands\n> silent\n> printf \"rax=%lx rdi=%lx\\n\", $rax, $rdi\n> continue\n> end\n```\n\n### Watchpoints\n\n```bash\n# Hardware watchpoint — break when memory changes\n(gdb) watch *(int*)0x601050 # Break on write to address\n(gdb) rwatch *(int*)0x601050 # Break on read\n(gdb) awatch *(int*)0x601050 # Break on read or write\n\n# Watch a variable by name (needs debug symbols)\n(gdb) watch flag_buffer[0]\n\n# Conditional watchpoint\n(gdb) watch *(int*)0x601050 if *(int*)0x601050 == 0x42\n```\n\n### Reverse Debugging (rr)\n\n```bash\n# Record execution\nrr record ./binary\n# Replay with reverse execution support\nrr replay\n\n# In rr replay (GDB commands plus):\n(gdb) reverse-continue # Run backward to previous breakpoint\n(gdb) reverse-stepi # Step backward one instruction\n(gdb) reverse-next # Reverse next\n(gdb) when # Show current event number\n\n# Set checkpoint and return to it\n(gdb) checkpoint\n(gdb) restart 1 # Return to checkpoint 1\n```\n\n**Key use:** When you step past the critical moment, reverse back instead of restarting. Invaluable for anti-debug that corrupts state.\n\n### GDB Dashboard / GEF / pwndbg\n\n```bash\n# pwndbg (most popular for CTF)\n# https://github.com/pwndbg/pwndbg\ngit clone https://github.com/pwndbg/pwndbg && cd pwndbg && ./setup.sh\n\n# Key pwndbg commands:\npwndbg> context # Show registers, stack, code, backtrace\npwndbg> vmmap # Memory map (like /proc/self/maps)\npwndbg> search -s \"flag{\" # Search memory for string\npwndbg> telescope $rsp 20 # Smart stack dump\npwndbg> cyclic 200 # Generate De Bruijn pattern\npwndbg> hexdump $rdi 64 # Pretty hex dump\npwndbg> got # Show GOT entries\npwndbg> plt # Show PLT entries\n\n# GEF (alternative)\n# https://github.com/hugsy/gef\nbash -c \"$(curl -fsSL https://gef.blah.cat/sh)\"\n\n# Key GEF commands:\ngef> xinfo $rdi # Detailed info about address\ngef> checksec # Binary security features\ngef> heap chunks # Heap chunk listing\ngef> pattern create 100 # De Bruijn pattern\n```\n\n---\n\n## Advanced Ghidra Scripting\n\n```python\n# Ghidra Python (Jython) — run via Script Manager or headless\n\n# Batch rename functions matching a pattern\nfrom ghidra.program.model.symbol import SourceType\nfm = currentProgram.getFunctionManager()\nfor func in fm.getFunctions(True):\n if func.getName().startswith(\"FUN_\"):\n # Check if function contains specific instruction pattern\n body = func.getBody()\n inst_iter = currentProgram.getListing().getInstructions(body, True)\n for inst in inst_iter:\n if inst.getMnemonicString() == \"CPUID\":\n func.setName(\"anti_vm_check_\" + hex(func.getEntryPoint().getOffset()),\n SourceType.USER_DEFINED)\n break\n\n# Extract all XOR constants from a function\ndef extract_xor_constants(func):\n \"\"\"Find all XOR operations and their immediate operands.\"\"\"\n constants = []\n body = func.getBody()\n inst_iter = currentProgram.getListing().getInstructions(body, True)\n for inst in inst_iter:\n if inst.getMnemonicString() == \"XOR\":\n for i in range(inst.getNumOperands()):\n op = inst.getOpObjects(i)\n if op and hasattr(op[0], 'getValue'):\n constants.append(int(op[0].getValue()))\n return constants\n\n# Bulk decompile and search for pattern\nfrom ghidra.app.decompiler import DecompInterface\ndecomp = DecompInterface()\ndecomp.openProgram(currentProgram)\n\nfor func in fm.getFunctions(True):\n result = decomp.decompileFunction(func, 30, monitor)\n if result.depiledFunction():\n code = result.getDecompiledFunction().getC()\n if \"strcmp\" in code or \"memcmp\" in code:\n print(f\"Comparison in {func.getName()} at {func.getEntryPoint()}\")\n```\n\n---\n\n## Patching Strategies\n\n### Binary Ninja Patching (Python API)\n\n```python\nimport binaryninja as bn\n\nbv = bn.open_view(\"binary\")\n\n# NOP out instruction\nbv.write(0x401234, b\"\\x90\" * 5) # 5-byte NOP\n\n# Patch conditional jump (JNZ → JZ)\nbv.write(0x401234, b\"\\x74\") # 0x75 (JNZ) → 0x74 (JZ)\n\n# Insert always-true (mov eax, 1; ret)\nbv.write(0x401234, b\"\\xb8\\x01\\x00\\x00\\x00\\xc3\")\n\nbv.save(\"patched\")\n```\n\n### LIEF (Library for Instrumenting Executable Formats)\n\n```python\nimport lief\n\n# Parse and modify ELF/PE/Mach-O\nbinary = lief.parse(\"binary\")\n\n# Add a new section\nsection = lief.ELF.Section(\".patch\")\nsection.content = list(b\"\\xcc\" * 0x100)\nsection.type = lief.ELF.SECTION_TYPES.PROGBITS\nsection.flags = lief.ELF.SECTION_FLAGS.EXECINSTR | lief.ELF.SECTION_FLAGS.ALLOC\nbinary.add(section)\n\n# Modify entry point\nbinary.header.entrypoint = 0x401000\n\n# Hook imported function\nbinary.patch_pltgot(\"strcmp\", 0x401000)\n\nbinary.write(\"patched\")\n```\n\n**LIEF advantages:** Cross-format (ELF, PE, Mach-O), Python API, can add sections/segments, modify headers, patch imports.\n\n---\n\n## GDB Constraint Extraction with ILP/LP Solver (BackdoorCTF 2017)\n\nWhen a binary enforces linear arithmetic relationships between input bytes, extract constraints automatically via GDB and solve with an ILP solver.\n\n**Technique:** Send position-encoded input (`input[i] = i`) so that when a comparison fires, you know exactly which positions are involved and what their sum/difference must equal. Collect all constraints from logged comparisons, then feed to PuLP or Gurobi.\n\n```python\nfrom pulp import *\n\nn = 32 # flag length\nprob = LpProblem(\"crackme\", LpMinimize)\nx = [LpVariable(f'x{i}', 32, 126, cat='Integer') for i in range(n)]\nprob += 0 # dummy objective\n\n# Constraints extracted via GDB automation (input[i]=i, monitor comparisons):\nprob += x[3] + x[7] == 0xAB\nprob += x[1] - x[5] == 0x0C\n# ... add all extracted constraints ...\n\n# Constrain to printable ASCII\nfor xi in x:\n prob += xi >= 32\n prob += xi \u003c= 126\n\nprob.solve(PULP_CBC_CMD(msg=0))\nflag = ''.join(chr(int(value(xi))) for xi in x)\nprint(\"Flag:\", flag)\n```\n\n**GDB automation to extract constraints:**\n```python\n# In GDB Python: set input[i]=i, run, log every CMP instruction result\nimport gdb\n\nclass CmpLogger(gdb.Breakpoint):\n def stop(self):\n frame = gdb.selected_frame()\n # Read compared values, map back to input indices via position encoding\n return False\n```\n\n**Key insight:** When a binary enforces linear arithmetic relationships between input bytes, ILP solvers directly find the satisfying assignment once constraints are extracted via GDB automation.\n\n**References:** BackdoorCTF 2017\n\n---\n\n## GDB Position-Encoded Input with Zero Flag Monitoring (EKOPARTY 2017)\n\nSend input where `input[i] = i` (position-encoded). Single-step through the binary monitoring the CPU zero flag (ZF). When ZF is set at a comparison involving a specific position's value, the comparison matched — log the expected value for that position.\n\n```python\nimport gdb\n\n# Script: single-step binary with position-encoded input, watch ZF\nclass ZFMonitor(gdb.Breakpoint):\n def stop(self):\n zf = (int(gdb.parse_and_eval('$eflags')) >> 6) & 1\n if zf:\n rip = int(gdb.parse_and_eval('$rip'))\n # Disassemble at rip to find the compared immediate\n disasm = gdb.execute(f'x/1i {rip-5}', to_string=True)\n print(f\"ZF set at {rip:#x}: {disasm.strip()}\")\n return False\n\n# Run once with input b'\\x00\\x01\\x02\\x03...\\x1f'\n# ZF fires when comparison matches the position's own value -> that IS the key byte\n```\n\nMaps each input byte to its required value in one pass without manual reversing.\n\n**Key insight:** Position-encoded input (`input[i]=i`) combined with zero flag monitoring reveals the full key/password in one pass — the zero flag fires when the expected value for position i equals i itself.\n\n**References:** EKOPARTY CTF 2017\n\n---\n\n## LD_PRELOAD to Dump Execute-Only Binary (BackdoorCTF 2017)\n\nA binary has execute-only permissions (mode `--x`, no read bit). The file cannot be read directly or with standard tools, but the kernel still maps it into memory on execution.\n\nLD_PRELOAD a shared library with a constructor that runs inside the process and reads its own memory via `/proc/self/mem`:\n\n```c\n// dump_xo.c — compile: gcc -shared -fPIC -o dump_xo.so dump_xo.c\n#include \u003cstdio.h>\n#include \u003cstdlib.h>\n#include \u003cstring.h>\n\n__attribute__((constructor)) void dump() {\n FILE *maps = fopen(\"/proc/self/maps\", \"r\");\n char line[256];\n unsigned long base = 0, end = 0;\n\n // Find the execute-only binary's mapping (r-xp or --xp)\n while (fgets(line, sizeof(line), maps)) {\n if (strstr(line, \"binary_name\")) {\n sscanf(line, \"%lx-%lx\", &base, &end);\n break;\n }\n }\n fclose(maps);\n\n FILE *mem = fopen(\"/proc/self/mem\", \"rb\");\n fseek(mem, base, SEEK_SET);\n size_t size = end - base;\n void *buf = malloc(size);\n fread(buf, 1, size, mem);\n fclose(mem);\n\n FILE *out = fopen(\"/tmp/dumped_binary\", \"wb\");\n fwrite(buf, 1, size, out);\n fclose(out);\n}\n// Usage: LD_PRELOAD=./dump_xo.so ./binary_xo\n```\n\n**Key insight:** Execute-only prevents file reading but not execution. LD_PRELOAD constructors run inside the process where `/proc/self/mem` provides access to mapped memory regardless of file permissions.\n\n**References:** BackdoorCTF 2017\n\n---\n\n## PEDA current_inst Bit-by-Bit Flag Scraper (CONFidence CTF 2019 Teaser)\n\n**Pattern (Elementary):** A large obfuscated validator dispatches one `call functionN` per flag bit. Each check reads `flag[offset] >> bit & 1`, calls an opaque `functionN(that_bit)`, and compares the return with a constant. Rather than decompile hundreds of wrapper functions, drive GDB+PEDA to step through the dispatcher and read `edi` (argument) / `eax` (return) directly:\n\n```python\n# peda.current_inst(rip) returns (addr, mnemonic_str); use it as a cheap\n# disassembler to track the shift/add offsets passed to each checker.\ndef get_current_inst():\n return peda.current_inst(peda.getreg(\"rip\"))[1]\n\npeda.execute('file ./elementary')\npeda.set_breakpoint(0x555555554000 + 0xCEB88)\npeda.execute(\"run \u003c _input\") # _input = 'A' * 103 (length only)\n\nflag = ['0'] * 832 # 104 bytes * 8 bits\nwhile peda.getreg(\"rip\") \u003c 0x555555554000 + 0xD827F:\n offset = bit = 0\n while 'and' != get_current_inst()[:3]:\n cur = get_current_inst()\n if cur == 'mov rax,QWORD PTR [rbp-0x18]':\n offset = 0; bit = 0\n elif cur.startswith('sar'):\n bit = int(cur.split(',')[1], 16) # shift count -> bit index\n elif cur.startswith('add'):\n offset = int(cur.split(',')[1], 16) # byte index in flag\n peda.execute('si')\n while 'call' not in get_current_inst(): peda.execute('si')\n tmp = peda.getreg('edi')\n peda.execute('ni')\n ret = peda.getreg('eax')\n # Oracle: if return == arg the checker voted \"bit=0\", else \"bit=1\"\n flag[offset * 8 + bit] = '0' if (ret != 0 and tmp == ret) else '1'\n peda.execute('set $eax=0') # neutralise so loop continues\n```\n\n**Key insight:** Any validator of the form `f_i(bit_i) == const_i` is a black-box oracle — you do not need to understand `f_i`. PEDA's `current_inst()` + `si`/`ni` give a 30-line Python scraper that harvests all bits in one run; parsing the preceding `sar imm` / `add imm` instructions recovers `(byte_offset, bit_index)` without disassembling the validator's arithmetic.\n\n**References:** CONFidence CTF 2019 Teaser — Elementary, writeup 13927\n","content_type":"text/markdown; charset=utf-8","language":"markdown","size":14829,"content_sha256":"2e7bcf73790eaa0f2fdaa289155704bfdb0ca899e617267811b04cd62b307eea"},{"filename":"tools-advanced.md","content":"# CTF Reverse - Advanced Tools & Deobfuscation\n\nAdvanced tooling for commercial packers/protectors, binary diffing, deobfuscation frameworks, emulation, and symbolic execution beyond angr.\n\nFor advanced GDB scripting, Ghidra automation, patching frameworks, and GDB-driven CTF techniques, see [tools-advanced-2.md](tools-advanced-2.md).\n\n## Table of Contents\n- [VMProtect Analysis](#vmprotect-analysis)\n - [Recognition](#recognition)\n - [Approach](#approach)\n - [Tools](#tools)\n - [CTF Strategy](#ctf-strategy)\n- [Themida / WinLicense Analysis](#themida--winlicense-analysis)\n - [Themida Recognition](#themida-recognition)\n - [Approach for CTF](#approach-for-ctf)\n- [Binary Diffing](#binary-diffing)\n - [BinDiff](#bindiff)\n - [Diaphora](#diaphora)\n- [Deobfuscation Frameworks](#deobfuscation-frameworks)\n - [D-810 (IDA)](#d-810-ida)\n - [GOOMBA (Ghidra)](#goomba-ghidra)\n - [Miasm](#miasm)\n- [Qiling Framework (Emulation)](#qiling-framework-emulation)\n- [Triton (Dynamic Symbolic Execution)](#triton-dynamic-symbolic-execution)\n- [Manticore (Symbolic Execution)](#manticore-symbolic-execution)\n- [Rizin / Cutter](#rizin--cutter)\n- [RetDec (Retargetable Decompiler)](#retdec-retargetable-decompiler)\n- [Custom VM Bytecode Lifting to LLVM IR (Google CTF 2017)](#custom-vm-bytecode-lifting-to-llvm-ir-google-ctf-2017)\n\n---\n\n## VMProtect Analysis\n\nVMProtect virtualizes x86/x64 code into custom bytecode interpreted by a generated VM. One of the most challenging protectors in CTF.\n\n### Recognition\n\n```bash\n# VMProtect signatures\nstrings binary | grep -i \"vmp\\|vmprotect\"\n# PE sections: .vmp0, .vmp1 (VMProtect adds its own sections)\nreadelf -S binary | grep \".vmp\"\n# Large binary with entropy > 7.5 in certain sections\n```\n\n**Key indicators:**\n- `push` / `pop` heavy prologues (VM entry pushes all registers to stack)\n- Large switch-case dispatcher (the VM handler loop)\n- Anti-debug checks embedded in VM handlers\n- Mutation engine: same opcode has different handlers per build\n\n### Approach\n\n```text\n1. Identify VM entry points — look for pushad/pushaq-like sequences\n2. Find the handler table — large indirect jump (jmp [reg + offset])\n3. Trace handler execution — each handler ends with jump to next\n4. Identify handlers:\n - vAdd, vSub, vMul, vXor, vNot (arithmetic)\n - vPush, vPop (stack operations)\n - vLoad, vStore (memory access)\n - vJmp, vJcc (control flow)\n - vRet (VM exit — restores real registers)\n5. Build disassembler for VM bytecode\n6. Simplify / deobfuscate the lifted IL\n```\n\n### Tools\n\n- **VMPAttack** (IDA plugin): Automatically identifies VM handlers\n- **NoVmp**: Devirtualization via VTIL (open-source)\n- **VMProtect devirtualizer scripts**: Community IDA/Binary Ninja scripts\n- **Approach for CTF:** Often easier to trace specific operations (crypto, comparisons) than fully devirtualize\n\n### CTF Strategy\n\n```python\n# Trace VM execution dynamically to extract operations on flag\n# Hook VM handler dispatch to log opcode + operands\n\nimport frida\n\nscript = \"\"\"\nvar vm_dispatch = ptr('0x...'); // Address of handler table jump\nInterceptor.attach(vm_dispatch, {\n onEnter(args) {\n // Log handler index and stack state\n var handler_idx = this.context.rax; // or whichever register\n console.log('Handler:', handler_idx, 'RSP:', this.context.rsp);\n }\n});\n\"\"\"\n```\n\n**Key insight:** Full devirtualization is rarely needed for CTF. Focus on tracing what operations are performed on your input. Hook comparison/crypto functions called from within the VM.\n\n---\n\n## Themida / WinLicense Analysis\n\nSimilar to VMProtect but with additional anti-debug layers.\n\n### Themida Recognition\n- Sections: `.themida`, `.winlice`\n- Extremely heavy anti-debug (kernel-level checks, driver installation)\n- Code mutation + virtualization + packing combined\n\n### Approach for CTF\n1. **Dump unpacked code:** Let it run, dump process memory after unpacking\n2. **Bypass anti-debug:** ScyllaHide in x64dbg with Themida-specific preset\n3. **Fix imports:** Use Scylla plugin for IAT reconstruction\n4. **Focus on dumped code:** Once unpacked, analyze as normal binary\n\n```bash\n# x64dbg workflow for Themida:\n1. Load binary\n2. Enable ScyllaHide → Profile: Themida\n3. Run to OEP (Original Entry Point) — may need several attempts\n4. Dump with Scylla: OEP → IAT Autosearch → Get Imports → Dump\n5. Fix dump: Scylla → Fix Dump\n6. Analyze fixed dump in Ghidra/IDA\n```\n\n---\n\n## Binary Diffing\n\nCritical for patch analysis, 1-day exploit development, and CTF challenges that provide two versions of a binary.\n\n### BinDiff\n\n```bash\n# Export from IDA/Ghidra first, then diff\n# IDA: File → BinExport → Export as BinExport2\n# Ghidra: Use BinExport plugin\n\n# Command line diffing\nbindiff primary.BinExport secondary.BinExport\n# Opens in BinDiff GUI — shows matched/unmatched functions\n```\n\n**Key metrics:**\n- Similarity score (0.0-1.0) per function pair\n- Changed instructions highlighted\n- Unmatched functions = new/removed code\n\n### Diaphora\n\nFree, open-source alternative to BinDiff, runs as IDA plugin.\n\n```bash\n# In IDA:\n# File → Script file → diaphora.py\n# Export first binary, then open second and diff\n\n# Ghidra version: diaphora_ghidra.py\n```\n\n**Useful for CTF:** When challenge provides \"patched\" and \"original\" binaries, diff reveals the vulnerability or hidden functionality.\n\n---\n\n## Deobfuscation Frameworks\n\n### D-810 (IDA)\n\nPattern-based deobfuscation plugin for IDA Pro. Excellent for OLLVM-obfuscated binaries.\n\n```text\nCapabilities:\n- MBA simplification: (a ^ b) + 2*(a & b) → a + b\n- Dead code elimination\n- Opaque predicate removal\n- Constant folding\n- Control flow unflattening (partial)\n\nInstallation: Copy to IDA plugins directory\nUsage: Edit → Plugins → D-810 → Select rules → Apply\n```\n\n### GOOMBA (Ghidra)\n\n```text\nGOOMBA (Ghidra-based Obfuscated Object Matching and Bytes Analysis):\n- Integrates with Ghidra's P-Code\n- Simplifies MBA expressions\n- Pattern matching for known obfuscation\n\nInstallation: Copy .jar to Ghidra extensions\nUsage: Code Browser → Analysis → GOOMBA\n```\n\n### Miasm\n\nPowerful reverse engineering framework with symbolic execution and IR lifting.\n\n```python\nfrom miasm.analysis.binary import Container\nfrom miasm.analysis.machine import Machine\nfrom miasm.expression.expression import *\n\n# Load binary and lift to Miasm IR\ncont = Container.from_stream(open(\"binary\", \"rb\"))\nmachine = Machine(cont.arch)\nmdis = machine.dis_engine(cont.bin_stream, loc_db=cont.loc_db)\n\n# Disassemble function\nasmcfg = mdis.dis_multiblock(entry_addr)\n\n# Lift to IR\nlifter = machine.lifter_model_call(loc_db=cont.loc_db)\nircfg = lifter.new_ircfg_from_asmcfg(asmcfg)\n\n# Symbolic execution\nfrom miasm.ir.symbexec import SymbolicExecutionEngine\nsb = SymbolicExecutionEngine(lifter)\n# Execute symbolically, then simplify expressions\n```\n\n**Use case:** Deobfuscate expression trees, simplify complex arithmetic, trace data flow through obfuscated code.\n\n---\n\n## Qiling Framework (Emulation)\n\nCross-platform emulation framework built on Unicorn, with OS-level support (syscalls, filesystem, registry).\n\n```python\nfrom qiling import Qiling\nfrom qiling.const import QL_VERBOSE\n\n# Emulate Linux ELF\nql = Qiling([\"./binary\"], \"rootfs/x8664_linux\",\n verbose=QL_VERBOSE.DEBUG)\n\n# Hook specific address\[email protected]_address\ndef hook_check(ql, address, size):\n if address == 0x401234:\n ql.arch.regs.rax = 0 # Bypass check\n ql.log.info(\"Anti-debug bypassed\")\n\n# Hook syscall\[email protected]_syscall(name=\"ptrace\")\ndef hook_ptrace(ql, request, pid, addr, data):\n return 0 # Always succeed\n\n# Hook API (Windows)\[email protected]_api(\"IsDebuggerPresent\", target=ql.os.user_defined_api)\ndef hook_isdebug(ql, address, params):\n return 0\n\nql.run()\n```\n\n**Advantages over Unicorn:**\n- OS emulation (file I/O, network, registry)\n- Multi-platform (Linux, Windows, macOS, Android, UEFI)\n- Built-in debugger interface\n- Rootfs for library loading\n\n**CTF use cases:**\n- Emulate binaries for foreign architectures (ARM, MIPS, RISC-V)\n- Bypass all anti-debug at once (no debugger artifacts)\n- Fuzz embedded/IoT firmware without hardware\n- Trace execution without code modification\n\n---\n\n## Triton (Dynamic Symbolic Execution)\n\nPin-based dynamic binary analysis framework with symbolic execution, taint analysis, and AST simplification.\n\n```python\nfrom triton import *\n\nctx = TritonContext(ARCH.X86_64)\n\n# Load binary sections\nwith open(\"binary\", \"rb\") as f:\n binary = f.read()\nctx.setConcreteMemoryAreaValue(0x400000, binary)\n\n# Symbolize input\nfor i in range(32):\n ctx.symbolizeMemory(MemoryAccess(INPUT_ADDR + i, CPUSIZE.BYTE), f\"input_{i}\")\n\n# Emulate instructions\npc = ENTRY_POINT\nwhile pc:\n inst = Instruction(pc, ctx.getConcreteMemoryAreaValue(pc, 16))\n ctx.processing(inst)\n\n # At comparison point, extract path constraint\n if pc == CMP_ADDR:\n ast = ctx.getPathConstraintsAst()\n model = ctx.getModel(ast)\n for k, v in sorted(model.items()):\n print(f\"input[{k}] = {chr(v.getValue())}\", end=\"\")\n break\n\n pc = ctx.getConcreteRegisterValue(ctx.registers.rip)\n```\n\n**Triton vs angr:**\n| Feature | Triton | angr |\n|---|---|---|\n| Execution | Concrete + symbolic (DSE) | Fully symbolic |\n| Speed | Faster (concrete-driven) | Slower (explores all paths) |\n| Path explosion | Less prone (follows one path) | Major issue |\n| API | C++ / Python | Python |\n| Best for | Single-path deobfuscation, taint tracking | Multi-path exploration |\n\n**Key use:** Triton excels at deobfuscation — run the program concretely, but track symbolic state, then simplify the collected constraints.\n\n---\n\n## Manticore (Symbolic Execution)\n\nTrail of Bits' symbolic execution tool. Similar to angr but with native EVM (Ethereum) support.\n\n```python\nfrom manticore.native import Manticore\n\nm = Manticore(\"./binary\")\n\n# Hook success/failure\[email protected](0x401234)\ndef success(state):\n buf = state.solve_one_n_batched(state.input_symbols, 32)\n print(\"Flag:\", bytes(buf))\n m.kill()\n\[email protected](0x401256)\ndef fail(state):\n state.abandon()\n\nm.run()\n```\n\n**Best for:** EVM/smart contract analysis, simpler Linux binaries. angr is generally more mature for complex RE tasks.\n\n---\n\n## Rizin / Cutter\n\nRizin is the maintained fork of radare2. Cutter is its Qt-based GUI.\n\n```bash\n# Rizin CLI (r2-compatible commands)\nrizin -d ./binary\n> aaa # Analyze all\n> afl # List functions\n> pdf @ main # Print disassembly\n> VV # Visual graph mode\n\n# Cutter GUI\ncutter binary # Open in GUI with decompiler\n```\n\n**Cutter advantages:**\n- Built-in Ghidra decompiler (via r2ghidra plugin)\n- Graph view, hex editor, debug panel in one GUI\n- Integrated Python/JavaScript scripting console\n- Free and open source\n\n---\n\n## RetDec (Retargetable Decompiler)\n\nLLVM-based decompiler supporting many architectures. Free and open-source.\n\n```bash\n# Install\npip install retdec-decompiler\n# Or use web: https://retdec.com/decompilation/\n\n# CLI\nretdec-decompiler binary\n# Outputs: binary.c (decompiled C), binary.dsm (disassembly)\n\n# Specific function\nretdec-decompiler --select-ranges 0x401000-0x401100 binary\n```\n\n**Strengths:** Multi-arch support (x86, ARM, MIPS, PowerPC, PIC32), free, produces compilable C. Good for architectures not well-supported by Ghidra.\n\n---\n\n## Custom VM Bytecode Lifting to LLVM IR (Google CTF 2017)\n\nFor complex custom VMs, transpile the VM bytecode to LLVM IR and use LLVM's optimization passes to simplify the code, then decompile the optimized IR.\n\n```python\n# Pipeline: VM bytecode → custom disassembler → LLVM IR → optimize → decompile\n# 1. Write disassembler for the custom VM opcodes\n# 2. Emit LLVM IR for each opcode:\n# INC reg → %reg = add i32 %reg, 1\n# CDEC reg → conditional decrement\n# CALL fn → call void @fn()\n# 3. Use MCJIT or llc to optimize:\n# opt -O3 -S vm_lifted.ll -o vm_optimized.ll\n# 4. Load optimized IR in IDA or decompile with RetDec\n# Result: 1300 lines → 150 lines after inlining + constant folding\n```\n\n**Key insight:** LLVM's optimization passes (inlining, constant folding, dead code elimination) dramatically simplify lifted VM bytecode. A custom VM with 26 registers and 3 opcodes that produces 1300 lines of IL reduces to ~150 lines after `-O3`, revealing the underlying algorithm (e.g., Collatz sequence computation).\n","content_type":"text/markdown; charset=utf-8","language":"markdown","size":12375,"content_sha256":"8ac3db030d0d6daf366de53005d12d9720b952ae8aa6216c97926111d8fa0b36"},{"filename":"tools-dynamic.md","content":"# CTF Reverse - Dynamic Analysis Tools\n\n## Table of Contents\n- [Frida (Dynamic Instrumentation)](#frida-dynamic-instrumentation)\n - [Installation](#installation)\n - [Basic Function Hooking](#basic-function-hooking)\n - [Anti-Debug Bypass](#anti-debug-bypass)\n - [Memory Scanning and Patching](#memory-scanning-and-patching)\n - [Function Replacement](#function-replacement)\n - [Tracing and Stalker](#tracing-and-stalker)\n - [r2frida (Radare2 + Frida Integration)](#r2frida-radare2--frida-integration)\n - [Frida for Android/iOS](#frida-for-androidios)\n - [Frida Memoization for Recursive Function Speedup (hxp CTF 2017)](#frida-memoization-for-recursive-function-speedup-hxp-ctf-2017)\n- [angr (Symbolic Execution)](#angr-symbolic-execution)\n - [angr Installation](#angr-installation)\n - [Basic Path Exploration](#basic-path-exploration)\n - [Symbolic Input with Constraints](#symbolic-input-with-constraints)\n - [Hook Functions to Simplify Analysis](#hook-functions-to-simplify-analysis)\n - [Exploring from Specific Address](#exploring-from-specific-address)\n - [Common Patterns and Tips](#common-patterns-and-tips)\n - [Dealing with Path Explosion](#dealing-with-path-explosion)\n - [angr CFG Recovery](#angr-cfg-recovery)\n- [lldb (LLVM Debugger)](#lldb-llvm-debugger)\n - [Basic Commands](#basic-commands)\n - [Scripting (Python)](#scripting-python)\n- [x64dbg (Windows Debugger)](#x64dbg-windows-debugger)\n - [Key Features](#key-features)\n - [Scripting](#scripting)\n - [Common CTF Workflow](#common-ctf-workflow)\n- [GDB Register Side-Channel on putchar() (picoCTF 2018)](#gdb-register-side-channel-on-putchar-picoctf-2018)\n- [radare2 Visual Panels for Custom VM Tracing (OTW Advent 2018)](#radare2-visual-panels-for-custom-vm-tracing-otw-advent-2018)\n- [libSegFault.so Register Dump at Crash (OTW Advent 2018)](#libsegfaultso-register-dump-at-crash-otw-advent-2018)\n- [r2pipe Binary Walking + DP Constraint Solver (OTW Advent 2018)](#r2pipe-binary-walking--dp-constraint-solver-otw-advent-2018)\n- [GDB Commands at strcmp to Recover Dynamic XOR Key (TAMUctf 2019)](#gdb-commands-at-strcmp-to-recover-dynamic-xor-key-tamuctf-2019)\n\nFor Qiling/Triton emulation and Intel Pin / LD_PRELOAD side-channel techniques, see [tools-emulation.md](tools-emulation.md).\n\n---\n\n## Frida (Dynamic Instrumentation)\n\nFrida injects JavaScript into running processes for real-time hooking, tracing, and modification. Essential for anti-debug bypass, runtime inspection, and mobile RE.\n\n### Installation\n\n```bash\npip install frida-tools frida\n# Verify\nfrida --version\n```\n\n### Basic Function Hooking\n\n```javascript\n// hook.js — intercept a function and log arguments/return value\nInterceptor.attach(Module.findExportByName(null, \"strcmp\"), {\n onEnter: function(args) {\n this.arg0 = Memory.readUtf8String(args[0]);\n this.arg1 = Memory.readUtf8String(args[1]);\n console.log(`strcmp(\"${this.arg0}\", \"${this.arg1}\")`);\n },\n onLeave: function(retval) {\n console.log(` → ${retval}`);\n }\n});\n```\n\n```bash\n# Attach to running process\nfrida -p $(pidof binary) -l hook.js\n\n# Spawn and instrument from start\nfrida -f ./binary -l hook.js --no-pause\n\n# One-liner: hook strcmp and dump comparisons\nfrida -f ./binary --no-pause -e '\nInterceptor.attach(Module.findExportByName(null, \"strcmp\"), {\n onEnter(args) {\n console.log(\"strcmp:\", Memory.readUtf8String(args[0]), Memory.readUtf8String(args[1]));\n }\n});\n'\n```\n\n### Anti-Debug Bypass\n\n```javascript\n// Bypass ptrace(PTRACE_TRACEME) — returns 0 (success) without calling\nInterceptor.attach(Module.findExportByName(null, \"ptrace\"), {\n onEnter: function(args) {\n this.request = args[0].toInt32();\n },\n onLeave: function(retval) {\n if (this.request === 0) { // PTRACE_TRACEME\n retval.replace(ptr(0));\n console.log(\"[*] ptrace(TRACEME) bypassed\");\n }\n }\n});\n\n// Bypass IsDebuggerPresent (Windows)\nvar isDbg = Module.findExportByName(\"kernel32.dll\", \"IsDebuggerPresent\");\nInterceptor.attach(isDbg, {\n onLeave: function(retval) {\n retval.replace(ptr(0));\n }\n});\n\n// Bypass timing checks — hook clock_gettime to return constant\nInterceptor.attach(Module.findExportByName(null, \"clock_gettime\"), {\n onLeave: function(retval) {\n // Force constant timestamp to defeat timing checks\n var ts = this.context.rsi || this.context.x1; // x86 or ARM\n Memory.writeU64(ts, 0); // tv_sec\n Memory.writeU64(ts.add(8), 0); // tv_nsec\n }\n});\n```\n\n### Memory Scanning and Patching\n\n```javascript\n// Scan for flag pattern in memory\nProcess.enumerateRanges('r--').forEach(function(range) {\n Memory.scan(range.base, range.size, \"66 6c 61 67 7b\", { // \"flag{\"\n onMatch: function(address, size) {\n console.log(\"[FLAG] Found at:\", address, Memory.readUtf8String(address, 64));\n },\n onComplete: function() {}\n });\n});\n\n// Patch instruction (NOP out a check)\nvar addr = Module.findBaseAddress(\"binary\").add(0x1234);\nMemory.patchCode(addr, 2, function(code) {\n var writer = new X86Writer(code, { pc: addr });\n writer.putNop();\n writer.putNop();\n writer.flush();\n});\n```\n\n### Function Replacement\n\n```javascript\n// Replace a validation function to always return true\nvar checkFlag = Module.findExportByName(null, \"check_flag\");\nInterceptor.replace(checkFlag, new NativeCallback(function(input) {\n console.log(\"[*] check_flag called with:\", Memory.readUtf8String(input));\n return 1; // always valid\n}, 'int', ['pointer']));\n```\n\n### Tracing and Stalker\n\n```javascript\n// Trace all calls in a function (Stalker — instruction-level tracing)\nvar targetAddr = Module.findExportByName(null, \"main\");\nStalker.follow(Process.getCurrentThreadId(), {\n transform: function(iterator) {\n var instruction;\n while ((instruction = iterator.next()) !== null) {\n if (instruction.mnemonic === \"call\") {\n iterator.putCallout(function(context) {\n console.log(\"CALL at\", context.pc, \"→\", ptr(context.pc).readPointer());\n });\n }\n iterator.keep();\n }\n }\n});\n```\n\n### r2frida (Radare2 + Frida Integration)\n\n```bash\n# Attach radare2 to process via Frida\nr2 frida://spawn/./binary\n\n# r2frida commands\n\\ii # List imports\n\\il # List loaded modules\n\\dt strcmp # Trace strcmp calls\n\\dc # Continue execution\n\\dm # List memory maps\n```\n\n### Frida for Android/iOS\n\n```bash\n# Android (requires rooted device or Frida server)\nadb push frida-server /data/local/tmp/\nadb shell \"chmod 755 /data/local/tmp/frida-server && /data/local/tmp/frida-server &\"\n\n# Hook Android Java methods\nfrida -U -f com.example.app -l hook_android.js --no-pause\n```\n\n```javascript\n// hook_android.js — hook Java method\nJava.perform(function() {\n var MainActivity = Java.use(\"com.example.app.MainActivity\");\n MainActivity.checkPassword.implementation = function(input) {\n console.log(\"[*] checkPassword called with:\", input);\n var result = this.checkPassword(input);\n console.log(\"[*] Result:\", result);\n return result;\n };\n});\n```\n\n**Key insight:** Frida excels where static analysis fails — obfuscated code, packed binaries, and runtime-generated data. Hook comparison functions (`strcmp`, `memcmp`, custom validators) to extract expected values without reversing the algorithm. Use `Interceptor.attach` for observation, `Interceptor.replace` for modification.\n\n**When to use:** Anti-debugging bypass, extracting runtime-computed keys, hooking crypto functions to dump plaintext, mobile app analysis, packed binary inspection.\n\n### Frida Memoization for Recursive Function Speedup (hxp CTF 2017)\n\nHook a recursive function with Frida, memoize results, and replay cached values to skip redundant computation. Fibonacci-like recursive challenges with exponential complexity become instant with memoization.\n\n```javascript\n// memo_hook.js — memoize a recursive function to skip redundant calls\nvar memo = {};\nvar funcAddr = ptr(\"0x400abc\"); // Address of the recursive function\nvar retAddr = ptr(\"0x400def\"); // Address of the function's ret instruction\n\nInterceptor.attach(funcAddr, {\n onEnter: function(args) {\n this.key = args[0].toInt32();\n if (memo[this.key] !== undefined) {\n // Skip computation entirely: set return value and jump to ret\n this.context.rax = memo[this.key];\n this.context.rip = retAddr;\n }\n },\n onLeave: function(retval) {\n // Cache the result for future calls with the same argument\n memo[this.key] = retval.toInt32();\n }\n});\n```\n\n```bash\n# Usage\nfrida -f ./binary -l memo_hook.js --no-pause\n```\n\nFor multi-argument functions, build a composite key:\n```javascript\nInterceptor.attach(funcAddr, {\n onEnter: function(args) {\n this.key = args[0].toInt32() + \",\" + args[1].toInt32();\n if (memo[this.key] !== undefined) {\n this.context.rax = memo[this.key];\n this.context.rip = retAddr;\n }\n },\n onLeave: function(retval) {\n memo[this.key] = retval.toInt32();\n }\n});\n```\n\n**Key insight:** Frida's `Interceptor` can both read and modify register state, allowing you to skip function execution entirely by setting `rax` (return value) and `rip` (to the `ret` instruction). This works on any recursive function where the same arguments produce the same result. Exponential-time recursive computations (Fibonacci, Ackermann, tree traversals) become linear with memoization.\n\n**References:** hxp CTF 2017\n\n---\n\n## angr (Symbolic Execution)\n\nangr automatically explores program paths to find inputs satisfying constraints. Solves many flag-checking binaries in minutes that take hours manually.\n\n### angr Installation\n\n```bash\npip install angr\n```\n\n### Basic Path Exploration\n\n```python\nimport angr\nimport claripy\n\n# Load binary\nproj = angr.Project('./binary', auto_load_libs=False)\n\n# Find address of \"Correct!\" print, avoid \"Wrong!\" print\n# Get these from disassembly (objdump -d or Ghidra)\nFIND_ADDR = 0x401234 # Address of success path\nAVOID_ADDR = 0x401256 # Address of failure path\n\n# Create simulation manager and explore\nsimgr = proj.factory.simgr()\nsimgr.explore(find=FIND_ADDR, avoid=AVOID_ADDR)\n\nif simgr.found:\n found = simgr.found[0]\n # Get stdin that reaches the target\n print(\"Flag:\", found.posix.dumps(0)) # fd 0 = stdin\n```\n\n### Symbolic Input with Constraints\n\n```python\nimport angr\nimport claripy\n\nproj = angr.Project('./binary', auto_load_libs=False)\n\n# Create symbolic input (e.g., 32-byte flag)\nflag_len = 32\nflag_chars = [claripy.BVS(f'flag_{i}', 8) for i in range(flag_len)]\nflag = claripy.Concat(*flag_chars + [claripy.BVV(b'\\n')])\n\n# Constrain to printable ASCII\nstate = proj.factory.entry_state(stdin=flag)\nfor c in flag_chars:\n state.solver.add(c >= 0x20)\n state.solver.add(c \u003c= 0x7e)\n\n# Constrain known prefix: \"flag{\"\nstate.solver.add(flag_chars[0] == ord('f'))\nstate.solver.add(flag_chars[1] == ord('l'))\nstate.solver.add(flag_chars[2] == ord('a'))\nstate.solver.add(flag_chars[3] == ord('g'))\nstate.solver.add(flag_chars[4] == ord('{'))\nstate.solver.add(flag_chars[flag_len-1] == ord('}'))\n\nsimgr = proj.factory.simgr(state)\nsimgr.explore(find=0x401234, avoid=0x401256)\n\nif simgr.found:\n found = simgr.found[0]\n result = found.solver.eval(flag, cast_to=bytes)\n print(\"Flag:\", result.decode())\n```\n\n### Hook Functions to Simplify Analysis\n\n```python\nimport angr\n\nproj = angr.Project('./binary', auto_load_libs=False)\n\n# Hook printf to avoid path explosion in I/O\[email protected](0x401100, length=5) # Address of call to printf\ndef skip_printf(state):\n pass # Do nothing, just skip\n\n# Hook sleep/anti-debug functions\[email protected](0x401050, length=5) # Address of call to sleep\ndef skip_sleep(state):\n pass\n\n# Replace a function with a summary\nclass AlwaysSucceed(angr.SimProcedure):\n def run(self):\n return 1\n\nproj.hook_symbol('check_license', AlwaysSucceed())\n```\n\n### Exploring from Specific Address\n\n```python\n# Start from middle of function (skip initialization)\nstate = proj.factory.blank_state(addr=0x401200)\n\n# Set up registers/memory manually\nstate.regs.rdi = 0x600000 # Pointer to input buffer\nstate.memory.store(0x600000, b\"AAAA\" + b\"\\x00\" * 28)\n\nsimgr = proj.factory.simgr(state)\nsimgr.explore(find=0x401300, avoid=0x401350)\n```\n\n### Common Patterns and Tips\n\n```python\n# Pattern 1: argv-based input\nstate = proj.factory.entry_state(args=['./binary', flag_sym])\n\n# Pattern 2: Multiple find/avoid addresses\nsimgr.explore(\n find=[0x401234, 0x401300], # Any success path\n avoid=[0x401256, 0x401400] # All failure paths\n)\n\n# Pattern 3: Find by output string (no address needed)\ndef is_successful(state):\n stdout = state.posix.dumps(1) # fd 1 = stdout\n return b\"Correct\" in stdout\n\ndef should_avoid(state):\n stdout = state.posix.dumps(1)\n return b\"Wrong\" in stdout\n\nsimgr.explore(find=is_successful, avoid=should_avoid)\n\n# Pattern 4: Timeout protection\nsimgr.explore(find=0x401234, avoid=0x401256, num_find=1)\n# Or use exploration techniques:\nsimgr.use_technique(angr.exploration_techniques.DFS()) # Depth-first\nsimgr.use_technique(angr.exploration_techniques.LengthLimiter(max_length=500))\n```\n\n### Dealing with Path Explosion\n\n```python\n# Use DFS instead of BFS (default) for flag checkers\nsimgr.use_technique(angr.exploration_techniques.DFS())\n\n# Limit symbolic memory operations\nstate.options.add(angr.options.ZERO_FILL_UNCONSTRAINED_MEMORY)\nstate.options.add(angr.options.ZERO_FILL_UNCONSTRAINED_REGISTERS)\n\n# Hook expensive functions (crypto, hashing) to avoid explosion\nimport hashlib\nclass SHA256Hook(angr.SimProcedure):\n def run(self, data, length, output):\n # Concretize input and compute hash\n concrete_data = self.state.solver.eval(\n self.state.memory.load(data, self.state.solver.eval(length)),\n cast_to=bytes\n )\n h = hashlib.sha256(concrete_data).digest()\n self.state.memory.store(output, h)\n\nproj.hook_symbol('SHA256', SHA256Hook())\n```\n\n### angr CFG Recovery\n\n```python\n# Control flow graph for understanding structure\ncfg = proj.analyses.CFGFast()\nprint(f\"Functions found: {len(cfg.functions)}\")\n\n# Find main\nfor addr, func in cfg.functions.items():\n if func.name == 'main':\n print(f\"main at {addr:#x}\")\n break\n\n# Cross-references\nnode = cfg.model.get_any_node(0x401234)\nprint(\"Predecessors:\", [hex(p.addr) for p in cfg.model.get_predecessors(node)])\n```\n\n**Key insight:** angr works best on flag-checker binaries with clear success/failure paths. For complex binaries, hook expensive functions (crypto, I/O) and use DFS exploration. Start with the simplest approach (just find/avoid addresses) before adding constraints. If angr is slow, constrain input to printable ASCII and add known prefix.\n\n**When to use:** Flag validators with branching logic, maze/path-finding binaries, constraint-heavy checks, automated binary analysis. Less effective for: heavy crypto, floating-point math, complex heap operations.\n\n---\n\n## lldb (LLVM Debugger)\n\nPrimary debugger for macOS/iOS. Also works on Linux. Preferred for Swift/Objective-C and Apple platform binaries.\n\n### Basic Commands\n\n```bash\nlldb ./binary\n(lldb) run # Run program\n(lldb) b main # Breakpoint on main\n(lldb) b 0x401234 # Breakpoint at address\n(lldb) breakpoint set -r \"check.*\" # Regex breakpoint\n(lldb) c # Continue\n(lldb) si # Step instruction\n(lldb) ni # Next instruction\n(lldb) register read # Show all registers\n(lldb) register write rax 0 # Modify register\n(lldb) memory read 0x401000 -c 32 # Read 32 bytes\n(lldb) x/s $rsi # Examine string (GDB-style)\n(lldb) dis -n main # Disassemble function\n(lldb) image list # Loaded modules + base addresses\n```\n\n### Scripting (Python)\n\n```python\n# lldb Python scripting\nimport lldb\n\ndef hook_strcmp(debugger, command, result, internal_dict):\n target = debugger.GetSelectedTarget()\n process = target.GetProcess()\n thread = process.GetSelectedThread()\n frame = thread.GetSelectedFrame()\n arg0 = frame.FindRegister(\"rdi\").GetValueAsUnsigned()\n arg1 = frame.FindRegister(\"rsi\").GetValueAsUnsigned()\n s0 = process.ReadCStringFromMemory(arg0, 256, lldb.SBError())\n s1 = process.ReadCStringFromMemory(arg1, 256, lldb.SBError())\n print(f'strcmp(\"{s0}\", \"{s1}\")')\n\n# Register in lldb: command script add -f script.hook_strcmp hook_strcmp\n```\n\n**Key insight:** Use lldb for macOS binaries (Mach-O), iOS apps, and when GDB isn't available. `image list` gives ASLR slide for PIE binaries. Scripting API is more structured than GDB's.\n\n---\n\n## x64dbg (Windows Debugger)\n\nOpen-source Windows debugger with modern UI. Alternative to OllyDbg/WinDbg for Windows RE challenges.\n\n### Key Features\n\n```bash\n# Launch\nx64dbg.exe binary.exe # 64-bit\nx32dbg.exe binary.exe # 32-bit\n\n# Essential shortcuts\nF2 → Toggle breakpoint\nF7 → Step into\nF8 → Step over\nF9 → Run\nCtrl+G → Go to address\nCtrl+F → Find pattern in memory\n```\n\n### Scripting\n\n```bash\n# x64dbg command line\nbp 0x401234 # Breakpoint\nSetBPX 0x401234, 0, \"log {s:utf8@[esp+4]}\" # Log string arg on hit\nrun # Continue\nStepOver # Step over\n```\n\n### Common CTF Workflow\n\n1. Set breakpoint on `GetWindowTextA`/`MessageBoxA` for GUI crackers\n2. Trace back from success/failure message\n3. Use **Scylla** plugin for IAT reconstruction on packed binaries\n4. **Snowman** decompiler plugin for quick pseudo-C\n\n**Key insight:** x64dbg has built-in pattern scanning, hardware breakpoints, and conditional logging. For Windows CTF binaries, it's often faster than IDA/Ghidra for dynamic analysis. Use the **xAnalyzer** plugin for automatic function argument annotation.\n\n---\n\n## GDB Register Side-Channel on putchar() (picoCTF 2018)\n\n**Pattern:** The binary decrypts a flag one character at a time and calls `putchar()` with a `usleep()` between prints. Rather than wait out the sleeps, set a breakpoint on `putchar@plt` and log `$rdi` (on glibc x86-64 the character lives there) at every hit. A GDB logging loop dumps the full flag in milliseconds regardless of the artificial delay.\n\n```gdb\n# ~/.gdbinit for this challenge\nset pagination off\nset logging file flag.log\nset logging overwrite on\nset logging on\n\nbreak putchar\ncommands\n silent\n printf \"%c\", $rdi\n continue\nend\n\nrun\n```\n\n```bash\ngdb -batch -x script.gdb ./crackme\ncat flag.log\n```\n\n**Key insight:** Any time a program artificially slows output with `usleep`, `nanosleep`, or busy-loop delays, the character to be printed is already in a register before the sleep runs. Breakpoint on the output function (`putchar`, `fputc`, `write` with `fd=1`), print the first-argument register (`$rdi` on x86-64, `$r0` on ARM, `$a0` on RISC-V/MIPS), and let GDB scripting batch-extract the data. Works even on anti-debug binaries when hardware breakpoints are available.\n\n**References:** picoCTF 2018 — learn gdb, writeup 11784\n\n---\n\n## radare2 Visual Panels for Custom VM Tracing (OTW Advent 2018)\n\n**Pattern:** Custom-VM binaries look opaque until you can see the program counter, next opcode, stack, and heap simultaneously. radare2's panel mode (`V!`) lets you pin all four views on one screen and step through host-level instructions while watching the VM state move.\n\n```text\nf sp @ rbp-0x160 # flag VM sp\nf ip @ rbp-0x158 # flag VM ip\nf stack @ rbp-0x150\nf heap @ rbp-0x148\n\nV! # enter panels\n# panel 1: ?v [ip]; pd 1 @ [ip] (next VM instruction)\n# panel 2: pxQ 0x60 @ sp (stack)\n# panel 3: pxQ 0x60 @ heap (heap)\n# panel 4: afvd (local vars / registers)\n```\n\nSet conditional breakpoints on host-level branches that correspond to VM opcode dispatch, and step with `ds`. Combine with `e io.cache=true` for non-destructive patching of VM opcodes during analysis.\n\n**Key insight:** Custom VMs are reversible in minutes once you watch their state live. Panel mode beats static decompilation because the host binary often lacks decompiler-friendly structure; the VM becomes self-explanatory when you see every register tick in real time.\n\n**References:** OverTheWire Advent 2018 — Jackinthebox, writeup 12789\n\n---\n\n## libSegFault.so Register Dump at Crash (OTW Advent 2018)\n\n**Pattern:** You need the exact register state at shellcode entry but gdb is unavailable or hooked. Preload `libSegFault.so` (shipped with glibc) and crash the program: it prints a full register dump, backtrace, and memory map to stderr.\n\n```bash\nLD_PRELOAD=/usr/lib/x86_64-linux-gnu/libSegFault.so ./target\n# or 32-bit:\nLD_PRELOAD=/lib32/libSegFault.so ./target\n\n# Force the crash:\n# segfault_handler dumps: RIP, RSP, RAX..R15, stack backtrace\n```\n\nRead the printed registers to discover which already point at your shellcode (common: `RAX` → buffer, `RDI` → zero) and design minimal shellcode.\n\n**Key insight:** libSegFault is installed on every glibc system as part of standard debugging infrastructure. It turns any segfault into a free register snapshot, even on hardened boxes without `strace`/`gdb` permissions.\n\n**References:** OverTheWire Advent Bonanza 2018 — Day 22, writeup 12757\n\n---\n\n## r2pipe Binary Walking + DP Constraint Solver (OTW Advent 2018)\n\n**Pattern:** 12 MB binary with 300k+ basic blocks performs chained hash checks on `argv[1]`. Walk every block via `r2pipe`, classify each instruction as hash/cmp/jmp/print, build a constraint graph, then solve with dynamic programming + backtracking over input positions.\n\n```python\nimport r2pipe\nr = r2pipe.open('./huge_binary')\nr.cmd('aaa')\nfor fn in r.cmdj('aflj'):\n for block in r.cmdj(f\"pdfj @ {fn['offset']}\")['ops']:\n op = block['type']\n if op == 'cmp': constraints.append(parse_cmp(block))\n if op == 'call': targets.append(block['jump'])\n# DP: memoize (position, accepted_set) -> char\n```\n\n**Key insight:** Big binaries with hash chains are solvable if you treat each branch as an inequality on input bytes. r2pipe's JSON output is machine-readable; DP over position/value tuples prunes most branches before running.\n\n**References:** OverTheWire Advent Bonanza 2018 — Day 8, writeup 12771\n\n---\n\n## GDB Commands at strcmp to Recover Dynamic XOR Key (TAMUctf 2019)\n\n**Pattern (Obfuscaxor):** Binary uses the [obfy](https://github.com/fritzone/obfy) C++ template obfuscator to bury a simple `enc(input)` XOR loop under thousands of opaque predicates. The terminal check is still `strcmp(expected_ciphertext, enc(input))` — so instead of unwinding obfy, break at the `strcmp` call and dump both operands:\n\n```\ndisassemble verify_key\n# ... 0x5555555560b9 \u003c+96>: call strcmp@plt\nbreak *verify_key+96\ncommands\n silent\n printf \"RDI (expected): \"\n x/4xg $rdi\n printf \"RSI (computed): \"\n x/4xg $rsi\n continue\nend\nrun\n```\n\nFeed a known plaintext (`AAAAAAAAA`) and record `computed_A[i]`. Because `enc` is a byte-wise XOR keystream, the key byte is recovered directly from the delta with the target:\n\n```python\n# input_char ^ key = computed_char, and we want: target_char ^ key = target_input\ndef to_ans(got_A, expected):\n return chr(got_A ^ ord('A') ^ expected)\n\n# Sanity: flip just one byte of input and confirm only one computed byte moves.\n```\n\nChain the per-byte recovery over the full 16-byte target and reconstruct the correct key (`p3Asujmn9CEeCB3A` for this challenge).\n\n**Key insight:** When `strcmp` is the last gate, the obfuscator is irrelevant — its output still has to equal a fixed string at a known call site. GDB's `commands` block turns the breakpoint into an automatic oracle: one run with `AAAA...` leaks the keystream, and a second pass with any target string gives the valid input. Works for any keyed transform that is effectively a permutation of the input under a fixed key.\n\n**References:** TAMUctf 2019 — Obfuscaxor, writeup 13574\n\n","content_type":"text/markdown; charset=utf-8","language":"markdown","size":24249,"content_sha256":"f999c7279538fc22ae746483cf95770b3d4c72508a13ea48572e8ccd077ffe6e"},{"filename":"tools-emulation.md","content":"# CTF Reverse - Emulation and Side-Channel Tooling\n\nEmulation frameworks (Qiling, Triton) and side-channel measurement tools (Intel Pin, LD_PRELOAD hooks) for CTF challenges where anti-debug, self-modifying code, or cross-architecture targets make plain GDB/Frida impractical.\n\nFor core dynamic analysis tools (Frida, angr, lldb, x64dbg), see [tools-dynamic.md](tools-dynamic.md).\n\n## Table of Contents\n- [Qiling Framework (Cross-Platform Emulation)](#qiling-framework-cross-platform-emulation)\n - [Qiling Installation](#qiling-installation)\n - [Basic Usage](#basic-usage)\n - [Anti-Debug Bypass via Emulation](#anti-debug-bypass-via-emulation)\n - [Input Fuzzing with Qiling](#input-fuzzing-with-qiling)\n- [Triton (Dynamic Symbolic Execution)](#triton-dynamic-symbolic-execution)\n- [Intel Pin Instruction-Counting Side Channel (Hackover CTF 2015)](#intel-pin-instruction-counting-side-channel-hackover-ctf-2015)\n - [Intel Pin Instruction Counting with Genetic Algorithm (hxp CTF 2017)](#intel-pin-instruction-counting-with-genetic-algorithm-hxp-ctf-2017)\n- [Opcode-Only Trace Reconstruction (0CTF 2016)](#opcode-only-trace-reconstruction-0ctf-2016)\n- [LD_PRELOAD time() Freeze for Deterministic Analysis (EKOPARTY 2017)](#ld_preload-time-freeze-for-deterministic-analysis-ekoparty-2017)\n - [LD_PRELOAD memcmp Side-Channel for Byte-by-Byte Bruteforce (Blaze CTF 2018)](#ld_preload-memcmp-side-channel-for-byte-by-byte-bruteforce-blaze-ctf-2018)\n\n---\n\n## Qiling Framework (Cross-Platform Emulation)\n\nQiling emulates binaries with OS-level support (syscalls, filesystem, registry). Built on Unicorn but adds the OS layer that Unicorn lacks.\n\n### Qiling Installation\n\n```bash\npip install qiling\n# Download rootfs for target OS:\ngit clone https://github.com/qilingframework/rootfs\n```\n\n### Basic Usage\n\n```python\nfrom qiling import Qiling\nfrom qiling.const import QL_VERBOSE\n\n# Linux ELF emulation\nql = Qiling([\"./binary\", \"arg1\"], \"rootfs/x8664_linux\",\n verbose=QL_VERBOSE.DEFAULT)\nql.run()\n\n# Windows PE emulation (no Windows needed!)\nql = Qiling([\"rootfs/x86_windows/bin/binary.exe\"], \"rootfs/x86_windows\")\nql.run()\n\n# ARM/MIPS emulation (IoT firmware)\nql = Qiling([\"rootfs/arm_linux/bin/binary\"], \"rootfs/arm_linux\")\nql.run()\n```\n\n### Anti-Debug Bypass via Emulation\n\n```python\nfrom qiling import Qiling\n\nql = Qiling([\"./binary\"], \"rootfs/x8664_linux\")\n\n# Hook ptrace syscall — return 0 (success)\ndef hook_ptrace(ql, ptrace_request, pid, addr, data):\n ql.log.info(\"ptrace bypassed\")\n return 0\n\nql.os.set_syscall(\"ptrace\", hook_ptrace)\n\n# Hook specific address (e.g., anti-VM check)\ndef skip_check(ql):\n ql.arch.regs.rax = 0 # Force success\n ql.log.info(f\"Skipped check at {ql.arch.regs.rip:#x}\")\n\nql.hook_address(skip_check, 0x401234)\n\nql.run()\n```\n\n### Input Fuzzing with Qiling\n\n```python\n# Emulate binary with different inputs to find flag\nimport string\nfrom qiling import Qiling\n\ndef test_input(candidate):\n ql = Qiling([\"./binary\"], \"rootfs/x8664_linux\",\n verbose=QL_VERBOSE.DISABLED, stdin=candidate.encode())\n ql.run()\n return ql.os.stdout.read()\n\nfor ch in string.printable:\n output = test_input(\"flag{\" + ch)\n if b\"Correct\" in output:\n print(f\"Found: {ch}\")\n```\n\n**Advantages over GDB/Frida:**\n- No debugger artifacts (bypasses all anti-debug by default)\n- Cross-platform without hardware (ARM, MIPS, RISC-V on x86 host)\n- Scriptable with Python (faster iteration than GDB)\n- Snapshot/restore for brute-forcing\n\n**Key insight:** Qiling emulates the entire OS layer (syscalls, filesystem, registry), not just the CPU. This means anti-debug checks like `ptrace(TRACEME)` naturally return success without patching, and you can analyze ARM/MIPS binaries on an x86 host without QEMU or real hardware.\n\n**When to use:** Foreign architecture binaries, IoT firmware, heavy anti-debug, automated testing of many inputs.\n\n---\n\n## Triton (Dynamic Symbolic Execution)\n\nSee [tools-advanced.md](tools-advanced.md#triton-dynamic-symbolic-execution) for full Triton reference. Quick usage:\n\n```python\nfrom triton import *\n\nctx = TritonContext(ARCH.X86_64)\n\n# Symbolize input buffer\nfor i in range(32):\n ctx.symbolizeMemory(MemoryAccess(0x600000 + i, CPUSIZE.BYTE), f\"flag_{i}\")\n\n# Process instructions and collect constraints\n# At comparison point, solve for flag\nmodel = ctx.getModel(ctx.getPathConstraintsAst())\nflag = ''.join(chr(v.getValue()) for _, v in sorted(model.items()))\n```\n\n**Key insight:** Triton excels at single-path DSE (Dynamic Symbolic Execution) where angr's path explosion is a problem. Feed it a concrete execution trace, symbolize specific inputs, and solve for constraints at comparison points. Faster than angr for linear code paths with known execution flow.\n\n**Best for:** Single-path symbolic execution, deobfuscation, taint analysis. Faster than angr for linear code paths.\n\n---\n\n## Intel Pin Instruction-Counting Side Channel (Hackover CTF 2015)\n\n**Pattern:** Brute-force input character-by-character against a binary using Intel Pin's `inscount0` tool. Each correct character causes deeper execution (more instructions) in the comparison logic.\n\n```python\nimport string\nfrom subprocess import Popen, PIPE\n\npin = './pin'\ntool = './source/tools/ManualExamples/obj-ia32/inscount0.so'\nbinary = './target'\n\nkey = ''\nwhile True:\n best_count, best_char = 0, ''\n for c in string.printable:\n cmd = [pin, '-injection', 'child', '-t', tool, '--', binary]\n p = Popen(cmd, stdout=PIPE, stdin=PIPE, stderr=PIPE)\n p.communicate((key + c + '\\n').encode())\n with open('inscount.out') as f:\n count = int(f.read().split()[-1])\n if count > best_count:\n best_count, best_char = count, c\n key += best_char\n print(f\"Found: {key}\")\n```\n\n**Key insight:** Movfuscated binaries (compiled with `movfuscator`) expand every instruction into sequences of `mov` operations, making static analysis impractical. However, character-by-character comparison still creates measurable instruction count differences. Pin's `inscount0.so` counts total executed instructions — the correct character at each position causes ~1000+ more instructions (proceeding further in the comparison). Also works for obfuscated binaries with sequential input checks.\n\n---\n\n### Intel Pin Instruction Counting with Genetic Algorithm (hxp CTF 2017)\n\nFor self-modifying code that decrypts the next chunk only after each character check passes, standard character-by-character Pin counting fails because the search space is too large and characters may interact. Use a genetic algorithm instead to explore the input space more efficiently.\n\n```python\nimport subprocess\nimport random\nimport string\n\nPIN_PATH = '/tmp/pin-3.5/pin'\nTOOL_PATH = 'source/tools/ManualExamples/obj-intel64/inscount0.so'\n\ndef fitness(candidate):\n \"\"\"Run binary under Pin and return instruction count as fitness.\"\"\"\n proc = subprocess.Popen(\n [PIN_PATH, '-t', TOOL_PATH, '--', './binary'],\n stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE)\n stdout, stderr = proc.communicate(candidate.encode())\n # inscount0 writes count to stderr or inscount.out\n try:\n with open('inscount.out') as f:\n return int(f.read().split()[-1])\n except:\n return 0\n\ndef mutate(individual, rate=0.1):\n \"\"\"Randomly mutate characters in the individual.\"\"\"\n result = list(individual)\n for i in range(len(result)):\n if random.random() \u003c rate:\n result[i] = random.choice(string.printable[:62])\n return result\n\n# Genetic algorithm parameters\nFLAG_LEN = 40\nPOP_SIZE = 100\nSURVIVORS = 20\n\n# Initialize random population\npopulation = [random.choices(string.printable[:62], k=FLAG_LEN) for _ in range(POP_SIZE)]\n\nfor generation in range(10000):\n # Score each individual by instruction count\n scored = [(fitness(''.join(p)), p) for p in population]\n scored.sort(reverse=True)\n best_score, best_individual = scored[0]\n print(f\"Gen {generation}: {best_score} {''.join(best_individual)}\")\n\n # Keep top survivors, mutate to refill population\n survivors = [s[1] for s in scored[:SURVIVORS]]\n population = survivors + [mutate(random.choice(survivors)) for _ in range(POP_SIZE - SURVIVORS)]\n```\n\n**Modified Pin for Go binaries (table-lookup flag checking):**\nWhen standard `inscount` fails because counter increments don't correlate with correctness (e.g., table-lookup comparison), modify Pin's icount tool to only count executions at the success-branch address. Brute-force character-by-character with this targeted counter:\n```cpp\n// Modified inscount0.cpp — count only executions of a specific address\nstatic ADDRINT target_addr = 0x401234; // success-branch address\nstatic UINT64 target_count = 0;\n\nVOID CountAtTarget(ADDRINT ip) {\n if (ip == target_addr) target_count++;\n}\n\nVOID Instruction(INS ins, VOID *v) {\n INS_InsertCall(ins, IPOINT_BEFORE, (AFUNPTR)CountAtTarget,\n IARG_INST_PTR, IARG_END);\n}\n```\n\n**Key insight:** When each correct character unlocks a new code section (self-modifying or multi-stage decryption), instruction count increases monotonically with correctness. A genetic algorithm explores the input space more efficiently than character-by-character brute-force because it can discover multiple correct characters simultaneously. Converges in approximately 30 minutes for 40-character flags. For table-lookup comparisons where total instruction count doesn't correlate, target a specific branch address instead.\n\n**References:** hxp CTF 2017\n\n---\n\n## Opcode-Only Trace Reconstruction (0CTF 2016)\n\nGiven an execution trace with only opcodes (no register/memory values), reconstruct the program: sort/dedup trace by address, split into basic blocks, annotate functions. Sorting algorithms are particularly vulnerable -- branch decisions leak element ordering.\n\n**Approach:**\n1. Sort trace entries by address, deduplicate to recover code layout\n2. Identify basic block boundaries (jumps, calls, returns)\n3. Map branch taken/not-taken decisions from trace order\n4. For sorting algorithms, partition comparisons reveal relative ordering of all input elements\n\n**Key insight:** Execution traces without data values still leak information through branch decisions. Quicksort partition comparisons reveal which element is greater/lesser at each step, enabling full recovery of the sorted input from branch direction alone.\n\n---\n\n## LD_PRELOAD time() Freeze for Deterministic Analysis (EKOPARTY 2017)\n\nOverride `time()` via LD_PRELOAD to return a constant value, freezing any timestamp-seeded PRNG. Once the binary's cipher becomes deterministic, brute-force each output byte without understanding the VM or cipher internals.\n\n```c\n// freeze_time.c — compile: gcc -shared -fPIC -o freeze.so freeze_time.c\n#include \u003ctime.h>\n\ntime_t time(time_t *t) {\n if (t) *t = 1234567890;\n return 1234567890;\n}\n```\n\n```bash\n# Build and use:\ngcc -shared -fPIC -o freeze.so freeze_time.c\nLD_PRELOAD=./freeze.so ./binary\n\n# Byte-at-a-time oracle: run with frozen time, try each candidate byte,\n# observe output — correct byte produces expected output character.\nfor byte in $(seq 0 255); do\n output=$(echo -n \"$(printf '\\x%02x' $byte)\" | LD_PRELOAD=./freeze.so ./binary)\n # Check output against known/expected\ndone\n```\n\nIf `srand()` or `rand()` is also involved, override `rand()` too:\n```c\nint rand(void) { return 42; }\n```\n\n**Key insight:** LD_PRELOAD function interception freezes non-determinism sources (time, rand). Once deterministic, even complex VMs become tractable byte-at-a-time oracles.\n\n**References:** EKOPARTY CTF 2017\n\n---\n\n### LD_PRELOAD memcmp Side-Channel for Byte-by-Byte Bruteforce (Blaze CTF 2018)\n\n**Pattern:** Replace `memcmp` with an LD_PRELOAD library that returns the number of matching bytes instead of the standard -1/0/1 result. This converts any memcmp-based validation into a byte-by-byte oracle. Automate with GDB Python scripting to bruteforce each character position.\n\n```c\n// memcmp_hook.c - compile: gcc -shared -fPIC -o hook.so memcmp_hook.c\nint memcmp(const char *s1, const char *s2, int n) {\n int cnt = 0;\n for (int i = 0; i \u003c n; ++i) {\n if (s1[i] == s2[i]) cnt++;\n else break;\n }\n return cnt;\n}\n```\n\n```bash\n# Use with GDB: LD_PRELOAD=./hook.so gdb ./binary\n# Set breakpoint after memcmp, read return value to count matching bytes\n# Iterate characters at each position to find the one that increases count\n```\n\n**Key insight:** Replacing memcmp via LD_PRELOAD to return match count converts any comparison-based validation into a byte-by-byte oracle. Combined with GDB scripting, this automates bruteforce of password/flag checks without reversing the validation algorithm.\n\n**Detection:** Binary uses `memcmp` or `strcmp` for flag validation (visible in `ltrace` output or import table). The comparison function is called with user input and a computed/stored expected value.\n\n**References:** Blaze CTF 2018\n","content_type":"text/markdown; charset=utf-8","language":"markdown","size":12985,"content_sha256":"8eec4b79a61f5f32211d09fb8537b663e637281be4060012a980609cc85afb80"},{"filename":"tools.md","content":"# CTF Reverse - Tools Reference\n\n## Table of Contents\n- [GDB](#gdb)\n - [Basic Commands](#basic-commands)\n - [PIE Binary Debugging](#pie-binary-debugging)\n - [One-liner Automation](#one-liner-automation)\n - [Memory Examination](#memory-examination)\n- [Radare2](#radare2)\n - [Basic Session](#basic-session)\n - [r2pipe Automation](#r2pipe-automation)\n- [Ghidra](#ghidra)\n - [Headless Analysis](#headless-analysis)\n - [Emulator for Decryption](#emulator-for-decryption)\n - [MCP Commands](#mcp-commands)\n- [Unicorn Emulation](#unicorn-emulation)\n - [Basic Setup](#basic-setup)\n - [Mixed-Mode (64 to 32) Switch](#mixed-mode-64-to-32-switch)\n - [Register Tracing Hook](#register-tracing-hook)\n - [Track Register Changes](#track-register-changes)\n- [Python Bytecode](#python-bytecode)\n - [Disassembly](#disassembly)\n - [Extract Constants](#extract-constants)\n - [Pyarmor Static Unpack (1shot)](#pyarmor-static-unpack-1shot)\n- [WASM Analysis](#wasm-analysis)\n - [Decompile to C](#decompile-to-c)\n - [Common Patterns](#common-patterns)\n- [Android APK](#android-apk)\n - [Extraction](#extraction)\n - [Key Locations](#key-locations)\n - [Search](#search)\n - [Flutter APK (Blutter)](#flutter-apk-blutter)\n - [HarmonyOS HAP/ABC (abc-decompiler)](#harmonyos-hapabc-abc-decompiler)\n- [.NET Analysis](#net-analysis)\n - [Tools](#tools)\n - [Two-Stage XOR + AES-CBC Decode Pattern (Codegate 2013)](#two-stage-xor--aes-cbc-decode-pattern-codegate-2013)\n - [NativeAOT](#nativeaot)\n- [Packed Binaries](#packed-binaries)\n - [UPX](#upx)\n - [Custom Packers](#custom-packers)\n - [PyInstaller](#pyinstaller)\n- [LLVM IR](#llvm-ir)\n - [Convert to Assembly](#convert-to-assembly)\n- [RISC-V Binary Analysis (EHAX 2026)](#risc-v-binary-analysis-ehax-2026)\n- [Binary Ninja](#binary-ninja)\n- [Decompiler Comparison with dogbolt.org](#decompiler-comparison-with-dogboltorg)\n- [Useful Commands](#useful-commands)\n- [boolector SMT2 for Custom Hash Reversal (OTW Advent 2018)](#boolector-smt2-for-custom-hash-reversal-otw-advent-2018)\n\nFor dynamic instrumentation tools (Frida, angr, lldb, x64dbg), see [tools-dynamic.md](tools-dynamic.md).\n\n---\n\n## GDB\n\n### Basic Commands\n```bash\ngdb ./binary\nrun # Run program\nstart # Run to main\nb *0x401234 # Breakpoint at address\nb *main+0x100 # Relative breakpoint\nc # Continue\nsi # Step instruction\nni # Next instruction (skip calls)\nx/s $rsi # Examine string\nx/20x $rsp # Examine stack\ninfo registers # Show registers\nset $eax=0 # Modify register\n```\n\n### PIE Binary Debugging\n```bash\ngdb ./binary\nstart # Forces PIE base resolution\nb *main+0xca # Relative to main\nb *main+0x198\nrun\n```\n\n### One-liner Automation\n```bash\ngdb -ex 'start' -ex 'b *main+0x198' -ex 'run' ./binary\n```\n\n### Memory Examination\n```bash\nx/s $rsi # String at RSI\nx/38c $rsi # 38 characters\nx/20x $rsp # 20 hex words from stack\nx/10i $rip # 10 instructions from RIP\n```\n\n---\n\n## Radare2\n\n### Basic Session\n```bash\nr2 -d ./binary # Open in debug mode\naaa # Analyze all\nafl # List functions\npdf @ main # Disassemble main\ndb 0x401234 # Set breakpoint\ndc # Continue\nood # Restart debugging\ndr # Show registers\ndr eax=0 # Modify register\n```\n\n### r2pipe Automation\n```python\nimport r2pipe\nr2 = r2pipe.open('./binary', flags=['-d'])\nr2.cmd('aaa')\nr2.cmd('db 0x401234')\n\nfor char in range(256):\n r2.cmd('ood') # Restart\n r2.cmd(f'dr eax={char}')\n output = r2.cmd('dc')\n if 'correct' in output:\n print(f\"Found: {chr(char)}\")\n```\n\n---\n\n## Ghidra\n\n### Headless Analysis\n```bash\nanalyzeHeadless /path/to/project tmp -import binary -postScript script.py\n```\n\n### Emulator for Decryption\n```java\nEmulatorHelper emu = new EmulatorHelper(currentProgram);\nemu.writeRegister(\"RSP\", 0x2fff0000);\nemu.writeRegister(\"RBP\", 0x2fff0000);\n\n// Write encrypted data\nemu.writeMemory(dataAddress, encryptedBytes);\n\n// Set function arguments\nemu.writeRegister(\"RDI\", arg1);\n\n// Run until return\nemu.setBreakpoint(returnAddress);\nemu.run(functionEntryAddress);\n\n// Read result\nbyte[] decrypted = emu.readMemory(outputAddress, length);\n```\n\n### MCP Commands\n- Recon: `list_functions`, `list_imports`, `list_strings`\n- Analysis: `decompile_function`, `get_xrefs_to`\n- Annotation: `rename_function`, `rename_variable`\n\n---\n\n## Unicorn Emulation\n\n### Basic Setup\n```python\nfrom unicorn import *\nfrom unicorn.x86_const import *\n\nmu = Uc(UC_ARCH_X86, UC_MODE_64)\n\n# Map code segment\nmu.mem_map(0x400000, 0x10000)\nmu.mem_write(0x400000, code_bytes)\n\n# Map stack\nmu.mem_map(0x7fff0000, 0x10000)\nmu.reg_write(UC_X86_REG_RSP, 0x7fff0000 + 0xff00)\n\n# Run\nmu.emu_start(start_addr, end_addr)\n```\n\n### Mixed-Mode (64 to 32) Switch\n```python\n# When a 64-bit stub jumps into 32-bit code via retf/retfq:\n# - retf pops 4-byte EIP + 2-byte CS (6 bytes)\n# - retfq pops 8-byte RIP + 8-byte CS (16 bytes)\n\nuc32 = Uc(UC_ARCH_X86, UC_MODE_32)\n# Copy memory regions, then GPRs\nreg_map = {\n UC_X86_REG_EAX: UC_X86_REG_RAX,\n UC_X86_REG_EBX: UC_X86_REG_RBX,\n UC_X86_REG_ECX: UC_X86_REG_RCX,\n UC_X86_REG_EDX: UC_X86_REG_RDX,\n UC_X86_REG_ESI: UC_X86_REG_RSI,\n UC_X86_REG_EDI: UC_X86_REG_RDI,\n UC_X86_REG_EBP: UC_X86_REG_RBP,\n}\nfor e, r in reg_map.items():\n uc32.reg_write(e, mu.reg_read(r) & 0xffffffff) # mu = 64-bit emulator from above\nuc32.reg_write(UC_X86_REG_EFLAGS, mu.reg_read(UC_X86_REG_RFLAGS) & 0xffffffff)\n\n# SSE-heavy blobs need XMM registers copied\nfor xr in [UC_X86_REG_XMM0, UC_X86_REG_XMM1, UC_X86_REG_XMM2, UC_X86_REG_XMM3,\n UC_X86_REG_XMM4, UC_X86_REG_XMM5, UC_X86_REG_XMM6, UC_X86_REG_XMM7]:\n uc32.reg_write(xr, mu.reg_read(xr))\n\n# Run 32-bit, then copy regs/memory back to 64-bit\n```\n\n**Tip:** set `UC_IGNORE_REG_BREAK=1` to silence warnings on unimplemented regs.\n\n### Register Tracing Hook\n```python\ndef hook_code(uc, address, size, user_data):\n if address == TARGET_ADDR:\n rsi = uc.reg_read(UC_X86_REG_RSI)\n print(f\"0x{address:x}: rsi=0x{rsi:016x}\")\n\nmu.hook_add(UC_HOOK_CODE, hook_code)\n```\n\n### Track Register Changes\n```python\nprev_rsi = [None]\ndef hook_rsi_changes(uc, address, size, user_data):\n rsi = uc.reg_read(UC_X86_REG_RSI)\n if rsi != prev_rsi[0]:\n print(f\"0x{address:x}: RSI changed to 0x{rsi:016x}\")\n prev_rsi[0] = rsi\n\nmu.hook_add(UC_HOOK_CODE, hook_rsi_changes)\n```\n\n---\n\n## Python Bytecode\n\n### Disassembly\n```python\nimport marshal, dis\n\nwith open('file.pyc', 'rb') as f:\n f.read(16) # Skip header (varies by Python version)\n code = marshal.load(f)\n dis.dis(code)\n```\n\n### Extract Constants\n```python\nfor ins in dis.get_instructions(code):\n if ins.opname == 'LOAD_CONST':\n print(ins.argval)\n```\n\n### Pyarmor Static Unpack (1shot)\n\nRepository: `https://github.com/Lil-House/Pyarmor-Static-Unpack-1shot`\n\n```bash\n# Basic usage (recursive processing)\npython /path/to/oneshot/shot.py /path/to/scripts\n\n# Specify pyarmor runtime library explicitly\npython /path/to/oneshot/shot.py /path/to/scripts -r /path/to/pyarmor_runtime.so\n\n# Save outputs to another directory\npython /path/to/oneshot/shot.py /path/to/scripts -o /path/to/output\n```\n\nNotes:\n- `oneshot/pyarmor-1shot` must exist before running `shot.py`.\n- Supported focus: Pyarmor 8.x-9.x (`PY` + six digits header style).\n- Pyarmor 7 and earlier (`PYARMOR` header) are out of scope.\n- Disassembly output is generally reliable; decompiled source is experimental.\n\n---\n\n## WASM Analysis\n\n### Decompile to C\n```bash\nwasm2c checker.wasm -o checker.c\ngcc -O3 checker.c wasm-rt-impl.c -o checker\n```\n\n### Common Patterns\n- `w2c_memory` - Linear memory array\n- `wasm_rt_trap(N)` - Runtime errors\n- Function exports: `flagChecker`, `validate`\n\n---\n\n## Android APK\n\n### Extraction\n```bash\napktool d app.apk -o decoded/ # Best - decodes XML\njadx app.apk # Decompile to Java\nunzip app.apk -d extracted/ # Simple extraction\n```\n\n### Key Locations\n- `res/values/strings.xml` - String resources\n- `AndroidManifest.xml` - App metadata\n- `classes.dex` - Dalvik bytecode\n- `assets/`, `res/raw/` - Resources\n\n### Search\n```bash\ngrep -r \"flag\\|CTF\" decoded/\nstrings decoded/classes*.dex | grep -i flag\n```\n\n### Flutter APK (Blutter)\n\n```bash\n# Run Blutter on arm64 build\npython3 blutter.py path/to/app/lib/arm64-v8a out_dir\n```\n\n### HarmonyOS HAP/ABC (abc-decompiler)\n\nRepository: `https://github.com/ohos-decompiler/abc-decompiler`\n\n```bash\n# Extract .hap first to obtain .abc files\nunzip app.hap -d hap_extracted/\n```\n\nCritical startup mode:\n```bash\n# Use CLI entrypoint (avoid java -jar GUI mode)\njava -cp \"./jadx-dev-all.jar\" jadx.cli.JadxCLI [options] \u003cinput>\n```\n\n```bash\n# Basic decompile\njava -cp \"./jadx-dev-all.jar\" jadx.cli.JadxCLI -d \"out\" \".abc\"\n\n# Recommended for .abc\njava -cp \"./jadx-dev-all.jar\" jadx.cli.JadxCLI -m simple --log-level ERROR -d \"out_abc_simple\" \".abc\"\n```\n\nNotes:\n- Start with `-m simple --log-level ERROR`.\n- If `auto` fails, retry with `-m simple` first.\n- Errors do not always mean total failure; check `out_xxx/sources/`.\n- Use a fresh output directory per run.\n\n---\n\n## .NET Analysis\n\n### Tools\n- **dnSpy** - Debugging + decompilation (best)\n- **ILSpy** - Decompiler\n- **dotPeek** - JetBrains decompiler\n\n### NativeAOT\n- Look for `System.Private.CoreLib` strings\n- Type metadata present but restructured\n- Search for length-prefixed UTF-16 patterns\n\n### Two-Stage XOR + AES-CBC Decode Pattern (Codegate 2013)\n\n**Pattern:** .NET binary stores an encrypted byte array that undergoes XOR decoding followed by AES-256-CBC decryption. The same key value serves as both the AES key and IV.\n\n**Steps:**\n1. Extract hardcoded byte array and key string from binary (dnSpy/ILSpy)\n2. XOR each byte (may be multi-pass, e.g., `0x25` then `0x58`, equivalent to single `0x7D`)\n3. Base64-decode the XOR result\n4. AES-256-CBC decrypt with `RijndaelManaged` using the extracted key as both Key and IV\n\n```python\nfrom Crypto.Cipher import AES\nfrom base64 import b64decode\n\n# Step 1: XOR decode\ndata = bytearray(encrypted_bytes)\nfor i in range(len(data)):\n data[i] ^= 0x7D # Combined XOR key (0x25 ^ 0x58)\n\n# Step 2: Base64 decode\nct = b64decode(bytes(data))\n\n# Step 3: AES-256-CBC decrypt (same value for key and IV)\nkey = b\"9e2ea73295c7201c5ccd044477228527\" # Padded to 32 bytes\ncipher = AES.new(key, AES.MODE_CBC, iv=key)\nplaintext = cipher.decrypt(ct)\n```\n\n**Key insight:** When `RijndaelManaged` appears in .NET decompilation, check if Key and IV are set to the same value — this is a common CTF pattern. The XOR stage often serves as a simple obfuscation layer before the real crypto.\n\n---\n\n## Packed Binaries\n\n### UPX\n```bash\nupx -d packed -o unpacked\nstrings binary | grep UPX # Check for UPX signature\n```\n\n### Custom Packers\n1. Set breakpoint after unpacking stub\n2. Dump memory\n3. Fix PE/ELF headers\n\n### PyInstaller\n```bash\npython pyinstxtractor.py binary.exe\n# Look in: binary.exe_extracted/\n```\n\n---\n\n## LLVM IR\n\n### Convert to Assembly\n```bash\nllc task.ll --x86-asm-syntax=intel\ngcc -c task.s -o file.o\n```\n\n---\n\n## RISC-V Binary Analysis (EHAX 2026)\n\n**Pattern (iguessbro):** Statically linked, stripped RISC-V ELF binary. Can't run natively on x86.\n\n**Disassembly with Capstone:**\n```python\nfrom capstone import *\n\nwith open('binary', 'rb') as f:\n code = f.read()\n\n# RISC-V 64-bit with compressed instruction support\nmd = Cs(CS_ARCH_RISCV, CS_MODE_RISCVC | CS_MODE_RISCV64)\nmd.detail = True\n\n# Disassemble from entry point (check ELF header for e_entry)\nTEXT_OFFSET = 0x10000 # typical for static RISC-V\nfor insn in md.disasm(code[TEXT_OFFSET:], TEXT_OFFSET):\n print(f\"0x{insn.address:x}:\\t{insn.mnemonic}\\t{insn.op_str}\")\n```\n\n**Common RISC-V patterns:**\n- `li a0, N` → load immediate (argument setup)\n- `mv a0, s0` → register move\n- `call offset` → function call (auipc + jalr pair)\n- `beq/bne a0, zero, label` → conditional branch\n- `sd/ld` → 64-bit store/load\n- `addiw` → 32-bit add (W-suffix = word operations)\n\n**Key differences from x86:**\n- No flags register — comparisons are inline with branch instructions\n- Arguments in a0-a7 (not rdi/rsi/rdx)\n- Return value in a0\n- Saved registers s0-s11 (callee-saved)\n- Compressed instructions (2 bytes) mixed with standard (4 bytes) — use `CS_MODE_RISCVC`\n\n**Anti-RE tricks in RISC-V:**\n- Fake flags as string constants (check for `\"n0t_th3_r34l\"` patterns)\n- Timing anti-brute-force (rdtime instruction)\n- XOR decryption with incremental key: `decrypted[i] = enc[i] ^ (key & 0xFF) ^ 0xA5; key += 7`\n\n**Emulation:** `qemu-riscv64 -L /usr/riscv64-linux-gnu/ ./binary` (needs cross-toolchain sysroot)\n\n---\n\n## Binary Ninja\n\nInteractive disassembler/decompiler with rapid community growth.\n\n**Decompilation outputs:** High-Level Intermediate Language (HLIL), pseudo-C, pseudo-Rust, pseudo-Python.\n\n```bash\n# Open binary\nbinaryninja binary\n```\n\n```python\n# Headless analysis (Python API)\nimport binaryninja\nbv = binaryninja.open_view(\"binary\")\nfor func in bv.functions:\n print(func.name, hex(func.start))\n print(func.hlil) # High-Level IL\n```\n\n**Community plugins:** Available via Plugin Manager (Ctrl+Shift+P → \"Plugin Manager\").\n\n**Free version:** https://binary.ninja/free/ (cloud-based, limited features).\n\n**Advantages over Ghidra:** Faster startup, cleaner IL representations, better Python API for scripting.\n\n---\n\n## Decompiler Comparison with dogbolt.org\n\n**dogbolt.org** runs multiple decompilers simultaneously on the same binary and shows results side-by-side.\n\n**Supported decompilers:** Hex-Rays (IDA), Ghidra, Binary Ninja, angr, RetDec, Snowman, dewolf, Reko, Relyze.\n\n**When to use:**\n- Decompiler output is confusing — compare with alternatives for clarity\n- One decompiler mishandles a construct — another may get it right\n- Quick triage without installing every tool locally\n- Validate decompiler correctness by cross-referencing outputs\n\n```bash\n# Upload via web interface: https://dogbolt.org/\n# Or use the API:\ncurl -F \"file=@binary\" https://dogbolt.org/api/binaries/\n```\n\n**Key insight:** Different decompilers excel at different constructs. When one produces unreadable output, another often generates clearer pseudocode. Cross-referencing catches decompiler bugs.\n\n---\n\n## Useful Commands\n\n```bash\n# File info\nfile binary\nchecksec --file=binary\nrabin2 -I binary\n\n# String extraction\nstrings binary | grep -iE \"flag|secret\"\nrabin2 -z binary\n\n# Sections\nreadelf -S binary\nobjdump -h binary\n\n# Symbols\nnm binary\nreadelf -s binary\n\n# Disassembly\nobjdump -d binary\nobjdump -M intel -d binary\n```\n\n---\n\n## boolector SMT2 for Custom Hash Reversal (OTW Advent 2018)\n\n**Pattern:** Custom hash functions built from bit operations fall to SMT solvers. boolector's QF_BV (bitvector) logic is noticeably faster than Z3 for such instances. Translate the hash into SMT2 directly, assert the output, and solve for the input.\n\n```smt\n(set-logic QF_BV)\n(declare-fun input () (_ BitVec 64))\n(assert (bvuge input #x0000000020202020)) ; printable lower bound\n(assert (bvule input #x000000007e7e7e7e)) ; printable upper bound\n\n; Emit the hash function as bvxor/bvrol/bvadd chains\n(define-fun hash ((x (_ BitVec 64))) (_ BitVec 64) ...)\n(assert (= (hash input) #xdeadbeefcafef00d))\n(check-sat) (get-model)\n```\n\n```bash\nboolector -m --output-format=smt2 hash.smt2\n```\n\n**Key insight:** Z3 is the default, but for bit-level hash puzzles boolector is often 10-100× faster. Emit SMT2 from IDA/r2 by lifting each basic block into `bvxor`/`bvrol`/`bvadd` and let the solver pick the preimage.\n\n**References:** OverTheWire Advent 2018 — Jackinthebox, writeup 12789\n","content_type":"text/markdown; charset=utf-8","language":"markdown","size":15891,"content_sha256":"3934f59cf3bd15cbc4050a3fc1f41dd750d4fe8f9f455c42daa0e1d2a5380c01"}],"content_json":{"type":"doc","content":[{"type":"heading","attrs":{"level":1},"content":[{"text":"CTF Reverse Engineering","type":"text"}]},{"type":"paragraph","content":[{"text":"Quick reference for RE challenges. For detailed techniques, see supporting files.","type":"text"}]},{"type":"heading","attrs":{"level":2},"content":[{"text":"Prerequisites","type":"text"}]},{"type":"paragraph","content":[{"text":"Python packages (all platforms):","type":"text","marks":[{"type":"strong"}]}]},{"type":"code_block","attrs":{"wrap":false,"language":"bash"},"content":[{"text":"pip install frida-tools angr qiling uncompyle6 capstone lief z3-solver\n# For Python 3.9+ bytecode: build pycdc from source\ngit clone https://github.com/zrax/pycdc && cd pycdc && cmake . && make","type":"text"}]},{"type":"paragraph","content":[{"text":"Linux (apt):","type":"text","marks":[{"type":"strong"}]}]},{"type":"code_block","attrs":{"wrap":false,"language":"bash"},"content":[{"text":"apt install gdb radare2 binutils strace ltrace apktool upx","type":"text"}]},{"type":"paragraph","content":[{"text":"macOS (Homebrew):","type":"text","marks":[{"type":"strong"}]}]},{"type":"code_block","attrs":{"wrap":false,"language":"bash"},"content":[{"text":"brew install gdb radare2 binutils apktool upx ghidra","type":"text"}]},{"type":"paragraph","content":[{"text":"radare2 plugins:","type":"text","marks":[{"type":"strong"}]}]},{"type":"code_block","attrs":{"wrap":false,"language":"bash"},"content":[{"text":"r2pm -ci r2ghidra # Native Ghidra decompiler for radare2","type":"text"}]},{"type":"paragraph","content":[{"text":"Manual install:","type":"text","marks":[{"type":"strong"}]}]},{"type":"bullet_list","content":[{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"pwndbg — Linux: ","type":"text"},{"text":"GitHub","type":"text","marks":[{"type":"link","attrs":{"href":"https://github.com/pwndbg/pwndbg","title":null}}]},{"text":", macOS: ","type":"text"},{"text":"brew install pwndbg/tap/pwndbg-gdb","type":"text","marks":[{"type":"code_inline"}]}]}]}]},{"type":"heading","attrs":{"level":2},"content":[{"text":"Additional Resources","type":"text"}]},{"type":"bullet_list","content":[{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"tools.md","type":"text","marks":[{"type":"link","attrs":{"href":"tools.md","title":null}}]},{"text":" - Static analysis tools (GDB, Ghidra, radare2, IDA, Binary Ninja, dogbolt.org, RISC-V with Capstone, Unicorn emulation, Python bytecode, WASM, Android APK, .NET, packed binaries)","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"tools-dynamic.md","type":"text","marks":[{"type":"link","attrs":{"href":"tools-dynamic.md","title":null}}]},{"text":" - Dynamic analysis tools: Frida (hooking, anti-debug bypass, memory scanning, Android/iOS), angr symbolic execution (path exploration, constraints, CFG), lldb (macOS/LLVM debugger), x64dbg (Windows)","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"tools-emulation.md","type":"text","marks":[{"type":"link","attrs":{"href":"tools-emulation.md","title":null}}]},{"text":" - Emulation frameworks and side-channel tooling: Qiling (cross-platform OS-level emulation), Triton (DSE), Intel Pin instruction-counting + genetic algorithm side channel, opcode-only trace reconstruction, LD_PRELOAD time freeze and memcmp side-channel for byte-by-byte bruteforce","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"tools-advanced.md","type":"text","marks":[{"type":"link","attrs":{"href":"tools-advanced.md","title":null}}]},{"text":" - Advanced tools (Part 1): VMProtect/Themida analysis, binary diffing (BinDiff, Diaphora), deobfuscation frameworks (D-810, GOOMBA, Miasm), Qiling framework, Triton DSE, Manticore, Rizin/Cutter, RetDec, custom VM bytecode lifting to LLVM IR","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"tools-advanced-2.md","type":"text","marks":[{"type":"link","attrs":{"href":"tools-advanced-2.md","title":null}}]},{"text":" - Advanced tools (Part 2): advanced GDB (Python scripting, brute-force, conditional breakpoints, watchpoints, reverse debugging with rr, pwndbg/GEF), advanced Ghidra scripting, patching (Binary Ninja API, LIEF), GDB constraint extraction + ILP solver (BackdoorCTF 2017), GDB position-encoded input zero flag monitoring (EKOPARTY 2017), LD_PRELOAD execute-only binary dump (BackdoorCTF 2017), PEDA current_inst bit-by-bit flag scraper (CONFidence CTF 2019 Teaser)","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"anti-analysis.md","type":"text","marks":[{"type":"link","attrs":{"href":"anti-analysis.md","title":null}}]},{"text":" - Anti-analysis taxonomy: Linux anti-debug (ptrace, /proc, timing, signals, direct syscalls), Windows anti-debug (PEB, NtQueryInformationProcess, heap flags, TLS callbacks, HW/SW breakpoint detection, exception-based, thread hiding), anti-VM/sandbox (CPUID, MAC, timing, artifacts, resources), anti-DBI (Frida detection/bypass), code integrity/self-hashing, anti-disassembly (opaque predicates, junk bytes), MBA identification/simplification, comprehensive bypass strategies","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"anti-analysis-ctf.md","type":"text","marks":[{"type":"link","attrs":{"href":"anti-analysis-ctf.md","title":null}}]},{"text":" - CTF writeup techniques: SIGILL handler for execution mode switching (Hack.lu 2015), SIGFPE signal handler side-channel via strace counting (PlaidCTF 2017), instruction trace inversion with Keystone and Unicorn (MeePwn 2017), call-less function chaining via stack frame manipulation (THC 2018), parent-patched child binary dump via ","type":"text"},{"text":"process_vm_writev","type":"text","marks":[{"type":"code_inline"}]},{"text":" (Google CTF Quals 2018)","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"patterns.md","type":"text","marks":[{"type":"link","attrs":{"href":"patterns.md","title":null}}]},{"text":" - Foundational binary patterns: custom VMs, anti-debugging, nanomites, self-modifying code, XOR ciphers, mixed-mode stagers, LLVM obfuscation, S-box/keystream, SECCOMP/BPF, exception handlers, memory dumps, byte-wise transforms, x86-64 gotchas, custom mangle reversing, position-based transforms, hex-encoded string comparison, signal-based binary exploration","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"patterns-runtime.md","type":"text","marks":[{"type":"link","attrs":{"href":"patterns-runtime.md","title":null}}]},{"text":" - Runtime patching and oracle techniques: malware anti-analysis bypass, multi-stage shellcode loaders, timing side-channel attacks, multi-thread anti-debug with decoy + signal handler MBA (ApoorvCTF 2026), INT3 patch + coredump brute-force oracle (Pwn2Win 2016), signal handler chain + LD_PRELOAD oracle (Nuit du Hack 2016), printf format string VM decompilation to Z3 (SECCON 2017), quadtree recursive image format parser (Google CTF Quals 2018)","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"patterns-ctf.md","type":"text","marks":[{"type":"link","attrs":{"href":"patterns-ctf.md","title":null}}]},{"text":" - Competition-specific patterns (Part 1): hidden emulator opcodes, LD_PRELOAD key extraction, SPN static extraction, image XOR smoothness, byte-at-a-time cipher, mathematical convergence bitmap, Windows PE XOR bitmap OCR, two-stage RC4+VM loaders, GBA ROM meet-in-the-middle, Sprague-Grundy game theory, kernel module maze solving, multi-threaded VM channels, backdoored shared library detection via string diffing, custom binfmt kernel module with RC4 flat binaries, hash-resolved imports / no-import ransomware, ELF section header corruption for anti-analysis","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"patterns-ctf-2.md","type":"text","marks":[{"type":"link","attrs":{"href":"patterns-ctf-2.md","title":null}}]},{"text":" - Competition-specific patterns (Part 2): multi-layer self-decrypting brute-force, embedded ZIP+XOR license, stack string deobfuscation, prefix hash brute-force, CVP/LLL lattice for integer validation, decision tree function obfuscation, GF(2^8) Gaussian elimination, ROP chain obfuscation analysis (ROPfuscation)","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"patterns-ctf-3.md","type":"text","marks":[{"type":"link","attrs":{"href":"patterns-ctf-3.md","title":null}}]},{"text":" - Competition-specific patterns (Part 3): Z3 single-line Python circuit, sliding window popcount, keyboard LED Morse code via ioctl, C++ destructor-hidden validation, syscall side-effect memory corruption, MFC dialog event handlers, VM sequential key-chain brute-force, Burrows-Wheeler transform inversion, OpenType font ligature exploitation, GLSL shader VM with self-modifying code, instruction counter as cryptographic state, batch crackme automation via objdump, fork+pipe+dead branch anti-analysis, TensorFlow DNN inversion via sigmoid layer inversion, BPF filter analysis via kernel JIT to x64 assembly","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"languages.md","type":"text","marks":[{"type":"link","attrs":{"href":"languages.md","title":null}}]},{"text":" - Language-specific: Python bytecode & opcode remapping, Python version-specific bytecode, Pyarmor static unpack, DOS stubs, Unity IL2CPP, HarmonyOS HAP/ABC, Brainfuck/esolangs (+ BF character-by-character static analysis, BF side-channel read count oracle, BF comparison idiom detection), UEFI, transpilation to C, code coverage side-channel, OPAL functional reversing, non-bijective substitution, FRACTRAN program inversion","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"languages-platforms.md","type":"text","marks":[{"type":"link","attrs":{"href":"languages-platforms.md","title":null}}]},{"text":" - Platform/framework-specific: Roblox place file analysis, Godot game asset extraction, Rust serde_json schema recovery, Android JNI RegisterNatives obfuscation, Android DEX runtime bytecode patching via /proc/self/maps, Android native .so loading bypass via new project, Frida Firebase Cloud Functions bypass, Verilog/hardware RE, prefix-by-prefix hash reversal, Ruby/Perl polyglot constraint satisfaction, Electron ASAR extraction + native binary analysis, Node.js npm runtime introspection","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"languages-compiled.md","type":"text","marks":[{"type":"link","attrs":{"href":"languages-compiled.md","title":null}}]},{"text":" - Go binary reversing (GoReSym, goroutines, memory layout, channel ops, embed.FS, Go binary UUID patching for C2 enumeration), Rust binary reversing (demangling, Option/Result, Vec, panic strings), Swift binary reversing (demangling, protocol witness tables), Kotlin/JVM (coroutine state machines), Haskell GHC CMM intermediate language for recursive structure analysis, C++ (vtable reconstruction, RTTI, STL patterns)","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"platforms.md","type":"text","marks":[{"type":"link","attrs":{"href":"platforms.md","title":null}}]},{"text":" - Platform-specific RE: macOS/iOS (Mach-O, code signing, Objective-C runtime, Swift, dyld, jailbreak bypass), embedded/IoT firmware (binwalk, UART/JTAG/SPI extraction, ARM/MIPS, RTOS), kernel drivers (Linux .ko, eBPF, Windows .sys), game engines (Unreal Engine, Unity, anti-cheat, Lua), automotive CAN bus","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"platforms-hardware.md","type":"text","marks":[{"type":"link","attrs":{"href":"platforms-hardware.md","title":null}}]},{"text":" - Hardware and advanced architecture RE: HD44780 LCD controller GPIO reconstruction, RISC-V advanced (custom extensions, privileged modes, debugging), ARM64/AArch64 reversing and exploitation (calling convention, ROP gadgets, qemu-aarch64-static emulation)","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"field-notes.md","type":"text","marks":[{"type":"link","attrs":{"href":"field-notes.md","title":null}}]},{"text":" - Quick reference notes: binary types, anti-debugging bypass, specialized patterns, CTF case notes","type":"text"}]}]}]},{"type":"hr","attrs":{"markup":"---"}},{"type":"heading","attrs":{"level":2},"content":[{"text":"When to Pivot","type":"text"}]},{"type":"bullet_list","content":[{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"If you already understand the binary and now need heap, ROP, or kernel exploitation, switch to ","type":"text"},{"text":"/ctf-pwn","type":"text","marks":[{"type":"code_inline"}]},{"text":".","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"If the challenge is really about recovering deleted files, PCAP data, or disk artifacts, switch to ","type":"text"},{"text":"/ctf-forensics","type":"text","marks":[{"type":"code_inline"}]},{"text":".","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"If the target is a web app and you are only reversing a small client-side helper script, switch to ","type":"text"},{"text":"/ctf-web","type":"text","marks":[{"type":"code_inline"}]},{"text":".","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"If the binary implements a machine learning model and the challenge is about model attacks or adversarial inputs, switch to ","type":"text"},{"text":"/ctf-ai-ml","type":"text","marks":[{"type":"code_inline"}]},{"text":".","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"If the reversed binary's core logic is a cryptographic algorithm or math problem, switch to ","type":"text"},{"text":"/ctf-crypto","type":"text","marks":[{"type":"code_inline"}]},{"text":".","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"If the binary is a real malware sample with C2, packing, or evasion behavior, switch to ","type":"text"},{"text":"/ctf-malware","type":"text","marks":[{"type":"code_inline"}]},{"text":".","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"If the challenge is a toy VM, encoding puzzle, or pyjail rather than a real binary, switch to ","type":"text"},{"text":"/ctf-misc","type":"text","marks":[{"type":"code_inline"}]},{"text":".","type":"text"}]}]}]},{"type":"heading","attrs":{"level":2},"content":[{"text":"Problem-Solving Workflow","type":"text"}]},{"type":"ordered_list","attrs":{"order":1,"listStyle":"number"},"content":[{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"Start with strings extraction","type":"text","marks":[{"type":"strong"}]},{"text":" - many easy challenges have plaintext flags","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"Try ltrace/strace","type":"text","marks":[{"type":"strong"}]},{"text":" - dynamic analysis often reveals flags without reversing","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"Try Frida hooking","type":"text","marks":[{"type":"strong"}]},{"text":" - hook strcmp/memcmp to capture expected values without reversing","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"Try angr","type":"text","marks":[{"type":"strong"}]},{"text":" - symbolic execution solves many flag-checkers automatically","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"Try Qiling","type":"text","marks":[{"type":"strong"}]},{"text":" - emulate foreign-arch binaries or bypass heavy anti-debug without artifacts","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"Map control flow","type":"text","marks":[{"type":"strong"}]},{"text":" before modifying execution","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"Automate manual processes","type":"text","marks":[{"type":"strong"}]},{"text":" via scripting (r2pipe, Frida, angr, Python)","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"Validate assumptions","type":"text","marks":[{"type":"strong"}]},{"text":" by comparing decompiler outputs (dogbolt.org for side-by-side)","type":"text"}]}]}]},{"type":"heading","attrs":{"level":2},"content":[{"text":"Quick Wins (Try First!)","type":"text"}]},{"type":"code_block","attrs":{"wrap":false,"language":"bash"},"content":[{"text":"# Plaintext flag extraction\nstrings binary | grep -E \"flag\\{|CTF\\{|pico\"\nstrings binary | grep -iE \"flag|secret|password\"\nrabin2 -z binary | grep -i \"flag\"\n\n# Dynamic analysis - often captures flag directly\nltrace ./binary\nstrace -f -s 500 ./binary\n\n# Hex dump search\nxxd binary | grep -i flag\n\n# Run with test inputs\n./binary AAAA\necho \"test\" | ./binary","type":"text"}]},{"type":"heading","attrs":{"level":2},"content":[{"text":"Initial Analysis","type":"text"}]},{"type":"code_block","attrs":{"wrap":false,"language":"bash"},"content":[{"text":"file binary # Type, architecture\nchecksec --file=binary # Security features (for pwn)\nchmod +x binary # Make executable","type":"text"}]},{"type":"heading","attrs":{"level":2},"content":[{"text":"Memory Dumping Strategy","type":"text"}]},{"type":"paragraph","content":[{"text":"Key insight:","type":"text","marks":[{"type":"strong"}]},{"text":" Let the program compute the answer, then dump it. Break at final comparison (","type":"text"},{"text":"b *main+OFFSET","type":"text","marks":[{"type":"code_inline"}]},{"text":"), enter any input of correct length, then ","type":"text"},{"text":"x/s $rsi","type":"text","marks":[{"type":"code_inline"}]},{"text":" to dump computed flag.","type":"text"}]},{"type":"heading","attrs":{"level":2},"content":[{"text":"Decoy Flag Detection","type":"text"}]},{"type":"paragraph","content":[{"text":"Pattern:","type":"text","marks":[{"type":"strong"}]},{"text":" Multiple fake targets before real check. Look for multiple comparison targets in sequence with different success messages. Set breakpoint at FINAL comparison, not earlier ones.","type":"text"}]},{"type":"heading","attrs":{"level":2},"content":[{"text":"GDB PIE Debugging","type":"text"}]},{"type":"paragraph","content":[{"text":"PIE binaries randomize base address. Use relative breakpoints:","type":"text"}]},{"type":"code_block","attrs":{"wrap":false,"language":"bash"},"content":[{"text":"gdb ./binary\nstart # Forces PIE base resolution\nb *main+0xca # Relative to main\nrun","type":"text"}]},{"type":"heading","attrs":{"level":2},"content":[{"text":"Comparison Direction (Critical!)","type":"text"}]},{"type":"paragraph","content":[{"text":"Two patterns: (1) ","type":"text"},{"text":"transform(flag) == stored_target","type":"text","marks":[{"type":"code_inline"}]},{"text":" — reverse the transform. (2) ","type":"text"},{"text":"transform(stored_target) == flag","type":"text","marks":[{"type":"code_inline"}]},{"text":" — flag IS the transformed data, just apply transform to stored target.","type":"text"}]},{"type":"heading","attrs":{"level":2},"content":[{"text":"Common Encryption Patterns","type":"text"}]},{"type":"bullet_list","content":[{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"XOR with single byte - try all 256 values","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"XOR with known plaintext (","type":"text"},{"text":"flag{","type":"text","marks":[{"type":"code_inline"}]},{"text":", ","type":"text"},{"text":"CTF{","type":"text","marks":[{"type":"code_inline"}]},{"text":")","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"RC4 with hardcoded key","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"Custom permutation + XOR","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"XOR with position index (","type":"text"},{"text":"^ i","type":"text","marks":[{"type":"code_inline"}]},{"text":" or ","type":"text"},{"text":"^ (i & 0xff)","type":"text","marks":[{"type":"code_inline"}]},{"text":") layered with a repeating key","type":"text"}]}]}]},{"type":"heading","attrs":{"level":2},"content":[{"text":"Quick Tool Reference","type":"text"}]},{"type":"code_block","attrs":{"wrap":false,"language":"bash"},"content":[{"text":"# Radare2\nr2 -d ./binary # Debug mode\naaa # Analyze\nafl # List functions\npdf @ main # Disassemble main\n\n# Ghidra (headless)\nanalyzeHeadless project/ tmp -import binary -postScript script.py\n\n# IDA\nida64 binary # Open in IDA64","type":"text"}]},{"type":"heading","attrs":{"level":2},"content":[{"text":"Deep-Dive Notes","type":"text"}]},{"type":"paragraph","content":[{"text":"Use ","type":"text"},{"text":"field-notes.md","type":"text","marks":[{"type":"link","attrs":{"href":"field-notes.md","title":null}}]},{"text":" after the first round of triage when you know what kind of target you have.","type":"text"}]},{"type":"bullet_list","content":[{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"Target formats: Python bytecode, WASM, Android, Flutter, .NET, UPX, Tauri","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"Technique notes: anti-debug bypass, VM analysis, x86-64 gotchas, iterative solvers, Unicorn, timing side channels","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"Platform notes: Godot, Roblox, macOS/iOS, embedded firmware, kernel drivers, game engines, Swift, Kotlin, Go, Rust, D","type":"text"}]}]},{"type":"list_item","content":[{"type":"paragraph","content":[{"text":"Case notes: modern CTF-specific reversing patterns and older classic challenge patterns","type":"text"}]}]}]},{"type":"hr","attrs":{"markup":"---"}}]},"metadata":{"date":"2026-06-05","name":"ctf-reverse","author":"@skillopedia","source":{"stars":2252,"repo_name":"ctf-skills","origin_url":"https://github.com/ljagiello/ctf-skills/blob/HEAD/ctf-reverse/SKILL.md","repo_owner":"ljagiello","body_sha256":"468a1ece1fb6e6046f9e55afa2d3988a1abed0ceb3e82f27053ffbbb8af0ecf0","cluster_key":"6cdaad9369e17c82f83d3e8906f4108bf29e0580bedd57b0cba145232fccdee4","clean_bundle":{"format":"clean-skill-bundle-v1","source":"ljagiello/ctf-skills/ctf-reverse/SKILL.md","attachments":[{"id":"f19d999c-9b53-5982-9654-d2cbdf709b37","key":"uploads/10433ee7-ad12-4ae0-b34e-97553e46c6c8/f19d999c-9b53-5982-9654-d2cbdf709b37/attachment.md","path":"anti-analysis-ctf.md","size":10958,"sha256":"765391aea65cc073bf805622c5b6e77b18f130501865f73e4592224087bdc22c","contentType":"text/markdown; charset=utf-8"},{"id":"3a816257-7521-5648-8f41-89075788a57f","key":"uploads/10433ee7-ad12-4ae0-b34e-97553e46c6c8/3a816257-7521-5648-8f41-89075788a57f/attachment.md","path":"anti-analysis.md","size":23818,"sha256":"5df2569a02d060bd3066d5ececbaac7953317c264657b873f38722e40cf01f7b","contentType":"text/markdown; charset=utf-8"},{"id":"fdb42955-e038-58f1-84a2-696422243e60","key":"uploads/10433ee7-ad12-4ae0-b34e-97553e46c6c8/fdb42955-e038-58f1-84a2-696422243e60/attachment.md","path":"field-notes.md","size":32679,"sha256":"6730e90e1b875bf70a8e5d6760dded192d743485fefefa3a654365e7454202c3","contentType":"text/markdown; charset=utf-8"},{"id":"cb097e7f-da1d-5fb4-adac-b4f7a3c9068f","key":"uploads/10433ee7-ad12-4ae0-b34e-97553e46c6c8/cb097e7f-da1d-5fb4-adac-b4f7a3c9068f/attachment.md","path":"languages-compiled.md","size":27264,"sha256":"9916d12612f4acd34a4b0c84b3334f68759de7398a9e159045ff5fab601eaf1c","contentType":"text/markdown; charset=utf-8"},{"id":"e43e491f-23d8-573f-9b54-510944d1d175","key":"uploads/10433ee7-ad12-4ae0-b34e-97553e46c6c8/e43e491f-23d8-573f-9b54-510944d1d175/attachment.md","path":"languages-platforms.md","size":28990,"sha256":"98ee43b6c250c4c189bcabbc4408a0c63cdd9b3541308019db1078e7ea436b93","contentType":"text/markdown; charset=utf-8"},{"id":"65b795ca-3f03-5ec1-b795-115d3942e9a4","key":"uploads/10433ee7-ad12-4ae0-b34e-97553e46c6c8/65b795ca-3f03-5ec1-b795-115d3942e9a4/attachment.md","path":"languages.md","size":23468,"sha256":"f065b7bbab7f269c0b1c4238cdf6424134930f32904ab480319c2d7f58a2deaa","contentType":"text/markdown; charset=utf-8"},{"id":"e739e2be-4bdf-5f7a-9fe8-e77b6aeba9eb","key":"uploads/10433ee7-ad12-4ae0-b34e-97553e46c6c8/e739e2be-4bdf-5f7a-9fe8-e77b6aeba9eb/attachment.md","path":"patterns-ctf-2.md","size":19276,"sha256":"8f5d39380a11b26cfba08f72e796c284464931eac33c761b2608557706d2735c","contentType":"text/markdown; charset=utf-8"},{"id":"c1ebae94-067d-5a79-a380-d207dff7ffb3","key":"uploads/10433ee7-ad12-4ae0-b34e-97553e46c6c8/c1ebae94-067d-5a79-a380-d207dff7ffb3/attachment.md","path":"patterns-ctf-3.md","size":38811,"sha256":"a877bf89d5610f81112cdd02dd56260e82371ea4ec612b580bae68df95b96083","contentType":"text/markdown; charset=utf-8"},{"id":"808d92f6-97fb-57fe-901b-59f04bfc5e0c","key":"uploads/10433ee7-ad12-4ae0-b34e-97553e46c6c8/808d92f6-97fb-57fe-901b-59f04bfc5e0c/attachment.md","path":"patterns-ctf.md","size":30764,"sha256":"fdc7f03538336bba962ccb84ad51af3d208e8ebe53536f97cae2080eece9ced1","contentType":"text/markdown; charset=utf-8"},{"id":"b5250c9d-1002-52cb-97ab-0347fb812499","key":"uploads/10433ee7-ad12-4ae0-b34e-97553e46c6c8/b5250c9d-1002-52cb-97ab-0347fb812499/attachment.md","path":"patterns-runtime.md","size":12280,"sha256":"d48ba0d017493db44417cb4cffb128286d77dc032d4038b45ff07addf51c8495","contentType":"text/markdown; charset=utf-8"},{"id":"c00d0f76-84ed-5a18-9ffe-7284958dd490","key":"uploads/10433ee7-ad12-4ae0-b34e-97553e46c6c8/c00d0f76-84ed-5a18-9ffe-7284958dd490/attachment.md","path":"patterns.md","size":20387,"sha256":"d6c3ce394122c47a77e95b3147e1c559d95030516c46cf36bf2e6b384e249314","contentType":"text/markdown; charset=utf-8"},{"id":"c434f104-92c7-5ea2-aeec-ba8d21b79287","key":"uploads/10433ee7-ad12-4ae0-b34e-97553e46c6c8/c434f104-92c7-5ea2-aeec-ba8d21b79287/attachment.md","path":"platforms-hardware.md","size":16239,"sha256":"f2548b021226489270e669ad99f0a4d10fbd0df59d7a3d2fe85de01a42366e36","contentType":"text/markdown; charset=utf-8"},{"id":"29f8009b-a0e3-5be2-a8aa-27ef9a16eb5e","key":"uploads/10433ee7-ad12-4ae0-b34e-97553e46c6c8/29f8009b-a0e3-5be2-a8aa-27ef9a16eb5e/attachment.md","path":"platforms.md","size":23005,"sha256":"e7592cabda16bfce9ab57c8f1f57bcc03ac1d68a725fe420643d8d3c54ca64f3","contentType":"text/markdown; charset=utf-8"},{"id":"26d4622d-8c26-5473-b4c5-ef9b41145a6f","key":"uploads/10433ee7-ad12-4ae0-b34e-97553e46c6c8/26d4622d-8c26-5473-b4c5-ef9b41145a6f/attachment.md","path":"tools-advanced-2.md","size":14829,"sha256":"2e7bcf73790eaa0f2fdaa289155704bfdb0ca899e617267811b04cd62b307eea","contentType":"text/markdown; charset=utf-8"},{"id":"90a8de33-e2db-5b82-8628-82429a0512ed","key":"uploads/10433ee7-ad12-4ae0-b34e-97553e46c6c8/90a8de33-e2db-5b82-8628-82429a0512ed/attachment.md","path":"tools-advanced.md","size":12375,"sha256":"8ac3db030d0d6daf366de53005d12d9720b952ae8aa6216c97926111d8fa0b36","contentType":"text/markdown; charset=utf-8"},{"id":"35b16b57-e3e9-5ee7-8e62-9a00185c958c","key":"uploads/10433ee7-ad12-4ae0-b34e-97553e46c6c8/35b16b57-e3e9-5ee7-8e62-9a00185c958c/attachment.md","path":"tools-dynamic.md","size":24249,"sha256":"f999c7279538fc22ae746483cf95770b3d4c72508a13ea48572e8ccd077ffe6e","contentType":"text/markdown; charset=utf-8"},{"id":"d138fb15-f6ce-5def-8c2d-6cc4337661d2","key":"uploads/10433ee7-ad12-4ae0-b34e-97553e46c6c8/d138fb15-f6ce-5def-8c2d-6cc4337661d2/attachment.md","path":"tools-emulation.md","size":12985,"sha256":"8eec4b79a61f5f32211d09fb8537b663e637281be4060012a980609cc85afb80","contentType":"text/markdown; charset=utf-8"},{"id":"bc416ded-29a8-57d3-b357-a3a19a819648","key":"uploads/10433ee7-ad12-4ae0-b34e-97553e46c6c8/bc416ded-29a8-57d3-b357-a3a19a819648/attachment.md","path":"tools.md","size":15891,"sha256":"3934f59cf3bd15cbc4050a3fc1f41dd750d4fe8f9f455c42daa0e1d2a5380c01","contentType":"text/markdown; charset=utf-8"}],"bundle_sha256":"b5c4a739bb1d5c6714e87a986f3e0946a6ae3f91ca62694a629106b321e04f28","attachment_count":18,"text_attachments":18,"attachment_storage":"skillopedia-attachments-v1","binary_attachments":0,"excluded_attachments":[]},"cluster_size":1,"skill_md_path":"ctf-reverse/SKILL.md","import_metadata":{"date":"2026-06-05","author":"@skillopedia","version":"v1","category":"security","category_label":"Security"},"exact_dupes_collapsed_into_this":0},"license":"MIT","version":"v1","category":"security","metadata":{"user-invocable":"false"},"import_tag":"clean-skills-v1","description":"Provides reverse engineering techniques for CTF challenges. Use when the main job is to understand how a compiled, obfuscated, packed, or virtualized target works before exploiting or solving it, including binaries, APKs, WASM, firmware, custom VMs, bytecode, game clients, malware-like loaders, and anti-debug or anti-analysis logic. Do not use it when the vulnerability is already understood and the remaining task is exploitation; use pwn instead. Do not use it for pure web workflows, log or disk forensics, or standalone crypto problems unless reversing the implementation is the real blocker.","allowed-tools":"Bash Read Write Edit Glob Grep Task WebFetch WebSearch","compatibility":"Requires filesystem-based agent (Claude Code or similar) with bash, Python 3, and internet access for tool installation."}},"renderedAt":1782979607217}

Important: agents should read /llm.txt, /llms.txt, or /.well-known/skills.json to discover the public Skillopedia API.