Wave 1 of the code-analysis layer, built on the x86-64 decoder:
- vmie_win32_callgraph walks each .pdata function with the decoder and emits an
edge for every direct call/jmp whose target lands in the module - the
intra-module call graph. Indirect edges are left to the IAT and jump tables.
- gva_jumptable recovers a switch's case targets from an indirect jump's table:
consecutive pointer entries that land in an executable region.
- cfg_blocks splits one function view into basic blocks (a generic handler:
leaders from intra-function branch targets, cut after jmp/jcc/ret).
- gva_imm_xref finds the instructions whose immediate operand equals a constant
- the dual of code-xref for magic values, error codes, syscall numbers.
The decoder now also reports imm_off/imm_len so a caller can read or match the
immediate operand. The generic primitives live in the new codeanalysis.h
(jump tables, basic blocks) and scan.h (constant xref); the .pdata-bound call
graph stays on the win32 surface and reuses the existing function/section/decode
primitives - no second PE or instruction parser.
Three reversing capabilities on the win32 surface plus a pure sig-gen handler:
- vmie_win32_functions enumerates a module's functions from the exception
directory (.pdata RUNTIME_FUNCTION), folding unwind chain continuations into
their primary - authoritative non-leaf boundaries, not prologue heuristics.
- vmie_win32_exports resolves the export table to {name, rva, ordinal,
forwarded}: named functions with no PDB or network. vmie_win32_pdb_ref pulls
the CodeView/RSDS {guid, age, pdb} from the debug directory - the symbol-server
key for any module (full PDB parsing stays out of scope).
- sig_generate (siggen.h) builds a unique masked signature for a code span,
wildcarding the rel/RIP-relative displacement bytes the x86 decoder locates and
growing until it matches the scope exactly once - the dual of sigscan.
The decoder now also reports disp_off/disp_len so a caller can mask the floating
bytes. The MZ/PE walk gains one shared data-directory accessor and one shared
CodeView/RSDS parser; the kernel bootstrap is moved onto both, removing its
private copies - one PE parser in the tree.
The reversing keystone: a length-disassembly decoder with control-flow and
RIP-relative target extraction (x86dec.h), pure over a byte buffer - no vmie_mem,
no cr3, no Windows. Table-driven length over the 1-byte / 0F / 0F38 / 0F3A maps,
legacy + REX + VEX prefixes, ModRM/SIB, displacements and immediates (66 and
REX.W operand-size aware). It reports the instruction length plus the rel and
RIP-relative targets of near call/jmp/jcc and any RIP-relative memory operand.
EVEX is a documented gap (decodes as length 0). This is the primitive the rest
of the static-reversing layer builds on (function inventory, call graph, xref).
gva_code_xref now brute-scans with the decoder instead of its own ad-hoc E8/E9
and REX.W-lea heuristic, which is removed - one decoder in the tree. Because a
brute scan can re-enter a prefixed instruction one byte in and decode a shorter
aliased form with the same target, the scan drops a match that starts inside the
extent of an already-accepted one; real, non-overlapping instructions are
unaffected.