Commit Graph

2 Commits

Author SHA1 Message Date
lirent c4419964aa Add function inventory (.pdata), signature generation, and export/PDB symbols
Three reversing capabilities on the win32 surface plus a pure sig-gen handler:

- vmie_win32_functions enumerates a module's functions from the exception
  directory (.pdata RUNTIME_FUNCTION), folding unwind chain continuations into
  their primary - authoritative non-leaf boundaries, not prologue heuristics.
- vmie_win32_exports resolves the export table to {name, rva, ordinal,
  forwarded}: named functions with no PDB or network. vmie_win32_pdb_ref pulls
  the CodeView/RSDS {guid, age, pdb} from the debug directory - the symbol-server
  key for any module (full PDB parsing stays out of scope).
- sig_generate (siggen.h) builds a unique masked signature for a code span,
  wildcarding the rel/RIP-relative displacement bytes the x86 decoder locates and
  growing until it matches the scope exactly once - the dual of sigscan.

The decoder now also reports disp_off/disp_len so a caller can mask the floating
bytes. The MZ/PE walk gains one shared data-directory accessor and one shared
CodeView/RSDS parser; the kernel bootstrap is moved onto both, removing its
private copies - one PE parser in the tree.
2026-06-16 19:27:42 +03:00
lirent 3199fbf258 Add a light x86-64 decoder; back code-xref with it
The reversing keystone: a length-disassembly decoder with control-flow and
RIP-relative target extraction (x86dec.h), pure over a byte buffer - no vmie_mem,
no cr3, no Windows. Table-driven length over the 1-byte / 0F / 0F38 / 0F3A maps,
legacy + REX + VEX prefixes, ModRM/SIB, displacements and immediates (66 and
REX.W operand-size aware). It reports the instruction length plus the rel and
RIP-relative targets of near call/jmp/jcc and any RIP-relative memory operand.
EVEX is a documented gap (decodes as length 0). This is the primitive the rest
of the static-reversing layer builds on (function inventory, call graph, xref).

gva_code_xref now brute-scans with the decoder instead of its own ad-hoc E8/E9
and REX.W-lea heuristic, which is removed - one decoder in the tree. Because a
brute scan can re-enter a prefixed instruction one byte in and decode a shorter
aliased form with the same target, the scan drops a match that starts inside the
extent of an already-accepted one; real, non-overlapping instructions are
unaffected.
2026-06-16 18:11:29 +03:00