Commit Graph

3 Commits

Author SHA1 Message Date
lirent 50ed32b7dc Add function-level code diff over caller-supplied views
code_diff compares two views of the same code in one coordinate space - an
on-disk image section against the live in-memory section, or one .text across
two snapshots - and reports the functions whose body changed. For each function
extent it func_hash()es the slice of each view and flags a mismatch: a patch, an
inline hook, or an unpacked/JIT-rewritten body. A thin handler over func_hash +
mem_sub, with no file I/O of its own - the caller owns reading the on-disk image.
The relocation limit (absolute-address immediates) is documented; two snapshots
at the same base diff exactly. Closes the non-starred reversing series.
2026-06-16 20:21:36 +03:00
lirent 35c5dc06ba Add imports, inline-hook detection, function hashing, per-function imports
Wave 2 of the code-analysis layer:

- vmie_win32_imports resolves the import directory (INT/IAT) to {iat_rva, dll,
  name, ordinal} - named APIs, walking the name and slot thunks in lockstep so
  every import carries the IAT slot a call lands on.
- vmie_win32_inline_hooks decodes each .pdata function's entry and reports any
  whose first instruction is a direct jmp/call leaving the module image - the
  detour/trampoline shape.
- vmie_win32_func_imports records, in order, the IAT slots a function calls
  through (call qword [rip+disp] onto an import slot): the function's API-call
  sequence, named by correlating with vmie_win32_imports.
- func_hash (codeanalysis.h) hashes a function position-independently, zeroing
  the displacement bytes the decoder locates - one primitive for fingerprinting
  known code and for detecting a changed body across snapshots.

Devirtualization needs no new call and is documented as a composition: a
vtable's methods are gva_jumptable(vtable_va), its instances are
pmap_referrers(vtable_va), and func_hash names each method. Imports reuse the
shared data-directory accessor; the analyses reuse the function/section/decode
primitives - no second PE or instruction parser.
2026-06-16 20:03:49 +03:00
lirent 79e82ffc6a Add code-structure analysis: call graph, jump tables, basic blocks, constant xref
Wave 1 of the code-analysis layer, built on the x86-64 decoder:

- vmie_win32_callgraph walks each .pdata function with the decoder and emits an
  edge for every direct call/jmp whose target lands in the module - the
  intra-module call graph. Indirect edges are left to the IAT and jump tables.
- gva_jumptable recovers a switch's case targets from an indirect jump's table:
  consecutive pointer entries that land in an executable region.
- cfg_blocks splits one function view into basic blocks (a generic handler:
  leaders from intra-function branch targets, cut after jmp/jcc/ret).
- gva_imm_xref finds the instructions whose immediate operand equals a constant
  - the dual of code-xref for magic values, error codes, syscall numbers.

The decoder now also reports imm_off/imm_len so a caller can read or match the
immediate operand. The generic primitives live in the new codeanalysis.h
(jump tables, basic blocks) and scan.h (constant xref); the .pdata-bound call
graph stays on the win32 surface and reuses the existing function/section/decode
primitives - no second PE or instruction parser.
2026-06-16 19:52:25 +03:00