code_diff compares two views of the same code in one coordinate space - an
on-disk image section against the live in-memory section, or one .text across
two snapshots - and reports the functions whose body changed. For each function
extent it func_hash()es the slice of each view and flags a mismatch: a patch, an
inline hook, or an unpacked/JIT-rewritten body. A thin handler over func_hash +
mem_sub, with no file I/O of its own - the caller owns reading the on-disk image.
The relocation limit (absolute-address immediates) is documented; two snapshots
at the same base diff exactly. Closes the non-starred reversing series.
Wave 2 of the code-analysis layer:
- vmie_win32_imports resolves the import directory (INT/IAT) to {iat_rva, dll,
name, ordinal} - named APIs, walking the name and slot thunks in lockstep so
every import carries the IAT slot a call lands on.
- vmie_win32_inline_hooks decodes each .pdata function's entry and reports any
whose first instruction is a direct jmp/call leaving the module image - the
detour/trampoline shape.
- vmie_win32_func_imports records, in order, the IAT slots a function calls
through (call qword [rip+disp] onto an import slot): the function's API-call
sequence, named by correlating with vmie_win32_imports.
- func_hash (codeanalysis.h) hashes a function position-independently, zeroing
the displacement bytes the decoder locates - one primitive for fingerprinting
known code and for detecting a changed body across snapshots.
Devirtualization needs no new call and is documented as a composition: a
vtable's methods are gva_jumptable(vtable_va), its instances are
pmap_referrers(vtable_va), and func_hash names each method. Imports reuse the
shared data-directory accessor; the analyses reuse the function/section/decode
primitives - no second PE or instruction parser.
Wave 1 of the code-analysis layer, built on the x86-64 decoder:
- vmie_win32_callgraph walks each .pdata function with the decoder and emits an
edge for every direct call/jmp whose target lands in the module - the
intra-module call graph. Indirect edges are left to the IAT and jump tables.
- gva_jumptable recovers a switch's case targets from an indirect jump's table:
consecutive pointer entries that land in an executable region.
- cfg_blocks splits one function view into basic blocks (a generic handler:
leaders from intra-function branch targets, cut after jmp/jcc/ret).
- gva_imm_xref finds the instructions whose immediate operand equals a constant
- the dual of code-xref for magic values, error codes, syscall numbers.
The decoder now also reports imm_off/imm_len so a caller can read or match the
immediate operand. The generic primitives live in the new codeanalysis.h
(jump tables, basic blocks) and scan.h (constant xref); the .pdata-bound call
graph stays on the win32 surface and reuses the existing function/section/decode
primitives - no second PE or instruction parser.