Add process-scoped scanning algorithms: multi-pattern, code-xref, pointer-map, dissection, snapshot diff

All are OS-agnostic handlers keyed by vmie_mem* + cr3, built on the windowed
sweep / region walk / matcher; none names a Windows concept and each compiles
against include/ alone.

Scanning: a compiled multi-pattern automaton (Aho-Corasick over each pattern's
longest literal anchor, then a masked verify) finds N signatures in one sweep
pass (sigscan.h sigset; scan.h gva_sig_scan_multi). gva_code_xref decodes
rel32 call/jmp and RIP-relative lea/mov to find every instruction targeting a
given VA.

Pointer graph (pmap.h): one sweep indexes every qword whose value lands in a
mapped region into reverse + forward edges. pmap_referrers is the keystone -
it answers who-points-here, class-instance enumeration (referrers of a vtable
VA), and string xref (referrers of a string VA) from the same index;
pmap_paths is the indexed counterpart to scan_pointer's one-shot DFS;
struct_dissect classifies the qwords of an instance (pointer/vtable/float/
int/string) into a field map.

Temporal (snapdiff.h): snap_take captures a window's bytes, snap_diff reports
the changed runs against a later read.
This commit is contained in:
2026-06-16 17:38:10 +03:00
parent 25b8ed8ca9
commit c36ffe295d
9 changed files with 1160 additions and 1 deletions
+30
View File
@@ -79,4 +79,34 @@ uint64_t sig_rip(mem_view_t v, uint64_t hit_va, size_t disp_off, size_t instr_le
* is actually available. Useful for narrowing a scan to a [start,end] window. */
mem_view_t mem_sub(mem_view_t v, uint64_t start_va, size_t size);
/* ---- compiled multi-pattern matcher (Aho-Corasick anchors) --------------- *
* A sigset compiles N patterns into one automaton scanned in a single pass. It
* is still PURE (only mem_view_t, no vmie_mem). Each pattern contributes its
* longest contiguous non-wildcard run as a literal anchor; an Aho-Corasick goto
* over those anchors finds candidate sites, and on an anchor hit the FULL masked
* pattern is verified (mem_sub + mask compare) before the match is reported.
* This is the building block under gva_sig_scan_multi (see scan.h). */
typedef struct sigset sigset; /* compiled automaton (opaque) */
/* Compile `n` patterns into a sigset. The patterns are borrowed for the call
* only (their bytes are copied into the automaton). Returns NULL on OOM, on
* n <= 0, or if any pattern is empty / all-wildcard (no literal anchor). Release
* with sigset_free(). */
sigset* sigset_compile(const sig_pattern_t* pats, int n);
/* Release a sigset produced by sigset_compile. Safe on NULL. */
void sigset_free(sigset* s);
/* Invoke cb(user, pat_index, match_va) for every full-pattern match of any
* compiled pattern in `v`, anchor-driven (not necessarily in ascending order
* across patterns). `cb` returns nonzero to stop early. The longest-anchor
* length is what a windowed caller uses as overlap to de-dup across seams. */
void sig_set_each(const sigset* s, mem_view_t v,
int (*cb)(void* user, int pat, uint64_t va), void* user);
/* Longest compiled pattern length, in bytes. A windowed sweep carries
* (this - 1) leading-overlap bytes so no full pattern is split at a seam (the
* gva_sig_scan_multi overlap contract). 0 on NULL. */
size_t sigset_maxlen(const sigset* s);
#endif /* VMIE_SIGSCAN_H */