1
0
mirror of https://git.proxmox.com/git/mirror_zfs.git synced 2025-04-06 17:49:11 +03:00

FreeBSD: don't verify recycled vnode for zfs control directory

Under certain loads, the following panic is hit:

    panic: page fault
    KDB: stack backtrace:
     0xffffffff805db025 at kdb_backtrace+0x65
     0xffffffff8058e86f at vpanic+0x17f
     0xffffffff8058e6e3 at panic+0x43
     0xffffffff808adc15 at trap_fatal+0x385
     0xffffffff808adc6f at trap_pfault+0x4f
     0xffffffff80886da8 at calltrap+0x8
     0xffffffff80669186 at vgonel+0x186
     0xffffffff80669841 at vgone+0x31
     0xffffffff8065806d at vfs_hash_insert+0x26d
     0xffffffff81a39069 at sfs_vgetx+0x149
     0xffffffff81a39c54 at zfsctl_snapdir_lookup+0x1e4
     0xffffffff8065a28c at lookup+0x45c
     0xffffffff806594b9 at namei+0x259
     0xffffffff80676a33 at kern_statat+0xf3
     0xffffffff8067712f at sys_fstatat+0x2f
     0xffffffff808ae50c at amd64_syscall+0x10c
     0xffffffff808876bb at fast_syscall_common+0xf8

The page fault occurs because vgonel() will call VOP_CLOSE() for active
vnodes. For this reason, define vop_close for zfsctl_ops_snapshot. While
here, define vop_open for consistency.

After adding the necessary vop, the bug progresses to the following
panic:

    panic: VERIFY3(vrecycle(vp) == 1) failed (0 == 1)
    cpuid = 17
    KDB: stack backtrace:
     0xffffffff805e29c5 at kdb_backtrace+0x65
     0xffffffff8059620f at vpanic+0x17f
     0xffffffff81a27f4a at spl_panic+0x3a
     0xffffffff81a3a4d0 at zfsctl_snapshot_inactive+0x40
     0xffffffff8066fdee at vinactivef+0xde
     0xffffffff80670b8a at vgonel+0x1ea
     0xffffffff806711e1 at vgone+0x31
     0xffffffff8065fa0d at vfs_hash_insert+0x26d
     0xffffffff81a39069 at sfs_vgetx+0x149
     0xffffffff81a39c54 at zfsctl_snapdir_lookup+0x1e4
     0xffffffff80661c2c at lookup+0x45c
     0xffffffff80660e59 at namei+0x259
     0xffffffff8067e3d3 at kern_statat+0xf3
     0xffffffff8067eacf at sys_fstatat+0x2f
     0xffffffff808b5ecc at amd64_syscall+0x10c
     0xffffffff8088f07b at fast_syscall_common+0xf8

This is caused by a race condition that can occur when allocating a new
vnode and adding that vnode to the vfs hash. If the newly created vnode
loses the race when being inserted into the vfs hash, it will not be
recycled as its usecount is greater than zero, hitting the above
assertion.

Fix this by dropping the assertion.

FreeBSD-issue: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=252700
Reviewed-by: Andriy Gapon <avg@FreeBSD.org>
Reviewed-by: Mateusz Guzik <mjguzik@gmail.com>
Reviewed-by: Alek Pinchuk <apinchuk@axcient.com>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Rob Wing <rob.wing@klarasystems.com>
Co-authored-by: Rob Wing <rob.wing@klarasystems.com>
Submitted-by: Klara, Inc.
Sponsored-by: rsync.net
Closes 
This commit is contained in:
rob-wing 2023-02-21 16:26:33 -09:00 committed by Brian Behlendorf
parent 45c4b3e680
commit f786232b2a

View File

@ -1165,7 +1165,7 @@ zfsctl_snapshot_inactive(struct vop_inactive_args *ap)
{
vnode_t *vp = ap->a_vp;
VERIFY3S(vrecycle(vp), ==, 1);
vrecycle(vp);
return (0);
}
@ -1249,6 +1249,8 @@ static struct vop_vector zfsctl_ops_snapshot = {
#if __FreeBSD_version >= 1300121
.vop_fplookup_vexec = VOP_EAGAIN,
#endif
.vop_open = zfsctl_common_open,
.vop_close = zfsctl_common_close,
.vop_inactive = zfsctl_snapshot_inactive,
#if __FreeBSD_version >= 1300045
.vop_need_inactive = vop_stdneed_inactive,