Define sops->free_inode() to prevent use-after-free during lookup

On Linux, when doing path lookup with LOOKUP_RCU, dentry and inode can
be dereferenced without refcounts and locks. For this reason, dentry and
inode must only be freed after RCU grace period.

However, zfs currently frees inode in zfs_inode_destroy synchronously
and we can't use GPL-only call_rcu() in zfs directly. Fortunately, on
Linux 5.2 and after, if we define sops->free_inode(), the kernel will do
call_rcu() for us.

This issue may be triggered more easily with init_on_free=1 boot
parameter:

BUG: kernel NULL pointer dereference, address: 0000000000000020
RIP: 0010:selinux_inode_permission+0x10e/0x1c0
Call Trace:
 ? show_trace_log_lvl+0x1be/0x2d9
 ? show_trace_log_lvl+0x1be/0x2d9
 ? show_trace_log_lvl+0x1be/0x2d9
 ? security_inode_permission+0x37/0x60
 ? __die_body.cold+0x8/0xd
 ? no_context+0x113/0x220
 ? exc_page_fault+0x6d/0x130
 ? asm_exc_page_fault+0x1e/0x30
 ? selinux_inode_permission+0x10e/0x1c0
 security_inode_permission+0x37/0x60
 link_path_walk.part.0.constprop.0+0xb5/0x360
 ? path_init+0x27d/0x3c0
 path_lookupat+0x3e/0x1a0
 filename_lookup+0xc0/0x1d0
 ? __check_object_size.part.0+0x123/0x150
 ? strncpy_from_user+0x4e/0x130
 ? getname_flags.part.0+0x4b/0x1c0
 vfs_statx+0x72/0x120
 ? ioctl_has_perm.constprop.0.isra.0+0xbd/0x120
 __do_sys_newlstat+0x39/0x70
 ? __x64_sys_ioctl+0x8d/0xd0
 do_syscall_64+0x30/0x40
 entry_SYSCALL_64_after_hwframe+0x62/0xc7

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Rob Norris <rob.norris@klarasystems.com>
Signed-off-by: Chunwei Chen <david.chen@nutanix.com>
Co-authored-by: Chunwei Chen <david.chen@nutanix.com>
Closes #17546
This commit is contained in:
Chunwei Chen
2025-07-18 08:45:13 -07:00
committed by Alexander Motin
parent 347d68048a
commit c79d5e4f33
5 changed files with 54 additions and 2 deletions
+24
View File
@@ -0,0 +1,24 @@
dnl #
dnl # Linux 5.2 API change
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_SOPS_FREE_INODE], [
ZFS_LINUX_TEST_SRC([super_operations_free_inode], [
#include <linux/fs.h>
static void free_inode(struct inode *) { }
static struct super_operations sops __attribute__ ((unused)) = {
.free_inode = free_inode,
};
],[])
])
AC_DEFUN([ZFS_AC_KERNEL_SOPS_FREE_INODE], [
AC_MSG_CHECKING([whether sops->free_inode() exists])
ZFS_LINUX_TEST_RESULT([super_operations_free_inode], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_SOPS_FREE_INODE, 1, [sops->free_inode() exists])
],[
AC_MSG_RESULT(no)
])
])
+2
View File
@@ -132,6 +132,7 @@ AC_DEFUN([ZFS_AC_KERNEL_TEST_SRC], [
ZFS_AC_KERNEL_SRC_PIN_USER_PAGES
ZFS_AC_KERNEL_SRC_TIMER
ZFS_AC_KERNEL_SRC_SUPER_BLOCK_S_WB_ERR
ZFS_AC_KERNEL_SRC_SOPS_FREE_INODE
case "$host_cpu" in
powerpc*)
ZFS_AC_KERNEL_SRC_CPU_HAS_FEATURE
@@ -248,6 +249,7 @@ AC_DEFUN([ZFS_AC_KERNEL_TEST_RESULT], [
ZFS_AC_KERNEL_PIN_USER_PAGES
ZFS_AC_KERNEL_TIMER
ZFS_AC_KERNEL_SUPER_BLOCK_S_WB_ERR
ZFS_AC_KERNEL_SOPS_FREE_INODE
case "$host_cpu" in
powerpc*)
ZFS_AC_KERNEL_CPU_HAS_FEATURE