mirror_zfs/module/os/linux/zfs
Prakash Surya 13312e2fa1
Reduce need for contiguous memory for ioctls
We've had cases where we trigger an OOM despite having memory freely
available on the system. For example, here, we had about 21GB free:

    kernel: Node 0 Normal: 2418758*4kB (UME) 1549533*8kB (UE) 0*16kB
    0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB =
    22071296kB

The problem being, all the memory is in 4K and 8K contiguous regions,
but the allocation request was for a 16K contiguous region:

    kernel: SafeExecutors-4 invoked oom-killer:
    gfp_mask=0x42dc0(GFP_KERNEL|__GFP_NOWARN|__GFP_COMP|__GFP_ZERO),
    order=2, oom_score_adj=0

The offending allocation came from this call trace:

    kernel: Call Trace:
    kernel:  dump_stack+0x57/0x7a
    kernel:  dump_header+0x4f/0x1e1
    kernel:  oom_kill_process.cold.33+0xb/0x10
    kernel:  out_of_memory+0x1ad/0x490
    kernel:  __alloc_pages_slowpath+0xd55/0xe40
    kernel:  __alloc_pages_nodemask+0x2df/0x330
    kernel:  kmalloc_large_node+0x42/0x90
    kernel:  __kmalloc_node+0x25a/0x320
    kernel:  ? spl_kmem_free_impl+0x21/0x30 [spl]
    kernel:  spl_kmem_alloc_impl+0xa5/0x100 [spl]
    kernel:  spl_kmem_zalloc+0x19/0x20 [spl]
    kernel:  zfsdev_ioctl+0x2b/0xe0 [zfs]
    kernel:  do_vfs_ioctl+0xa9/0x640
    kernel:  ? __audit_syscall_entry+0xdd/0x130
    kernel:  ksys_ioctl+0x67/0x90
    kernel:  __x64_sys_ioctl+0x1a/0x20
    kernel:  do_syscall_64+0x5e/0x200
    kernel:  entry_SYSCALL_64_after_hwframe+0x44/0xa9
    kernel: RIP: 0033:0x7fdca3674317

The problem is, for each ioctl that ZFS makes, it has to allocate a
zfs_cmd_t structure, which is 13744 bytes in size (on my system):

    sdb> sizeof zfs_cmd
    (size_t)13744

This size, coupled with the fact that we currently allocate it with
kmem_zalloc, means we need a 16K contiguous region of memory to satisfy
the request.

The solution taken by this change, is to use "vmem" instead of "kmem" to
do the allocation, such that we don't necessarily need a contiguous 16K
memory region to satisfy the allocation.

Arguably, a better solution would be not to require such a large
allocation to begin with (e.g. reduce the size of the zfs_cmd_t
structure), but that'd be a much larger change than this "one liner".
Thus, I've opted for this approach for now; we can always circle back
and attempt to reduce the size of the structure in the future.

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Mark Maybee <mark.maybee@delphix.com>
Reviewed-by: Don Brady <don.brady@delphix.com>
Signed-off-by: Prakash Surya <prakash.surya@delphix.com>
Closes #14474
2023-02-13 16:35:59 -08:00
..
abd_os.c Add assertion and make variables unsigned in abd_alloc_chunks() 2023-02-06 11:10:50 -08:00
arc_os.c Cleanup: 64-bit kernel module parameters should use fixed width types 2022-10-13 10:03:29 -07:00
mmp_os.c Cleanup: 64-bit kernel module parameters should use fixed width types 2022-10-13 10:03:29 -07:00
policy.c Support idmapped mount in user namespace 2022-11-08 10:28:56 -08:00
qat_compress.c Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
qat_crypt.c Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
qat.c Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
spa_misc_os.c Cleanup: 64-bit kernel module parameters should use fixed width types 2022-10-13 10:03:29 -07:00
trace.c Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
vdev_disk.c Cleanup of dead code suggested by Clang Static Analyzer (#14380) 2023-01-17 09:57:12 -08:00
vdev_file.c Cleanup: 64-bit kernel module parameters should use fixed width types 2022-10-13 10:03:29 -07:00
zfs_acl.c Cleanup of dead code suggested by Clang Static Analyzer (#14380) 2023-01-17 09:57:12 -08:00
zfs_ctldir.c Remove zpl_revalidate: fix snapshot rollback 2022-10-28 09:47:19 -07:00
zfs_debug.c Cleanup: Replace oldstyle struct hack with C99 flexible array members 2023-01-12 16:00:03 -08:00
zfs_dir.c Fix unprotected zfs_znode_dmu_fini 2023-01-19 16:59:05 -08:00
zfs_file_os.c Cleanup: Remove branches that always evaluate the same way 2022-11-03 10:47:48 -07:00
zfs_ioctl_os.c Reduce need for contiguous memory for ioctls 2023-02-13 16:35:59 -08:00
zfs_racct.c module: zfs: fix unused, remove argsused 2021-12-23 09:42:47 -08:00
zfs_sysfs.c Introduce kmem_scnprintf() 2022-10-29 13:05:11 -07:00
zfs_uio.c Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
zfs_vfsops.c zfs_get_temporary_prop() should not pass NULL to strcpy() 2023-02-06 11:08:57 -08:00
zfs_vnops_os.c Cleanup of dead code suggested by Clang Static Analyzer (#14380) 2023-01-17 09:57:12 -08:00
zfs_znode.c Fix unprotected zfs_znode_dmu_fini 2023-01-19 16:59:05 -08:00
zio_crypt.c Fix GCC 12 compilation errors 2022-11-30 13:45:53 -08:00
zpl_ctldir.c Support idmapped mount in user namespace 2022-11-08 10:28:56 -08:00
zpl_export.c Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
zpl_file.c Support idmapped mount in user namespace 2022-11-08 10:28:56 -08:00
zpl_inode.c linux 6.2 compat: get_acl() got moved to get_inode_acl() in 6.2 2023-01-06 14:40:54 -08:00
zpl_super.c Support idmapped mount 2022-10-19 11:17:09 -07:00
zpl_xattr.c linux 6.2 compat: zpl_set_acl arg2 is now struct dentry 2023-01-24 11:20:50 -08:00
zvol_os.c Optionally skip zil_close during zvol_create_minor_impl 2022-11-08 12:38:08 -08:00