mirror_zfs/module/os/linux/zfs
Rob Norris c6be6ce175 abd_iter_page: don't use compound heads on Linux <4.5
Before 4.5 (specifically, torvalds/linux@ddc58f2), head and tail pages
in a compound page were refcounted separately. This means that using the
head page without taking a reference to it could see it cleaned up later
before we're finished with it. Specifically, bio_add_page() would take a
reference, and drop its reference after the bio completion callback
returns.

If the zio is executed immediately from the completion callback, this is
usually ok, as any data is referenced through the tail page referenced
by the ABD, and so becomes "live" that way. If there's a delay in zio
execution (high load, error injection), then the head page can be freed,
along with any dirty flags or other indicators that the underlying
memory is used. Later, when the zio completes and that memory is
accessed, its either unmapped and an unhandled fault takes down the
entire system, or it is mapped and we end up messing around in someone
else's memory. Both of these are very bad.

The solution on these older kernels is to take a reference to the head
page when we use it, and release it when we're done. There's not really
a sensible way under our current structure to do this; the "best" would
be to keep a list of head page references in the ABD, and release them
when the ABD is freed.

Since this additional overhead is totally unnecessary on 4.5+, where
head and tail pages share refcounts, I've opted to simply not use the
compound head in ABD page iteration there. This is theoretically less
efficient (though cleaning up head page references would add overhead),
but its safe, and we still get the other benefits of not mapping pages
before adding them to a bio and not mis-splitting pages.

There doesn't appear to be an obvious symbol name or config option we
can match on to discover this behaviour in configure (and the mm/page
APIs have changed a lot since then anyway), so I've gone with a simple
version check.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rob Norris <rob.norris@klarasystems.com>
Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Closes #15533
Closes #15588
2024-03-25 16:51:54 -07:00
..
abd_os.c abd_iter_page: don't use compound heads on Linux <4.5 2024-03-25 16:51:54 -07:00
arc_os.c Linux 6.7 compat: rework shrinker setup for heap allocations 2023-12-20 11:47:55 -08:00
mmp_os.c Cleanup: 64-bit kernel module parameters should use fixed width types 2022-10-13 10:03:29 -07:00
policy.c Linux 6.3 compat: idmapped mount API changes 2023-04-10 14:15:36 -07:00
qat_compress.c Intel QAT 1.7 compatibility 2023-09-07 14:38:17 -07:00
qat_crypt.c Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
qat.c Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
spa_misc_os.c Selectable block allocators 2023-09-01 18:00:30 -07:00
trace.c Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
vdev_disk.c vdev_disk: use bio_chain() to submit multiple BIOs 2024-03-25 16:51:47 -07:00
vdev_file.c Cleanup: 64-bit kernel module parameters should use fixed width types 2022-10-13 10:03:29 -07:00
vdev_label_os.c RAID-Z expansion feature 2023-11-08 10:19:41 -08:00
zfs_acl.c Fixed parameter passing error when calling zfs_acl_chmod 2024-02-26 11:41:44 -08:00
zfs_ctldir.c Linux 6.7 compat: use inode atime/mtime accessors 2023-12-20 11:47:40 -08:00
zfs_debug.c RAID-Z expansion feature 2023-11-08 10:19:41 -08:00
zfs_dir.c Linux 6.3 compat: idmapped mount API changes 2023-04-10 14:15:36 -07:00
zfs_file_os.c Cleanup: Remove branches that always evaluate the same way 2022-11-03 10:47:48 -07:00
zfs_ioctl_os.c Linux 6.3 compat: idmapped mount API changes 2023-04-10 14:15:36 -07:00
zfs_racct.c module: zfs: fix unused, remove argsused 2021-12-23 09:42:47 -08:00
zfs_sysfs.c Introduce kmem_scnprintf() 2022-10-29 13:05:11 -07:00
zfs_uio.c zvol: Remove broken blk-mq optimization 2023-10-24 14:37:52 -07:00
zfs_vfsops.c Linux 6.7 compat: handle superblock shrinker member change 2023-12-20 11:47:50 -08:00
zfs_vnops_os.c Fix corruption caused by mmap flushing problems 2024-03-25 14:56:49 -07:00
zfs_znode.c Linux 6.7 compat: use inode atime/mtime accessors 2023-12-20 11:47:40 -08:00
zio_crypt.c ZIL: Assert record sizes in different places 2023-11-28 13:35:14 -08:00
zpl_ctldir.c Linux 6.6 compat: generic_fillattr has a new u32 request_mask added at arg2 2023-09-21 18:38:40 -07:00
zpl_export.c Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
zpl_file_range.c Linux 6.8 compat: use splice_copy_file_range() for fallback 2024-03-20 16:46:15 -07:00
zpl_file.c Fix corruption caused by mmap flushing problems 2024-03-25 14:56:49 -07:00
zpl_inode.c Linux 6.7 compat: use inode atime/mtime accessors 2023-12-20 11:47:40 -08:00
zpl_super.c Unify arc_prune_async() code 2023-10-30 16:56:04 -07:00
zpl_xattr.c Linux 6.6 compat: use inode_get/set_ctime*(...) 2023-09-21 18:38:31 -07:00
zvol_os.c ZVOL: Minor code cleanup 2023-11-27 13:16:59 -08:00