mirror_zfs/include/os/linux/zfs/sys
Umer Saleem 27e8f56102
Fix inconsistent mount options for ZFS root
While mounting ZFS root during boot on Linux distributions from initrd,
mount from busybox is effectively used which executes mount system call
directly. This skips the ZFS helper mount.zfs, which checks and enables
the mount options as specified in dataset properties. As a result,
datasets mounted during boot from initrd do not have correct mount
options as specified in ZFS dataset properties.

There has been an attempt to use mount.zfs in zfs initrd script,
responsible for mounting the ZFS root filesystem (PR#13305). This was
later reverted (PR#14908) after discovering that using mount.zfs breaks
mounting of snapshots on root (/) and other child datasets of root have
the same issue (Issue#9461).

This happens because switching from busybox mount to mount.zfs correctly
parses the mount options but also adds 'mntpoint=/root' to the mount
options, which is then prepended to the snapshot mountpoint in
'.zfs/snapshot'. '/root' is the directory on Debian with initramfs-tools
where root filesystem is mounted before pivot_root. When Linux runtime
is reached, trying to access the snapshots on root results in
automounting the snapshot on '/root/.zfs/*', which fails.

This commit attempts to fix the automounting of snapshots on root, while
using mount.zfs in initrd script. Since the mountpoint of dataset is
stored in vfs_mntpoint field, we can check if current mountpoint of
dataset and vfs_mntpoint are same or not. If they are not same, reset
the vfs_mntpoint field with current mountpoint. This fixes the
mountpoints of root dataset and children in respective vfs_mntpoint
fields when we try to access the snapshots of root dataset or its
children. With correct mountpoint for root dataset and children stored
in vfs_mntpoint, all snapshots of root dataset are mounted correctly
and become accessible.

This fix will come into play only if current process, that is trying to
access the snapshots is not in chroot context. The Linux kernel API
that is used to convert struct path into char format (d_path), returns
the complete path for given struct path. It works in chroot environment
as well and returns the correct path from original filesystem root.

However d_path fails to return the complete path if any directory from
original root filesystem is mounted using --bind flag or --rbind flag
in chroot environment. In this case, if we try to access the snapshot
from outside the chroot environment, d_path returns the path correctly,
i.e. it returns the correct path to the directory that is mounted with
--bind flag. However inside the chroot environment, it only returns the
path inside chroot.

For now, there is not a better way in my understanding that gives the
complete path in char format and handles the case where directories from
root filesystem are mounted with --bind or --rbind on another path which
user will later chroot into. So this fix gets enabled if current
process trying to access the snapshot is not in chroot context.

With the snapshots issue fixed for root filesystem, using mount.zfs in
ZFS initrd script, mounts the datasets with correct mount options.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Ameer Hamza <ahamza@ixsystems.com>
Signed-off-by: Umer Saleem <usaleem@ixsystems.com>
Closes #16646
2024-10-17 09:09:39 -04:00
..
abd_impl_os.h abd_os: break out platform-specific header parts 2024-08-21 13:37:18 -07:00
abd_os.h Adding Direct IO Support 2024-09-14 13:47:59 -07:00
policy.h Linux 6.3 compat: idmapped mount API changes 2023-04-10 14:15:36 -07:00
trace_acl.h Linux: use filemap_range_has_page() 2023-02-14 11:04:34 -08:00
trace_arc.h ARC: Remove b_bufcnt/b_ebufcnt from ARC headers 2023-10-06 08:56:17 -07:00
trace_common.h zio: remove io_cmd and DKIOCFLUSHWRITECACHE 2024-04-11 17:17:11 -07:00
trace_dbgmsg.h Linux 6.10 compat: Fix tracepoints definitions 2024-09-17 13:38:02 -07:00
trace_dbuf.h Linux 6.10 compat: Fix tracepoints definitions 2024-09-17 13:38:02 -07:00
trace_dmu.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
trace_dnode.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
trace_multilist.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
trace_rrwlock.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
trace_txg.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
trace_vdev.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
trace_zfs.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
trace_zil.h ZIL: Update Linux tracing after #15635 2024-01-08 16:49:39 -08:00
trace_zio.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
trace_zrlock.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
zfs_bootenv_os.h zfs label bootenv should store data as nvlist 2020-09-15 15:42:27 -07:00
zfs_context_os.h Replace dead opensolaris.org license link 2022-07-11 14:16:13 -07:00
zfs_ctldir.h snapdir: add 'disabled' value to make .zfs inaccessible 2024-10-02 09:12:02 -07:00
zfs_dir.h zfs_rename: restructure to have cleaner fallbacks 2022-10-28 09:48:58 -07:00
zfs_vfsops_os.h Fix inconsistent mount options for ZFS root 2024-10-17 09:09:39 -04:00
zfs_vnops_os.h Support for longnames for files/directories (Linux part) 2024-10-01 13:40:27 -07:00
zfs_znode_impl.h config: remove HAVE_INODE_TIMESPEC64_TIMES 2024-09-18 11:23:50 -07:00
zpl.h config: remove HAVE_INODE_TIMESPEC64_TIMES 2024-09-18 11:23:50 -07:00