mirror_zfs

mirror of https://git.proxmox.com/git/mirror_zfs.git synced 2026-05-23 10:54:35 +03:00

Author	SHA1	Message	Date
Tony Hutter	91c87648a7	[2.4.2-only] GCC: Fix uu_ident.c strchr() Convert 'char ' to 'const char ' to make GCC happy on Fedora 44. Signed-off-by: Tony Hutter <hutter2@llnl.gov>	2026-05-05 15:41:43 -07:00
Prakash Surya	0b58f1db89	libspl/mnttab: follow symlinks when resolving path via statx (#18469 ) When the path argument to "zfs list -Ho name <path>" (or any caller of zfs_path_to_zhandle()) is a symlink that crosses a mount boundary, the wrong dataset is returned. Instead of returning the dataset that owns the symlink's target, getextmntent() matches the dataset containing the symlink itself. For example, given two ZFS datasets "tank/ds1" and "tank/ds2", and a symlink "/tank/ds1/link" pointing into "/tank/ds2": $ sudo zfs list -Ho name /tank/ds1/link tank/ds1 The expected (and previous) behavior is to return "tank/ds2", since the symlink's target resides in that dataset. The problem is in getextmntent(), in lib/libspl/os/linux/mnttab.c. That function calls statx() on the caller-supplied path to obtain its mnt_id (used to match against the mnt_id of each entry in /proc/self/mounts), and it passes AT_SYMLINK_NOFOLLOW to that statx() call. As a result, the mnt_id returned reflects the symlink's location rather than the symlink target's mount, and the wrong /proc/self/mounts entry is matched. The same function also calls stat64() on the caller-supplied path (used as a fallback when STATX_MNT_ID is not available, and to populate the statbuf out-parameter). stat64() always follows symlinks, so the statx() and stat64() calls were inconsistent: one resolved the symlink, the other didn't. The AT_SYMLINK_NOFOLLOW behavior may be appropriate when statx() is called on a mount entry from /proc/self/mounts (which is always a real directory), but it is wrong for caller-supplied paths, which may be symlinks. This bug was introduced by `523d9d6007` ("Validate mountpoint on path-based unmount using statx"), which added the STATX_MNT_ID code path. However, the bug was latent: config/user-statx.m4 omitted "#define _GNU_SOURCE" when checking for STATX_MNT_ID in <sys/stat.h>, so HAVE_STATX_MNT_ID was never defined, and the buggy statx() path was never compiled in. getextmntent() always fell back to the dev_t comparison via stat64(), which correctly follows symlinks. The fix to that autoconf check, in `2b930f63f8` ("config: fix STATX_MNT_ID detection"), caused HAVE_STATX_MNT_ID to be properly defined on kernels that support it, activating the broken AT_SYMLINK_NOFOLLOW path for the first time and exposing the regression. The fix is to drop AT_SYMLINK_NOFOLLOW from the statx() call so that symlinks are followed, matching the behavior of stat64() on the same path. Verified with a minimal reproducer: created two ZFS datasets, placed a symlink inside the first pointing into the second, and confirmed that "zfs list -Ho name <symlink>" returns the dataset containing the symlink's target rather than the dataset containing the symlink. Signed-off-by: Prakash Surya <prakash.surya@perforce.com> Reviewed-by: Ameer Hamza <ahamza@ixsystems.com> Reviewed-by: Mark Maybee <mark.maybee@delphix.com> Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>	2026-05-04 13:33:57 -07:00
Brian Behlendorf	3862aadf78	Fix vdev_rebuild_range() tx commit The spa_sync thread waits on ->spa_txg_zio and will set ZIO_WAIT_DONE before running the sync tasks. The dmu_tx_commit() call must be done after we add the child zio to the ->spa_txg_zio parent otherwise its possible the child is added after txg_sync has waited. Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #18276	2026-05-04 13:09:02 -07:00
Akash B	9f92266b76	Fix redundant declaration of dsl_pool_t Remove redundant dsl_pool variable and duplicate spa_get_dsl() call in vdev_rebuild_thread. Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Akash B <akash-b@hpe.com> Closes #18263	2026-05-04 13:09:02 -07:00
Brian Behlendorf	7b10409fbf	CI: FreeBSD 15.1 PRERELEASE (#18490 ) Update freebsd15-0s builder to freebsd15-1s and point it at the 15.1-PRERELEASE tag. The previous freebsd-15.0-STABLE images are no longer available. Additionally, add a freebsd15-0r stanza for the RELEASE. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>	2026-05-04 10:38:46 -07:00
Tony Hutter	7534fa4df7	CI/GCC: Add Fedora 44, fix build errors and threadsappend - Add Fedora 44 to CI tests - Fix build issues from the newer compiler. These are mostly 'char ' to 'const char ' conversions. - Fix threadsappend.c test waiting for the same thread TID twice. This caused the test to hang on F44 (but strangely not other OSs?) Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com> Signed-off-by: Tony Hutter <hutter2@llnl.gov> Closes #18478	2026-05-04 10:38:46 -07:00
Rob Norris	65b4a5c551	Linux 7.1: access dentry d_alias directly The d_u union introduced in 3.18 is now anonymous, so we need to detect it and decide the right way to name d_alias. Note that we used to have support for both names to support kernels before 3.18, so this commit is effectively reverting the commit that removed that support, `efc293e371`. Sponsored-by: TrueNAS Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rob Norris <rob.norris@truenas.com> Closes #18471	2026-05-04 10:38:46 -07:00
Brian Behlendorf	fc87e269e2	Initialize vr_last_txg for rebuild Only call txg_wait_synced() when rebuild IOs were issued for this metaslab. This is a small optimization since in practice the first metaslab is very likely to have allocations and cause vr_last_txg to be initialized. After this point when processing empty metaslabs txg_wait_synced() is called but with an already committed txg so it will not wait. Still it's better not to call txg_wait_synced() at all when it's not needed. Reviewed-by: Andriy Tkachuk <atkachuk@wasabi.com> Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #18482	2026-05-04 10:38:46 -07:00
Andriy Tkachuk	76fd64ac9f	Fix rare cksum errors after rebuild Currently, after rebuild (aka sequential resilver), checksum errors can be seen sometimes on the spare vdev or draid spare. On my laptop, it happens from 2 to 4 times of running redundancy_draid_spare1 test in a loop for 100 times. It looks like there's a race in vdev_rebuild_thread() when the rebuild of space map ranges is finished and we re-enable allocations from the metaslab too soon: a new allocations may happen from that metaslab before txg with the rebuilt ranges is sync-ed, causing undesirable interference. Solution: wait for the txg to be sync-ed before enabling metaslab. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Akash B <akash-b@hpe.com> Signed-off-by: Andriy Tkachuk <atkachuk@wasabi.com> Closes #18307 Closes #18319 Closes #18473	2026-05-04 10:38:46 -07:00
Brian Behlendorf	b0c1dcb531	ZTS: add targeted redundancy_draid_spare exception When sequentially resilvering a dRAID pool it's possible that a few correctable checksum errors will be reported. This is a known issue which is occasionally observed in the CI. Until it's resolved we want the test case to tolerate a few checksum errors in this scenario to prevent false positives in the CI. This change also has the additional side effect of standardizing in one location how the dRAID pool integrity is verified. Reviewed-by: Tony Hutter <hutter2@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #18307 Issue #18319 Closes #18436	2026-05-04 10:38:46 -07:00
Christos Longros	887bfc1a64	build: use pax tar format for make dist Automake's default tar formats (v7 pre-1.18, ustar since) impose path length limits that drop several long test filenames from the release tarball when `make dist` runs. Pax format has no such limit and is read by GNU tar 1.14+ and libarchive/bsdtar. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Christos Longros <chris.longros@gmail.com> Closes: #17276 Closes: #18465	2026-04-27 10:57:52 -07:00
Tony Hutter	19354abc53	CI: curl fallback, print killed tests, FreeBSD URL - We've seen occasional 'ERROR 502: Bad Gateway' from the runner trying to download an image with axel. Axel can open multiple connections for a faster download, so maybe that's causing problems. This commit adds in a fallback to curl if the axel download doesn't work. - Update merge_summary.awk to print out killed tests in the summary. We've seen cases where the summary page was red but there were no test failures printed. This is because one of the VMs had too may killed tests, which caused the total test time to run too long and caused the runner to timeout qemu-6-test.sh. When the runner kills off qemu-6-tests.sh, it means we never generate the nice summary page for that VM listing the killed off tests. This commit parses the partial test logs for killed off tests and includes them in the merge_summary.awk output. - Print an error message in the summary page if one of the VMs didn't complete ZTS. This helps draw attention to a VM crash. - FreeBSD sometimes has broken links to their CI image. When that happens, select the newest nightly snapshot image as an alternative. This is needed right now, since the current images in the FreeBSD 16 "current/" directory are returning 404 errors. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tony Hutter <hutter2@llnl.gov> Closes #18460	2026-04-27 10:57:52 -07:00
Tony Hutter	aa62ae87dd	Fix 'kernel BUG at mm/usercopy.c' Fix a bug where an cgroup-OOM-killed process can cause a panic: usercopy: Kernel memory exposure attempt detected from vmalloc (offset 1007584, size 217120)! kernel BUG at mm/usercopy.c:102! This was caused by zfs_uiomove() not correctly returning EFAULT for short copies. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tony Hutter <hutter2@llnl.gov> Closes #15918 Closes #18408	2026-04-27 10:57:52 -07:00
Gality	b8addf9221	dmu_direct: avoid UAF in dmu_write_direct_done() dmu_write_direct_done() passes dmu_sync_arg_t to dmu_sync_done(), which updates the override state and frees the completion context. The Direct I/O error path then still dereferences dsa->dsa_tx while rolling the dirty record back with dbuf_undirty(), resulting in a use-after-free. Save dsa->dsa_tx in a local variable before calling dmu_sync_done() and use that saved tx for the error rollback. This preserves the existing ownership model for dsa and does not change the Direct I/O write semantics. Reviewed-by: Brian Atkinson <batkinson@lanl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Co-authored-by: gality369 <gality369@example.com> Signed-off-by: ZhengYuan Huang <gality369@gmail.com> Closes #18440	2026-04-27 10:57:52 -07:00
Alek P	7590972f76	Prevent range tree corruption race by updating dnode_sync() Switch to incremental range tree processing in dnode_sync() to avoid unsafe lock dropping during zfs_range_tree_walk(). This also ensures the free ranges remain visible to dnode_block_freed() throughout the sync process, preventing potential stale data reads. This patch: - Keeps the range tree attached during processing for visibility. - Processes segments one-by-one by restarting from the tree head. - Uses zfs_range_tree_clear() to safely handle ranges that may have been modified while the lock was dropped. - adds ASSERT()s to document that we don't expect dn_free_ranges modification outside of sync context. Reviewed-by: Paul Dagnelie <paul.dagnelie@klarasystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Alek Pinchuk <apinchuk@axcient.com> Issue #18186 Closes #18235	2026-04-27 10:57:52 -07:00
clefru	b06caaeec4	range_tree: use zfs_panic_recover() for partial-overlap remove zfs_range_tree_remove_impl() used a bare panic() when a segment to be removed was not completely overlapped by an existing tree entry. Every other consistency check in range_tree.c uses zfs_panic_recover(), which respects the zfs_recover tunable and allows pools with on-disk corruption to be imported and recovered. This one call was inconsistent, making the partial-overlap case unrecoverable regardless of zfs_recover. Replace panic() with zfs_panic_recover() so that operators can set zfs_recover=1 to import a corrupted pool and reclaim data, consistent with all other range tree error paths. Related-to: https://github.com/openzfs/zfs/issues/13483 Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Clemens Fruhwirth <clemens@endorphin.org> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Closes #18255	2026-04-27 10:57:52 -07:00
Tony Hutter	9edfdd6e41	[zfs-2.4.2] Whitelist some Makefile.am files from SPDX The Makefile.am files from libshare, libtpool, libunicode, and libuutil do not have SPDX lines. This is because those Makefiles only got SPDX lines after the big Makefile merge in commits like `309006a0c` and `0d44b58d7` (which have not been ported to this branch). Add the Makefiles to the whitelist here so spdxcheck.pl passes. Signed-off-by: Tony Hutter <hutter2@llnl.gov>	2026-04-23 15:08:21 -07:00
Gary Guo	e7524594a9	Fix read corruption after block clone after truncate When copy_file_range overwrites a recent truncation, subsequent reads can incorrectly determine that it is read hole instead of reading the cloned blocks. This can happen when the following conditions are met: - Truncate adds blkid to dn_free_ranges - A new TXG is created - copy_file_range calls dmu_brt_clone which override the block pointer and set DB_NOFILL - Subsequent read, given DB_NOFILL, hits dbuf_read_impl and dbuf_read_hole - dbuf_read_hole calls dnode_block_freed, which returns TRUE because the truncated blkids are still in dn_free_ranges This will not happen if the clone and truncate are in the same TXG, because the block clone would update the current TXG's dn_free_ranges, which is why this bug only triggers under high IO load (such as compilation). Fix this by skipping the dnode_block_freed call if the block is overridden. The fix shouldn't cause an issue when the cloned block is subsequently freed in later TXGs, as dbuf_undirty would remove the override. This requires a dedicated test program as it is much harder to trigger with scripts (this needs to generate a lot of I/O in short period of time for the bug to trigger reliably). Assisted-by: Gemini:gemini-3.1-pro Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Signed-off-by: Gary Guo <gary@kernel.org> Closes #18412 Closes #18421	2026-04-23 15:02:27 -07:00
Ameer Hamza	b2602a400a	Fix snapshot automount deadlock during concurrent zfs recv zfsctl_snapshot_mount() holds z_teardown_lock(R) across call_usermodehelper(), which spawns a mount process that needs namespace_sem(W) via move_mount. Reading /proc/self/mountinfo holds namespace_sem(R) and needs z_teardown_lock(R) via zpl_show_devname. When zfs_suspend_fs (from zfs recv or zfs rollback) queues z_teardown_lock(W), the rrwlock blocks new readers, completing the deadlock cycle. Fix by releasing z_teardown_lock(R) after gathering the dataset name and mount path, before any blocking operation. Everything after the release operates on local string copies or uses its own synchronization. The parent zfsvfs pointer remains valid because the caller holds a path reference to the automount trigger dentry. Releasing the lock allows zfs_suspend_fs to proceed concurrently with the mount helper, so dmu_objset_hold in zpl_get_tree can transiently fail with ENOENT during the clone swap. The mount helper fails, EISDIR is returned, and the VFS falls back to the ctldir stub (empty directory) until the next access retries. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com> Reviewed-by: Rob Norris <robn@despairlabs.com> Signed-off-by: Ameer Hamza <ahamza@ixsystems.com> Closes #18415	2026-04-23 15:02:23 -07:00
Ameer Hamza	5d569358c8	Fix options memory leak in zfsctl_snapshot_mount Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com> Reviewed-by: Rob Norris <robn@despairlabs.com> Signed-off-by: Ameer Hamza <ahamza@ixsystems.com> Closes #18415	2026-04-23 15:02:18 -07:00
mischivus	b40cd91913	Fix s_active leak in zfsvfs_hold() when z_unmounted is true When getzfsvfs() succeeds (incrementing s_active via zfs_vfs_ref()), but z_unmounted is subsequently found to be B_TRUE, zfsvfs_hold() returns EBUSY without calling zfs_vfs_rele(). This permanently leaks the VFS superblock s_active reference, preventing generic_shutdown_super() from ever firing, which blocks dmu_objset_disown() and makes the pool permanently unexportable (EBUSY). Add the missing zfs_vfs_rele() call, guarded by zfs_vfs_held() to handle the zfsvfs_create() fallback path where no VFS reference exists. This matches the existing cleanup pattern in zfsvfs_rele(). Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: mischivus <1205832+mischivus@users.noreply.github.com> Closes #18309 Closes #18310	2026-04-23 15:02:14 -07:00
Alek P	aba3ed30a3	fix memleak in spa_errlog.c Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com> Reviewed-by: Alan Somers <asomers@freebsd.org> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Alek Pinchuk <apinchuk@axcient.com> Closes #18403	2026-04-23 15:02:10 -07:00
Tony Hutter	afc6e08160	CI: Add more debugging to qemu-1-setup.sh - Remove line where we disable stdout at the end of qemu-1-setup.sh - Fix comment switching the 2x75GB -> 1x150GB cases - Add some more debug to the end of the script Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tony Hutter <hutter2@llnl.gov> Closes #18441	2026-04-23 15:01:25 -07:00
Brian Behlendorf	f99954c01f	CI: tolerate missing artifacts When a VM fails to launch or is unreachable the qemu-7-prepare.sh script will fail to collect the artifacts due to the missing vm* directories. We want to collect as much diagnostic information as possible, when missing create the directory to allow the subsequent steps to proceed normally. Additionally, we don't want to fail if the /tmp/summary.txt file is missing. Reviewed-by: Tony Hutter <hutter2@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #18438	2026-04-23 15:01:19 -07:00
Tony Hutter	6cb1e850b2	CI: Do not set scheduler in qemu-1-setup.sh We've seen some qemu-1-setup failures while trying to change the runner's block device scheduler value to 'none': We have a single 150GB block device Setting up swapspace version 1, size = 16 GiB (17179865088 bytes) no label, UUID=7a790bfe-79e5-4e38-b208-9c63fe523294 tee: '/sys/block/s*/queue/scheduler': No such file or directory Luckily, we don't need to set the scheduler anymore on modern kernels: https://github.com/openzfs/zfs/issues/9778#issuecomment-569347505 This commit just removes the code that sets the scheduler. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tony Hutter <hutter2@llnl.gov> Closes #18437	2026-04-23 15:01:14 -07:00
Brian Behlendorf	eb3331a83e	Linux 7.0 compat: META Update the META file to reflect compatibility with the 7.0 kernel. Reviewed-by: Tony Hutter <hutter2@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #18435	2026-04-23 15:01:10 -07:00
Christos Longros	a6b3ff9bab	deb.am: propagate build errors in native-deb targets Replace semicolons with && so build failures are not masked by the subsequent lockfile cleanup. Use trap to ensure the lockfile is removed on both success and failure. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Christos Longros <chris.longros@gmail.com> Closes #18206 Closes #18424	2026-04-23 15:01:05 -07:00
Andriy Tkachuk	da44040bbb	draid: fix cksum errors after rebuild with degraded disks Currently, when more than nparity disks get faulted during the rebuild, only first nparity disks would go to faulted state, and all the remaining disks would go to degraded state. When a hot spare is attached to that degraded disk for rebuild creating the spare mirror, only that hot spare is getting rebuilt, but not the degraded device. So when later during scrub some other attached draid spare happens to map to that spare, it will end up with cksum error. Moreover, if the user clears the degraded disk from errors, the data won't be resilvered to it, hot spare will be detached almost immediately and the data that was resilvered only to it will be lost. Solution: write to all mirrored devices during rebuild, similar to traditional/healing resilvering, but only if we can verify the integrity of the data, or when it's the draid spare we are writing to, in which case we are writing to a reserved spare space, and there is no danger to overwrite any good data. The argument that writing only to rebuilding draid spare vdev is faster than writing to normal device doesn't hold since, at a specific offset being rebuilt, draid spare will be mapped to a normal device anyway. redundancy_draid_degraded2 automation test is added also to cover the scenario. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Andriy Tkachuk <atkachuk@wasabi.com> Closes #18414	2026-04-23 15:00:46 -07:00
Tony Hutter	eec8b9b929	CI: Disable ZIP file artifacts, update versions The GH artifacts action now lets you disable auto-zipping your artifacts. Previously, GH would always automatically put your artifacts in a ZIP file. This is annoying when your artifacts are already in a tarball. Also update the following action versions checkout: v4 -> v6 upload-artifact: v4 -> v7 download-artifact: v4 -> v8 Lastly, fix a issue where zfs-qmeu-packages now needs to power cycle the VM. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: George Melikov <mail@gmelikov.ru> Signed-off-by: Tony Hutter <hutter2@llnl.gov> Closes #18411	2026-04-23 14:59:56 -07:00
Brian Behlendorf	f4e5eb7e51	CI: set /etc/hostid in zloop runner ztest can enable and disable the multihost property when testing. This can result in a failure when attempting to import an existing pool when multihost=on but no /etc/hostid file exists. Update the workflow to use zgenhostid to create /etc/hostid when not present. Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #18413	2026-04-23 14:59:52 -07:00
Brian Behlendorf	e9a8c6e080	draid: allow seq resilver reads from degraded vdevs When sequentially resilvering allow a dRAID child to be read as long as the DTLs indicate it should have a good copy of the data and the leaf isn't being rebuilt. The previous check was slightly too broad and would skip dRAID spare and replacing vdevs if one of their children was being replaced. As long as there exists enough additional redundancy this is fine, but when there isn't this vdev must be read in order to correctly reconstruct the missing data. A new test case has been added which exhausts the available redundancy, faults another device causing it to be degraded, and then performs a sequential resilver for the degraded device. In such a situation enough redundancy exists to perform the replacement and a scrub should detect no checksum errors. Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com> Reviewed-by: Andriy Tkachuk <andriy.tkachuk@seagate.com> Reviewed-by: Akash B <akash-b@hpe.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #18405	2026-04-23 14:59:47 -07:00
Alexander Motin	63b8da8ff7	Linux: Refactor zpl_fadvise() Similar to FreeBSD stop issuing prefetches on POSIX_FADV_SEQUENTIAL. It should not have this semantics, only hint speculative prefetcher, if access ever happen later. Instead after POSIX_FADV_WILLNEED handling call generic_fadvise(), if available, to do all the generic stuff, including setting f_mode in struct file, that we could later use to control prefetcher as part of read/write operations. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Alexander Motin <alexander.motin@TrueNAS.com> Closes #18395	2026-04-23 14:59:39 -07:00
Tony Hutter	26e9a69fea	CI: Free 35GB of unused files on the runner Free 35GB of unused files, mostly from unused development environments. This helps with the out of disk space problems we were seeing on FreeBSD runners. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: George Melikov <mail@gmelikov.ru> Signed-off-by: Tony Hutter <hutter2@llnl.gov> Closes #18400	2026-04-23 14:59:35 -07:00
Rob Norris	fc285caa84	linux/vfsops: remove zfs_mnt_t, pass directly A cleanup of opportunity. Since we already are modifying the contents of zfs_mnt_t, we've broken any API guarantee, so we might as well go the rest of the way and get rid of it, and just pass the osname and/or the vfs_t directly. It seems like zfs_mnt_t was never really needed anyway; it was added in `1c2555ef92` (March 2017) to minimise the difference to illumos, but zfs_vfsops was made platform-specific anyway in `7b4e27232d`. We also remove setting SB_RDONLY on the caller's flags when failing a read-write remount on a read-only snapshot or pool. Since `0f608aa6ca` the caller's flags have been a pointer back to fc->sb_flags, which are discarded without further ceremony when the operation fails, so the change is unnecessary and we can simplify the call further. Sponsored-by: TrueNAS Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rob Norris <rob.norris@truenas.com> Closes #18377	2026-04-23 14:59:31 -07:00
Rob Norris	a8942fdb89	linux/super: work around kernels that enforce "forbidden" mount options Before Linux 5.8 (include RHEL8), a fixed set of "forbidden" options would be rejected outright. For those, we work around it by providing our own option parser to avoid the codepath in the kernel that would trigger it. Sponsored-by: TrueNAS Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rob Norris <rob.norris@truenas.com> Closes #18377	2026-04-23 14:59:27 -07:00
Rob Norris	0b223ef577	linux/super: implement new mount params parser Adds zpl_parse_param and wires it up to the fs_context. This uses the kernel's standard mount option parsing infrastructure to keep the work we need to do to a minimum. We simply fill in the vfs_t we attached to the fs_context in the previous commit, ready to go for the mount/remount call. Here we also document all the options we need to support, and why. It's a lot of history but in the end the implementation is straightforward. Finally, if we get SB_RDONLY on the proposed superblock flags, we record that as the readonly mount option, because we haven't necessarily seen a "ro" param and we still need to know for remount, the `readonly` dataset property, etc. Sponsored-by: TrueNAS Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rob Norris <rob.norris@truenas.com> Closes #18377	2026-04-23 14:59:22 -07:00
Rob Norris	43eed9ee41	linux/super: match vfs_t lifetime to fs_context vfs_t is initially just parameters for the mount or remount operation, so match them to the lifetime of the fs_context that represents that operation. When we actually execute the operation (calling .get_tree or .reconfigure), transfer ownership of those options to the associated zfsvfs_t. Sponsored-by: TrueNAS Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rob Norris <rob.norris@truenas.com> Closes #18377	2026-04-23 14:59:18 -07:00
Rob Norris	f5a60b6cae	linux/super: remove zpl_parse_monolithic Final bit of cleanup of the old method. Sponsored-by: TrueNAS Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rob Norris <rob.norris@truenas.com> Closes #18377	2026-04-23 14:59:14 -07:00
Rob Norris	36ae5a65aa	linux/vfsops: remove old options parser We're working to replace this, and its easier to drop it outright while we get set up. To keep things compiling, the calls to zfsvfs_parse_options() are replaced with zfsvfs_vfs_alloc(), though without any option parsing at all nothing will work. That's ok, next commits are working towards it. Sponsored-by: TrueNAS Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rob Norris <rob.norris@truenas.com> Closes #18377	2026-04-23 14:59:09 -07:00
Rob Norris	7843c42b27	linux/vfsops: add vfs_t allocator, make public In a few commits, we're going to need to allocate and free vfs_t from zpl_super.c as well, so lets keep them uniform. Sponsored-by: TrueNAS Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rob Norris <rob.norris@truenas.com> Closes #18377	2026-04-23 14:59:02 -07:00
Andriy Tkachuk	9b8ccbd2cb	draid: fix import failure after disks replacements Currently, it's possible that draid vdev asize would decrease after disks replacements when the disk size is a little less than all other disks in the pool. In such situations, import would fail on this check in vdev_open(): /* * Make sure the allocatable size hasn't shrunk too much. */ if (asize < vd->vdev_min_asize) { vdev_set_state(vd, B_TRUE, VDEV_STATE_CANT_OPEN, VDEV_AUX_BAD_LABEL); return (SET_ERROR(EINVAL)); } Solution: fix vdev_draid_min_asize() so that it would round up the required minimal disk capacity to the VDEV_DRAID_ROWHEIGHT. This would refuse replacements with the disks whose size is less than minimally required to avoid draid asize decrement. Note: we also use VDEV_DRAID_ROWHEIGHT in vdev_draid_open() when calculating asize, and thats why we need to round up min_size at vdev_draid_min_asize() to avoid asize drops. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Andriy Tkachuk <andriy.tkachuk@seagate.com> Closes #18380	2026-04-23 14:58:57 -07:00
Rob Norris	3ca81f610b	Linux 7.0: ensure LSMs get to process mount options Normally, kernel gives any LSM registering a `sb_eat_lsm_opts` hook a first look at mount options coming in from a userspace mount request. The LSM may process and/or remove any options. Whatever is left is passed to the filesystem. This is how the dataset properties `context`, `fscontext`, `defcontext` and `rootcontext` are used to configure ZFS mounts for SELinux. libzfs will fetch those properties from the dataset, then add them to the mount options. In `0f608aa6ca` (#18216) we added our own mount shims to cover the loss of the kernel-provided ones. It turns out that if a filesystem provides a `.parse_monolithic callback`, it is expected to do _all_ mount option parameter processing - the kernel will not get involved at all. Because of that, LSMs are never given a chance to process mount options. The `context` properties are never seen by SELinux, nor are any other options targetting other LSMs. Fix this by calling `security_sb_eat_lsm_opts()` in `zpl_parse_monolithic()`, before we stash the remaining options for `zfs_domount()`. Sponsored-by: TrueNAS Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rob Norris <rob.norris@truenas.com> Closes #18376	2026-04-23 14:58:50 -07:00
Christos Longros	74052404c6	ci: update FreeBSD CI images from 14.3 to 14.4 Update FreeBSD CI targets from 14.3 to 14.4 in both the QEMU start script and the workflow configuration. Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Christos Longros <chris.longros@gmail.com> Closes #18362	2026-04-23 14:58:44 -07:00
John Cabaj	6756fd4740	Linux 7.0: autoconf: Remove copy-from-user-inatomic API checks (#18348 ) (#18354 ) This function was removed in `c6442bd3b6`: "Removing old code outside of 4.18 kernsls", but fails at present on PowerPC builds due to the recent inclusion of 6bc9c0a90522: "powerpc: fix KUAP warning in VMX usercopy path" in the upstream kernel, which introduces a use of cpu_feature_keys[], which is a GPL-only symbol. Removing the API check as it doesn't appear necessary. Signed-off-by: John Cabaj <john.cabaj@canonical.com> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>	2026-04-23 14:58:39 -07:00
Tony Hutter	0d42a6c357	CI: Add ARM builder Do a ZFS build inside of an ARM runner. This only does a simple build, it does not run the test suite. The build runs on the runner itself rather than in a VM, since nesting is not supported on Github ARM runners. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: George Melikov <mail@gmelikov.ru> Signed-off-by: Tony Hutter <hutter2@llnl.gov> Closes #18343	2026-04-23 14:58:34 -07:00
Ameer Hamza	2c861ebcde	CI: Support repository variable override for ZTS OS selection Allow restricting ZTS OS targets by setting the vars.ZTS_OS_OVERRIDE repository variable (e.g. '["debian13"]') to reduce shared runner contention when running the full OS matrix is unnecessary. When unset, the existing ci_type-based OS selection is used unchanged. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ameer Hamza <ahamza@ixsystems.com> Closes #18342	2026-04-23 14:58:28 -07:00
Rob Norris	20b8936c1a	linux/super: flatten zpl_fill_super into zpl_get_tree Target of opportunity; with no other callers, there's no need for it to be a static function. Sponsored-by: TrueNAS Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rob Norris <rob.norris@truenas.com> Closes #18339	2026-04-23 14:57:44 -07:00
Rob Norris	04692b29da	linux/super: flatten zpl_mount_impl into zpl_get_tree Target of opportunity; with no other callers, there's no need for it to be a static function. Sponsored-by: TrueNAS Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rob Norris <rob.norris@truenas.com> Closes #18339	2026-04-23 14:57:37 -07:00
Rob Norris	7c3f75af2f	linux/super: flatten mount/remount into get_tree/reconfigure With the old API gone, there's no need to massage new-style calls into its shape and call another function; we can just make those handlers work directly. Sponsored-by: TrueNAS Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rob Norris <rob.norris@truenas.com> Closes #18339	2026-04-23 14:57:29 -07:00
Rob Norris	0edbfbfb2d	linux/super: remove support for old mount API Removing the HAVE_FS_CONTEXT gates and anything that would be used if it wasn't set. Sponsored-by: TrueNAS Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rob Norris <rob.norris@truenas.com> Closes #18339	2026-04-23 14:57:23 -07:00

1 2 3 4 5 ...

10495 Commits