Commit Graph

9858 Commits

Author SHA1 Message Date
Tony Hutter
4b014840ea Fix double spares for failed vdev
It's possible for two spares to get attached to a single failed vdev.
This happens when you have a failed disk that is spared, and then you
replace the failed disk with a new disk, but during the resilver
the new disk fails, and ZED kicks in a spare for the failed new
disk.  This commit checks for that condition and disallows it.

Reviewed-by: Akash B <akash-b@hpe.com>
Reviewed-by: Ameer Hamza <ahamza@ixsystems.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes: #16547
Closes: #17231
(cherry picked from commit f40ab9e399)
2025-05-28 16:00:28 -07:00
Tino Reichardt
cd777ba5ad ZTS: Fix replacement/resilver_restart_001 on FreeBSD
Decrease the RESILVER_MIN_TIME_MS variable from 50 to 20.
So the test, which expects two 2 resilver starts will see them.

Logfile of the seen failures before this fix:
log: NOTE: expected 2 resilver start(s) after offline/online, found 1
log: expected 2 resilver start(s) after offline/online, found 1

The test time decreases also from around 00:42 to 00:24 seconds.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de>
Closes #16822
Closes #17279
(cherry picked from commit 3b18877269)
2025-05-28 16:00:28 -07:00
Artem
ad63ab2d90 Sort the blocking snapshots list #12751 (#17264)
When multiple snapshots prevent the destruction/rollback of the
respective dataset/snapshot/volume via zfs destroy or zfs rollback,
the error message does not list the blocking snapshots sorted
according to their order of creation. This causes inconvenience and can
lead to confusion, and also creates a contrast with a returned message
from zfs list -t snap function.

Closes: #12751

Signed-off-by: Artem-OSSRevival <artem.vlasenko@ossrevival.org>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
(cherry picked from commit 27f3d94940)
2025-05-28 16:00:28 -07:00
Aleksandr Liber
edae295af9 Double quote variables to prevent globbing and word splitting
This change goes through and quotes variables where appropriate to
avoid issues with incorrect splitting. The performance tests ran into
an issue with $SUDO_COMMAND splitting incorrectly because it was not
quoted. This change fixes that issue and hopefully gets ahead of any
other similar problems.

Reviewed by: John Wren Kennedy <jwk404@gmail.com>
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Aleksandr Liber <aleksandr.liber@perforce.com>
Closes #17235
(cherry picked from commit aa46cc9812)
2025-05-28 16:00:28 -07:00
Rob Norris
c85f2fd531 cred: properly pass and test creds on other threads (#17273)
### Background

Various admin operations will be invoked by some userspace task, but the
work will be done on a separate kernel thread at a later time. Snapshots
are an example, which are triggered through zfs_ioc_snapshot() ->
dsl_dataset_snapshot(), but the actual work is from a task dispatched to
dp_sync_taskq.

Many such tasks end up in dsl_enforce_ds_ss_limits(), where various
limits and permissions are enforced. Among other things, it is necessary
to ensure that the invoking task (that is, the user) has permission to
do things. We can't simply check if the running task has permission; it
is a privileged kernel thread, which can do anything.

However, in the general case it's not safe to simply query the task for
its permissions at the check time, as the task may not exist any more,
or its permissions may have changed since it was first invoked. So
instead, we capture the permissions by saving CRED() in the user task,
and then using it for the check through the secpolicy_* functions.

### Current implementation

The current code calls CRED() to get the credential, which gets a
pointer to the cred_t inside the current task and passes it to the
worker task. However, it doesn't take a reference to the cred_t, and so
expects that it won't change, and that the task continues to exist. In
practice that is always the case, because we don't let the calling task
return from the kernel until the work is done.

For Linux, we also take a reference to the current task, because the
Linux credential APIs for the most part do not check an arbitrary
credential, but rather, query what a task can do. See
secpolicy_zfs_proc(). Again, we don't take a reference on the task, just
a pointer to it.

### Changes

We change to calling crhold() on the task credential, and crfree() when
we're done with it. This ensures it stays alive and unchanged for the
duration of the call.

On the Linux side, we change the main policy checking function
priv_policy_ns() to use override_creds()/revert_creds() if necessary to
make the provided credential active in the current task, allowing the
standard task-permission APIs to do the needed check. Since the task
pointer is no longer required, this lets us entirely remove
secpolicy_zfs_proc() and the need to carry a task pointer around as
well.

Sponsored-by: https://despairlabs.com/sponsor/

Signed-off-by: Rob Norris <robn@despairlabs.com>
Reviewed-by: Pavel Snajdr <snajpa@snajpa.net>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Kyle Evans <kevans@FreeBSD.org>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
(cherry picked from commit c8fa39b46c)
2025-05-28 16:00:28 -07:00
Tino Reichardt
aa9335bbbc ZTS: Optimize KSM on Linux and remove it for FreeBSD
Don't use KSM on the FreeBSD VMs and optimize KSM settings for
Linux to have faster run times.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de>
Closes #17247
(cherry picked from commit ba17cedf65)
2025-05-28 16:00:28 -07:00
Quentin Thébault
4f34e8dcf6 zfs-rollback.8: fix typo in example number
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Alexander Ziaee <ziaee@FreeBSD.org>
Reviewed-by: Rob Norris <robn@despairlabs.com>
Signed-off-by: Quentin Thébault <quentin.thebault@defenso.fr>
Closes #17282
(cherry picked from commit 63de2d2dbd)
2025-05-28 16:00:28 -07:00
Tino Reichardt
a33e8b05ee ZTS: Use Ubuntu default url for cloud-image
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de>
Closes #17278
(cherry picked from commit 88ec6c4f40)
2025-05-28 16:00:28 -07:00
Alexander Motin
fbff1ae9f6 ZTS: Make zvol_stress write some more
Sometimes it fails unable to see any injected write errors.
I guess writing 25KB of zeroes might be not enough to trigger
errors with probability set to 10%.  Lets try to write more.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by:	Alexander Motin <mav@FreeBSD.org>
Sponsored by:	iXsystems, Inc.
Closes #17270
(cherry picked from commit d947b9aedd)
2025-05-28 16:00:28 -07:00
Alexander Motin
273db246a4 ZTS: Reduce extra caching in pool_checkpoint (#17268)
Those tests are write-mostly at the nested pool.  Considering we have
3 more layers of caching underneath, we can hint ZFS how to use the
memory better by setting primarycache=metadata.

While there, add missing zpool sync after rm in checkpoint_capacity
before we could potentially see the freed space, would not there be
a pool checkpoint.

Signed-off-by:	Alexander Motin <mav@FreeBSD.org>
Sponsored by:	iXsystems, Inc.

Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
(cherry picked from commit 1ef706c4ad)
2025-05-28 16:00:28 -07:00
Sebastian Pauka
28f0c5cfdc Support using llvm-libunwind
This commit adds support for using llvm-libunwind for kernels built
using llvm and clang. The two differences are that the largest register
index is given by _LIBUNWIND_HIGHEST_DWARF_REGISTER, we need to check
whether the register is a floating point register and the prototype
for unw_regname takes the unwind cursor as the first argument.

Reviewed-by: Rob Norris <robn@despairlabs.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Sebastian Pauka <me@spauka.se>
Closes #17230
(cherry picked from commit 1b4826b9a2)
2025-05-28 16:00:28 -07:00
Brian Atkinson
a77d641f01 Export correct symbols for Lustre Direct I/O
Originally the Lustre ZFS OSD code was going to use zfs_uio_t structs
for supporting Direct I/O with ZFS. However, this has changed to using
abd_t structs instead. This exports the proper symbols that will be used
by the Lustre ZFS OSD code.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Brian Atkinson <batkinson@lanl.gov>
Closes #17256
(cherry picked from commit 7031a48c70)
2025-05-28 16:00:28 -07:00
Artem-OSSRevival
0956fd736c Add more descriptive destroy error message
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de>
Reviewed by: Attila Fülöp <attila@fueloep.org>
Signed-off-by: Artem-OSSRevival <artem.vlasenko@ossrevival.org>
Fixes: #14538
Closes: #17234
(cherry picked from commit 37a3e26552)
2025-05-28 16:00:28 -07:00
Alexander Motin
7bb7ff7b49 ZTS: Fix 256MB file leak in zed_cksum_reported
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by:	Alexander Motin <mav@FreeBSD.org>
Sponsored by:	iXsystems, Inc.
Closes: #17267
(cherry picked from commit 38c3a8be83)
2025-05-28 16:00:28 -07:00
Tino Reichardt
658526db99 ZTS: Update FreeBSD version numbers
All defined variants:
- freebsd13-4r, freebsd13-5r, freebsd14-1r, freebsd14-2r (RELEASE)
- freebsd13-5s, freebsd14-2s (STABLE)
- freebsd15-0c (CURRENT)

Used for testing:
- freebsd13-4r (RELEASE)
- freebsd14-2s (STABLE)
- freebsd15-0c (CURRENT)

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de>
Closes: #17260
(cherry picked from commit 6afb405d96)
2025-05-28 16:00:28 -07:00
Alexander Motin
b590bfc6c8 ZTS: Remove fixed sleeps from slog_006_pos
Replace `sleep 15` with `zpool wait`, which should take much less
than the 15 seconds.  And considering it is called 16 times, this
should save us up to 4 minutes total.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de>
Signed-off-by:	Alexander Motin <mav@FreeBSD.org>
Sponsored by:	iXsystems, Inc.
Closes: #17257
(cherry picked from commit 8f2c2dea3c)
2025-05-28 16:00:28 -07:00
Alexander Motin
03ac770008 ZTS: Polish online_offline tests
- Kill workload first for faster cleanup.
 - Use `zpool wait` for resilver instead of `sleep`.
 - Remove irrelevant workload from `online_offline_003_neg`.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by:	Alexander Motin <mav@FreeBSD.org>
Sponsored by:	iXsystems, Inc.
Closes: #17259
(cherry picked from commit cb49e7701f)
2025-05-28 16:00:28 -07:00
Alexander Motin
95df01020d ZTS: Remove ashift setting from dedup_quota test (#17250)
The test writes 1M of 1KB blocks, which may produce up to 1GB of
dirty data.  On top of that ashift=12 likely produces additional
4GB of ZIO buffers during sync process.  On top of that we likely
need some page cache since the pool reside on files.  And finally
we need to cache the DDT.  Not surprising that the test regularly
ends up in OOMs, possibly depending on TXG size variations.

Also replace fio with pretty strange parameter set with a set of
dd writes and TXG commits, just as we neeed here.

While here, remove compression.  It has nothing to do here, but
waste CI CPU time.

Signed-off-by:	Alexander Motin <mav@FreeBSD.org>
Sponsored by:	iXsystems, Inc.
Reviewed-by: Paul Dagnelie <pcd@delphix.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
(cherry picked from commit 1d8f625233)
2025-05-28 16:00:28 -07:00
Alexander Motin
243a46f28d Cleanup VERIFY() macros (#17163)
- Fix VERIFY3B() when given non-boolean values.
 - Map EQUIV() into VERIFY3B(,==,) as equivalent.
 - Tune messages for better readability and to closer match source
code for easier search.  Unify user-space messages with kernel.
 - Tune printed types and remove %px outside of Linux kernel.

Signed-off-by:	Alexander Motin <mav@FreeBSD.org>
Sponsored by:	iXsystems, Inc.
Reviewed-by: @ImAwsumm
Reviewed-by: Rob Norris <robn@despairlabs.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
(cherry picked from commit 4866c2fabf)
2025-05-28 16:00:28 -07:00
Rob Norris
7fde3933fb vdev_to_nvlist_iter: ignore draid parameters when matching names (#17228)
Various tools will display draid vdev names with parameters embedded in
them, but would not accept them as valid vdev names when looking them
up, making it difficult to build pipelines involving draid vdevs.

This commit makes it so that if a full draid name is offered for match,
it gets truncated at the first ':' character.

Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.

Signed-off-by: Rob Norris <rob.norris@klarasystems.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
(cherry picked from commit 131df3bbf2)
2025-05-28 16:00:28 -07:00
Alexander Motin
c2424f8d1a Improve L2 caching control for prefetched indirects
dbuf_prefetch_impl() should look on level of current indirect, not
the target prefetch level.  dbuf_prefetch_indirect_done() should
call dnode_level_is_l2cacheable() if we have dpa_dnode to pass it.
It should fix some both false positive and negative L2ARC caching.

While there, fix redacted feature activation assertions.  One was
always true, while another could give false positive if dpa_dnode
is NULL.

George Amanakis <gamanakis@gmail.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by:	Alexander Motin <mav@FreeBSD.org>
Sponsored by:	iXsystems, Inc.
Closes #17204
(cherry picked from commit a497c5fc8b)
2025-05-28 16:00:28 -07:00
Alexander Motin
40b9ad19cc ZTS: Remove TXG_TIMEOUT from dedup_quota test (#17150)
It seems `fio` in `ddt_dedup_vdev_limit` overwhelms the system
with the amount of dirty data caused by DDT updates within one
TXG due to tiny 1KB records used, while I see no reason for this
test to extend the TXGs beyond default.

Signed-off-by:	Alexander Motin <mav@FreeBSD.org>
Sponsored by:	iXsystems, Inc.
Reviewed-by: Allan Jude <allan@klarasystems.com>
Reviewed-by: @ImAwsumm
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
(cherry picked from commit 21850f519b)
2025-05-28 16:00:28 -07:00
Alexander Motin
602fecc316 Prefer embedded blocks to dedup
Since embedded blocks introduction 11 years ago, their writing was
blocked if dedup is enabled.  After searching through the modern
code I see no reason for this restriction to exist.  Same time
embedded blocks are dramatically cheaper.  Even regular write of
so small blocks would likely be cheaper than deduplication, even
if the last is successful, not mentioning otherwise.

Reviewed-by: Allan Jude <allan@klarasystems.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by:	Alexander Motin <mav@FreeBSD.org>
Sponsored by:	iXsystems, Inc.
Closes #17113
(cherry picked from commit 09f4dd06c3)
2025-05-28 16:00:28 -07:00
Alexander Motin
588fa16830 ZAP: Reduce leaf array and free chunks fragmentation
Previous implementation of zap_leaf_array_free() put chunks on the
free list in reverse order.  Also zap_leaf_transfer_entry() and
zap_entry_remove() were freeing name and value arrays in reverse
order.  Together this created a mess in the free list, making
following allocations much more fragmented than necessary.

This patch re-implements zap_leaf_array_free() to keep existing
chunks order, and implements non-destructive zap_leaf_array_copy()
to be used in zap_leaf_transfer_entry() to allow properly ordered
freeing name and value arrays there and in zap_entry_remove().

With this change test of some writes and deletes shows percent of
non-contiguous chunks in DDT reducing from 61% and 47% to 0% and
17% for arrays and frees respectively.  Sure some explicit sorting
could do even better, especially for ZAPs with variable-size arrays,
but it would also cost much more, while this should be very cheap.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored by:	iXsystems, Inc.
Closes #16766
(cherry picked from commit 9a81484e35)
2025-05-28 16:00:28 -07:00
Tony Hutter
92f430b00f Tag zfs-2.3.2
META file and changelog updated.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
2025-04-30 10:58:28 -07:00
Tony Hutter
2f8fc4a869 RPM: Hold back incompatible kernel packages on Fedora
A user reported that when your upgrade your kernel packages on Fedora
with ZFS installed, only the kernel-devel package gets held back to the
ZFS-supported version, but not the other kernel packages. So if ZFS only
supports the 6.13 kernel, Fedora will still happily upgrade the kernel
RPM to 6.14, but hold back kernel-devel at 6.13, for example.

This commit includes version checks for the 'kernel-uname-r' dependency,
typically provided by the 'kernel-core' package.

Original-patch-by: @jkool702
Reviewed-by: @jkool702
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #17265
Closes #17271
2025-04-30 10:58:28 -07:00
Tony Hutter
a39a14eb6e CI: Add Fedora 42 runner (#17249)
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
2025-04-22 12:29:07 -07:00
Tony Hutter
36864e3d77 GCC 15: Fix unterminated-string-initialization (#17244)
Fix build errors on Fedora 42 like:

  module/zcommon/zfs_valstr.c:193:16: error: initializer-string for
  array of 'char' truncates NUL terminator but destination lacks
  'nonstring' attribute (3 chars into 2 available)

The arrays in zpool_vdev_os.c and zfs_valstr.c don't need to be
NULL terminated, but we do so to make GCC happy.

Closes: #17242

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
2025-04-16 09:59:45 -07:00
Rob Norris
fea534d1d0 gcm_avx_init: zero the ghash state after hashing the IV
IVs != 96 bits get hashed with GHASH to bring them to 96 bits. Any call
to GHASH will mix the ghash state in gcm_ghash. This is expected to be
zero at first use in an encrypt or decrypt operation, so it needs to be
zeroed after using GHASH in setup.

gcm_init() does this, but gcm_avx_init() zeroed it before setup, not
after, resulting in incorrect encrypt/decrypt results when using AVX GCM
with an IV != 96 bits.

OpenZFS _always_ uses a 96 bit IV (ZIO_DATA_IV_LEN) so this will never
have been hit in any real-world use, which is extremely fortunate, as we
would have incorrectly-encrypted data on-disk. Still, as long as we have
this code here we should make sure it's correct.

Thanks-to: Joel Low <joel@joelsplace.sg>
Sponsored-by: https://despairlabs.com/sponsor/
Signed-off-by: Rob Norris <robn@despairlabs.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de>
Reviewed-by: Attila Fülöp <attila@fueloep.org>
2025-04-16 09:59:45 -07:00
IIIPr0t0typ3III
cc43549b08 Fixed zfs_notify_email for programs like sendmail
zfs_notify_email will now include an empty line separating the header
from the body of the email in case the subject is not provided via a
command line argument. This is necessary for programs like sendmail to
function correctly (everything up to the first empty line is interpreted
as header, which previously resulted in either missing message parts or
unsent emails)

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Felix Schmidt <felixschmidt20@aol.com>
Closed #17238
2025-04-16 09:59:45 -07:00
Rob Norris
04b02f0663 config: fix ZFS_LINUX_TEST_RESULT_SYMBOL with --enable-linux-builtin
The tiniest typo in dd2a46b5e6 (#17106) broke it, by setting the wrong
var with the test var, resulting in it always producing "no".

Sponsored-by: https://despairlabs.com/sponsor/
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed by: Attila Fülöp <attila@fueloep.org>
Signed-off-by: Rob Norris <robn@despairlabs.com>
Closes #17236
2025-04-16 09:59:45 -07:00
Tony Hutter
20f00819f3 Linux 6.0 compat: Check for migratepage VFS (#17217)
The 6.0 kernel removes the 'migratepage' VFS op. Check for
migratepage.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Rob Norris <robn@despairlabs.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org
2025-04-16 09:59:45 -07:00
Tony Hutter
81de1eae4c debian: Add libtirpc-dev dependency (#17220)
Debian requires libtirpc-dev.  Update our debian/control file to
match Debian's upstream one.

Closes: #17197

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de>
Reviewed-by: @manfromafar
2025-04-16 09:59:45 -07:00
Richard Kojedzinszky
5952fc15b9 Fix memory leaks in pool properties handling
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Richard Kojedzinszky <richard@kojedz.in>
Closes #17208
2025-04-16 09:59:45 -07:00
Syed Shahrukh Hussain
a486cac359 Added fix for zpool get state segfaults with two or more vdevs (#15972). (#17213)
The problem was identified in handling of the zpool get state command
line arguments. A pointer vdev was used to point to the argv[1], and
its address set to cb.cb_vdevs.cb_names(pointer to array of strings)
so any increment to cb_names resulted in a segfault. Fix covers a
special case of root parameter at argv[1] and remaining cases are
handled by passing in the argv + 1, which allows cb_names iteration
of next command line arguments (vdevs).

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Attila Fülöp <attila@fueloep.org>

Signed-off-by: Syed Shahrukh Hussain <syed.shahrukh@ossrevival.org>
2025-04-16 09:59:45 -07:00
Paul Dagnelie
fbac52e1e9 Fix FDT rollback to not overwrite unnecessary fields (#17205)
When a dedup write fails, we try to roll the DDT entry back to a known
good state. However, this also rolls the refcounts and the last-update
time back to the state they were at when we started this write. This
doesn't appear to be able to cause any refcount leaks (after the fix in
17123). This PR prevents that from happening by only rolling back the
parts of the DDT entry that have been updated by the write so far.

Sponsored-by: iXsystems, Inc.
Sponsored-by: Klara, Inc.

Signed-off-by: Paul Dagnelie <paul.dagnelie@klarasystems.com>
Co-authored-by: Paul Dagnelie <paul.dagnelie@klarasystems.com>

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
2025-04-16 09:59:45 -07:00
Rob Norris
8539bdf568 [2.3.2] uconv: add SPDX license tag
Signed-off-by: Rob Norris <robn@despairlabs.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
2025-04-16 09:59:45 -07:00
Martin Matuška
c312a988b5 freebsd: unbreak module/Makefile.bsd build on 15-CURRENT-arm64
- don't include foreign machine assembly files
- reduce diff to FreeBSD module Makefile

Discovered in FreeBSD port filesystems/openzfs-kmod

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de>
Signed-off-by: Martin Matuska <mm@FreeBSD.org>
Closes #17219
2025-04-16 09:59:45 -07:00
Paul Dagnelie
bd5465e4eb Fix nonrot property being incorrectly unset (#17206)
When opening a vdev and setting the nonrot property, we used to wait for
each child to be opened before examining its nonrot property. When the
change was made to open vdevs asynchronously, we didn't move the nonrot
check out of the main loop. As a result, the nonrot property is almost
always set to false, regardless of the actual type of the underlying
disks. The fix is simply to move the nonrot check to a separate loop
after the taskq has been waited for.

Sponsored-by: Klara, Inc.
Sponsored-by: Eshtek, Inc.

Signed-off-by: Paul Dagnelie <paul.dagnelie@klarasystems.com>
Co-authored-by: Paul Dagnelie <paul.dagnelie@klarasystems.com>

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
2025-04-16 09:59:45 -07:00
Martin Matuška
5fb1d520fe Multiple printf() size fixes (#17199)
cmd/zinject/zinject.c:
 - use PRIu64 when printing uint64_t

tests/zfs-tests/cmd/clonefile.c:
 - use an unsigned long long to store result from strtoull()
 - use %jd for printing off_t, %zu for size_t, %zd for ssize_t

tests/zfs-tests/tests/functional/vdev_disk/page_alignment.c:
 - use %zx to print size_t

Discovered when compiling on FreeBSD i386.

Signed-off-by: Martin Matuska <mm@FreeBSD.org>

Reviewed-by: Rob Norris <robn@despairlabs.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Allan Jude <allan@klarasystems.com>
Reviewed-by: @ImAwsumm
2025-04-16 09:59:45 -07:00
Alexander Motin
6f2080f1ab Fix lock reversal on device removal cancel
FreeBSD kernel's WITNESS code detected lock ordering violation in
spa_vdev_remove_cancel_sync().  It took svr_lock while holding
ms_lock, which is opposite to other places.  I was thinking to
resolve it similar to #17145, but looking closer I don't think
we even need svr_lock at that point, since we already asserted
svr_allocd_segs is empty, and we don't need to add there segments
we are going to call free_mapped_segment_cb for.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Allan Jude <allan@klarasystems.com>
Signed-off-by:	Alexander Motin <mav@FreeBSD.org>
Sponsored by:	iXsystems, Inc.
Closes #17164
2025-04-16 09:59:45 -07:00
Paul Dagnelie
9f0be8fca0 Fix dspace underflow bug
Since spa_dspace accounts only normal allocation class space,
spa_nonallocating_dspace should do the same.  Otherwise we may get
negative overflow or respective assertion spa_update_dspace() if
removed special/dedup vdev is bigger than all normal class space.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Allan Jude <allan@klarasystems.com>
Signed-off-by: Paul Dagnelie <paul.dagnelie@klarasystems.com>
Closes #17183
2025-04-16 09:59:45 -07:00
Piotr Kubaj
12657df52a simd_powerpc.h: enable FPU on FreeBSD
FreeBSD nowadays supports FPU in the kernel on powerpc*, so enable it.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Piotr Kubaj <pkubaj@FreeBSD.org>
Closes #17191
2025-04-16 09:59:45 -07:00
aokblast
153c982aac spl_vfs: fix vrele task runner signature mismatch
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: SHENGYI HONG <aokblast@FreeBSD.org>
Closes #17101
2025-04-16 09:59:45 -07:00
Attila Fülöp
398bdcd884 ZTS: Fix zpool dry run tests output formating
Signed-off-by: Attila Fülöp <attila@fueloep.org>
2025-04-16 09:59:45 -07:00
Attila Fülöp
a9c37b7c38 ZTS: Fix zpool dry run tests depending on output format
Signed-off-by: Attila Fülöp <attila@fueloep.org>
2025-04-16 09:59:45 -07:00
Friedrich Weber
1c6d2f71aa contrib/initramfs: use LVM autoactivation for activating VGs (#17125)
Currently, the zfs initramfs-tools boot script under local-top calls
`vgchange -ay`, which unconditionally activates all logical volumes
(LVs) in all discovered volume groups (VGs). This causes all LVs to be
active after boot. However, users may prefer to not activate certain
VGs/LVs on boot. They might normally use the `--setautoactivation n`
VG/LV flag or the `auto_activation_volume_list` LVM config option to
achieve this, but since the script unconditionally activates all LVs,
neither has an effect.

To fix this, call `vgchange -aay` instead. This triggers LVM
autoactivation, which honors autoactivation settings such as the
`--setautoactivation` flag. It is also more in line with the LVM
documentation, which says autoactivation is "meant to be used by
activation commands that are run automatically by the system" [1].

Note that this change might break misconfigured setups that have ZFS
on top of an LV for which autoactivation is disabled.

[1] https://gitlab.com/lvmteam/lvm2/-/blob/cff93e4d/conf/example.conf.in#L1579


Reviewed-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>

Signed-off-by: Friedrich Weber <f.weber@proxmox.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
2025-04-16 09:59:45 -07:00
Ameer Hamza
ab455c7b80 zed: Ensure spare activation after kernel-initiated device removal
In addition to hotplug events, the kernel may also mark a failing vdev
as REMOVED. This was observed in a customer report and reproduced by
forcing the NVMe host driver to disable the device after a failed reset
due to command timeout. In such cases, the spare was not activated
because the device had already transitioned to a REMOVED state before
zed processed the event.
To address this, explicitly attempt hot spare activation when the
kernel marks a device as REMOVED.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
Closes #17187
2025-04-16 09:59:45 -07:00
Rob Norris
76bd2ae5c8 config: cache results of kernel checks (#17106)
Kernel checks are the heaviest part of the configure checks. This allows
the results to be cached through the normal autoconf cache.

Since we don't want to reuse cached values for different kernels, but
don't want to discard the entire cache on every kernel, we instead add a
short checksum to kernel config cache keys, based on the version and
path, so the cache can hold results for multiple different kernels.

Sponsored-by: https://despairlabs.com/sponsor/

Signed-off-by: Rob Norris <robn@despairlabs.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
2025-04-16 09:59:45 -07:00
Alexander Motin
e6f8c1f612 Block remap for cloned blocks on device removal
When after device removal we handle block pointers remap, skip blocks
that might be cloned.  BRTs are indexed by vdev id and offset from
block pointer's DVA[0].  So if we start addressing the same block by
some different DVA, we won't get the proper reference counter.  As
result, we might either remap the block twice, that may result in
assertion during indirect mapping condense, or free it prematurely,
that may result in data overwrite, or free it twice, that may result
in assertion in spacemap code.

Reviewed-by: Ameer Hamza <ahamza@ixsystems.com>
Reviewed-by: Paul Dagnelie <pcd@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by:  Alexander Motin <mav@FreeBSD.org>
Sponsored by:   iXsystems, Inc.
Closes #15604
Closes #17180
2025-04-16 09:59:45 -07:00