mirror_zfs

mirror of https://git.proxmox.com/git/mirror_zfs.git synced 2026-04-17 08:54:52 +03:00

Author	SHA1	Message	Date
Christos Longros	6a717f31e6	Improve misleading error messages for ZPOOL_STATUS_CORRUPT_POOL When devices are missing or claimed by another subsystem (e.g. mdadm, LVM), zpool import reports "The pool metadata is corrupted" and suggests destroying the pool. This is misleading because the metadata is not necessarily corrupted -- it may simply be incomplete due to inaccessible devices. Update the status, action, and recovery messages to acknowledge that missing devices can trigger this status, and suggest checking device availability before resorting to pool destruction. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Chris Longros <chris.longros@gmail.com> Closes #18251 Closes #8236	2026-02-23 09:41:24 -08:00
Tony Hutter	d2f5cb3a50	Move range_tree, btree, highbit64 to common code Break out the range_tree, btree, and highbit64/lowbit64 code from kernel space into shared kernel and userspace code. This is needed for the updated `zpool status -vv` error byte range reporting that will be coming in a future commit. That commit needs the range_tree code in kernel and userspace. Reviewed-by: Rob Norris <robn@despairlabs.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tony Hutter <hutter2@llnl.gov> Closes #18133	2026-02-22 11:43:51 -08:00
Rob Norris	d11c661544	zdb: handle key load/derive failures a bit more gracefully There's no real need to outright crash if key loading fails; we can just unwind nicely. Sponsored-by: TrueNAS Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com> Signed-off-by: Rob Norris <rob.norris@truenas.com> Closes #18230	2026-02-20 13:37:43 -08:00
Rob Norris	9f874ad092	zdb: don't try to load key for unencrypted dataset Previously using -K/--key on an unencrypted dataset would trip a VERIFY, because the dataset has nowhere to load the key into. Now, just ignore it. This makes zdb much easier to drive when there's a mix of encrypt and non-encrypted datasets, as the key can provided for all of them (at least, assuming the same encryption root, which is a common enough case). Sponsored-by: TrueNAS Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com> Signed-off-by: Rob Norris <rob.norris@truenas.com> Closes #18230	2026-02-20 13:37:11 -08:00
Tim Hatch	64bae56b00	Include missing newline in 'man' error Because the `strerror` result doesn't include a newline, we need to add one. Observed on a minimal system that doesn't have `man` installed, which behaves like this before the fix: ``` [root@upper tim]# zpool help import couldn't run man program: No such file or directory[root@upper tim]# ``` Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tim Hatch <tim@timhatch.com> Closes #18183	2026-02-09 10:19:08 -08:00
Alexander Motin	2646bd5585	Allow rewrite skip cloned and snapshotted blocks Rewrite of cloned and snapshotted blocks can allocate additional space, that may be undesired. In some cases it may have sense to still rewrite snapshotted blocks, expecting the snapshots to rotate with time, freeing space. In other cases rewrite of cloned blocks may be acceptable, despite persistent space usage increase. For this reason add them as separate flags to `zfs rewrite`. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Rob Norris <robn@despairlabs.com> Reviewed-by: Ameer Hamza <ahamza@ixsystems.com> Signed-off-by: Alexander Motin <alexander.motin@TrueNAS.com> Closes #18179	2026-02-09 10:17:56 -08:00
Brian Behlendorf	d4c0e52188	zhack: add "action idle" subcommand In order to reliably test the multihost protection we need two (or more) systems attempting to import the pool at the same time. Historically, we've used ztest running in userspace to simulate an active pool and attempted to import the pool with the kernel modules. This works but ztest is a bit unwieldy for this and if it crashes for unrelated reasons it can result in false positives. All we really need is the pool imported in userspace so the MMP thread is active and writing out uberblocks. We can extend zhack which already knows how to import the pool read/write and add an option to leave the pool open and idle. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Olaf Faaland <faaland1@llnl.gov> Reviewed-by: Akash B <akash-b@hpe.com>	2026-02-09 09:36:14 -08:00
Brian Behlendorf	731ff0a5ac	zhack: add -G option to dump debug buffer Add a -G option to zhack to dump the internal debug buffer on exit. We were able to use the same code from zdb for this which was nice. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Olaf Faaland <faaland1@llnl.gov> Reviewed-by: Akash B <akash-b@hpe.com>	2026-02-09 09:36:10 -08:00
Brian Behlendorf	20176224ee	mmp: claim sequence id before final import As part of SPA_LOAD_IMPORT add an additional activity check to detect simultaneous imports from different hosts. This check is only required when the timing is such that there's no activity for the the read-only tryimport check to detect. This extra safety chceck operates as follows: 1. Repeats the following MMP check 10 times: a. Write out an MMP uberblock with the best txg and a random sequence id to all primary pool vdevs. b. Verify a minimum number of good writes such that even if the pool appears degraded on the remote host it will see at least one of the updated MMP uberblocks. c. Wait for the MMP interval this leaves a window for other racing hosts to make similar modifications which can be detected. d. Call vdev_uberblock_load() to determine the best uberblock to use, this should be the MMP uberblock just written. e. Verify the txg and random sequeunce number match the MMP uberblock written in 1a. 2. Restore the original MMP uberblocks. This allows the check to be performed again if the pool fails to import for an unrelated reason. This change also includes some refactoring and minor improvements. - Never try loading earlier txgs during import when the import fails with EREMOTEIO or EINTER. These errors don't indicate the txg is damaged but instead that its either in use on a remote host or the import was interactively cancelled. No rewind is also performed for EBADD which can result from a stale trusted config when doing a verbatim import. - Refactor the code for consistent logging of the multihost activity check using spa_load_note() and console messages indicating when the activity check was trigger and the result. - Added MMP_*_MASK and MMP_SEQ_CLEAR() macros to allow easier modification of the sequence number in an uberblock. - Added ZFS_LOAD_INFO_DEBUG environment variable which can be set to log to dump to stdout the spa_load_info nvlist returned during import. This is used by the updated mmp test cases to determine if an activity check was run and its result. - Standardize the mmp messages similarly to make it easier to find all the relevent mmp lines in the debug log. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Olaf Faaland <faaland1@llnl.gov> Reviewed-by: Akash B <akash-b@hpe.com>	2026-02-09 09:36:01 -08:00
Rob Norris	85391ee931	build: add SPDX license tags to build system files Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #18077	2026-01-08 15:08:03 -08:00
Ivan Shapovalov	dbb3f247ed	cmd/zfs: clone: accept `-u` to not mount newly created datasets Signed-off-by: Ivan Shapovalov <intelfx@intelfx.name> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com> Closes #18080	2026-01-05 12:21:56 -05:00
Alexander Moch	e72f3054e3	cmd/ztest: avoid `PATH_MAX` stack allocation in `ztest_get_zdb_bin()` (#18085 ) Calling realpath(path, buf) can trigger fortified header wrappers that allocate a PATH_MAX-sized temporary buffer on the stack, exceeding the 4 KiB frame limit on some systems. Use the heap-allocating realpath(path, NULL) form instead. Sponsored-by: ERNW Research GmbH - https://ernw-research.de/ Signed-off-by: Alexander Moch <amoch@ernw.de> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov>	2025-12-29 11:16:34 -08:00
Ivan Shapovalov	3c4193333b	zed.d, contrib: fix shellcheck errors in scripts Not sure why this was not caught by CI; perhaps my shellcheck is new enough to catch more things. Signed-off-by: Ivan Shapovalov <intelfx@intelfx.name>	2025-12-23 11:12:21 -08:00
Ivan Shapovalov	1e7280cece	zfs_main: cosmetic: add missing flag to the comment for create Signed-off-by: Ivan Shapovalov <intelfx@intelfx.name>	2025-12-23 11:12:21 -08:00
Rob Norris	0d44b58d7f	libshare: fold into libzfs and reorg headers a little libzfs is the only user of libshare, and only internally, so there's no particular reason to build it separately, nor to export its symbols. So, pull it into libzfs proper, remove its "public" header, and hide its symbols. The bare minimum "public" API is just to count and enumerate the supported share types. These are moved to libzfs.h with the other share API. Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #18072	2025-12-19 19:52:33 -08:00
Allan Jude	1d43387dd8	zdb: Add -O option for -r to specify object-id "zdb -r -O pool/dataset obj-id destination" will copy the file with object-id obj-id to the named destination; without -O it'll still be interpreted as a pathname. Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Akash B <akash-b@hpe.com> Signed-off-by: Sean Eric Fagan <sean.fagan@klarasystems.com> Closes #16307	2025-12-18 09:25:09 -08:00
Alexander Motin	4754ac8529	raidz_test: Restore rand_data protection It feels dirty to modify protection of a memory allocated via libc, but at least we should try to restore it before freeing. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Rob Norris <robn@despairlabs.com> Signed-off-by: Alexander Motin <alexander.motin@TrueNAS.com> Closes #17977	2025-12-01 14:34:52 -08:00
Alexander Motin	338d432b42	raidz_test: Fix ZIO ABDs initialization - When filling ABDs of several segments, consider offset. - "Corrupt" ABDs with actually different data to fail something. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Rob Norris <robn@despairlabs.com> Signed-off-by: Alexander Motin <alexander.motin@TrueNAS.com> Closes #17977	2025-12-01 14:34:48 -08:00
Alexander Motin	95b2eb50f2	raidz_test: Set io_offset reasonably - io_offset of 1 makes no sense. Set default to 0. - Initialize io_offset in all cases. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Rob Norris <robn@despairlabs.com> Signed-off-by: Alexander Motin <alexander.motin@TrueNAS.com> Closes #17977	2025-12-01 14:34:43 -08:00
Rob Norris	e37937f42d	ztest: fix broken random call Bad copypasta in `4d451bae8a`, leading to random stuff being blasted all over stack, destroying the program. Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Igor Kozhukhov <igor@dilos.org> Reviewed-by: Sean Eric Fagan <sean.fagan@klarasystems.com> Signed-off-by: Rob Norris <rob.norris@klarasystems.com> Closes #17957	2025-11-24 12:43:15 -05:00
Shreshth3	1f3444f2bb	zpool: fix special vdev -v -o conflict Right now, running `zpool list` with -v and -o passed does not work properly for special vdevs. This commit fixes that problem. See the discussion on #17839. Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Shreshth Srivastava <shreshthsrivastava2@gmail.com> Closes #17932	2025-11-19 08:30:20 -08:00
Rob Norris	71609a9264	zfs: replace tpool with taskq They're basically the same thing; lets just carry one. Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #17948	2025-11-19 08:16:51 -08:00
Rob Norris	adb316f411	libuutil: remove the whole thing Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #17934	2025-11-17 06:23:05 -08:00
Rob Norris	871fa61d26	zfs: replace uu_list with sys/list Lets just use the list implementation we use everywhere else. Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #17934	2025-11-17 06:22:48 -08:00
Rob Norris	b593748287	zfs: replace uu_avl with sys/avl Lets just use the AVL implementation we use everywhere else. Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #17934	2025-11-17 06:21:26 -08:00
Toomas Soome	e63d026b91	cmd/zpool cstyle issues add missing headers. usage() is no-return, so anything after call to it is unreachable code. use (void) cast where we do ignore return value. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Toomas Soome <tsoome@me.com> Closes #17885	2025-11-14 15:58:50 -08:00
Rob Norris	23d17f3587	libspl/random: add switch to force pseudo-random numbers for all calls ztest wants to force all kernel random calls to use the pseudo-random generator (/dev/urandom), to avoid depleting the system entropy pool just for testing. Up until the previous commit, it did this by switching the path that the libzpool (now libspl) random API would use to get random data from; that is, it took advantage of an implementation detail. Now that that hole is closed to it, we need another method. This commit introduces that; a simple API call to enable/disable "force pseudo" mode. Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #17861	2025-11-12 10:04:30 -08:00
Rob Norris	4d451bae8a	libspl: hide global data objects Currently libspl is a static archive that is linked into multiple shared objects, which then re-export its symbols. We intend to fix this soon. For the moment though, most programs shipped with OpenZFS depend on two or more of these shared objects, and see the same symbols twice. For functions this is not a problem, as they do not have any mutable state and so the linker can simply select the first one and use that for all. For global data objects however, each shared object will have direct (non-relocatable) references to its own instance of the symbol, such that changes on one will not necessarily be seen by the other. While this shouldn't be a problem in practice as these reexported interfaces are not supposed to be used, they are technically undefined behaviour in C (C17 6.9.2) and are reported by ASAN as a violation of C++'s "One Definition Rule". To fix this, we hide these globals inside their compilation units, and add access functions and macros as appropriate to preserve the existing API (though not ABI). Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #17861	2025-11-12 10:04:22 -08:00
Rob Norris	4e3b88927c	libzpool: separate driver-side include Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #17861	2025-11-12 10:01:04 -08:00
Alexander Motin	b4f073b5a6	Add BRT support to zpool prefetch command Implement BRT (Block Reference Table) prefetch functionality similar to existing DDT prefetch. This allows preloading BRT metadata into ARC to improve performance for block cloning operations and frees of earlier cloned blocks. Make -t parameter optional. When omitted, prefetch all supported metadata types (both DDT and BRT now). Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Alexander Motin <alexander.motin@TrueNAS.com> Closes #17890	2025-11-10 16:16:22 -08:00
Rob Norris	6e12f0bd77	spa_misc: add an API for spa_namespace_lock This is useful as debugging support, as it lets namespace lock operations be traced directly. It will also be useful for future work to reduce the use of spa_namespace_lock, traditionally a source of difficult deadlocks. Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Signed-off-by: Rob Norris <rob.norris@klarasystems.com> Closes #17906	2025-11-10 14:23:39 -08:00
Tony Hutter	f93506d1df	Linux 6.17 compat: Fix broken projectquota on 6.17 We need to specifically use the FX_XFLAG_* macros in zpl_ioctl_attr() codepaths, and the FS__FL macros in the zpl_ioctl_flags() codepaths. The earlier code just assumes the FS__FL macros for both codepaths. The 6.17 kernel add a bitmask check in copy_fsxattr_from_user() that exposed this error via failing 'projectquota' ZTS tests. Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tony Hutter <hutter2@llnl.gov> Closes #17884 Closes #17869	2025-11-05 16:22:03 -08:00
Alexander Motin	dcada084b9	Pass flags to more DMU write/hold functions Over the time many of DMU functions got flags argument to control prefetch, caching, etc. Few functions though left without it, even though closer look shown that many of them do not require prefetch due to their access pattern. This patch adds the flags argument to dmu_write(), dmu_buf_hold_array() and dmu_buf_hold_array_by_bonus(), passing DMU_READ_NO_PREFETCH where applicable. I am going to also pass DMU_UNCACHEDIO to some of them later. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Rob Norris <robn@despairlabs.com> Signed-off-by: Alexander Motin <alexander.motin@TrueNAS.com> Closes #17872	2025-10-29 11:17:51 -07:00
Shreshth3	44704616b4	zpool: fix conflict with -v and -o options Right now, the -v and -o options for `zpool list` work independently, but when paired, the -v "wins out" and the -o effect is lost. This commit fixes that problem. Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Rob Norris <robn@despairlabs.com> Signed-off-by: Shreshth Srivastava <shreshthsrivastava2@gmail.com> Closes #11040 Closes #17839	2025-10-21 15:10:52 -07:00
Shreshth3	3ea8ca8c0f	zdb: fix bug with -A flag Fixes #10544. According to the manpage, zdb -A should ignore all assertions. But it currently does not do that. This commit fixes this bug. Signed-off-by: Shreshth Srivastava <shreshthsrivastava2@gmail.com> Reviewed-by: Rob Norris <rob.norris@klarasystems.com> Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com> Closes #17825	2025-10-20 09:30:20 -04:00
Alexander Motin	51de2d76f8	Explicit set ashift for non-leaf vdevs Before this change ashift property was applied only to a leaf vdevs. As result, it worked only as a minimal value for parent vdevs, since bigger physical_ashift value reported by any child could be used instead when deciding parent's ashift, as if the ashift property was never set. This change explicitly passes ZPOOL_CONFIG_ASHIFT to all vdevs, allowing override for parents only if the passed value is below logical_ashift and so unacceptable. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Allan Jude <allan@klarasystems.com> Reviewed-by: Rob Norris <rob.norris@klarasystems.com> Signed-off-by: Alexander Motin <alexander.motin@TrueNAS.com> Closes #17826	2025-10-13 10:41:02 -07:00
Ivan Shapovalov	f8b082b5af	zdb: adjust block histogram binning strategy Previously, a bin included all blocks _starting_ from given size (e.g., a "4K" bin would include all blocks within the [4K; 8K) region). This is counter-intuitive and does not match the typical use-case of the block histogram (that is, to estimate disk usage considering how ZFS' block allocation works). In other words, if I'm looking at the "4K" row, I'm interested in records that _fit into_ a 4K block. Adjust the binning strategy such that a bin includes all blocks _up to_ given size, such that e.g. a "4K" bin would include all blocks within the (2K; 4K] region. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ivan Shapovalov <intelfx@intelfx.name> Closes #16999	2025-10-06 09:35:32 -07:00
Ivan Shapovalov	3a1a22abb4	zdb: factor out block histogram bin number computation Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ivan Shapovalov <intelfx@intelfx.name> Closes #16999	2025-10-06 09:35:28 -07:00
Ivan Shapovalov	c0a874fced	zdb: add `--class=(normal\|special\|...)` to filter blocks by alloc class When counting blocks to generate block size histograms (`-bb`), accept a `--class=` argument (as a comma-separated list of either "normal", "special", "dedup" or "other") to only consider blocks that belong to these metaslab classes. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ivan Shapovalov <intelfx@intelfx.name> Closes #16999	2025-10-06 09:35:23 -07:00
Ivan Shapovalov	8e97b98140	zdb: add `--bin=(lsize\|psize\|asize)` arg to control histogram binning When counting blocks to generate block size histograms (`-bb`), accept a `--bin=` argument to force placing blocks into all three bins based on this size. E.g. with `--bin=lsize`, a block with lsize=512K, psize=128K, asize=256K will be placed into the "512K" bin in all three output columns. This way, by looking at the "512K" row the user will be able to determine how well was ZFS able to compress blocks of this logical size. Conversely, with `--bin=psize`, by looking at the "128K" row the user will be able to determine how much overhead was incurred for storage of blocks of this physical size. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ivan Shapovalov <intelfx@intelfx.name> Closes #16999	2025-10-06 09:35:02 -07:00
Ivan Shapovalov	1269fa9b79	zdb: convert `ALLOCATED_OPT` into anonymous enum We are adding more long-only options, so use an enum for all of them to avoid manually numbering these constants. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ivan Shapovalov <intelfx@intelfx.name> Closes #16999	2025-10-06 09:34:03 -07:00
Rob Norris	5605a6d79b	pool_iter_refresh: don't refresh pools twice In "all pools" mode, pool_iter_refresh() will call zpool_iter(), which will call zpool_refresh_stats() before calling add_pool(). If we already have the pool, this is a different handle, so we just release it and return. Back in pool_iter_refresh(), we then call zpool_stats_refresh() again for our handle on the same pool. All together, this means we're doing two ZFS_IOC_POOL_STATS calls into the kernel for every pool in the system. This isn't wrong, but it does double the pressure on global locks. Instead, we add a new function zpool_refresh_stats_from_handle() that simply copies the pool config and state from one handle to another, and use it to update our handle before we release it in add_pool(), so we only have one call per pool per interval. Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rob Norris <rob.norris@klarasystems.com> Closes #17807	2025-10-03 14:39:09 -07:00
Rob Norris	5f09781cca	pool_iter_refresh: don't flag existing pools as refreshed zpool_iter() passes the callback a new instance of zpool_handle_t each time, so the existing handle in the pool_list AVL never actually gets a refresh. Internally, that means its zpool_config is never updated, and the old config is never moved to zpool_old_config. As a result, print_iostat() never sees any updated config, and so repeats the first line forever. This is the simplest workaround: just don't mark existing pools as refreshed. pool_list_refresh() will see this and refresh them. The downside is a second call to ZFS_IOC_POOL_STATS for existing pools, because zpool_iter() just called it for the handle we threw away. Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rob Norris <rob.norris@klarasystems.com> Closes #17807	2025-10-03 14:39:00 -07:00
Rob Norris	1a32adca0f	zpool iostat: update pool counter when skipping boot row When skipping the boot row (with -y), the early loop meant we weren't updating the "last_npools" count. That means the count never advanced past zero, so cb_iteration was always reset to 0, leading to it being "stuck" on the boot line, printing the header and nothing else forever. Updating the pool counter on every loop sorts that out: it advances, cb_iteration moves properly, and normal rows are printed. Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rob Norris <rob.norris@klarasystems.com> Closes #17807	2025-10-03 14:38:35 -07:00
Ameer Hamza	ac2d8c80b6	Make mount/share errors non-fatal for zfs create/clone If zfs_mount_and_share() fails, the error propagates to zfs create/clone commands despite successful operation. If create/clone operations were successful, there's no point in making zfs_mount_and_share() failures fatal. Signed-off-by: Ameer Hamza <ahamza@ixsystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com> Closes #17799	2025-10-02 11:24:26 -04:00
Robert Evans	8869caae5f	zinject: Introduce ready delay fault injection This adds a pause to the ZIO pipeline in the ready stage for matching I/O (data, dnode, or raw bookmark). Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Rob Norris <robn@despairlabs.com> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Akash B <akash-b@hpe.com> Signed-off-by: Robert Evans <evansr@google.com> Closes #17787	2025-10-01 12:17:13 -07:00
Rob Norris	f0a95e8971	zpool iostat: refresh pool list every interval When running zpool iostat in interval mode, it would not notice any new pools created or imported, and would forget any destroyed or exported, so would not notice if they came back. This leads to outputting "no pools available" every interval until killed. It looks like this was at least intended to work; the comment above zpool_do_iostat() indicates that it is expected to "deal with pool creation/destruction" and that pool_list_update() would detect new pools. That call however was removed in `3e43edd2c5`, though its unclear if that broke this behaviour and it wasn't noticed, or if it never worked, or if something later broke it. That said, the lack of pool_list_update() is only part of the reason it doesn't work properly. The fundamental problem is that the various things involved in refreshing or updating the list of pools would aggressively ignore, remove, skip or fail on pools that stop existing, or that already exist. Mostly this meant that once a pool is removed from the list, it will never be seen again. Restoring pool_list_update() to the zpool_do_iostat() loop only partially fixes this - it would find "new" pools again, but only in the "all pools" (no args) mode, and because its iterator callback add_pool() would abort the iterator if it already has a pool listed, it would only add pools if there weren't any already. So, this commit reworks the structure somewhat. pool_list_update() becomes pool_list_refresh(), and will ensure the state of all pools in the list are updated. In the "all pools" mode, it will also add new pools and remove pools that disappear, but when a fixed list of pools is used, the list doesn't change, only the state of the pools within it. The rest of the commit is adjusting things for this much simpler structure. Regardless of the mode in use, pool_list_refresh() will always do the right thing, so the driver code can just get on with the display. Now that pools can appear and disappear, I've made it so the header (if enabled) is re-printed when the list changes, so that its easier to see what's happening if the column widths change. Since this is all rather complicated, I've included tests for the "all pools" and "set of pools" modes. Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Igor Kozhukhov <igor@dilos.org> Signed-off-by: Rob Norris <rob.norris@klarasystems.com> Closes #17786	2025-09-29 16:35:27 -07:00
patrickxia	5c38029f4b	zdb: add ZFS_KEYFORMAT_RAW support for -K option This change adds support for ZFS_KEYFORMAT_RAW to zdb_derive_key in zdb.c. The implementation reads the raw key from the file specified by the -K option which is consistent with how raw keys are handled in the other parts of ZFS, along with a check to ensure that the keyfile doesn't have too many bytes. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Patrick Xia <patrickx@google.com> Closes #17783	2025-09-25 12:05:42 -07:00
Brian Behlendorf	c722bf8812	Add interface to interface spa_get_worst_case_min_alloc() function Provide an interface to retrieve the lowest and highest minimum allocation size for the normal allocation class. This can be used by external consumers of the DMU to estimate potential wasted capacity when setting the recordsize for an object. The new "min_alloc" and "max_alloc" keys are added to the pool configuration and used by default_volblocksize() to warn when an ineffecient block size is requested. For older kmods which don't yet include the new keys fallback to the previous logic. Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #17758	2025-09-25 09:35:35 -07:00
Brian Behlendorf	0e1a53a8c0	Fix 'zpool add' safety check corner cases Three cases were discovered where 'zpool add' would fail to warn when adding vdevs to a pool with a mismatched replication level. These are: 1. When a pool contains mixed file and disk vdevs. 2. When a pool contains an active dRAID distributed spare 3. When a pool contains an active hot spare The lack of warnings are caused by get_replication() assessing the current pool configuration an inconsistent and disabling the mismatched replication check for the new pool configuration after 'zpool add'. This change updates get_replication() to be slightly more tolerant in the non-fatal case. The zpool_add_010_pos.ksh test case was split in to separate tests: zpool_add_warn_create.ksh, pool_add_warn_degraded.ksh, and zpool_add_warn_removal. These test were extended to include coverage for dRAID pools and the three scenarios described above. Reviewed-by: Tony Hutter <hutter2@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #17780	2025-09-25 09:32:59 -07:00

1 2 3 4 5 ...

1702 Commits