Compare commits

..

282 Commits

Author SHA1 Message Date
Tony Hutter
1c702dda34 Tag zfs-2.4.1
META file and changelog updated.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
2026-02-19 11:14:37 -08:00
Alexander Motin
3dcd071b51 Fix available space accounting for special/dedup (#18222)
Currently, spa_dspace (base to calculate dataset AVAIL) only includes
the normal allocation class capacity, but dd_used_bytes tracks space
allocated across all classes.  Since we don't want to report free
space of other classes as available (we can't promise new allocations
will be able to use it), report only allocated space, similar to how
we report space saved by dedup and block cloning.

Since we need deflated space here, make allocation classes track
deflated allocated space also.  While here, make mc_deferred also
deflated, matching its use contexts.  Also while there, use
atomic_load() to read the allocation class stats.

Reviewed-by: Rob Norris <robn@despairlabs.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <alexander.motin@TrueNAS.com>
Closes #18190
Closes #18222
2026-02-19 11:14:37 -08:00
Tony Hutter
46500a0803 CI: Test & fix Linux ZFS built-in build
ZFS can be built directly into the Linux kernel.  Add a test build
of this to the CI to verify it works.  The test build is only enabled
on Fedora runners (since they run the newest kernels) and is done in
parallel with ZTS.  The test build is done on vm2, since it typically
finishes ~15min before vm1 and thus has time to spare.

In addition:

- Update 'copy-builtin' to check that $1 is a directory
- Fix some VERIFYs that were causing the built-in build to fail

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #18234
2026-02-19 11:14:37 -08:00
Attila Fülöp
c629e594e4 Linux 6.19 compat: in-tree build: fix duplicate GCM assembly functions
Linux 6.19 added an AES-GCM VAES-AVX2 assembly implementation. It's
basically a translation from the BoringSSL perlasm syntax to macro
assembly. We're using the same source but the perlasm generated flat
assembly which shares some global function names with the former.
When  building in-tree this results in the linker failing due to the
duplicate symbols.

To avoid the error we prepend `icp_` via a macro to our function
names.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Moch <mail@alexmoch.com>
Signed-off-by: Attila Fülöp <attila@fueloep.org>
Closes #18204
Closes #18224
2026-02-17 13:52:43 -08:00
rmacklem
f83a7864aa zfs_vnops_os.c: Move a vput() to after zfs_setattr_dir()
Without this patch, the following crash can occur when
a file system is configured with "xattr=dir".

VNASSERT failed: locked not true at
 /posix-acl/freebsd-rdma/sys/kern/vfs_subr.c:5786 (assert_vop_locked)
    hold count flags ()
    flags ()
    lock type zfs: UNLOCKED
panic: zfs_dirent_lookup: vnode is not locked but should be
cpuid = 3
time = 1770520763
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b
vpanic() at vpanic+0x136/frame 0xfffffe00914c8270
panic() at panic+0x43/frame 0xfffffe00914c82d0
assert_vop_locked() at assert_vop_locked+0x78
zfs_dirent_lookup() at zfs_dirent_lookup+0x41
zfs_setattr_dir() at zfs_setattr_dir+0x123
zfs_setattr() at zfs_setattr+0x1389
zfs_freebsd_setattr() at zfs_freebsd_setattr+0x56b
VOP_SETATTR_APV() at VOP_SETATTR_APV+0x5d
setfown() at setfown+0xb1
kern_fchownat() at kern_fchownat+0x192

This patch fixes the problem by moving the vput() call for
attrzp to after the zfs_setattr_dir() call that takes it as
an argument.

Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rick Macklem <rmacklem@uoguelph.ca>
Closes: #18188
2026-02-17 11:54:58 -08:00
Austin Wise
612d4019f1 Fix activating large_microzap on receive
This ensures that the in-memory state of the feature is recorded and
that `dsl_dataset_activate_feature` is not called when the feature
is already active.

Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Austin Wise <AustinWise@gmail.com>
Closes #18143
Closes #18144
2026-02-17 11:54:58 -08:00
Alexander Motin
25327ed7ce Improve caching for dbuf prefetches
To avoid read errors with transaction open dmu_tx_check_ioerr()
is used to read everything required in advance.  But there seems
to be a chance for the buffer to evicted from dbuf cache in
between, which result in immediate eviction from ARC, which may
require additional disk read later in a place where error handling
is problematic.

To partially workaround this introduce a new flag DMU_IS_PREFETCH,
relayed to ARC as ARC_FLAG_PREFETCH | ARC_FLAG_PRESCIENT_PREFETCH,
making ARC delay eviction by at least several seconds, or till the
actual read inside the transaction, that will promote it to demand
access.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Alexander Motin <alexander.motin@TrueNAS.com>
Closes #18160
2026-02-17 11:54:58 -08:00
Mariusz Zaborski
11647c669e Flush RRD only when TXGs contain data
This change modifies the behavior of spa_sync_time_logger when
flushing the RRD database.

Previously, once the sync interval elapsed, a flush would always
be generated. On solid-state devices, especially when the pool was
otherwise idle, this caused disks to wake up solely to write RRD
data. Since RRD is best-effort telemetry, this behavior is
unnecessary and wasteful.

With this change, spa_sync_time_logger delays flushing until a TXG
that already contains data is being synced. The RRD update is
appended to that TXG instead of forcing the creation of
a new write-only TXG.

During pool export, flushing is forced regardless of whether
the TXG contains user data. At that stage, data durability takes
precedence and a write must be issued.

Sponsored by: [Wasabi Technology, Inc.; Klara, Inc.]
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Mariusz Zaborski <mariusz.zaborski@klarasystems.com>
Closes #18082
Closes #18138
2026-02-11 11:41:13 -08:00
Marc Sladek
a0350f61c4 Fix send:raw permission for send -w -I
When performing an incremental raw send with intermediates (-w -I),
the standard 'send' permission was incorrectly required instead of
allowing 'send:raw'. This was due to a strict boolean comparison on
the 'rawok' flag in zfs_secpolicy_send() with non-boolean value.

This change normalizes the 'rawok' variable to be strictly 0/1 and
updates the test suite to properly verify delegated raw send behavior.

Introduced-by: https://github.com/openzfs/zfs/pull/17543
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Marc Sladek <marc@sladek.dev>
Closes #18198
Closes #18193
2026-02-11 11:41:13 -08:00
Tony Hutter
936a98c716 ZTS: Fix zed_synchronous_zedlet
Wait for scrub_finish (as the comments in the code suggest) rather
than trim_finish in zed_synchronous_zedlet.ksh.  This seems to
workaround the ZTS failures in #18192.  Also, fix some typos.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #18192
Closes #18196
2026-02-11 11:41:13 -08:00
Tony Hutter
e1ade37573 Linux 6.19 compat: META
Update the META file to reflect compatibility with the 6.19
kernel.

Reviewed-by: Rob Norris <robn@despairlabs.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #18197
2026-02-11 09:38:39 -08:00
Tony Hutter
fdaec98d4b CI: Test build Lustre against ZFS
The Lustre filessytem calls a number of exported ZFS functions.  Do a
test build on the Almalinux runners to make sure we're not breaking
Lustre.  We do the Lustre build in parallel with the normal ZTS test
for efficiency, since ZTS isn't very CPU intensive. The full Lustre
build takes around 15min when run on its own.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #18161
2026-02-10 17:03:02 -08:00
Tim Hatch
a42bb54050 Include missing newline in 'man' error
Because the `strerror` result doesn't include a newline, we need to add
one.  Observed on a minimal system that doesn't have `man` installed,
which behaves like this before the fix:

```
[root@upper tim]# zpool help import
couldn't run man program: No such file or directory[root@upper tim]#
```

Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tim Hatch <tim@timhatch.com>
Closes #18183
2026-02-10 17:01:50 -08:00
Brian Behlendorf
618cfa02ea ZTS: update the relevant mmp test cases
- mmp_concurrent_import: added test case to verify that concurrent
  import correctness.  The pool may only be imported once.

- mmp_exported_import: an activity check is now required for pools
  which were cleanly exported if the system and pool hostids don't
  match.

- mmp_inactive_import: an activity check is now required for any
  pool which wasn't cleanly exported, even if the system and pool
  hostids match.

- mmp_on_uberblocks: updated expected uberblocks to take in to account
  the value MMP_INTERVAL_DEFAULT is set too.

- mmp_reset_interval: reduce the number of iterations from 10 to 3.
  This is sufficient to verify functionality and significantly speeds
  up the test.

- mmp_on_uberblocks: adjust the thresholds and increase the runtime
  to avoid false positives observed in CI.

- Update tests to use 'zhack action idle' instead of ztest to improve
  the reliability of the tests.

- Add additional log_note messages to test cases which have multiple
  verification steps to make it clear which portion of a test failed
  when reviewing the logs.

- Replace default_setup/cleanup_noexit calls with 'zpool create' and
  'zpool destroy' calls to avoid additional unnecessary dataset
  creation work.

- Update activity/noactivity check helper functions to use the
  ZFS_LOAD_INFO_DEBUG information now available from 'zpool import'
  to determine if this activity check ran and why.  This is more
  reliable in the CI than measuring the runtime.

- Removed all mmp tests from the zts-report.py exceptions list.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-by: Akash B <akash-b@hpe.com>
2026-02-10 17:01:29 -08:00
Brian Behlendorf
82ed6842ba zhack: add "action idle" subcommand
In order to reliably test the multihost protection we need two (or more)
systems attempting to import the pool at the same time.  Historically, we've
used ztest running in userspace to simulate an active pool and attempted to
import the pool with the kernel modules.  This works but ztest is a bit
unwieldy for this and if it crashes for unrelated reasons it can result
in false positives.

All we really need is the pool imported in userspace so the MMP thread is
active and writing out uberblocks.  We can extend zhack which already knows
how to import the pool read/write and add an option to leave the pool open
and idle.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-by: Akash B <akash-b@hpe.com>
2026-02-10 17:01:29 -08:00
Brian Behlendorf
184e9b3cd5 zhack: add -G option to dump debug buffer
Add a -G option to zhack to dump the internal debug buffer on exit.
We were able to use the same code from zdb for this which was nice.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-by: Akash B <akash-b@hpe.com>
2026-02-10 17:01:29 -08:00
Brian Behlendorf
c710f87923 mmp: claim sequence id before final import
As part of SPA_LOAD_IMPORT add an additional activity check to
detect simultaneous imports from different hosts.  This check is
only required when the timing is such that there's no activity
for the the read-only tryimport check to detect.  This extra
safety chceck operates as follows:

1. Repeats the following MMP check 10 times:
  a. Write out an MMP uberblock with the best txg and a random
     sequence id to all primary pool vdevs.
  b. Verify a minimum number of good writes such that even if
     the pool appears degraded on the remote host it will see
     at least one of the updated MMP uberblocks.
  c. Wait for the MMP interval this leaves a window for other
     racing hosts to make similar modifications which can be
     detected.
  d. Call vdev_uberblock_load() to determine the best uberblock
     to use, this should be the MMP uberblock just written.
  e. Verify the txg and random sequeunce number match the MMP
     uberblock written in 1a.

2. Restore the original MMP uberblocks.  This allows the check
   to be performed again if the pool fails to import for an
   unrelated reason.

This change also includes some refactoring and minor improvements.

- Never try loading earlier txgs during import when the import
  fails with EREMOTEIO or EINTER.  These errors don't indicate
  the txg is damaged but instead that its either in use on a
  remote host or the import was interactively cancelled.  No
  rewind is also performed for EBADD which can result from a
  stale trusted config when doing a verbatim import.

- Refactor the code for consistent logging of the multihost
  activity check using spa_load_note() and console messages
  indicating when the activity check was trigger and the result.

- Added MMP_*_MASK and MMP_SEQ_CLEAR() macros to allow easier
  modification of the sequence number in an uberblock.

- Added ZFS_LOAD_INFO_DEBUG environment variable which can be
  set to log to dump to stdout the spa_load_info nvlist returned
  during import.  This is used by the updated mmp test cases
  to determine if an activity check was run and its result.

- Standardize the mmp messages similarly to make it easier to
  find all the relevent mmp lines in the debug log.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-by: Akash B <akash-b@hpe.com>
2026-02-10 17:01:29 -08:00
Brian Behlendorf
96ffe51004 mmp: add spa_load_name() for tryimport
Tryimport adds a unique prefix to the pool name to avoid name
collisions.  This makes it awkward to log user-friendly info
during a tryimport.  Add a spa_load_name() function which can
be used to report the unmodified pool name.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-by: Akash B <akash-b@hpe.com>
2026-02-10 17:01:29 -08:00
Brian Behlendorf
f2c40b4586 mmp: move "Starting import" log message
Move the "Starting import" log message in to the import block so
it's matched with the "Fiinshed importing" debug message.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-by: Akash B <akash-b@hpe.com>
2026-02-10 17:01:29 -08:00
Brian Behlendorf
e78596e05e mmp: further restrict mmp exported pool check
For a cleanly exported pools there exists a small window where
both systems may determine it's safe to import the pool and skip
the activity check.  Only allow the check to be skipped when the
last imported hostid matches the systems hostid and the pool was
cleanly exported.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-by: Akash B <akash-b@hpe.com>
2026-02-10 17:01:29 -08:00
Erik Larsson
8a9bbaa7cf Fix build for Linux 6.18 with PowerPC/RISC-V kernels. (#18145)
The macro 'flush_dcache_page(...)' modifies the page flags, but in Linux
6.18 the type of the page flags changed from 'unsigned long' to the
struct type 'memdesc_flags_t' with a single member 'f' which is the page
flags field.

Signed-off-by: Erik Larsson <catacombae@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
2026-02-10 17:00:04 -08:00
John Cabaj
2328b37eb9 Linux 6.19: handle --werror with CONFIG_OBJTOOL_WERROR=y
Linux upstream commit 56754f0f46f6: "objtool: Rename
--Werror to --werror" did just that, so we should check for
either "--Werror" or "--werror", else the build will fail

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Attila Fülöp <attila@fueloep.org>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: John Cabaj <john.cabaj@canonical.com>
Closes #18152
2026-02-10 16:59:50 -08:00
Alexander Moch
8dec2d94b4 CI: Add Alpine Linux 3.23 runner to the pipeline (#18087)
Add an Alpine Linux 3.23 runner to the CI chain to run OpenZFS builds
and tests against musl libc.

Currently, zfs_send_sparse is killed after 10 minutes on Alpine, causing
cascading EBUSY failures in the test suite. With zfs_send_sparse
disabled, the ZFS test suite reaches a pass rate of 94.62%.

This commit introduces the required Alpine-specific setup and a small
set of shell and cloud-init compatibility fixes that also apply to
existing Linux runners.

The Alpine runner is not enabled by default and is not executed for new
pull requests.

Sponsored-by: ERNW Research GmbH - https://ernw-research.de/

Signed-off-by: Alexander Moch <amoch@ernw.de>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de>
2026-02-10 16:59:18 -08:00
Alexander Moch
8e946b5ae8 cmd/ztest: avoid PATH_MAX stack allocation in ztest_get_zdb_bin() (#18085)
Calling realpath(path, buf) can trigger fortified header wrappers that
allocate a PATH_MAX-sized temporary buffer on the stack, exceeding the
4 KiB frame limit on some systems. Use the heap-allocating
realpath(path, NULL) form instead.

Sponsored-by: ERNW Research GmbH - https://ernw-research.de/

Signed-off-by: Alexander Moch <amoch@ernw.de>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
2026-02-10 16:59:11 -08:00
Ivan Shapovalov
f1321648a5 zed.d, contrib: fix shellcheck errors in scripts
Not sure why this was not caught by CI; perhaps my shellcheck is new
enough to catch more things.

Signed-off-by: Ivan Shapovalov <intelfx@intelfx.name>
2026-02-10 16:58:49 -08:00
Ivan Shapovalov
5889b7ce90 zfs_main: cosmetic: add missing flag to the comment for create
Signed-off-by: Ivan Shapovalov <intelfx@intelfx.name>
2026-02-10 16:58:36 -08:00
Tony Hutter
c62c3aeb13 CI: Test 2.4.x in qemu-test-repo-vm.sh, quick mode
The qemu-test-repo-vm.sh script tests installs ZFS from different
repos.  Have it test from the new 2.4.x repos as well.

Also add a checkbox to run in "lookup mode".  This just does a
quick lookup to see what version is installed in each repo.  It does
not do a test install and module load.  It only takes 3min to run vs
over an hour for the full version.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #18070
2026-02-10 16:57:41 -08:00
Turbo Fredriksson
026d4ee1a9 Change shellcheck and checkbashism triggers.
Newer versions of `shellcheck` and `checkbashism` finds more than
previous, so fix those.

Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Rob Norris <robn@despairlabs.com>
Signed-off-by: Turbo Fredriksson <turbo@bayour.com>
Closes #18000
2026-02-10 16:57:22 -08:00
Turbo Fredriksson
f27550e985 Replace bashisms in ZFS shell function stub.
The `type` command is an optional feature in POSIX, so shouldn't be
used.

Instead, use `command -v`, which commit
  e865e7809e
did, but it missed this file.

Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Rob Norris <robn@despairlabs.com>
Signed-off-by: Turbo Fredriksson <turbo@bayour.com>
Closes #18000
2026-02-10 16:57:15 -08:00
Turbo Fredriksson
c54825f7eb Make lines stay within 80 char limit.
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Rob Norris <robn@despairlabs.com>
Signed-off-by: Turbo Fredriksson <turbo@bayour.com>
Closes #18000
2026-02-10 16:57:04 -08:00
Turbo Fredriksson
9ef326b987 Add some comments to clarify the mounting of filesystems.
There's no real documenation (which should probably be written!),
so instead document the code the best we can on what's going and
with the mounting of file systems to make future updates easier.

Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Rob Norris <robn@despairlabs.com>
Signed-off-by: Turbo Fredriksson <turbo@bayour.com>
Closes #18000
2026-02-10 16:56:57 -08:00
Turbo Fredriksson
8a3ff09350 Standardise if/then/else and for/do/done lines.
More code standard changes, where if/then is on different lines.
To have it on the same, or on different lines, can be argued, but
we need to pick one, and try not to mix how to do things.

Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Rob Norris <robn@despairlabs.com>
Signed-off-by: Turbo Fredriksson <turbo@bayour.com>
Closes #18000
2026-02-10 16:56:43 -08:00
Turbo Fredriksson
bea96d7d4b Add missing initrd config variables.
The `ZFS_INITRD_ADDITIONAL_DATASETS` variable is used in the initrd
script to boot additional OS file systems besides the root file system.
But it wasn't included as an example in the config files.

The `ZFS_POOL_EXCEPTIONS` *was* included in the example defaults file,
but it was not exported, so not available in the initrd.

Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Rob Norris <robn@despairlabs.com>
Signed-off-by: Turbo Fredriksson <turbo@bayour.com>
Closes #18000
2026-02-10 16:56:36 -08:00
Turbo Fredriksson
1028571218 Remove unnecessary sourcing of variables.
The file `/etc/default/zfs` is already sourced by the `/etc/zfs/zfs-functions`,
so no need to source it again.

Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Rob Norris <robn@despairlabs.com>
Signed-off-by: Turbo Fredriksson <turbo@bayour.com>
Closes #18000
2026-02-10 16:56:24 -08:00
Turbo Fredriksson
b86f15d84b Fix issue with finding degraded pool(s).
When a pool is degraded, or needs special action, the `zpool import`
(without pool to import) line will report:
```
  pool: rpool
    id: 01234567890123456789
 state: ONLINE
action: The pool can be imported using its name or numeric identifier.
config:
   [..]
```
If the import with the pool name fails, it is supposed to try importing
using the pool ID.

However, the script is also getting the `action` line (and probably `scrub:`
if/when that's available):
  pool; The pool can be imported using its name or numeric identifier.;config:;
which causes issues on consequent import attempts.

Cleanup the information by rewriting the `sed` command line.

Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Rob Norris <robn@despairlabs.com>
Signed-off-by: Turbo Fredriksson <turbo@bayour.com>
Closes #18000
2026-02-10 16:56:09 -08:00
Turbo Fredriksson
425691cf59 Prefix all variables that are local with underscore.
This just to make them easier to see.

Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Rob Norris <robn@despairlabs.com>
Signed-off-by: Turbo Fredriksson <turbo@bayour.com>
Closes #18000
2026-02-10 16:55:46 -08:00
Turbo Fredriksson
01f089509e Shell script good practices changes.
It's considered good practice to:
1) Wrap the variable name in `{}`.
   As in `${variable}` instead of `$variable`.
2) Put variables in `"`.

Also some minor error message tuning.

Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Rob Norris <robn@despairlabs.com>
Signed-off-by: Turbo Fredriksson <turbo@bayour.com>
Closes #18000
2026-02-10 16:55:32 -08:00
Turbo Fredriksson
ddbfd0f2e1 Fix potential global variable overwrite.
In a previous commit (e865e7809e), the
`local` keyword was removed in functions because of bashism.

Removing bashisms is correct, however this could cause variable overwrites,
since several functions use the same variable name.

So this commit make function variables unique in the (now) global name
space.

The problem from the original bug report (see #17963) could not be duplicated,
but it is still sane to make sure that variables stay unique.

Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Rob Norris <robn@despairlabs.com>
Signed-off-by: Turbo Fredriksson <turbo@bayour.com>
Closes #18000
2026-02-10 16:55:11 -08:00
Shreshth3
c4ad5e2938 zpool: fix conflict with -v and -o options
Right now, the -v and -o options for `zpool list` work independently,
but when paired, the -v "wins out" and the -o effect is lost. This
commit fixes that problem.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Rob Norris <robn@despairlabs.com>
Signed-off-by: Shreshth Srivastava <shreshthsrivastava2@gmail.com>
Closes #11040
Closes #17839
2026-02-10 16:53:37 -08:00
Tony Hutter
22b959d2e5 CI: Fix qemu-1-setup failure, remove debug stuff
- For whatever reason, the runner will now startup with either two 75GB
  disks or one 150GB disk.  Previously the runner was always booting
  with two 75GB, but about a quarter of the time it now starts up
  with a single 150GB disk.  This caused qemu-1-setup.sh to fail
  since it expected the two 75GB disks.  This commit updates
  qemu-1-setup.sh to work with either disk config.

- Remove the watchdog from qemu-1-setup.sh.  It didn't turn out to be
  useful.

- Remove the timestamps that zfs-qemu.yml added to the qemu-1-setup.sh
  output.  The timestamps were redundant, since you can already
  download timestamped logs from the Github web interface.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #18166
2026-02-05 13:48:31 -08:00
Tony Hutter
23476277c0 CI: Use Ubuntu mirrors instead of azure (#18057)
Use the official Ubuntu apt mirrors instead of
azure.archive.ubuntu.com, since that mirror can be slow:

    https://github.com/actions/runner-images/issues/7048

This can help speed up the 'Setup QEMU' stage.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #18057
2026-02-05 13:48:31 -08:00
Brooks Davis
7c80abdd7c nvpair: chase FreeBSD xdrproc_t definition
As of FreeBSD 16, xdrproc_t will take exactly two arguments in both
kernel and userspace in line with the Linux kernel.

Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Reviewed-by: Alan Somers <asomers@freebsd.org>
Signed-off-by:	Brooks Davis <brooks@capabilitieslimited.co.uk>
Closes #18154
2026-02-05 13:48:31 -08:00
Mariusz Zaborski
79c3810088 Make sure we can still write data to txg
The final txgs are used only to clear out any remaining deferred
frees, and we cannot write new data to them. Make sure we do not
try to do so.

Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Signed-off-by: Mariusz Zaborski <mariusz.zaborski@klarasystems.com>
Closes #18139
2026-02-05 13:48:31 -08:00
Alexander Motin
554a81b20a Lock db_mtx around arc_release() in couple places
* Lock db_mtx around arc_release() in dbuf_release_bp()

While this function is called only in sync context, the same buffer
can be touched by dbuf_hold_impl() in open context, creating races.
All other accesses to arc_release() are already protected by db_mtx,
so just take it here too.

Signed-off-by: Alexander Motin <alexander.motin@TrueNAS.com>

* Lock db_mtx in sa_byteswap()

While SA code seems protected by sa_lock, there is a back door of
dmu_objset_userquota_get_ids(), that may hold and access the dbuf
without sa_lock, relying only on db_mtx. Taking db_mtx here should
protect both the arc_release() and the data for db_buf.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ameer Hamza <ahamza@ixsystems.com>
Signed-off-by: Alexander Motin <alexander.motin@TrueNAS.com>
Closes #18146
2026-02-05 13:48:31 -08:00
Alek P
aebbfdb37a remove thread unsafe debug code causing FreeBSD double free panic
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alan Somers <asomers@gmail.com>
Signed-off-by: Alek Pinchuk <apinchuk@axcient.com>
Closes #18140
2026-02-05 13:48:31 -08:00
Mark Johnston
343cc96d7d FreeBSD: Remove references to DEBUG_VFS_LOCKS
This option is removed upstream in favour of plain INVARIANTS.

VNASSERT is always defined so I see no reason to use it conditionally.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Signed-off-by: Mark Johnston <markj@FreeBSD.org>
Closes #18136
2026-02-05 13:48:31 -08:00
Martin Matuška
d69f7c5e9b FreeBSD: unbreak compilation on i386
tests/zfs-tests/cmd/mmap_seek.c: use correct printf specifier
module/zfs/vdev.c: vdev_clear(): correctly cast argument to
atomic_add_64().

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Martin Matuska <mm@FreeBSD.org>
Closes #18096
2026-02-05 13:48:31 -08:00
Alan Somers
d08e561d0a Fix --enable-invariants on FreeBSD
The make symbols were never getting forwarded to the correct make
subprocess.  As far as I can tell, this has never worked.  Either that,
or something has changed in the behavior of make.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alan Somers <asomers@gmail.com>
Closes #18131
2026-02-05 13:48:31 -08:00
shuppy
6218a5eb03 Fix history logging for zpool create -t
`zpool create` is supposed to log the command to the new pool’s history,
as a special record that never gets evicted from the ring buffer. but
when you create a pool with `zpool create -t`, no such record is ever
logged (#18102). that bug may be the cause of issues like #16408.

`zpool create -t` (83e9986f6e) and `zpool
import -t` (26b42f3f9d) are both designed
to override the on-disk zpool property `name` with an in-core
“temporary” name, but they work somewhat differently under the hood.

importing with a temporary name sets `spa->spa_import_flags |=
ZFS_IMPORT_TEMP_NAME` in ZFS_IOC_POOL_IMPORT, which tells
spa_write_cachefile() and spa_config_generate() to use the
ZPOOL_CONFIG_POOL_NAME in `spa->spa_config` instead of `spa->spa_name`.

creating with a temporary name permanently(!) sets the internal zpool
property `tname` (ZPOOL_PROP_TNAME) in the `zc->zc_nvlist_src` of
ZFS_IOC_POOL_CREATE, which tells zfs_ioc_pool_create()
(4ceb8dd6fd) and spa_create() to use that
name instead of `zc->zc_name`, then sets `spa->spa_import_flags |=
ZFS_IMPORT_TEMP_NAME` like an import.

but zfsdev_ioctl_common() fails to check for `tname` when saving the
pool name to `zfs_allow_log_key`, so when we call ZFS_IOC_LOG_HISTORY,
we call spa_open() on the wrong pool name and get ENOENT, so the logging
silently fails.

this patch fixes #18102 by checking for `tname` in zfsdev_ioctl_common()
like we do in zfs_ioc_pool_create().

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Signed-off-by: delan azabani <dazabani@igalia.com>
Closes #18118  
Closes #18102
2026-02-05 13:48:31 -08:00
Alexander Motin
2c9fec38d0 DDT: Add locking for table ZAP destruction
Similar to BRT, DDT ZAP can be destroyed by sync context when it
becomes empty.  Respectively similar to BRT introduce RW-lock to
protect open context methods from the destruction.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <alexander.motin@TrueNAS.com>
Closes #18115
2026-02-05 13:48:31 -08:00
Andrew Walker
6f7f71825f Add fh_to_parent export definition
This commit adds support for converting a file handle to its
parent dentry. This is called in exportfs_decode_fh_raw()
when subtree checking is enabled in NFS. Defining this and
handling the expanded filehandles allows the knfsd to succeed
in handling the file handle where it might otherwise fail
with ESTALE when trying to open by filehandle.

A side effect of this change is that name_to_handle_at(2)
and open_by_handle_at(2) now support AT_HANDLE_CONNECTABLE.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Reviewed-by: Ameer Hamza <ahamza@ixsystems.com>
Signed-off-by: Andrew Walker <andrew.walker@truenas.com>
Closes #18099
2026-02-05 13:48:31 -08:00
Rob Norris
42c2b2d774 spl: remove a _KERNEL check
This code is only compiled for the Linux kernel module, so that define
is always set.

Sponsored-by: https://despairlabs.com/sponsor/
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Signed-off-by: Rob Norris <robn@despairlabs.com>
Closes #18117
2026-02-05 13:48:31 -08:00
Rob Norris
26fcf5848b spl: unexport kstat_proc_entry functions
These are used to implement the kstat and procfs_list interfaces, and
aren't used from outside. There's no need to export them.

Sponsored-by: https://despairlabs.com/sponsor/
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Signed-off-by: Rob Norris <robn@despairlabs.com>
Closes #18117
2026-02-05 13:48:31 -08:00
Rob Norris
eaa645be5d spl: lift 64-bit math compat out to separate file
It's a lot of rarely-compiled code, so move it to the side to make other
code easier to read.

Sponsored-by: https://despairlabs.com/sponsor/
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Signed-off-by: Rob Norris <robn@despairlabs.com>
Closes #18117
2026-02-05 13:48:31 -08:00
Rob Norris
b1d3b5e567 spl: remove old atomic lock
Long ago, SPL atomics were implemented as a global spinlock over
conventional operations. In 5e9b5d832b (2009-10) they was converted to
proper atomics, with the spinlock retained as a fallback.

The switch to compile with the fallback was later removed in a91258913f
(2018-05), but the code it enabled wasn't. So lets do that.

Sponsored-by: https://despairlabs.com/sponsor/
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Signed-off-by: Rob Norris <robn@despairlabs.com>
Closes #18117
2026-02-05 13:48:31 -08:00
Dimitry Andric
4cc3056c56 icp: emit .note.GNU-stack section for all ELF targets
On FreeBSD, linking the zfs kernel module with binutils ld 2.44 shows
the following warning:

    ld: warning: aesni-gcm-avx2-vaes.o: missing .note.GNU-stack section
    implies executable stack
    ld: NOTE: This behaviour is deprecated and will be removed in a
    future version of the linker

Some of the `.S` files under `module/icp/asm-x86_64/modes` check whether
to emit the `.note.GNU-stack` section using:

    #if defined(__linux__) && defined(__ELF__)

We could add `&& defined(__FreeBSD__)` to the test, but since all other
`.S` files in the OpenZFS tree use:

    #ifdef __ELF__

it would seem more logical to use that instead. Any recent ELF platform
should support these note sections by now.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Dimitry Andric <dimitry@andric.com>
Closes #18119
2026-02-05 13:48:31 -08:00
Austin Wise
65e13c33d8 When receiving a stream with the large block flag, activate feature
ZFS send streams include a feature flag DMU_BACKUP_FEATURE_LARGE_BLOCKS
to indicate the presence of large blocks in the dataset. On the sending
side, this flag is included if the `-L` flag is passed to `zfs send`
and the feature is active in the dataset. On the receive side, the
stream is refused if the feature is active in the destination dataset
but the stream does not include the feature flag.

The problem is the feature is only activated when a large block is
born. If a large block has been born in the destination, but never
the source, the send can't work. This can arise when sending streams
back and forth between two datasets.

This commit fixes the problem by always activating the large blocks
feature when receiving a stream with the large block feature flag.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Signed-off-by: Austin Wise <AustinWise@gmail.com>
Closes #18105
2026-02-05 13:48:31 -08:00
Jitendra Patidar
8a826c0f68 Fix zfs_open() to skip zil_async_to_sync() for the snapshot
Fix zfs_open() to skip zil_async_to_sync() for the snapshot, as it won't
have any transactions. zfsvfs->z_log is NULL for the snapshot.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Signed-off-by: Jitendra Patidar <jitendra.patidar@nutanix.com>
Closes #18091
2026-02-05 13:48:31 -08:00
shuppy
bc3320f0cc ZTS: add regression test for #17180
In #17180, we fixed an interesting bug that i believe i hit in one of my
pools, but as far as i can tell, there was no test for it.

this patch adds a regression test for #17180, minimised from my attempts
to reproduce the bug in a way that resembled the history of my pool.

Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Adam Moss <c@yotes.com>
Signed-off-by: delan azabani <dazabani@igalia.com>
Closes #18109
2026-02-05 13:48:31 -08:00
Dimitry Andric
6a9d7820e6 Rename several printf attributes declarations to __printf__
For kernel builds on FreeBSD, we redefine `__printf__` to
`__freebsd_kprintf__`, to support FreeBSD kernel printf(9) extensions
with clang.

In OpenZFS various printf related functions are declared with
`__attribute__((format(printf, X, Y)))`, so these won't work with the
above redefinition. With clang 21 and higher, this leads to errors
similar to:

    sys/contrib/openzfs/module/zfs/spa_misc.c:414:38: error: passing
    'printf' format string where 'freebsd_kprintf' format string is
    expected [-Werror,-Wformat]
      414 |         (void) vsnprintf(buf, sizeof (buf), fmt, adx);
          |                                             ^

Since attribute names can always be spelled with leading and trailing
double underscores, rename these instances.

Note that in the FreeBSD base system we usually use `__printflike` from
`<sys/cdefs.h>`, but that does not apply to OpenZFS.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Signed-off-by: Dimitry Andric <dimitry@andric.com>
Closes #18095
2026-02-05 13:48:31 -08:00
Andrew Walker
edd3d33433 Add handling for STATX_CHANGE_COOKIE
This commit adds handling for the STATX_CHANGE_COOKIE so that
we can properly surface the ZFS znode sequence to NFS clients via
knfsd.

If knfsd does not have STATX_CHANGE_COOKIE in statx result then
it will synthesize the NFS change_info4 structure and related
change4id values algorithmically based on the ctime value of the
file. Since internally ZFS is using ktime_get_coarse_real_ts64()
for the timestamp calculation here it introduces the possiblity
that the change will not increment the change4id of directories
/ files causing a failure in the client to invalidate its attr
cache (among other things). See RFC 8881 Section 10.8 for
discussion of how clients may implement name and directory
caching.

Notable in this commit is that we are not initializing the
inode->i_version to the znode->z_seq number. The reason for this
is that we're intentionally not setting `SB_I_VERSION`. This
indicates that the filesystem manages its own i_version and
so it is not populated in the generic_fillattr.

The following compares tight loop of setattr over NFSv4
protocol while traching nfsd4_change_attribute.

Before change:
inode, change_attribute
4723, 7590032215978780890
4723, 7590032215978780890
4723, 7590032215978780890
4723, 7590032215982780865
4723, 7590032215982780865

After change:
inode, change_attribute
7602, 7590032992517123951
7602, 7590032992517123952
7602, 7590032992517123953
7602, 7590032992517123954
7602, 7590032992517123955

Reviewed-by: Ameer Hamza <ahamza@ixsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Signed-off-by: Andrew Walker <andrew.walker@truenas.com>
Closes #18097
2026-02-05 13:48:31 -08:00
Rob Norris
a5d9f233fa kmem: don't add __GFP_RECLAIMABLE for KM_VMEM allocations
vmalloc()'d memory is not movable/reclaimable, so __GFP_RECLAIMABLE is
not a valid flag, and since 6.19 the kernel warns if you use it.

Sponsored-by: https://despairlabs.com/sponsor/
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Signed-off-by: Rob Norris <robn@despairlabs.com>
Closes #18107
2026-02-05 13:48:31 -08:00
Ivan Shapovalov
6ab8f46c6c cmd/zfs: clone: accept -u to not mount newly created datasets
Signed-off-by: Ivan Shapovalov <intelfx@intelfx.name>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Closes #18080
2026-02-05 13:48:31 -08:00
Rob Norris
cb1833023f kmem: don't add __GFP_COMP for KM_VMEM allocations
It hasn't been necessary since Linux 3.13
(torvalds/linux@a57a49887e), and since 6.19 the kernel warns if you
use it.

Sponsored-by: https://despairlabs.com/sponsor/
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rob Norris <robn@despairlabs.com>
Closes #18053
2026-02-05 13:48:31 -08:00
Rob Norris
2422c1f3b9 kmem: don't pass __GFP_HIGHMEM to __vmalloc
Since Linux 4.12 (torvalds/linux@19809c2da2) __GFP_HIGHMEM has been
automatically added to calls to __vmalloc() internally, so we don't need
it anymore. This is good, because since 6.19 the kernel warns if you use
__GFP_HIGHMEM.

Sponsored-by: https://despairlabs.com/sponsor/
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rob Norris <robn@despairlabs.com>
Closes #18053
2026-02-05 13:48:31 -08:00
Rob Norris
ccf956c2b3 Linux 6.19: replace i_state access with inode_state_read_once()
Sponsored-by: https://despairlabs.com/sponsor/
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rob Norris <robn@despairlabs.com>
Closes #18053
2026-02-05 13:48:31 -08:00
Alexander Motin
09587c7385 Use reduced precision for scan times
Scan time limits do not need precision beyond 1ms.  Switching
scn_sync_start_time and spa_sync_starttime from gethrtime() to
getlrtime() saves ~3% of CPU time during resilver scan stage.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <alexander.motin@TrueNAS.com>
Closes #18061
2026-02-05 13:48:31 -08:00
Alexander Motin
35ee242abc Reduce minimal scrub/resilver times
With higher throughput and lower latency of modern devices ZFS can
happily live with pretty short (fractions of a second) TXGs.  But
the two decade old multi-second minimal time limits can almost stop
payload writes by extending TXGs beyond dirty data limits of ARC
ability to amortize it.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <alexander.motin@TrueNAS.com>
Closes #18060
2026-02-05 13:48:31 -08:00
Allan Jude
ccb7c82aa1 zdb: Add -O option for -r to specify object-id
"zdb -r -O pool/dataset obj-id destination" will copy
the file with object-id obj-id to the named destination;
without -O it'll still be interpreted as a pathname.

Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Akash B <akash-b@hpe.com>
Signed-off-by: Sean Eric Fagan <sean.fagan@klarasystems.com>
Closes #16307
2026-02-05 13:48:31 -08:00
Mark Maybee
0de2da6a37 Fix rangelock test for growing block size
If the file already has more than one block, then the current
block size cannot change. But if the file block size is less
than the maximum block size supported by the file system, and
there are multiple blocks in the file, the current code will
almost always extend the rangelock to its maximum size.
This means that all writes become serialized and even reads
are slowed as they will more often contend with writes. This
commit adjusts the test so that we will not lock the entire
range if there is more than one block in the file already.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Signed-off-by: Mark Maybee <mark.maybee@perforce.com>
Closes #18046
Closes #18064
2026-02-05 13:48:31 -08:00
Alexander Motin
8d391531eb Bypass snprintf() in quota checks if no quotas set
This improves synthetic 1 byte write speed by ~2.5%.

Reviewed-by: Ameer Hamza <ahamza@ixsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Alexander Motin <alexander.motin@TrueNAS.com>
Closes #18063
2026-02-05 13:48:31 -08:00
Alexander Motin
8dd01181aa RAIDZ: Remove some excessive logging
There were some per I/O logging into dbgmsg in RAIDZ code, that
increased CPU load and wiped useful content out of dbgmsg, for
example during routine disk replacement process.  I don't think
we need it to be that verbose.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <alexander.motin@TrueNAS.com>
Closes #18059
2026-02-05 13:48:31 -08:00
Alan Somers
76871c295a Remove the obsolete FreeBSD 14.2-RELEASE from CI
Sponsored by:	ConnectWise
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Signed-off-by:	Alan Somers <asomers@gmail.com>
Closes #18013
2026-02-05 13:48:31 -08:00
Alexander Motin
96b1d2fae9 DDT: Fix compressed entry buffer size
The first byte of the entry after compression is used for algorithm
and byte order flag.  We should decrement when calling compression/
decompression algorithm.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <alexander.motin@TrueNAS.com>
Closes #18055
2026-02-05 13:48:31 -08:00
Alexander Motin
4ab2027f59 DDT: Add/use zap_lookup_length_uint64_by_dnode()
Unlike other ZAP consumers due to compression DDT does not know
how big entry it is reading from ZAP.  Due to this it called
zap_length_uint64_by_dnode() and zap_lookup_uint64_by_dnode(),
each of which does full ZAP entry lookup.

Introduction of the combined ZAP method dramatically reduces the
CPU overhead and locks contention at DBUF layer.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <alexander.motin@TrueNAS.com>
Closes #18048
2026-02-05 13:48:31 -08:00
Alexander Motin
4905686e67 DDT: Switch to using ZAP _by_dnode() interfaces
As was previously done for BRT, avoid holding/releasing DDT ZAP
dnodes for every access.  Instead hold the dnodes during all their
life time, never releasing.

While at this, add _by_dnode() interfaces for zap_length_uint64()
and zap_count(), actively used by DDT code.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <alexander.motin@TrueNAS.com>
Closes #18047
2026-02-05 13:48:31 -08:00
Alexander Motin
fa857113a3 DDT: Move logs searches out of the lock
Postponing entry removal from the DDT log in case of hit till later
single-threaded sync stage allows to make ddl_tree stable during
multi-threaded ZIO processing stage.  It allows to drop the DDT lock
before the search instead of after, reducing the contention a lot.

Actually ddt_log_update_entry() was already handling the case of
entry present in the active log, so we only need to remove it from
flushing log, if the entry happen to be there.

My tests with parallel 4KB block writes show throughput increase
from 480MB/s (122K blocks/s) to 827MB/s (212K blocks/s), even
though still limited by the global DDT lock contention.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <alexander.motin@TrueNAS.com>
Closes #18044
2026-02-05 13:48:31 -08:00
Alexander Motin
2428043709 Improve async destroy processing timing
Previous code effectively enforced that all async free ZIOs were
_issued_ within the TXG timeout.  But they could take forever to
complete, especially if the required metadata were not in ARC.

This patch introduces periodic waits every 2000 ZIOs, which should
give at least somewhat reasonable TXG timings even for single HDD
pools with empty ARC.  And makes them complete within half of the
TXG timeout, since we might still need time to sync DDT and BRT.

While there, change zfs_max_async_dedup_frees semantics to include
also clone and gang blocks, which are similar.  Bump the default
value from set long ago to be more forgiving to block cloning
(still not having logs and benefiting from large TXGs), now that
we have better working time limits.  The limit now is a possible
amount of dirty data produced by BRT updates.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Allan Jude <allan@klarasystems.com>
Signed-off-by: Alexander Motin <alexander.motin@TrueNAS.com>
Closes #18043
2026-02-05 13:48:31 -08:00
Alexander Motin
135103a648 Defer async destroys on pool import
We've observed a number of cases when pool import stuck for many
minutes due to large async destroy trying to load DDT or BRT from
HDD pool.  While proper destroy dosage is a separate problem,
lets give import process a chance to complete before that at all.
It may be not enough if there is a lot of ZIL to replay, but that
is harder to cover, since those are in separate syscalls.

Code investigation shown that we already have this mechanism used
for scrub/resilver, so this patch converts SCAN_IMPORT_WAIT_TXGS
into a tunable and applies it to async destroys also.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <alexander.motin@TrueNAS.com>
Closes #18033
2026-02-05 13:48:30 -08:00
Alexander Motin
d5724f8f3f ZTS: Fix zvol_misc_fua SLOG writes check
Instead of comparing number of SLOG writes to number of normal
writes we should just make sure SLOG got the required number of
writes.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <alexander.motin@TrueNAS.com>
Closes #18033
2026-02-05 13:48:30 -08:00
Alexander Motin
e865ddad5c ZIO: ZIO_STAGE_DDT_WRITE is a blocking stage
ddt_lookup() in zio_ddt_write() might require synchronous DDT ZAP
read.  Running it from interrupt taskq might lead to deadlock.
Inclusion of ZIO_STAGE_DDT_WRITE into ZIO_BLOCKING_STAGES should
hopefully fix that, even though I am not sure how I got there.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <alexander.motin@TrueNAS.com>
Closes #17981
2026-02-05 13:48:30 -08:00
Alexander Motin
8a79d09680 ARC: Increase parallel eviction batching
Before parallel eviction implementation zfs_arc_evict_batch_limit
caused loop exits after evicting 10 headers.  The cost of it is not
big and well motivated.  Now though taskq task exit after the same
10 headers is much more expensive.  To cover the context switch
overhead of taskq introduce another level of batching, controlled
by zfs_arc_evict_batches_limit tunable, used only for parallel
eviction.

My tests including 36 parallel reads with 4KB recordsize that shown
1.4GB/s (~460K blocks/s) before with heavy arc_evict_lock contention,
now show 6.5GB/s (~1.6M blocks/s) without arc_evict_lock contention.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <alexander.motin@TrueNAS.com>
Closes #17970
2026-02-05 13:48:30 -08:00
Alexander Motin
5e0f20088d ARC: Pre-convert zfs_arc_min_prefetch_ms
There is no need to do MSEC_TO_TICK() for each evicted ARC header.
We can do it when tunables are set, since we already have separate
internal variables for those.

Reviewed-by: Rob Norris <robn@despairlabs.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <alexander.motin@TrueNAS.com>
Closes #17965
2026-02-05 13:48:30 -08:00
Alexander Motin
6482a27e81 Reduce dataset buffers re-dirtying
For each block written or freed ZFS dirties ds_dbuf of the dataset.
While dbuf_dirty() has a fast path for already dirty dbufs, it still
require taking the lock and doing some things visible in profiler.

Investigation shown ds_dbuf dirtying by dsl_dataset_block_born()
and some of dsl_dataset_block_kill() are just not needed, since
by the time they are called in sync context the ds_dbuf is already
dirtied by dsl_dataset_sync().

Tests show this reducing large file deletion time by ~3% by saving
CPU time of single-threaded part of the sync thread.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <alexander.motin@TrueNAS.com>
Closes #18028
2026-02-05 13:48:30 -08:00
Tony Hutter
8dc656b873 CI: Increase setup timeout to 20min, add timestamps
- Increase qemu-1-setup.sh timeout to 20min since it sometimes
  fails to complete after 15min.

- Timestamp all qemu-1-setup.sh lines to look for hangs.

- Add a 'watchdog' process to print out the top running process every
  30sec to help with debugging.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #17714
2026-02-05 13:48:30 -08:00
Tony Hutter
743334913e Tag 2.4.0
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
2025-12-18 09:31:29 -08:00
Tony Hutter
2143bff328 CI: Change timeout values
The 'Setup QEMU' CI step updates and installs all packages necessary to
startup QEMU.  Typically the step takes a little over a minute, but
we've seen cases where it can take legitimately take more than 45min
minutes.  Change the timeout to 60 minutes.

In addition, change the 'Install dependencies' timeout to 60min since
we've also seen timeouts there.

Lastly, remove all timeouts from the zfs-qemu-packages workflow.
We do this so that we can always build packages from a branch, even if
the time it takes to do a CI step changes over time.  It's ok to
eliminate the timeouts from the zfs-qemu-packages completely since that
workflow is only run manually.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #18056
2025-12-18 09:31:29 -08:00
Brian Behlendorf
42411327cb Tag 2.4.0-rc5
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
2025-12-10 10:21:29 -08:00
Ameer Hamza
47319ef7a6 ZTS: Add test for snapshot automount race
Add snapshot_019_pos to verify parallel snapshot automount operations
don't cause AVL tree panic. Regression test for commit 4ce030e025.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
Closes #18035
2025-12-10 10:21:29 -08:00
Ameer Hamza
0bcbee6040 Fix snapshot automount race causing duplicate mounts and AVL tree panic
Multiple threads racing to automount the same snapshot can both spawn
mount helper processes that successfully complete, causing both parent
threads to attempt AVL tree registration and triggering a VERIFY()
panic in avl_add(). This occurs because the fsconfig/fsmount API lacks
the serialization provided by traditional mount() via lock_mount().

The fix adds a per-entry mutex (se_mtx) to zfs_snapentry_t that
serializes mount and unmount operations on the same snapshot. The first
mount thread creates a pending entry with se_spa=NULL and holds se_mtx
during the helper execution. Concurrent mounts find the pending entry
and return success without spawning duplicate helpers. Unmount waits on
se_mtx if a mount is pending, ensuring proper serialization. This allows
different snapshots to mount in parallel while preventing the AVL panic.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
Closes #17943
2025-12-10 10:21:29 -08:00
Ameer Hamza
74bbdda1ef Fix snapshot automount expiry cancellation deadlock
A deadlock occurs when snapshot expiry tasks are cancelled while holding
locks. The snapshot expiry task (snapentry_expire) spawns an umount
process and waits for it to complete. Concurrently, ARC memory pressure
triggers arc_prune which calls zfs_exit_fs(), attempting to cancel the
expiry task while holding locks. The umount process spawned by the
expiry task blocks trying to acquire locks held by arc_prune, which is
blocked waiting for the expiry task to complete. This creates a circular
dependency: expiry task waits for umount, umount waits for arc_prune,
arc_prune waits for expiry task.

Fix by adding non-blocking cancellation support to taskq_cancel_id().
The zfs_exit_fs() path calls zfsctl_snapshot_unmount_delay() to
reschedule the unmount, which needs to cancel any existing expiry task.
It now uses non-blocking cancellation to avoid waiting while holding
locks, breaking the deadlock by returning immediately when the task is
already running.

The per-entry se_taskqid_lock has been removed, with all taskqid
operations now protected by the global zfs_snapshot_lock held as
WRITER. Additionally, an se_in_umount flag prevents recursive waits when
zfsctl_destroy() is called during unmount. The taskqid is now only
cleared by the caller on successful cancellation; running tasks clear
their own taskqid upon completion.

Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
Closes #17941
2025-12-10 10:21:29 -08:00
Ameer Hamza
663dc86de2 Fix taskq NULL pointer dereference on timer race
Remove unsafe timer_pending() check in taskq_cancel_id() that created a
race where:
- Timer expires and timer_pending() returns FALSE
- task_done() frees task with tqent_func = NULL
- Timer callback executes and queues freed task
- Worker thread crashes executing NULL function

Always call timer_delete_sync() unconditionally to ensure timer callback
completes before task is freed.

Reliably reproducible by injecting mdelay(10) after setting CANCEL flag
to widen the race window, combined with frequent task cancellations
(e.g., snapshot automount expiry).

Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
Closes #17942
2025-12-10 10:21:29 -08:00
Brian Behlendorf
145c606c60 Linux 6.18 compat: META (#18039)
Update the META file to reflect compatibility with the 6.18
kernel.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
2025-12-10 10:21:29 -08:00
Rob Norris
c9845a1332 Linux: work around use of GPL-only symbol kasan_flag_enabled
We may not be able to avoid our code referencing the symbol, but we can
ensure that a symbol of that name is available to the linker during
build, and so not require linking the GPL-exported version.

Sponsored-by: https://despairlabs.com/sponsor/
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rob Norris <robn@despairlabs.com>
Closes #18009
Closes #18040
2025-12-10 10:21:29 -08:00
Chunwei Chen
028d66b9dd Fix ddtprune causing space leak
In zio_ddt_free, if a pruned dde is still in ddt, it would do nothing
and cause space leak.

Reviewed-by: Rob Norris <robn@despairlabs.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Allan Jude <allan@klarasystems.com>
Signed-off-by: Chunwei Chen <david.chen@nutanix.com>
Closes #17982
Closes #17983
2025-12-10 10:21:29 -08:00
Tony Hutter
206487b9b1 CI: Fix Ubuntu 22.01 rsend failures
For whatever reason, the single `log_note` in the `directory_diff`
function causes the function to stop executing on Ubuntu 22.  This
causes most of the rsend tests to fail.  Remove the line since it's only
informational.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
2025-12-10 10:21:29 -08:00
Alex
f8572e2a97 Fix a declaration position of the nth_page.
Compilation time bug introduced by 87df5e4 commit.
Fix for the compilation error(Linux kernel 6.18.0):
"zfs/module/os/linux/zfs/abd_os.c:920:32: error: implicit declaration
of function ‘nth_page’; did you mean ‘pte_page’?
[-Werror=implicit-function-declaration]".

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Rob Norris <robn@despairlabs.com>
Signed-off-by: agiUnderground <alex.dev.cv@gmail.com>
Closes #18034
2025-12-10 10:21:29 -08:00
Brian Behlendorf
8c1eaea952 CI: exclude signed-off-by/reviewed-by from 72 char limit
Allow an author or reviewer's name and email address to exceed
the 72 character limit enforced by the commitcheck target.

Reviewed-by: RageLtMan <rageltman@sempervictus>
Reviewed-by: Rob Norris <robn@despairlabs.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #18030
2025-12-10 10:21:29 -08:00
bspengler-oss
25d755e108 Fix HIGHMEM/kmap API violation in zfs_uiomove_bvec_impl()
Fix another instance where ZFS assumes multiple pages can be
mapped at once via zfs_kmap_local(), resulting in crashes and
potential memory corruption on HIGHMEM-enabled (typically 32-bit)
systems.

Reviewed-by: RageLtMan <rageltman@sempervictus>
Reviewed-by: Rob Norris <robn@despairlabs.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: bspengler-oss <94915855+bspengler-oss@users.noreply.github.com>
Closes #15668
Closes #18030
2025-12-10 10:21:29 -08:00
bspengler-oss
5946eeb8df Preserve LIFO ordering of kmap ops in abd_raidz_gen_iterate()
ZFS typically preserves proper LIFO ordering regarding map/unmap
operations that wrap the Linux kernel's kmap interfaces that
require such ordering, but one instance in abd_raidz_gen_iterate()
did not.

Similar issues have been fixed in the Linux kernel in the past,
see for instance CVE-2025-39899 for userfaultfd.

Reviewed-by: RageLtMan <rageltman@sempervictus>
Reviewed-by: Rob Norris <robn@despairlabs.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: bspengler-oss <94915855+bspengler-oss@users.noreply.github.com>
Closes #15668
Closes #18030
2025-12-10 10:21:29 -08:00
bspengler-oss
5e271995d1 Fix interaction of abd_iter_map()/abd_iter_unmap() with HIGHMEM
HIGHMEM kmap interfaces operate on only a single page at a time
yet ZFS hadn't accounted for this, resulting in crashes and
potential memory corruption on HIGHMEM (typically 32-bit) systems.
This was caught by PaX's KERNSEAL feature as it makes use of
HIGHMEM functionality on x64.

On typical 64-bit systems, this issue wouldn't have been observed,
as the map interfaces simply fall back to returning an address in
lowmem where the contiguous pages can be accessed directly.

Joint work with the PaX Team, tested by Mark van Dijk

Reviewed-by: RageLtMan <rageltman@sempervictus>
Reviewed-by: Rob Norris <robn@despairlabs.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: bspengler-oss <94915855+bspengler-oss@users.noreply.github.com>
Closes #15668
Closes #18030
2025-12-10 10:21:29 -08:00
Mark Johnston
a2f768f61f FreeBSD: Fix a potential null dereference in zfs_freebsd_fsync()
In general it's possible for a vnode to not have an associated VM
object.  This happens in particular with named pipes, which have
some distinct VOPs, defined in zfs_fifoops.  Thus, this chunk of
zfs_freebsd_fsync() needs to check for the FIFO case, like other
vm_object_mightbedirty() callers do.

(Note that vn_flush_cached_data() calls are predicated on
zn_has_cached_data() returning true, and it checks for a NULL v_object
pointer already.)

Fixes: ef4058fcdc
Reported-by: Collin Funk <collin.funk1@gmail.com>
Reviewed-by: Sean Eric Fagan <sef@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Signed-off-by: Mark Johnston <markj@FreeBSD.org>
Closes #18015
2025-12-10 10:21:29 -08:00
Alan Somers
872266a5f3 During CI, use nproc instead of sysctl -n hw.ncpu
The latter may give the wrong result if cpusets are in use.

Sponsored by:	ConnectWise
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Signed-off-by:	Alan Somers <asomers@gmail.com>
Closes #18012
2025-12-10 10:21:29 -08:00
Brian Behlendorf
ed87bc593f ZTS: Add slow_vdev_degraded_sit_out retry
While not common the draid3 vdev type has been observed to
not always sit out a vdev when run in the CI.  To prevent
continued false positives allow the test to be retried up
to three times before considering it a failure.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #18003
2025-12-10 10:21:29 -08:00
Alexander Motin
e1f0baa546 FreeBSD: Remove HAVE_INLINE_FLSL use
These macros are deprecated in FreeBSD kernel for several years,
and unneeded for much longer.  Instead, similar to Linux, let
kernel let compiler do the right things.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <alexander.motin@TrueNAS.com>
Closes #18004
2025-12-10 10:21:29 -08:00
Alexander Motin
071369803e raidz_test: Restore rand_data protection
It feels dirty to modify protection of a memory allocated via libc,
but at least we should try to restore it before freeing.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Rob Norris <robn@despairlabs.com>
Signed-off-by: Alexander Motin <alexander.motin@TrueNAS.com>
Closes #17977
2025-12-10 10:21:29 -08:00
Alexander Motin
6e10a51b74 raidz_test: Fix ZIO ABDs initialization
- When filling ABDs of several segments, consider offset.
 - "Corrupt" ABDs with actually different data to fail something.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Rob Norris <robn@despairlabs.com>
Signed-off-by: Alexander Motin <alexander.motin@TrueNAS.com>
Closes #17977
2025-12-10 10:21:29 -08:00
Alexander Motin
001ce40cd4 raidz_test: Set io_offset reasonably
- io_offset of 1 makes no sense.  Set default to 0.
 - Initialize io_offset in all cases.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Rob Norris <robn@despairlabs.com>
Signed-off-by: Alexander Motin <alexander.motin@TrueNAS.com>
Closes #17977
2025-12-10 10:21:29 -08:00
Alexander Motin
68c1df8db3 ZFS: Enable more logs for raidz_001_neg
The output is not so big here, so lets collect something useful.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Rob Norris <robn@despairlabs.com>
Signed-off-by: Alexander Motin <alexander.motin@TrueNAS.com>
Closes #17977
2025-12-10 10:21:29 -08:00
Alexander Motin
a41ef36858 DDT: Reduce global DDT lock scope during writes
Before this change DDT lock was taken 4 times per written block,
and as effectively a pool-wide lock it can be highly congested.
This change introduces a new per-entry dde_io_lock, protecting some
fields during I/O ready and done stages, so that we don't need the
global lock there.

According to my write tests on 64-thread system with 4KB blocks this
significantly reduce the global lock contention, reducing CPU usage
from 100% to expected ~80%, and increasing write throughput by 10%.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Rob Norris <robn@despairlabs.com>
Signed-off-by: Alexander Motin <alexander.motin@TrueNAS.com>
Closes #17960
2025-12-10 10:21:29 -08:00
Alexander Motin
a785ddc5f3 DDT: Switch to using wmsums for lookup stats
ddt_lookup() is a very busy code under a highly congested global
lock.  Anything we can save here is very important.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Rob Norris <robn@despairlabs.com>
Signed-off-by: Alexander Motin <alexander.motin@TrueNAS.com>
Closes #17980
2025-12-10 10:21:29 -08:00
Alexander Motin
2aad3dee23 DDT: Make children writes inherit allocator
Even though unlike gang children it is not so critical for dedup
children to inherit parent's allocator, there is still no reason
for them to have allocation policy different from normal writes.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Rob Norris <robn@despairlabs.com>
Signed-off-by: Alexander Motin <alexander.motin@TrueNAS.com>
Closes #17961
2025-12-10 10:21:29 -08:00
Tony Hutter
cdbe788a39 CI: zfs-test-packages: Add in new repos
Test install from our new repos: zfs-latest, zfs-legacy,
zfs-2.3, zfs-2.2, from the zfs-test-packages workflow.
This on-demand workflow is use to verify that the zfs RPMs
in the repos are correct.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #17956
2025-12-10 10:21:29 -08:00
Rob Norris
d12eb47d96 config/kmap_atomic: initialise test data
6.18 changes kmap_atomic() to take a const pointer. This is no problem
for the places we use it, but Clang fails the test due to a warning
about being unable to guarantee that uninitialised data will definitely
not change. Easily solved by forcibly initialising it.

Sponsored-by: https://despairlabs.com/sponsor/
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rob Norris <robn@despairlabs.com>
Closes #17954
2025-12-10 10:21:29 -08:00
Rob Norris
304810208e zvol_id: make array length properly known at compile time
Using strlen() in an static array declaration is a GCC extension. Clang
calls it "gnu-folding-constant" and warns about it, which breaks the
build. If it were widespread we could just turn off the warning, but
since there's only one case, lets just change the array to an explicit
size.

Sponsored-by: https://despairlabs.com/sponsor/
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rob Norris <robn@despairlabs.com>
Closes #17954
2025-12-10 10:21:29 -08:00
Rob Norris
aa091a17bd Linux: bump -std to gnu11
Linux switched from -std=gnu89 to -std=gnu11 in 5.18
(torvalds/linux@e8c07082a8). We've always overridden that with gnu99
because we use some newer features.

More recent kernels are using C11 features in headers that we include.
GCC generally doesn't seem to care, but more recent versions of Clang
seem to be enforcing our gnu99 override more strictly, which breaks the
build in some configurations.

Just bumping our "override" to match the kernel seems to be the easiest
workaround. It's an effective no-op since 5.18, while still allowing us
to build on older kernels.

Sponsored-by: https://despairlabs.com/sponsor/
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rob Norris <robn@despairlabs.com>
Closes #17954
2025-12-10 10:21:29 -08:00
Alexx Saver
f45622ff42 chksum: run 256K benchmark on demand, preserve chksum_stat_data
Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexx Saver <lzsaver.eth@ethermail.io>
Co-authored-by: Adam Moss <c@yotes.com>
Closes #17945
Closes #17946
2025-12-10 10:21:29 -08:00
Alexander Motin
2e09f166f0 FreeBSD: Fix uninitialized variable error
On FreeBSD errno is defined as (* __error()), which means compiler
can't say whether two consecutive reads will return the same.
And without this knowledge the reported error is formally right.

Caching of the errno in local variable fixes the issue.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Rob Norris <robn@despairlabs.com>
Signed-off-by: Alexander Motin <alexander.motin@TrueNAS.com>
Closes #17975
2025-12-10 10:21:29 -08:00
Shreshth3
c8ecd63acd zpool: fix special vdev -v -o conflict
Right now, running `zpool list` with -v and -o passed
does not work properly for special vdevs. This commit
fixes that problem.

See the discussion on #17839.
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Shreshth Srivastava <shreshthsrivastava2@gmail.com>
Closes #17932
2025-12-10 10:21:29 -08:00
Brian Behlendorf
d06ebddee4 CI: Add smatch static analysis workflow
Smatch is an actively maintained kernel-aware static analyzer
for C with a low false positive rate.  Since the code checker
can be run relatively quickly against the entire OpenZFS code
base (15 min) it makes sense to add it as a GitHub Actions
workflow.  Today smatch reports a significant numbers warnings
so the workflow is configured to always pass as long as the
analysis was run.  The results are available for reference.
Long term it would ideal to resolve all of the errors/warnings
at which point the workflow can be updated to fail when new
problems are detected.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Toomas Soome <tsoome@me.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #17935
2025-12-09 15:34:45 -08:00
Toomas Soome
040c533280 cmd/zpool cstyle issues
add missing headers.
usage() is no-return, so anything after call to it is unreachable code.
use (void) cast where we do ignore return value.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Toomas Soome <tsoome@me.com>
Closes #17885
2025-12-09 15:34:13 -08:00
Brian Behlendorf
099f69ff5d Tag 2.4.0-rc4
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
2025-11-12 13:10:09 -08:00
Brian Behlendorf
7a919fb70c Update all ABI files
Refresh all ABI files using the CI generated files to reflect
the library interfaces to be published for the 2.4 release.

Reviewed-by: Rob Norris <robn@despairlabs.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #17911
2025-11-12 13:07:36 -08:00
Brian Behlendorf
5714090fb9 libspl: hide zfs_tunable_* symbols
The zfs_tunable_* functions are a public interface which are
part of the internal libspl convenience library.  They should
be hidden to prevent an unnecessary ABI change in installed
libraries which link against libspl (e.g. libzfs_core, libuutil).

We do already leak long standing libspl symbols.  This commit is
solely intended to prevent leaking these new ones until this is
properly sorted out.

Reviewed-by: Rob Norris <robn@despairlabs.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #17911
2025-11-12 13:07:32 -08:00
Brian Behlendorf
5b2489caf2 Bump SONAME of libzfs and libzpool
The ABI of libzfs and libzpool have breaking changes since the
last major release.  Bump the SONAME for the upcoming 2.4 release
branch to libzfs7 and libzpool7.

Reviewed-by: Rob Norris <robn@despairlabs.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #17911
2025-11-12 13:07:28 -08:00
Brian Behlendorf
ff536b1538 Bump SONAME on libnvpair
The nvlist_snprintf() function was added to the ABI of libnvpair.
No other symbols were modified or removed.  Bump the library-info
SONAME current and age args to reflect this is a minor library
version update.

Reviewed-by: Rob Norris <robn@despairlabs.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #17911
2025-11-12 13:07:23 -08:00
Adi-Goll
7ebb5e9b3f Reduce timeout to zero when running inside a container
Detect container environments and set timeout to zero unless
ZFS_MODULE_TIMEOUT is already set. This avoids an unnecessary ten
second delay after running zfs/zpool commands in a container where
/dev/zfs is unavailable.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Adi Gollamudi <adigollamudi@gmail.com>
Closes #15165
Closes #17922
2025-11-12 13:07:20 -08:00
Mariusz Zaborski
1e8c96d7d5 Add knob to disable slow io notifications
Introduce a new vdev property `VDEV_PROP_SLOW_IO_REPORTING` that
allows users to disable notifications for slow devices.
This prevents ZED and/or ZFSD from degrading the pool due to slow
I/O.

Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Mariusz Zaborski <oshogbo@FreeBSD.org>
Closes 17477
2025-11-12 13:07:14 -08:00
Alexander Motin
41878d57ea Add BRT support to zpool prefetch command
Implement BRT (Block Reference Table) prefetch functionality similar
to existing DDT prefetch.  This allows preloading BRT metadata into
ARC to improve performance for block cloning operations and frees
of earlier cloned blocks.

Make -t parameter optional.  When omitted, prefetch all supported
metadata types (both DDT and BRT now).

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <alexander.motin@TrueNAS.com>
Closes #17890
2025-11-12 13:07:09 -08:00
Alexander Motin
002bc3da6a BRT: Increase block size from 4KB to 8KB
According to my observations, BRT ZAPs are typically compressible
3:1 for data and 2:1 for indirects.  With ashift=12, typical these
days, it means increasing the block sizes to 8KB we may get most
of possible compression, reducing on-disk and in-ARC BRT footprint
in half by the cost of some compression/decompression overhead,
but without real write inflation, only some dirty data increase.

Increase to 32KB similar to DDT could further increase compression
and storage efficiency, but at the cost of write inflation and
much bigger dirty data increase, which we can not properly control
now.  So lets leave this for a time when BRT log gets implemented.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <alexander.motin@TrueNAS.com>
Closes #17916
2025-11-12 13:07:04 -08:00
Alexander Motin
e895c76194 ZAP: Remove dmu_object_info_from_dnode() call
dmu_object_info_from_dnode() takes two locks and copies plenty of
data that we don't need in zap_lockdir_impl().  Just read dn_type
directly in this hot path.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <alexander.motin@TrueNAS.com>
Closes #17921
2025-11-12 13:07:00 -08:00
Rob Norris
ac0bc4cc00 spa_misc: add an API for spa_namespace_lock
This is useful as debugging support, as it lets namespace lock
operations be traced directly. It will also be useful for future work to
reduce the use of spa_namespace_lock, traditionally a source of
difficult deadlocks.

Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Rob Norris <rob.norris@klarasystems.com>
Closes #17906
2025-11-12 13:06:54 -08:00
Alexander Motin
e305c7d596 BRT: Fix ranges to blocks conversion math
BRT_RANGESIZE_TO_NBLOCKS() takes number of ranges as its argument.
To get number of blocks we should multiply it by the entry size,
not divide by it, as it was due to missing parentheses.

Before #17875 this could cause small memory corruptions for vdevs
bigger than 64TB, but the change made the bug more noticeable.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <alexander.motin@TrueNAS.com>
Closes #17886
Closes #17915
2025-11-12 13:06:48 -08:00
Adi-Goll
e1734111fd Update man page description of zpool rewind
Update description of zpool import --rewind-to-checkpoint in
man/man7/zpoolconcepts.7 to explain that rewinding automatically
discards a checkpoint.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Adi Gollamudi <adigollamudi@gmail.com>
Closes #12646
Closes #17918
2025-11-12 13:06:43 -08:00
Alexander Motin
aaf374bd40 ZIO: Set minimum number of free issue threads to 32
Free issue threads might block waiting for synchronous DDT, BRT or
GANG header reads. So unlike other taskqs using ZTI_SCALE to scale
with number of CPUs, here we also need some amount of threads to
potentially saturate pool reads.  I am not sure we always want the
96 threads we had before ZTI_SCALE introduction at #11966 on small
systems, but lets make it at least 32.

While here, make free taskqs configurable, similar to read and
write ones.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Rob Norris <robn@despairlabs.com>
Signed-off-by: Alexander Motin <alexander.motin@TrueNAS.com>
Closes #17903
2025-11-12 13:06:39 -08:00
rmacklem
583db40030 FreeBSD: Add support for _PC_CASE_INSENSITIVE
FreeBSD now has a pathconf name called _PC_CASE_INSENSITIVE
used to check if a file system performs case insensitive
name lookups.

This patch adds support for this name.

Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
 Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rick Macklem <rmacklem@uoguelph.ca>
Closes #17908
2025-11-12 13:06:36 -08:00
Brian Behlendorf
84dd55510b zstd: disable intrinsics
Disable the aarch64 NEON SIMD intrinsics for kernel builds.  Safely
using them in the kernel context requires saving/restoring the FPU
registers which is not currently done.

Additionally, remove the aarch64 optimized PREFETCH_L1 and PREFETCH_L2
instruction.  Rely on the more portable compiler built ins.

This lets us remove the problematic workaround in the aarch64_compat.h
header which undefines the __aarch64__ macro.

Reviewed-by: Rob Norris <robn@despairlabs.com>
Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #17904
Closes #17852
2025-11-12 13:06:22 -08:00
Adi-Goll
015729a11b Fix typo in vdev_raidz.c
Change the spelling of "begining" on line 4875 to
"beginning".

Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Adi Gollamudi <adigollamudi@gmail.com>
Closes #17905
2025-11-12 13:06:19 -08:00
Toomas Soome
4fd926ab40 libzfs: ignoring unreachable code
We have infinite loop and on certain condition, we exit this loop
and thread with pthread_exit(). But also after this loop,
we have a code to perform pthread_cleanup_pop() and return from the
thread.

The  problem is that modern compilers are able to recognize that we
actually never get to the statements after loop and therefore
it is dead code there.

I think, instead of pthread_exit(), it is better to break out of loop
and let the last statements to work as intended. This is because
we do need to keep pthread_cleanup_pop() anyhow. Of course,
it is matter of taste if we want to use return or pthread_exit as very
last statement in this function.

Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Igor Kozhukhov <igor@dilos.org>
Signed-off-by: Toomas Soome <tsoome@me.com>
Closes #17900
2025-11-12 13:06:15 -08:00
Rob Norris
7b121388fb man: describe zfs-rewrite method and properties
We've heard anecdotes that suggest some
confusion/surprise/disappointment that a changed recordsize is not
applied during rewrite. Until such time as we actually can do that, we
can at least explicitly mention it at something that doesn't work.

Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Reviewed-by: Allan Jude <allan@klarasystems.com>
Signed-off-by: Rob Norris <rob.norris@klarasystems.com>
Closes #17898
2025-11-12 13:06:10 -08:00
Alexander Ziaee
055e908d47 zfs-jail.8: Add introductory sentence, refactor
Add an introductory sentance explaining why the reader may want to use
this command, and establishing the requirement that the jail must be
running. Move other requirements from the description of the subcommands
to follow this for flow and structure. Move the caveat that this is for
FreeBSD down to a cannonical CAVEATS section, and crossreference Linux's
equivelant functionality. Mention that this utility can not be used to
delegate the root directory of the jail to that section also.

Reported by: Jan Brankamp <crest@rlwinm.de>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Allan Jude <allan@klarasystems.com>
Signed-off-by: Alexander Ziaee <ziaee@FreeBSD.org>
Closes #17883
2025-11-12 13:06:06 -08:00
Tony Hutter
a2a34d9212 Linux 6.17 compat: Fix broken projectquota on 6.17
We need to specifically use the FX_XFLAG_* macros in zpl_ioctl_*attr()
codepaths, and the FS_*_FL macros in the zpl_ioctl_*flags() codepaths.
The earlier code just assumes the FS_*_FL macros for both codepaths.
The 6.17 kernel add a bitmask check in copy_fsxattr_from_user() that
exposed this error via failing 'projectquota' ZTS tests.

Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #17884
Closes #17869
2025-11-12 13:06:01 -08:00
Paul Dagnelie
dda711dbb5 Fix gang write late_arrival bug
When a write comes in via dmu_sync_late_arrival, its txg is equal to the
open TXG. If that write gangs, and we have not yet activated the new
gang header feature, and the gang header we pick can store a larger gang
header, we will try to schedule the upgrade for the open TXG + 1. In
debug mode, this causes an assertion to trip. This PR sets the TXG for
activating the feature to be the larger of either the current open TXG
or the syncing TXG + 1.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Paul Dagnelie <paul.dagnelie@klarasystems.com>
Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Closes #17824
2025-11-12 13:05:54 -08:00
Tino Reichardt
be1e5d599b CI: Update FreeBSD versions and ci-type handling
Update FreeBSD versions:
- add FreeBSD 15.0-STABLE
- add FreeBSD 16.0-CURRENT

So we use the latest versions of each line now:
  - Freebsd 14.3 (RELEASE)
  - FreeBSD 15.0 (STABLE)
  - FreeBSD 16.0 (CURRENT)

In commits - you may specify which type of CI should run:
- ZFS-CI-Type: quick
- ZFS-CI-Type: linux
- ZFS-CI-Type: freebsd
- ZFS-CI-Type: full

Reviewed-by: Alexx Saver <lzsaver@users.noreply.github.com>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de>
Closes #17896
2025-11-12 13:05:49 -08:00
Toomas Soome
612e8f1e57 get_key_material_https: label 'kfdok' defined but not used
The label 'kfdok' is only used with O_TMPFILE, we need to use
the same #ifdef around this label.

Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Toomas Soome <tsoome@me.com>
Closes #17894
2025-11-12 13:05:45 -08:00
Robert Evans
5582e8b08e Update dnode_next_offset_level to accept blkid instead of offset
Currently this function uses L0 offsets which:
1. is hard to read since it maps offsets to blkid and back each call
2. necessitates dnode_next_block to handle edge cases at limits
3. makes it hard to tell if the traversal can loop infinitely

Instead, update this and dnode_next_offset to work in (blkid, index).
This way the blkid manipulations are clear, and it's also clear that
the traversal always terminates since blkid goes one direction.

I've also considered updating dnode_next_offset to operate on blkid.
Callers use both patterns, so maybe another PR can split the cases?

While here tidy up dnode_next_offset_level comments.

Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Robert Evans <evansr@google.com>
Closes #17792
2025-11-12 13:05:40 -08:00
jamisiveshkumar
c9835dab1f Fix capitalization typo in README.md
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Sivesh Kumar <siveshjami@gmail.com>
Closes #17889
2025-11-12 13:05:35 -08:00
Alexander Motin
67fc49433f Cleanup ZIO_FLAG_IO_RETRY vs TRYHARD usage
In cases where all issued ZIOs must succeed, and we can't do
anything clever about the errors, we should just explicitly set
ZIO_FLAG_TRYHARD and let OS to do all the reasonable retries.

In other cases, where retries can be different from the original,
for example, some ZIOs are allowed to fail due to redundancy, or
we can disable aggregation on retrial to get at least some of
the data, we can do first pass without TRYHARD, and only if needed
retry with ZIO_FLAG_IO_RETRY (which implies TRYHARD semantics).

Reviewed-by: Rob Norris <robn@despairlabs.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <alexander.motin@TrueNAS.com>
Closes #17877
2025-11-12 13:05:31 -08:00
Alexander Motin
e3acd0a728 Fix caching of DDT log and BRT
Both DDT log and BRT counters we read on pool import and then only
append or overwrite in full blocks.  We don't need them in DMU or
ARC caches.  Fortunately we have DMU_UNCACHEDIO for this now.

Even more we don't need BRT in non-evictable metadata DMU caches,
since it will likely never fit there, while block the cache from
its original users.  Since DMU_OT_IS_METADATA_CACHED() has no way
to differentiate the new metadata types, mark BRT with storage
type of DMU_OT_DDT_ZAP.  As side effect it will also put it on
dedup device, but that should actually be right.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <alexander.motin@TrueNAS.com>
Closes #17875
2025-11-12 13:05:25 -08:00
Alexander Motin
178a8be216 BRT: Round bv_entcount up to BRT_BLOCKSIZE
Since we set bv_mos_brtvdev block size, and since we keep dirty
bitmap at the same granularity, we should keep the allocations
and writes done with.  Otherwise it makes the last block write
short, that will be odd once we implement writing of only dirty
blocks, but also requires read-modify-write on DMU layer.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <alexander.motin@TrueNAS.com>
Closes #17875
2025-11-12 13:05:21 -08:00
Joseph Anthony Pasquale Holsten
29567f13f6 autogen.sh: remove workaround for automake <1.14, needed for EL <=7
Ultimately this is a revert of 779ac93, which according to
@nabijaczleweli is to paper over automake <1.14's lack of
%reldir% support.

As I understand it, EL8 is the lowest current build target.

Reviewed-by: Rob Norris <robn@despairlabs.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Joseph Holsten <joseph@josephholsten.com>
Closes #17878
2025-11-12 13:05:16 -08:00
Brian Behlendorf
e8d2e08345 Retire ZoL patch scripts
Remove the out of date helper scripts originally used to port
Illumos commits to the ZoL repository.  Due to layout changes
made to this repository they're no longer entirely correct.
Remove them to make it clear they're no longer being used or
actively maintained.

Reviewed-by: Rob Norris <robn@despairlabs.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #17880
2025-11-12 13:05:11 -08:00
Alexander Motin
5847626175 Pass flags to more DMU write/hold functions
Over the time many of DMU functions got flags argument to control
prefetch, caching, etc.  Few functions though left without it, even
though closer look shown that many of them do not require prefetch
due to their access pattern.  This patch adds the flags argument to
dmu_write(), dmu_buf_hold_array() and dmu_buf_hold_array_by_bonus(),
passing DMU_READ_NO_PREFETCH where applicable.

I am going to also pass DMU_UNCACHEDIO to some of them later.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Rob Norris <robn@despairlabs.com>
Signed-off-by: Alexander Motin <alexander.motin@TrueNAS.com>
Closes #17872
2025-11-12 13:04:58 -08:00
Quartz
9a9e06e5dd man: Update zpool-event subclass names and document new types
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Quartz <yyhran@163.com>
Closes #17868
2025-11-12 13:04:51 -08:00
Toomas Soome
82d59f7666 ZTS: autotrim_config.ksh is missing pool type
functional/trim tests do create pools of different types to test
trim, autotrim_config.ksh is missing the type from zpool
create command line while we are looping over different pool
types.

Sponsored-by: Edgecast Cloud LLC.
Signed-off-by: Toomas Soome <tsoome@me.com>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #17874
2025-11-12 13:04:47 -08:00
Ryan Libby
672fea2a50 FreeBSD zio_crypt.c: initialize uio variables before access
In zio_crypt_key_wrap and zio_crypt_key_unwrap, the cuio_s variable was
not initialized before the calls to zfs_uio_init, leading to
uninitialized access to cuio_s.uio_offset.  Initialize it to avoid gcc
warnings.

Similar issue as fixed in 2bf152021 ("Fix gcc uninitialized warning in
FreeBSD zio_crypt.c")

Signed-off-by: Ryan Libby <rlibby@FreeBSD.org>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #17863
2025-11-12 13:04:42 -08:00
Rob Norris
ad6eee2b9b mailmap/AUTHORS: update with recent new contributors
We’re not always on the same page, but at least we’re in the same book.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Signed-off-by: Rob Norris <robn@despairlabs.com>
Closes #17860
2025-11-12 13:04:35 -08:00
Rob Norris
f43839e7fd ZTS: fail test run if test runner crashes unexpectedly
zfs-tests.sh executes test-runner.py to do the actual test work. Any
exit code < 4 is interpreted as success, with the actual value
describing the outcome of the tests inside.

If a Python program crashes in some way (eg an uncaught exception), the
process exit code is 1.

Taken together, this means that test-runner.py can crash during setup,
but return a "success" error code to zfs-tests.sh, which will report and
exit 0. This in turn causes the CI runner to believe the test run
completed successfully.

This commit addresses this by making zfs-tests.sh interpret an exit code
of 255 as a failure in the runner itself. Then, in test-runner.py, the
"fail()" function defaults to a 255 return, and the main function gets
wrapped in a generic exception handler, which prints it and calls
fail().

All together, this should mean that any unexpected failure in the test
runner itself will be propagated out of zfs-tests.sh for CI or any other
calling program to deal with.

Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Rob Norris <rob.norris@klarasystems.com>
Closes #17858
2025-11-12 12:57:59 -08:00
Tony Hutter
814f9afba7 Tag 2.4.0-rc3
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
2025-10-21 09:52:08 -07:00
Jean-Sébastien Pédron
6f6e1c90ae FreeBSD: zfs_getpages: Don't zero freshly allocated pages
Initially, `zfs_getpages()` is provided with an array of busy pages by
the vnode pager. It then tries to acquire the range lock, but if there
is a concurrent `zfs_write()` running and fails to acquire that range
lock, it "unbusies" the pages to avoid a deadlock with `zfs_write()`.
After that, it grabs the pages again and retries to acquire the range
lock, and so on.

Once it got the range lock, it filters out valid pages, then copy DMU
data to the remaining invalid pages.

The problem is that freshly allocated zero'd pages it grabbed itself are
marked as valid. Therefore they are skipped by the second part of the
function and DMU data is never copied to these pages. This causes mapped
pages to contain zeros instead of the expected file content.

This was discovered while working on RabbitMQ on FreeBSD. I could
reproduce the problem easily with the following commands:

    git clone https://github.com/rabbitmq/rabbitmq-server.git
    cd rabbitmq-server/deps/rabbit

    gmake distclean-ct RABBITMQ_METADATA_STORE=mnesia \
      ct-amqp_client t=cluster_size_3:leader_transfer_stream_send

The testsuite fails because there is a sendfile(2) that can happen
concurrently to a write(2) on the same file. This leads to sendfile(2)
or read(2) (after the sendfile) sending/returning data with zeros, which
causes a function to crash.

The patch consists of not setting the `VM_ALLOC_ZERO` flag when
`zfs_getpages()` grabs pages again. Then, the last page is zero'd if it
is invalid, in case it would be partially filled with the end of the
file content. Other pages are either valid (and will be skipped) or they
will be entirely overwritten by the file content.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Mark Johnston <markj@FreeBSD.org>
Signed-off-by: Jean-Sébastien Pédron <dumbbell@FreeBSD.org>
Closes #17851
2025-10-21 09:50:43 -07:00
Rob Norris
aeff23939a Linux 6.18: generic_drop_inode() and generic_delete_inode() renamed
Sponsored-by: https://despairlabs.com/sponsor/
Signed-off-by: Rob Norris <robn@despairlabs.com>
2025-10-21 09:50:43 -07:00
Rob Norris
7730109762 sha256_generic: make internal functions a little more private
Linux 6.18 has conflicting prototypes for various sha256_* and sha512_*
functions, which we get through a very long include chain. That's tough
to fix right now; easier is just to rename our internal functions.

Sponsored-by: https://despairlabs.com/sponsor/
Signed-off-by: Rob Norris <robn@despairlabs.com>
2025-10-21 09:50:43 -07:00
Rob Norris
2778832e22 Linux 6.18: namespace type moved to ns_common
The namespace type has moved from the namespace ops struct to the
"common" base namespace struct. Detect this and define a macro that does
the right thing for both versions.

Sponsored-by: https://despairlabs.com/sponsor/
Signed-off-by: Rob Norris <robn@despairlabs.com>
2025-10-21 09:50:43 -07:00
Rob Norris
005c631499 Linux 6.18: replace write_cache_pages()
Linux 6.18 removed write_cache_pages() without a usable replacement.
Here we implement a minimal zpl_write_cache_pages() that find the dirty
pages within the mapping, gets them into the expected state and hands
them off to zfs_putpage(), which handles the rest.

Sponsored-by: https://despairlabs.com/sponsor/
Signed-off-by: Rob Norris <robn@despairlabs.com>
2025-10-21 09:50:43 -07:00
Rob Norris
04d0f83f4e Linux 6.18: block_device_operations->getgeo takes struct gendisk*
Sponsored-by: https://despairlabs.com/sponsor/
Signed-off-by: Rob Norris <robn@despairlabs.com>
2025-10-21 09:50:43 -07:00
Rob Norris
49f078997a Linux 6.18: convert ida_simple_* calls
ida_simple_get() and ida_simple_remove() are removed in 6.18. However,
since 4.19 they have been simple wrappers around ida_alloc() and
ida_free(), so we can just use those directly.

Sponsored-by: https://despairlabs.com/sponsor/
Signed-off-by: Rob Norris <robn@despairlabs.com>
2025-10-21 09:50:43 -07:00
Rob Norris
3fb241157f Linux 6.18: replace nth_page()
Sponsored-by: https://despairlabs.com/sponsor/
Signed-off-by: Rob Norris <robn@despairlabs.com>
2025-10-21 09:50:43 -07:00
Andrew Walker
799bda73e2 Fix return value for setting zvol threading
We must return -1 instead of ENOENT if the special zvol threading
property set function can't locate the dataset (this would typically
happen with an encypted and unmounted zvol) so that the operation
gets inserted properly into the nvlist for operations to set. This
is because we want the property to be set once the zvol is
decrypted again.

Reviewed-by: Allan Jude <allan@klarasystems.com>
Reviewed-by: Rob Norris <robn@despairlabs.com>
Reviewed-by: Ameer Hamza <ahamza@ixsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Andrew Walker <awalker@ixsystems.com>
Closes #17836
2025-10-21 09:50:43 -07:00
Shreshth3
b0106a1b74 zdb: fix bug with -A flag
Fixes #10544.

According to the manpage, zdb -A should
ignore all assertions. But it currently
does not do that. This commit fixes
this bug.

Signed-off-by: Shreshth Srivastava <shreshthsrivastava2@gmail.com>
Reviewed-by: Rob Norris <rob.norris@klarasystems.com>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Closes #17825
2025-10-21 09:50:43 -07:00
Andrew Walker
084f8d0077 Fix ZFS_READONLY implementation on Linux
MS-FSCC 2.6 is the governing document for
DOS attribute behavior. It specifies the following:

For a file, applications can read the file but
cannot write to it or delete it. For a directory,
applications cannot delete it, but applications can
create and delete files from the directory.

Signed-off-by: Andrew Walker <awalker@ixsystems.com>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Reviewed-by: Ameer Hamza <ahamza@ixsystems.com>
Reviewed-by: Allan Jude <allan@klarasystems.com>
Reviewed-by: Rob Norris <rob.norris@klarasystems.com>
Closes #17837
2025-10-21 09:50:43 -07:00
Brian Behlendorf
7987d4deb4 Update device removal documentation
Make a minor update to the 'zpool remove' man page to clarify both
raidz and draid pools do not support removal, and change sector to
ashift which is what we actually care about.

Update the big theory comment in vdev_removal.c to accurately reflect
which types of vdevs can be removed.  Furthermore, I've added some
discussion for the casual reader to briefly explain the top-level
vdev removal restrictions.  This has been a common area of confusion
and it's not intuitive where they come from without understanding
the implementation details.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Closes #17847
2025-10-21 09:50:43 -07:00
Rob Norris
1956417b54 mmap_seek: print error code and text on failure
If lseek() returns an unexpected error, it's useful to know the error
code to help connect it to the trouble spot inside the module.

Since the two seek functions should be basically identical, lift them
into a single generic function.

Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Reviewed-by: Robert Evans <evansr@google.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rob Norris <rob.norris@klarasystems.com>
Closes #17843
2025-10-21 09:50:43 -07:00
Ameer Hamza
3378a324df CI: Fix FreeBSD 15.0 by staying on ALPHA4 due to broken ALPHA5 image
FreeBSD 15.0-ALPHA5 image fails to boot on cloud VMs due to missing
/boot/efi mount point, causing the system to drop to single user mode
where SSH cannot start. Work around this by staying on ALPHA4 and
setting IGNORE_OSVERSION=yes to bypass pkg's kernel version mismatch
prompt during bootstrap. This allows CI to proceed with ALPHA4 until we
have a stable FreeBSD 15.0 image.

Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Closes #17846
2025-10-21 09:50:43 -07:00
Shreshth3
f16fa115d1 arc: fix small typos
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Allan Jude <allan@klarasystems.com>
Reviewed-by: Rob Norris <rob.norris@klarasystems.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Shreshth Srivastava <shreshthsrivastava2@gmail.com>
Closes #17840
2025-10-21 09:50:43 -07:00
Rob Norris
f0c76f8a7b libzpool/cmn_err: remove suppression, add stop option, cleanup
A small uplift of the cmn_err() and panic() calls in userspace:

- remove the suppression on CE_NOTE. We have very few of these calls in
  a standard build, it's convenient for "print debugging".

- make prefixes clear and consistent.

- add LIBZPOOL_PANIC_STOP environment variable to send SIGSTOP to the
  process group on a panic, rather than abort(), so all threads remain
  alive for inspection.

Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Signed-off-by: Rob Norris <rob.norris@klarasystems.com>
Closes #17834
2025-10-21 09:50:43 -07:00
Mark Johnston
c1f55bff8b Fix the type of the raidz_outlier_check_interval_ms parameter
It's an hrtime_t, which is an unsigned long long.  In practice this is
just a U64.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Allan Jude <allan@klarasystems.com>
Reviewed-by: Rob Norris <rob.norris@klarasystems.com>
Signed-off-by: Mark Johnston <markj@FreeBSD.org>
Closes #17833
2025-10-21 09:50:43 -07:00
Alexander Motin
f0bff230f9 Suppress some ashift warnings
Do not warn about vdev ashifts being smaller then physical ashifts
in a pool status if the pool ashift property set and vdev ashift
satisfies it (bigger or equal), since user explicitly requested
this.  The ashift of individual vdevs are still reported.

Do not warn about vdev ashifts in zpool import, since it doesn't
matter much, and we don't even report individual vdevs ashifts
there.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Allan Jude <allan@klarasystems.com>
Reviewed-by: Rob Norris <rob.norris@klarasystems.com>
Signed-off-by: Alexander Motin <alexander.motin@TrueNAS.com>
Closes #17830
2025-10-21 09:50:43 -07:00
Alexander Motin
b9356f06ed Explicit set ashift for non-leaf vdevs
Before this change ashift property was applied only to a leaf
vdevs.  As result, it worked only as a minimal value for parent
vdevs, since bigger physical_ashift value reported by any child
could be used instead when deciding parent's ashift, as if the
ashift property was never set.

This change explicitly passes ZPOOL_CONFIG_ASHIFT to all vdevs,
allowing override for parents only if the passed value is below
logical_ashift and so unacceptable.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Allan Jude <allan@klarasystems.com>
Reviewed-by: Rob Norris <rob.norris@klarasystems.com>
Signed-off-by: Alexander Motin <alexander.motin@TrueNAS.com>
Closes #17826
2025-10-21 09:50:43 -07:00
Ameer Hamza
30a3e609a2 zpool_reopen_004_pos: Clear label from offline disk after destroy
zpool_reopen_004_pos destroys a pool with an offline disk, leaving its
label intact. In TrueNAS local repo, zpool_reopen_005_pos is skipped,
causing zpool_reopen_007_pos to fail as it doesn't use -f flag when
creating pools unlike zpool_reopen_005_pos.

Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Closes #17831
2025-10-21 09:50:43 -07:00
Dag-Erling Smørgrav
964dfc3176 FreeBSD: Correct _PC_MIN_HOLE_SIZE
The actual minimum hole size on ZFS is variable, but we always report
SPA_MINBLOCKSIZE, which is 512.  This may lead applications to believe
that they can reliably create holes at 512-byte boundaries and waste
resources trying to punch holes that ZFS ends up filling anyway.

* In the general case, if the vnode is a regular file, return its
  current block size, or the record size if the file is smaller than
  its own block size.  If the vnode is a directory, return the dataset
  record size.  If it is neither a regular file nor a directory,
  return EINVAL.

* In the control directory case, always return EINVAL.

Signed-off-by: Dag-Erling Smørgrav <des@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Closes #17750
2025-10-21 09:50:43 -07:00
Shreshth3
e4a393cf78 Add missing include statement
Resolve a build failure for user applications that include <sys/uio.h>.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Shreshth Srivastava <shreshthsrivastava2@gmail.com>
Closes #17781
Closes #17814
2025-10-21 09:50:43 -07:00
Tony Hutter
e09c86cb1f zvol: verify IO type is supported
ZVOLs don't support all block layer IO request types.  Add a check for
the IO types we do support.  Also, remove references to
io_is_secure_erase() since they are not supported on ZVOLs.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #17803
2025-10-21 09:50:43 -07:00
Mateusz Guzik
6c73fd8eeb Annotate arc_buf_is_shared as __maybe_unused
Otherwise the compiler warns about it on production FreeBSD builds.

The routine proved resilient to attempts to ifdef on debug.

Sponsored by:	Rubicon Communications, LLC ("Netgate")
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
Closes #17818
2025-10-21 09:50:43 -07:00
Tino Reichardt
9050ecb75c CI: Switch FreeBSD 15 to 15.0-ALPHA4 and add FreeBSD 16
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de>
Closes #17815
2025-10-21 09:50:43 -07:00
Ivan Shapovalov
cf9163f250 zdb: adjust block histogram binning strategy
Previously, a bin included all blocks _starting_ from given size
(e.g., a "4K" bin would include all blocks within the [4K; 8K) region).
This is counter-intuitive and does not match the typical use-case of the
block histogram (that is, to estimate disk usage considering how ZFS'
block allocation works). In other words, if I'm looking at the "4K" row,
I'm interested in records that _fit into_ a 4K block.

Adjust the binning strategy such that a bin includes all blocks _up to_
given size, such that e.g. a "4K" bin would include all blocks within
the (2K; 4K] region.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ivan Shapovalov <intelfx@intelfx.name>
Closes #16999
2025-10-21 09:50:43 -07:00
Ivan Shapovalov
250e2ec229 zdb: factor out block histogram bin number computation
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ivan Shapovalov <intelfx@intelfx.name>
Closes #16999
2025-10-21 09:50:43 -07:00
Ivan Shapovalov
968cfc3df2 zdb: add --class=(normal|special|...) to filter blocks by alloc class
When counting blocks to generate block size histograms (`-bb`), accept a
`--class=` argument (as a comma-separated list of either "normal",
"special", "dedup" or "other") to only consider blocks that belong to
these metaslab classes.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ivan Shapovalov <intelfx@intelfx.name>
Closes #16999
2025-10-21 09:50:43 -07:00
Ivan Shapovalov
627b530059 zdb: add --bin=(lsize|psize|asize) arg to control histogram binning
When counting blocks to generate block size histograms (`-bb`), accept a
`--bin=` argument to force placing blocks into all three bins based on
*this* size.

E.g. with `--bin=lsize`, a block with lsize=512K, psize=128K, asize=256K
will be placed into the "512K" bin in all three output columns. This
way, by looking at the "512K" row the user will be able to determine
how well was ZFS able to compress blocks of this logical size.

Conversely, with `--bin=psize`, by looking at the "128K" row the user
will be able to determine how much overhead was incurred for storage
of blocks of this physical size.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ivan Shapovalov <intelfx@intelfx.name>
Closes #16999
2025-10-21 09:50:43 -07:00
Ivan Shapovalov
6809137db5 zdb: convert ALLOCATED_OPT into anonymous enum
We are adding more long-only options, so use an enum for all of them
to avoid manually numbering these constants.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ivan Shapovalov <intelfx@intelfx.name>
Closes #16999
2025-10-21 09:50:43 -07:00
Rob Norris
3e7e19e028 pool_iter_refresh: don't refresh pools twice
In "all pools" mode, pool_iter_refresh() will call zpool_iter(), which
will call zpool_refresh_stats() before calling add_pool(). If we already
have the pool, this is a different handle, so we just release it and
return. Back in pool_iter_refresh(), we then call zpool_stats_refresh()
again for our handle on the same pool.

All together, this means we're doing two ZFS_IOC_POOL_STATS calls into
the kernel for every pool in the system. This isn't wrong, but it does
double the pressure on global locks.

Instead, we add a new function zpool_refresh_stats_from_handle() that
simply copies the pool config and state from one handle to another, and
use it to update our handle before we release it in add_pool(), so we
only have one call per pool per interval.

Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rob Norris <rob.norris@klarasystems.com>
Closes #17807
2025-10-21 09:50:43 -07:00
Rob Norris
4c84b77bc4 pool_iter_refresh: don't flag existing pools as refreshed
zpool_iter() passes the callback a new instance of zpool_handle_t each
time, so the existing handle in the pool_list AVL never actually gets a
refresh. Internally, that means its zpool_config is never updated, and
the old config is never moved to zpool_old_config. As a result,
print_iostat() never sees any updated config, and so repeats the first
line forever.

This is the simplest workaround: just don't mark existing pools as
refreshed. pool_list_refresh() will see this and refresh them.
The downside is a second call to ZFS_IOC_POOL_STATS for existing pools,
because zpool_iter() just called it for the handle we threw away.

Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rob Norris <rob.norris@klarasystems.com>
Closes #17807
2025-10-21 09:50:43 -07:00
Rob Norris
37d8d4619f zpool iostat: update pool counter when skipping boot row
When skipping the boot row (with -y), the early loop meant we weren't
updating the "last_npools" count. That means the count never advanced
past zero, so cb_iteration was always reset to 0, leading to it being
"stuck" on the boot line, printing the header and nothing else forever.

Updating the pool counter on every loop sorts that out: it advances,
cb_iteration moves properly, and normal rows are printed.

Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rob Norris <rob.norris@klarasystems.com>
Closes #17807
2025-10-21 09:50:43 -07:00
Ameer Hamza
1585a10a85 Make mount/share errors non-fatal for zfs create/clone
If zfs_mount_and_share() fails, the error propagates to zfs create/clone
commands despite successful operation. If create/clone operations were
successful, there's no point in making zfs_mount_and_share() failures
fatal.

Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Closes #17799
2025-10-21 09:50:43 -07:00
Igor Ostapenko
b9d1e28a71 ddt prune: Add SCL_ZIO deadlock workaround
Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Allan Jude <allan@klarasystems.com>
Reviewed-by: Igor Kozhukhov <igor@dilos.org>
Signed-off-by: Igor Ostapenko <igor.ostapenko@klarasystems.com>
Closes #17793
2025-10-21 09:50:43 -07:00
Igor Ostapenko
01180a63bd spa_config: Rename spa_config_enter_mmp() to spa_config_enter_priority()
Originally this was created for MMP, but now new cases are emerging
where the same mechanism is required. Hence the name's generalization.

Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Allan Jude <allan@klarasystems.com>
Reviewed-by: Igor Kozhukhov <igor@dilos.org>
Signed-off-by: Igor Ostapenko <igor.ostapenko@klarasystems.com>
Closes #17793
2025-10-21 09:50:43 -07:00
Robert Evans
ead0fb736d zinject: Introduce ready delay fault injection
This adds a pause to the ZIO pipeline in the ready stage for
matching I/O (data, dnode, or raw bookmark).

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Rob Norris <robn@despairlabs.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Akash B <akash-b@hpe.com>
Signed-off-by: Robert Evans <evansr@google.com>
Closes #17787
2025-10-21 09:50:43 -07:00
Paul Dagnelie
073b34b3ee Fix display of default xattr to show 'sa'
When the default value of the xattr property was changed from 'dir' to
'sa', the code that displays the property's value was not affected. The
problem with this state of affairs is that 1) user tooling that
specifically looked for 'sa' before will be confused now that the code
displays 'on' instead. And 2) users may be confused when manually
running the commands about which specific type of xattr is in use unless
they are up to date on the latest zfs changes.

The fix here is to show the actual type always, rather than 'on' if we
happen to be using the default. This turns out to be easy to do, by
simply reordering the list of xattr values in the properties code. When
the property is displayed, we iterate down the table until we find a row
with a matching value, and use that row's name as the
display. Reordering the row fixes the display without affecting any
other code.

Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Rob Norris <robn@despairlabs.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Paul Dagnelie <paul.dagnelie@klarasystems.com>
Closes #17801
2025-10-21 09:50:43 -07:00
Shreshth3
c0d63f5435 docs: fix a few small typos (#17804)
Signed-off-by: Shreshth Srivastava <shreshthsrivastava2@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
2025-10-21 09:50:43 -07:00
Brian Behlendorf
2f50d67409 Tag 2.4.0-rc2
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
2025-09-30 12:49:15 -07:00
nav1s
0939787e83 manuals: fix typos in zpool-upgrade man page
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: nav1s <nav1s@proton.me>
Closes #17797
2025-09-29 16:50:57 -07:00
hoshinomori
f3295ec763 range_tree: drop duplicate zfs_ prefix from rs_set_fill_raw
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Signed-off-by: hoshinomori <hoshinomori@owarisekai.moe>
Closes #17800
2025-09-29 16:50:53 -07:00
Rob Norris
35ec4b14ab zpool iostat: refresh pool list every interval
When running zpool iostat in interval mode, it would not notice any new
pools created or imported, and would forget any destroyed or exported,
so would not notice if they came back. This leads to outputting "no
pools available" every interval until killed.

It looks like this was at least intended to work; the comment above
zpool_do_iostat() indicates that it is expected to "deal with pool
creation/destruction" and that pool_list_update() would detect new
pools. That call however was removed in 3e43edd2c5, though its unclear
if that broke this behaviour and it wasn't noticed, or if it never
worked, or if something later broke it. That said, the lack of
pool_list_update() is only part of the reason it doesn't work properly.

The fundamental problem is that the various things involved in
refreshing or updating the list of pools would aggressively ignore,
remove, skip or fail on pools that stop existing, or that already exist.
Mostly this meant that once a pool is removed from the list, it will
never be seen again. Restoring pool_list_update() to the
zpool_do_iostat() loop only partially fixes this - it would find "new"
pools again, but only in the "all pools" (no args) mode, and because its
iterator callback add_pool() would abort the iterator if it already has
a pool listed, it would only add pools if there weren't any already.

So, this commit reworks the structure somewhat. pool_list_update()
becomes pool_list_refresh(), and will ensure the state of all pools in
the list are updated. In the "all pools" mode, it will also add new
pools and remove pools that disappear, but when a fixed list of pools is
used, the list doesn't change, only the state of the pools within it.

The rest of the commit is adjusting things for this much simpler
structure. Regardless of the mode in use, pool_list_refresh() will
always do the right thing, so the driver code can just get on with the
display.

Now that pools can appear and disappear, I've made it so the header (if
enabled) is re-printed when the list changes, so that its easier to see
what's happening if the column widths change.

Since this is all rather complicated, I've included tests for the "all
pools" and "set of pools" modes.

Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Igor Kozhukhov <igor@dilos.org>
Signed-off-by: Rob Norris <rob.norris@klarasystems.com>
Closes #17786
2025-09-29 16:50:49 -07:00
Tony Hutter
abda34b1c0 CI: Add ZTS -O option, log Setup Testing Machines step
Add a -O option to zfs-test.sh to dump debug information on test
timeout.  The debug info includes:

- 30 lines from 'top'
- /proc/<PID>/stack output of process with highest CPU usage
- Last lines strace-ing process with highest CPU usage
- /proc/sysrq-trigger kernel stack traces

All debug information gets dumped to /dev/kmsg (Linux only).

In addition, print out the VM console lines from the "Setup Testing
Machines" step.  We have often see VMs timeout at this step and don't
know why.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #17753
2025-09-29 16:50:46 -07:00
Tony Hutter
9079f986ae zvol: Fix blk-mq sync
The zvol blk-mq codepaths would erroneously send FLUSH and TRIM
commands down the read codepath, rather than write.  This fixes
the issue, and updates the zvol_misc_fua test to verify that
sync writes are actually happening.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Reviewed-by: Ameer Hamza <ahamza@ixsystems.com>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #17761
Closes #17765
2025-09-29 16:50:43 -07:00
Brian Behlendorf
a9bcf4faf3 CI: Switch FreeBSD 15 to 15.0-ALPHA3
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Closes #17795
2025-09-29 16:50:39 -07:00
Brian Behlendorf
ca9b89bd2d CI: Remove Buildbot references
The Buildbot CI infrastructure has been fully replaced by GitHub
Actions.  Remove any lingering references from the repository.

Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #17794
2025-09-29 16:50:35 -07:00
Brian Behlendorf
654a2ccc74 Linux 6.17 compat: META
Update the META file to reflect compatibility with the 6.17
kernel.

Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Reviewed-by: Rob Norris <robn@despairlabs.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #17789
2025-09-26 12:16:17 -07:00
Brian Behlendorf
ddecc5ff21 CI: update perf and bpftools with the kernel packages
When updating a Fedora instance to an experimental kernel make sure
to include the matching versioned perf and bpftool packages.  This
helps ensure there are no unexpected conflicts which would prevent
the new packages from being installed.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #17791
2025-09-26 12:16:14 -07:00
patrickxia
e1a6ec42d4 zdb: add ZFS_KEYFORMAT_RAW support for -K option
This change adds support for ZFS_KEYFORMAT_RAW to zdb_derive_key in 
zdb.c. The implementation reads the raw key from the file specified 
by the -K option which is consistent with how raw keys are handled in 
the other parts of ZFS, along with a check to ensure that the keyfile 
doesn't have too many bytes.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Patrick Xia <patrickx@google.com>
Closes #17783
2025-09-25 12:08:20 -07:00
Robert Evans
460858dfd6 dnode_next_offset: backtrack if lower level does not match
This changes the basic search algorithm from a single search up and down
the tree to a full depth-first traversal to handle conditions where the
tree matches at a higher level but not a lower level.

Normally higher level blocks always point to matching blocks, but there
are cases where this does not happen:

1. Racing block pointer updates from dbuf_write_ready.

   Before f664f1ee7f (#8946), both dbuf_write_ready and
   dnode_next_offset held dn_struct_rwlock which protected against
   pointer writes from concurrent syncs.

   This no longer applies, so sync context can f.e. clear or fill all
   L1->L0 BPs before the L2->L1 BP and higher BP's are updated.

   dnode_free_range in particular can reach this case and skip over L1
   blocks that need to be dirtied. Later, sync will panic in
   free_children when trying to clear a non-dirty indirect block.

   This case was found with ztest.

2. txg > 0, non-hole case. This is #11196.

   Freeing blocks/dnodes breaks the assumption that a match at a higher
   level implies a match at a lower level when filtering txg > 0.

   Whenever some but not all L0 blocks are freed, the parent L1 block is
   rewritten. Its updated L2->L1 BP reflects a newer birth txg.

   Later when searching by txg, if the L1 block matches since the txg is
   newer, it is possible that none of the remaining L1->L0 BPs match if
   none have been updated.

   The same behavior is possible with dnode search at L0.

   This is reachable from dsl_destroy_head for synchronous freeing.
   When this happens open context fails to free objects leaving sync
   context stuck freeing potentially many objects.

   This is also reachable from traverse_pool for extreme rewind where it
   is theoretically possible that datasets not dirtied after txg are
   skipped if the MOS has high enough indirection to trigger this case.

In both of these cases, without backtracking the search ends prematurely
as ESRCH result implies no more matches in the entire object.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Akash B <akash-b@hpe.com>
Signed-off-by: Robert Evans <evansr@google.com>
Closes #16025
Closes #11196
2025-09-25 12:08:17 -07:00
Brian Behlendorf
954fe5e1be Add interface to interface spa_get_worst_case_min_alloc() function
Provide an interface to retrieve the lowest and highest minimum
allocation size for the normal allocation class.  This can be used
by external consumers of the DMU to estimate potential wasted
capacity when setting the recordsize for an object.

The new "min_alloc" and "max_alloc" keys are added to the pool
configuration and used by default_volblocksize() to warn when
an ineffecient block size is requested.  For older kmods which
don't yet include the new keys fallback to the previous logic.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #17758
2025-09-25 12:08:14 -07:00
Brian Behlendorf
d33d0cac5a Fix 'zpool add' safety check corner cases
Three cases were discovered where 'zpool add' would fail to
warn when adding vdevs to a pool with a mismatched replication
level.  These are:

  1. When a pool contains mixed file and disk vdevs.
  2. When a pool contains an active dRAID distributed spare
  3. When a pool contains an active hot spare

The lack of warnings are caused by get_replication() assessing
the current pool configuration an inconsistent and disabling
the mismatched replication check for the new pool configuration
after 'zpool add'.  This change updates get_replication() to
be slightly more tolerant in the non-fatal case.

The zpool_add_010_pos.ksh test case was split in to separate
tests: zpool_add_warn_create.ksh, pool_add_warn_degraded.ksh,
and zpool_add_warn_removal.  These test were extended to
include coverage for dRAID pools and the three scenarios
described above.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #17780
2025-09-25 12:08:09 -07:00
Brian Behlendorf
9bd8f4379c ZTS: update upgrade_readonly_pool.ksh
Modify the test case to use the `zfs mount` command instead
of directly calling the mount command, create a dedicated dataset,
and use the default mount point.  These changes are intended to
preserve the intent of the original test case and resolve some
spurious mount failures which have been observed by the CI.

Reviewed-by: Igor Kozhukhov <igor@dilos.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #17785
2025-09-25 12:08:06 -07:00
jozzsi
6dad2f61a3 contrib: dracut: install dependent kernel modules
Eliminates the need for the following workaround

> Add other drivers to dracut:

```
if grep mpt3sas /proc/modules; then
  echo 'force_drivers+=" mpt3sas "'  >> /etc/dracut.conf.d/zfs.conf
fi
if grep virtio_blk /proc/modules; then
  echo 'filesystems+=" virtio_blk "' >> /etc/dracut.conf.d/fs.conf
fi
```

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Jo Zzsi <jozzsicsataban@gmail.com>
Closes #17762
2025-09-25 12:08:02 -07:00
Alexander Motin
61a68554de zdb: Fix asize overflow in verify_livelist_allocs()
Spacemap entry might be too big to fit into a block pointer ashift.
We hit an assertion trying to run `zdb -bvy` on a large pool.  But
it seems the code does not really need size there, since we only
need to search for a range of offsets, so setting it to zero should
just make btree return position just before the first entry.  I
suspect the previous code could actually miss the first entry
due to this if its size was smaller.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Igor Kozhukhov <igor@dilos.org>
Signed-off-by: Alexander Motin <alexander.motin@TrueNAS.com>
Closes #17764
2025-09-25 12:07:55 -07:00
trick2011
9bcda0b5fe Use "vdev" instead of "devices" when referring to vdevs
Update documentation to use the correct terminology.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: trick2011 <trick2011@users.noreply.github.com>
Closes #17734
Closes #17755
2025-09-25 12:07:52 -07:00
Tony Hutter
2380e0b679 ZTS: Fix stale symlinks with zfs-helpers.sh
zfs-helpers.sh is a utility script that sets up udev symlinks so you
can run ZTS from a local ZFS git workspace.  However, it doesn't check
that the udev symlinks point to the current workspace.  They may point
to an old workspace that has been deleted.  This means the udev rules
never get executed, which in turn causes the zvol tests to fail.

This commit removes old symlinks that do not point to the current
ZFS workspace.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #17766
2025-09-25 12:07:47 -07:00
jozzsi
83066c9627 contrib: dracut: always include zfs kernel module
This commit fixes the issue and includes the zfs kernel
module even when dracut is used in hostonly mode.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Jo Zzsi <jozzsicsataban@gmail.com>
Closes #17754
2025-09-25 12:07:43 -07:00
Alan Somers
ef9b7dde91 Fix a printf format specifier on FreeBSD/i386
This is breaking the build on FreeBSD/i386.  Originally committed
downstream as https://github.com/freebsd/freebsd-src/commit/2d76470b701

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Signed-off-by:	Alan Somers <asomers@gmail.com>
Sponsored by:	ConnectWise
Closes #17705
2025-09-17 16:34:24 -07:00
Alan Somers
9c6f72021d Fix atomic-alignment warnings in libspl on FreeBSD/i386
On i386, Clang complains about misaligned atomic operations.  Silence
these warnings to fix the build on FreeBSD/i386.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by:	Alan Somers <asomers@gmail.com>
Sponsored by:	ConnectWise
Closes #17708
2025-09-17 16:34:19 -07:00
Rob Norris
15a6b982c5 linux/super: add tunable to request immediate reclaim of unused dentries
Traditionally, unused dentries would be cached in the dentry cache until
the associated entry is no longer on disk. The cached dentry continues
to hold an inode reference, causing the inode to be pinned (see previous
commit).

Here we implement the dentry op d_delete, which is roughly analogous to
the drop_inode superblock op, and add a zfs_delete_dentry tunable to
control its behaviour. By default it continues the traditional
behaviour, but when the tunable is enabled, we signal that an unused
dentry should be freed immediately, releasing its inode reference, and
so allowing that inode to be deleted if no longer in use.

Sponsored-by: Klara, Inc.
Sponsored-by: Fastmail Pty Ltd
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Signed-off-by: Rob Norris <rob.norris@klarasystems.com>
Closes #17746
2025-09-17 16:34:14 -07:00
Rob Norris
42b9995f88 linux/super: add tunable to request immediate reclaim of unused inodes
Traditionally, unused inodes would be held on the superblock inode cache
until the associated on-disk file is removed or the kernel requests
reclaim.  On filesystems with millions of rarely-used files, this can be
a lot of unusable memory.

Here we implement the superblock drop_inode method, and add a
zfs_delete_inode tunable to control its behaviour. By default it
continues the traditional behaviour, but when the tunable is enabled, we
signal that the inode should be deleted immediately when the last
reference is dropped, rather than cached. This releases the associated
data to the dbuf cache and ARC, allowing them to be reclaimed normally.

Sponsored-by: Klara, Inc.
Sponsored-by: Fastmail Pty Ltd
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Signed-off-by: Rob Norris <rob.norris@klarasystems.com>
Closes #17746
2025-09-17 16:34:09 -07:00
buzzingwires
a056b3c341 Add typesets to zhack label repair test scripts
As a quality assurance measure, `typeset` is added to local variable
declarations to actually enforce their intended scope.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: buzzingwires <buzzingwires@outlook.com>
Closes #17732
2025-09-17 16:34:04 -07:00
buzzingwires
5f7253ca11 Refactor zhack label repair and fix -c regression on nonzero TXG
This commit fixes a likely regression introduced by 64db435 where the
checksum repair functionality (`-c` or default behavior) will perform
checks and access data associated with the newer undetach (`-u`)
functionality, resulting in a failure when an uberblock's TXG is not 0
as required by `-u` but not `-c`

Additionally, code is refactored for better separation of tasks.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: buzzingwires <buzzingwires@outlook.com>
Closes #17732
2025-09-17 16:33:59 -07:00
Rob Norris
e3eb3ca3dc man: add silent rules for mancheck
Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Reviewed-by: Paul Dagnelie <paul.dagnelie@klarasystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rob Norris <rob.norris@klarasystems.com>
Closes #17747
2025-09-17 16:33:52 -07:00
Rob Norris
d0084a4109 mancheck: allow single files
Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Reviewed-by: Paul Dagnelie <paul.dagnelie@klarasystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rob Norris <rob.norris@klarasystems.com>
Closes #17747
2025-09-17 16:33:48 -07:00
Rob Norris
4698208c78 Shellcheck.am: add silent rules for shellcheck and checkbashisms
Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Reviewed-by: Paul Dagnelie <paul.dagnelie@klarasystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rob Norris <rob.norris@klarasystems.com>
Closes #17747
2025-09-17 16:33:42 -07:00
Alexander Motin
406f76b7e3 CI: Switch FreeBSD 15 to 15.0-ALPHA2
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Alexander Motin <alexander.motin@TrueNAS.com>
Closes #17749
2025-09-15 12:44:05 -07:00
Igor Ostapenko
1ca4cd8a33 Fix txg_log_time ZAP key typo
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Allan Jude <allan@klarasystems.com>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Signed-off-by: Igor Ostapenko <igor.ostapenko@klarasystems.com>
Sponsored-by: Klara, Inc.
Closes #17748
2025-09-15 12:44:01 -07:00
Kyle Evans
8b548776ff zfsprops(7): attempt to clarify the keylocation description
The current description is somewhat difficult to parse through, and in
some cases is a little unclear as to the behavior.

Split it into a paragraphs based on the three distinct behaviors you
may get: prompt, file URL, HTTP(S) URL.  The descriptions of the file
and HTTP(s) behavior seems fine, but prompt is a little vague- expand
on it and make it clear that the behavior is actively based on whether
the inquisitor of key-data is provided with a tty for stdin or not.

Also clarify *why* one shouldn't "place keys which should be kept secret
on the command line" and note that you *have* to supply the key via
stdin if it's a raw key, just to be sure.

Reviewed-by: Allan Jude <allan@klarasystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Kyle Evans <kevans@FreeBSD.org>
Closes #17742
2025-09-15 12:43:57 -07:00
Brian Behlendorf
a4cb155e8d ZTS: default to random data in fill_fs
Update the fill_fs helper function to request a random fill pattern
when the "data" argument isn't specified.  This ensures the default
behavior is to perform a more realistic fill of incompressible blocks.

Additionally, update a few test cases to specify a random fill.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #17739
2025-09-15 12:43:52 -07:00
Brian Behlendorf
53c8d7071d ZTS: Fix zfs_send_delegation_user test
Correct the path in the common.run file.  The zfs_send_delegation_user
test is installed under cli_user not cli_root.

Reviewed-by: Allan Jude <allan@klarasystems.com>
Reviewed-by: Paul Dagnelie <paul.dagnelie@klarasystems.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #17740
2025-09-15 12:43:48 -07:00
Rob Norris
da33cfd436 vdev_disk_close: take disk write lock before destroying it
Many IO operations are submitted to the kernel async, and so the zio can
complete and followup actions before the submission call returns. If one
of the followup actions closes the disk (eg during pool create/import),
the initiator may be left holding a lock on the disk at destruction.

Instead, take the write lock before finishing up and decoupling the disk
state from the vdev proper. The caller will hold until all IO is
submitted and locks released.

Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Signed-off-by: Rob Norris <rob.norris@klarasystems.com>
Closes #17719
2025-09-15 12:43:44 -07:00
Alexander Motin
efdb4bf07a Fix two infinite loops if dmu_prefetch_max set to zero
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <alexander.motin@TrueNAS.com>
Closes #17692
Closes #17729
2025-09-15 12:43:39 -07:00
Paul Dagnelie
cac483dbd4 Fix time database update calculations
The time database update math assumed that the timestamps were in
nanoseconds, but at some point in the development or review process they
changed to seconds. This PR fixes the math to use seconds instead.
    
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Signed-off-by: Paul Dagnelie <paul.dagnelie@klarasystems.com>
Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Closes #17735
2025-09-15 12:43:34 -07:00
Brian Behlendorf
c9de42e089 ZTS: refreserv/refreserv_raidz improvements
Several small changes intended to make this test reliable.

- Leave the default compression enabled for the pool and switch
  to using /dev/urandom as the data source.  Functionally this
  shouldn't impact the test but it's preferable to test with
  the pool defaults when possible.

- Verify the device is created and removed as required.  Switch
  to a unique volume name for a more clarity in the logs.

- Use the ZVOL_DEVDIR to specify the device path.

- Speed up the test by creating the pool with an ashift=12 and
  testing 4K, 8K, 128K volblocksizes.

Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #17725
2025-09-12 15:05:26 -07:00
Alexander Motin
41c6eaac8b Fix type in dbrrd_closest()
For ABS() to work, the argument must be signed, but rrdd_time is
uint64_t.  Clang noticed it.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Mariusz Zaborski <mariusz.zaborski@klarasystems.com>
Signed-off-by: Alexander Motin <alexander.motin@TrueNAS.com>
Fixes #16853
Closes #17733
2025-09-12 15:05:22 -07:00
Alexander Motin
b5d41deca9 FreeBSD: Satisfy ASSERT_VOP_IN_SEQC()
zfs_aclset_common() might be called for newly created or not even
created vnodes, that triggers assertions on newer FreeBSD versions
with DEBUG_VFS_LOCKS included into INVARIANTS.  In the first case
make sure to call vn_seqc_write_begin()/_end(), in the second just
skip the assertion.

The similar has to be done for project management IOCTL and file-
bases extended attributes, since those are not going through VFS.

Signed-off-by: Alexander Motin <alexander.motin@TrueNAS.com>
Closes #17722
2025-09-12 15:05:18 -07:00
Paul Dagnelie
2f41193a26 Make new zhack test a little more reliable
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Paul Dagnelie <paul.dagnelie@klarasystems.com>
Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Closes #17728
2025-09-12 15:05:14 -07:00
Chunwei Chen
95d677efde Fix ddle memleak in ddt_log_load
In ddt_log_load(), when removing dup entry from flushing tree, it doesn't
free the entry causing memleak.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Signed-off-by: Chunwei Chen <david.chen@nutanix.com>
Co-authored-by: Chunwei Chen <david.chen@nutanix.com>
Closes #17657
Closes #17730
2025-09-12 15:05:10 -07:00
JT Pennington
43a9d9ac57 Add send:encrypted test
Create tests for the new send:encrypted permission

Sponsored-by: Klara, Inc.
Sponsored-by: Karakun AG
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Signed-off-by: JT Pennington <jt.pennington@klarasystems.com>
Closes #17543
2025-09-12 15:05:05 -07:00
Allan Jude
6c4ede4026 ZFS allow send:encrypted
A new `zfs allow` permissions that ONLY allows sending replication
streams in raw (encrypted) mode, so encrypted data will not be
decrypted as part of the replication process.

Sponsored-by: Klara, Inc.
Sponsored-by: Karakun AG
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Co-authored-by: JT Pennington <jt.pennington@klarasystems.com>
Signed-off-by: Allan Jude <allan@klarasystems.com>
Closes #17543
2025-09-12 15:05:02 -07:00
Tony Hutter
4a7a04630d zed: Add synchronous zedlets
Historically, ZED has blindly spawned off zedlets in parallel and never
worried about their completion order.  This means that you can
potentially have zedlets for event number 2 starting before zedlets for
event number 1 had finished.  Most of the time this is fine, and it
actually helps a lot when the system is getting spammed with hundreds
of events.

However, there are times when you want your zedlets to be executed
in sequence with the event ID.  That is where synchronous zedlets
come in.

ZED will wait for all previously spawned zedlets to finish before
running a synchronous zedlet.  Synchronous zedlets are guaranteed to be
the only zedlet running.  No other zedlets may run in parallel with a
synchronous zedlet.  Users should be careful to only use synchronous
zedlets when needed, since they decrease parallelism.

To make a zedlet synchronous, simply add a "-sync-" immediately
following the event name in the zedlet's file name:

	EVENT_NAME-sync-ZEDLETNAME.sh

For example, if you wanted a synchronous statechange script:

	statechange-sync-myzedlet.sh

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #17335
2025-09-11 15:58:59 -07:00
Brian Behlendorf
3dc345851c Prevent scrubbing a read-only pool
While it would be nice to be able to scrub a pool imported read-only
this will currently trip an ASSERT.  Before we can support this there
are some designs challenges which need to be thought through first.

For starters, a read-only import skips reading certain information 
from disk which it knows won't be needed, such as the space maps.
Furthermore, the scrub process expects to be checkpoint it's progress, 
update the on disk error log, and issue repair IO.  None of which 
would be possible when the pool is imported read-only.  

Each of these wrinkles can certainly be handled, but that will take 
some signifcant work.  In the meanwhile we disable the 'zpool scrub' 
command when the pool is imported read-only.

Reviewed-by: Alan Somers <asomers@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #17527
Closes #17717
2025-09-11 15:58:52 -07:00
Paul Dagnelie
df55ba7c49 Detect a slow raidz child during reads
A single slow responding disk can affect the overall read
performance of a raidz group.  When a raidz child disk is
determined to be a persistent slow outlier, then have it
sit out during reads for a period of time. The raidz group
can use parity to reconstruct the data that was skipped.

Each time a slow disk is placed into a sit out period, its
`vdev_stat.vs_slow_ios count` is incremented and a zevent
class `ereport.fs.zfs.delay` is posted.

The length of the sit out period can be changed using the
`raid_read_sit_out_secs` module parameter.  Setting it to
zero disables slow outlier detection.

Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Paul Dagnelie <paul.dagnelie@klarasystems.com>
Contributions-by: Don Brady <don.brady@klarasystems.com>
Contributions-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #17227
2025-09-10 15:31:30 -07:00
Paul Dagnelie
0df85ec27c Remove RAIDZ reconstruct flags from debug defaults
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Paul Dagnelie <paul.dagnelie@klarasystems.com>
Closes #17227
2025-09-10 15:31:25 -07:00
Tony Hutter
123bfc32f3 ZTS: Print warning if running ZTS user_run test locally
Print a warning if you're attempting to run a ZTS test that calls
'user_run', and the ephemeral user doesn't have permissions to
access the test binaries.

This can happen if you're running ZTS from a local git repo.  In
that case the test user (say, 'testuser1') may need access to the
ZTS binaries in:

/home/<your_username>/zfs/tests/zfs-tests/bin/

... but 'testuser1' doesn't have permission to enter your home dir:

/home/<your_username>

The warning will help alert users to what is going on.  This will
not be an issue when ZTS is actually installed on the system
(via 'make install' or from packages).

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #17721
2025-09-10 15:01:42 -07:00
Alan Somers
177e9d07d0 Fix the build of crypto_test on LP32 architectures
test->id is a uint64_t, not a long.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Signed-off-by:	Alan Somers <asomers@gmail.com>
Sponsored by:	ConnectWise
Closes #17707
2025-09-10 15:01:37 -07:00
Paul Dagnelie
e2e708241a Enable zhack to work properly with 4k sector size disks
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Paul Dagnelie <paul.dagnelie@klarasystems.com>
Closes #17576
2025-09-10 15:01:32 -07:00
Paul Dagnelie
26983d6fa7 Add allocation profile export and zhack subcommand for import
When attempting to debug performance problems on large systems, one of
the major factors that affect performance is free space
fragmentation. This heavily affects the allocation process, which is an
area of active development in ZFS. Unfortunately, fragmenting a large
pool for testing purposes is time consuming; it usually involves filling
the pool and then repeatedly overwriting data until the free space
becomes fragmented, which can take many hours. And even if the time is
available, artificial workloads rarely generate the same fragmentation
patterns as the natural workloads they're attempting to mimic.

This patch has two parts. First, in zdb, we add the ability to export
the full allocation map of the pool. It iterates over each vdev,
printing every allocated segment in the ms_allocatable range tree. This
can be done while the pool is online, though in that case the allocation
map may actually be from several different TXGs as new ones are loaded
on demand.

The second is a new subcommand for zhack, zhack metaslab leak (and its
supporting kernel changes). This is a zhack subcommand that imports a
pool and then modified the range trees of the metaslabs, allowing the
sync process to write them out normall. It does not currently store
those allocations anywhere to make them reversible, and there is no
corresponding free subcommand (which would be extremely dangerous); this
is an irreversible process, only intended for performance testing. The
only way to reclaim the space afterwards is to destroy the pool or roll
back to a checkpoint.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Paul Dagnelie <paul.dagnelie@klarasystems.com>
Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Closes #17576
2025-09-10 15:01:28 -07:00
Shengqi Chen
ca4f7d6d49 contrib/debian: install files into merged /usr
This commit synchronizes the debian packaging files with the distro
version (also maintained by me) as much as possible.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Colm Buckley <colm@tuatha.org>
Signed-off-by: Shengqi Chen <harry-chen@outlook.com>
Closes #17712
2025-09-10 15:01:24 -07:00
Shengqi Chen
717c57c834 cmd: rename arcstat to zarcstat
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Colm Buckley <colm@tuatha.org>
Signed-off-by: Shengqi Chen <harry-chen@outlook.com>
Closes #16357
Closes #17712
2025-09-10 15:01:20 -07:00
Shengqi Chen
743866cd2a cmd: rename arc_summary to zarcsummary
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Colm Buckley <colm@tuatha.org>
Signed-off-by: Shengqi Chen <harry-chen@outlook.com>
Closes #16357
Closes #17712
2025-09-10 15:01:16 -07:00
Shengqi Chen
5bf1500ee3 Remove renaming notice and symlinks for arcstat and arc_summary
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Colm Buckley <colm@tuatha.org>
Signed-off-by: Shengqi Chen <harry-chen@outlook.com>
Closes #16357
Closes #17712
2025-09-10 15:01:12 -07:00
Rob Norris
dc53e5c484 linux/rw_destroy: assert no holders before destroying
While rw_destroy() may do nothing on Linux, we still want to make sure
that we don't have any holders outstanding like we do for mutexes.

Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Signed-off-by: Rob Norris <rob.norris@klarasystems.com>
Closes #17718
2025-09-10 15:01:02 -07:00
Rob Norris
0df91abe82 Linux 6.17: d_set_d_op() is no longer available
We only have extremely narrow uses, so move it all into a single
function that does only what we need, with and without d_set_d_op().

Sponsored-by: https://despairlabs.com/sponsor/
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Rob Norris <robn@despairlabs.com>
Closes #17621
2025-09-09 17:06:55 -07:00
Rob Norris
d469371033 config: restore ZFS_AC_KERNEL_DENTRY tests
Accidentally removed calls in ed048fdc5b.

Sponsored-by: https://despairlabs.com/sponsor/
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Rob Norris <robn@despairlabs.com>
Closes #17621
2025-09-09 17:06:52 -07:00
Tony Hutter
87e35bd3ab ZTS: Fix fault_limits timeouts
fault_limits would often hit the 10min timeout and be killed on Fedora
41-42.  Investigation showed that the 'fill_fs' portion of the test,
which would fill the pool with junk data before vdev replacement, was
writing highly compressible data (~126x), which would have taxed the
CPUs, potentially causing the timeout.

The fix is to write random data and reduce the number of writes.
This has an added benefit that more real data being is written to the
pool (~1GB) vs the old way (~300-400MB).  It also speeds up the test.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Reviewed-by: Paul Dagnelie <paul.dagnelie@klarasystems.com>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #17709
2025-09-09 17:06:48 -07:00
Alan Somers
cfd640c3e8 Fix warnings about sha2_is_supported on FreeBSD/i386
This is one problem currently preventing OpenZFS from building on
FreeBSD/i386.

Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by:	Alan Somers <asomers@gmail.com>
Sponsored by:	ConnectWise
Closes #17704
2025-09-09 17:06:40 -07:00
Alan Somers
b23eae62be Fix the build on 32-bit FreeBSD with GCC
GCC complains about casting a 64-bit integer to a 32-bit pointer.
Originally committed downstream as
https://github.com/freebsd/freebsd-src/commit/2d76470b701

Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by:	Alan Somers <asomers@gmail.com>
Sponsored by:	ConnectWise
Closes #17706
2025-09-09 17:06:37 -07:00
rmacklem
b727163db9 zfs_vnops_os.c: Add support for the _PC_CLONE_BLKSIZE name
FreeBSD now has a pathconf name called _PC_CLONE_BLKSIZE
which is the block size supported for block cloning for
the file system.  Since ZFS's block size varies per file,
return the largest size likely to be used, or zero if block
cloning is not supported.

Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Rick Macklem <rmacklem@uoguelph.ca>
Closes #17645
2025-09-09 17:06:33 -07:00
Rob Norris
02fa962af0 cmd: force zarcstat/zarc_summary recreation at install
If the target already exists, lt will fail. Force it to recreate the
symlinks.

Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Signed-off-by: Rob Norris <rob.norris@klarasystems.com>
Closes #17702
2025-09-09 17:06:29 -07:00
Chunwei Chen
c755aa486d Fix wrong dedup_table_size for legacy dedup
If we call ddt_log_load() for legacy ddt, we will end up going into
ddt_log_update_stats() and filling uninitialized value into ddo_dspace.
This value will then get added to dedup_table_size during
ddt_get_dedup_object_stats().

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Closes #17019
Closes #17699

Signed-off-by: Chunwei Chen <david.chen@nutanix.com>
Co-authored-by: Chunwei Chen <david.chen@nutanix.com>
2025-09-09 17:06:24 -07:00
Shengqi Chen
34ca2b8392 Install zarcstat and zarcsummary in deb / rpm build rules
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Shengqi Chen <harry-chen@outlook.com>
Closes #16357
Closes #17695
2025-09-09 17:06:13 -07:00
Shengqi Chen
f8e2152db7 Install zarcstat and zarcsummary symlinks in Makefile
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Shengqi Chen <harry-chen@outlook.com>
Closes #16357
Closes #17695
2025-09-09 17:05:30 -07:00
Shengqi Chen
cbc6d57012 Add upcoming renaming notice for arc_summary and arcstat
They will become zarcsummary and zarcstat in 2.4.0.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Shengqi Chen <harry-chen@outlook.com>
Closes #16357
Closes #17695
2025-09-09 17:05:26 -07:00
Shengqi Chen
e5132a3382 ci: fix syntax issues in zfs-qemu.yml
Otherwise it might become `if [ == "" ]` which is ill-formed.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Shengqi Chen <harry-chen@outlook.com>
Closes #17695
2025-09-09 17:05:21 -07:00
Shengqi Chen
b5b6deb985 ci: use real head sha instead of GITHUB_SHA when generating CI type
Because GitHub creates a merge commit on top of real head, so the check
on HEAD will fail regardlessly.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Shengqi Chen <harry-chen@outlook.com>
Closes #17695
2025-09-09 17:05:17 -07:00
Maksym Shkolnyi
886f29e1f6 config: Add warning if ARCH environment variable is set
If ARCH environment variable is set it can cause the failure of the
kernel modules check during the configure step. The resulting error
will be confusing, and may looks like this:

>    checking for kernel config option compatibility... done
>    checking whether CONFIG_MODULES is defined... no
>    configure: error:
>        *** This kernel does not include the required loadable module
>        *** support!

Detect when ARCH is print a warning.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Maksym Shkolnyi <maksym.shkolnyi@workato.com>
Closes #17680
2025-09-09 17:04:39 -07:00
Rob Norris
56e8ab4a3e zvol: reject suspend attempts when zvol is shutting down
Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rob Norris <rob.norris@klarasystems.com>
Closes #17690
2025-09-09 17:04:32 -07:00
Brian Behlendorf
24baccb75e config: Fix LLVM-21 -Wuninitialized-const-pointer warning
LLVM-21 enables -Wuninitialized-const-pointer which results in the
following compiler warning and the bdev_file_open_by_path() interface
not being detected for 6.9 and newer kernels.  The blk_holder_ops
are not used by the ZFS code so we can safely use a NULL argument
for this check.

    bdev_file_open_by_path/bdev_file_open_by_path.c:110:54: error:
    variable 'h' is uninitialized when passed as a const pointer
    argument here [-Werror,-Wuninitialized-const-pointer]

Reviewed-by: Rob Norris <robn@despairlabs.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #17682
Closes #17684
2025-09-09 17:04:24 -07:00
Alexander Ziaee
92d4b135b6 manuals: Audit/bump dates for last content change
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Ziaee <ziaee@FreeBSD.org>
Closes #17676
2025-09-09 17:04:19 -07:00
classabbyamp
31b9646681 linux: use sys/stat.h instead of linux/stat.h
glibc includes linux/stat.h for statx, but musl defines its own statx
struct and associated constants, which does not include STATX_MNT_ID
yet. Thus, including linux/stat.h directly should be avoided for
maximum libc compatibility.

Tested on:
  - glibc: x86_64, i686, aarch64, armv7l, armv6l
  - musl: x86_64, aarch64, armv7l, armv6l

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Tested-By: Achill Gilgenast <achill@achill.org>
Signed-off-by: classabbyamp <dev@placeviolette.net>
Closes #17675
2025-09-09 17:04:15 -07:00
Eric A. Borisch
04d991dbc4 Update pam_zfs_key.c defaultt path for FreeBSD
As described in https://github.com/freebsd/freebsd-src/pull/1305,
FreeBSD's installer defaults to zroot/home for user home directories.

For FreeBSD only, set the default prefix for pam_zfs_key to match.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Signed-off-by: Eric A. Borisch <eborisch@gmail.com>
Closes #17600
2025-09-09 17:04:06 -07:00
ofthesun9
5846a85155 Update compatibility.d files
Add an openzfs-2.4 compatibility file for the next release.

While there are no compatibility difference between Linux and
FreeBSD for 2.4 symlinks for the -linux and -freebsd names are
created for any scripts expecting that convention.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Signed-off-by: ofthesun9 <olivier@ofthesun.net>
Closes #17672
Closes #17673
2025-09-09 17:04:01 -07:00
Shawn Bayern
8604e67dc9 Add description of default sorting behavior to zfs_list.8
The sorting logic is all in cmd/zfs/zfs_iter.c.  I borrowed
where I could from the comments in the source code, but please
note that the comment to zfs_sort() is a little imprecise, or at
least incomplete, because it doesn't give any indication of the
chronological sort that will be used by default for snapshots in
zfs_compare().

While adding this description, I took the liberty to copy-edit
the rest of the file lightly.

In those edits, I've removed "If specified, you can list
property information by the absolute pathname or the relative
pathname" because, in context, it seems more confusing than
helpful.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Rob Norris <robn@despairlabs.com>
Signed-off-by: Shawn Bayern <sbayern@law.fsu.edu>
Closes #15713
Closes #15869
2025-09-09 17:03:55 -07:00
Ivan Shapovalov
c00c3e33bb config: add and use KERNEL_CC check for -Wno-format-zero-length
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ivan Shapovalov <intelfx@intelfx.name>
Closes #16997
2025-09-09 17:03:49 -07:00
Ivan Shapovalov
92579489e0 config: cleanup KERNEL_CC checks, fix broken status output
If $KERNEL_CC was not defined, configure status output would print an
empty string where the kernel compiler should have been. Fix this and
simplify the code generally.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ivan Shapovalov <intelfx@intelfx.name>
Closes #16997
2025-09-09 17:03:44 -07:00
Tony Hutter
c539d6f211 ZTS: add mount_loopback to test zfs behind loop dev
Add a test case to reproduce issue #17277:

1. Make a pool
2. Write a file to the pool
3. Mount the file as a loopback device
4. Make an XFS filesystem on the loopback device
5. Mount the XFS filesystem... <hangs>

Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Reviewed-by: Rob Norris <robn@despairlabs.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Issue #17277
Closes #17329
2025-09-09 17:03:36 -07:00
Mark Johnston
2fc6bf82b6 zdb: Fix format strings on 32-bit systems
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Signed-off-by: Mark Johnston <markj@FreeBSD.org>
Closes #17665
2025-09-09 17:03:31 -07:00
youzhongyang
774a34f3ff Synchronize the update of feature refcount
The concurrent execution of feature_sync() can lead to a panic due 
to an unprotected update of the feature refcount.  Resolve this by
using the spa->spa_feat_stats_lock to synchronize the update of the 
refcount.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Youzhong Yang <yyang@mathworks.com>
Closes #17184
Closes #17632
2025-09-09 17:03:27 -07:00
Cong Zhang
e3392a5e7d Prompt user to unlock when login from dropbear
Update the zfsunlock initramfs hook to provide instructions on how
to unlock the root filesystem when appropriate.  The intent is to
make the dropbear ssh MOTD more user friendly.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Signed-off-by: Cong Zhang <13283869+congzhangzh@users.noreply.github.com>
Closes #17661
Closes #17662
2025-09-09 17:03:22 -07:00
532 changed files with 19424 additions and 37019 deletions

View File

@ -78,6 +78,11 @@ case "$OS" in
OPTS[0]="--boot" OPTS[0]="--boot"
OPTS[1]="uefi=on" OPTS[1]="uefi=on"
;; ;;
fedora41)
OSNAME="Fedora 41"
OSv="fedora-unknown"
URL="https://download.fedoraproject.org/pub/fedora/linux/releases/41/Cloud/x86_64/images/Fedora-Cloud-Base-Generic-41-1.4.x86_64.qcow2"
;;
fedora42) fedora42)
OSNAME="Fedora 42" OSNAME="Fedora 42"
OSv="fedora-unknown" OSv="fedora-unknown"

View File

@ -58,7 +58,7 @@ jobs:
strategy: strategy:
fail-fast: false fail-fast: false
matrix: matrix:
os: ['almalinux8', 'almalinux9', 'almalinux10', 'fedora42', 'fedora43'] os: ['almalinux8', 'almalinux9', 'almalinux10', 'fedora41', 'fedora42', 'fedora43']
runs-on: ubuntu-24.04 runs-on: ubuntu-24.04
steps: steps:
- uses: actions/checkout@v4 - uses: actions/checkout@v4

View File

@ -48,7 +48,7 @@ jobs:
os_selection='["almalinux8", "almalinux9", "almalinux10", "debian12", "fedora42", "freebsd15-0s", "ubuntu24"]' os_selection='["almalinux8", "almalinux9", "almalinux10", "debian12", "fedora42", "freebsd15-0s", "ubuntu24"]'
;; ;;
linux) linux)
os_selection='["almalinux8", "almalinux9", "almalinux10", "centos-stream9", "centos-stream10", "debian11", "debian12", "debian13", "fedora42", "fedora43", "ubuntu22", "ubuntu24"]' os_selection='["almalinux8", "almalinux9", "almalinux10", "centos-stream9", "centos-stream10", "debian11", "debian12", "debian13", "fedora41", "fedora42", "fedora43", "ubuntu22", "ubuntu24"]'
;; ;;
freebsd) freebsd)
os_selection='["freebsd13-5r", "freebsd14-3r", "freebsd13-5s", "freebsd14-3s", "freebsd15-0s", "freebsd16-0c"]' os_selection='["freebsd13-5r", "freebsd14-3r", "freebsd13-5s", "freebsd14-3s", "freebsd15-0s", "freebsd16-0c"]'

View File

@ -53,7 +53,6 @@ Jason Harmening <jason.harmening@gmail.com>
Jeremy Faulkner <gldisater@gmail.com> Jeremy Faulkner <gldisater@gmail.com>
Jinshan Xiong <jinshan.xiong@gmail.com> Jinshan Xiong <jinshan.xiong@gmail.com>
John Poduska <jpoduska@datto.com> John Poduska <jpoduska@datto.com>
Joseph Holsten <joseph@josephholsten.com>
Jo Zzsi <jozzsicsataban@gmail.com> Jo Zzsi <jozzsicsataban@gmail.com>
Justin Scholz <git@justinscholz.de> Justin Scholz <git@justinscholz.de>
Ka Ho Ng <khng300@gmail.com> Ka Ho Ng <khng300@gmail.com>
@ -73,12 +72,10 @@ Roberto Ricci <io@r-ricci.it>
Roberto Ricci <ricci@disroot.org> Roberto Ricci <ricci@disroot.org>
Rob Norris <robn@despairlabs.com> Rob Norris <robn@despairlabs.com>
Rob Norris <rob.norris@klarasystems.com> Rob Norris <rob.norris@klarasystems.com>
Rob Norris <rob.norris@truenas.com>
Sam Lunt <samuel.j.lunt@gmail.com> Sam Lunt <samuel.j.lunt@gmail.com>
Sanjeev Bagewadi <sanjeev.bagewadi@gmail.com> Sanjeev Bagewadi <sanjeev.bagewadi@gmail.com>
Sebastian Wuerl <s.wuerl@mailbox.org> Sebastian Wuerl <s.wuerl@mailbox.org>
SHENGYI HONG <aokblast@FreeBSD.org> SHENGYI HONG <aokblast@FreeBSD.org>
Sivesh Kumar <siveshjami@gmail.com>
Stoiko Ivanov <github@nomore.at> Stoiko Ivanov <github@nomore.at>
Tamas TEVESZ <ice@extreme.hu> Tamas TEVESZ <ice@extreme.hu>
WHR <msl0000023508@gmail.com> WHR <msl0000023508@gmail.com>
@ -86,12 +83,8 @@ Yanping Gao <yanping.gao@xtaotech.com>
Youzhong Yang <youzhong@gmail.com> Youzhong Yang <youzhong@gmail.com>
# Signed-off-by: overriding Author: # Signed-off-by: overriding Author:
Alexander Moch <mail@alexmoch.com> <amoch@ernw.de>
Alexander Moch <mail@alexmoch.com> <github@alexanderjulian.de>
Alexander Ziaee <ziaee@FreeBSD.org> <concussious@runbox.com> Alexander Ziaee <ziaee@FreeBSD.org> <concussious@runbox.com>
delan azabani <dazabani@igalia.com> <delan@azabani.com>
Felix Schmidt <felixschmidt20@aol.com> <f.sch.prototype@gmail.com> Felix Schmidt <felixschmidt20@aol.com> <f.sch.prototype@gmail.com>
George Shammas <george@shamm.as> <georgyo@gmail.com>
Jean-Sébastien Pédron <dumbbell@FreeBSD.org> <jean-sebastien.pedron@dumbbell.fr> Jean-Sébastien Pédron <dumbbell@FreeBSD.org> <jean-sebastien.pedron@dumbbell.fr>
Konstantin Belousov <kib@FreeBSD.org> <kib@kib.kiev.ua> Konstantin Belousov <kib@FreeBSD.org> <kib@kib.kiev.ua>
Olivier Certner <olce@FreeBSD.org> <olce.freebsd@certner.fr> Olivier Certner <olce@FreeBSD.org> <olce.freebsd@certner.fr>
@ -115,7 +108,6 @@ Ned Bass <bass6@llnl.gov> <bass6@zeno1.(none)>
Tulsi Jain <tulsi.jain@delphix.com> <tulsi.jain@Tulsi-Jains-MacBook-Pro.local> Tulsi Jain <tulsi.jain@delphix.com> <tulsi.jain@Tulsi-Jains-MacBook-Pro.local>
# Mappings from Github no-reply addresses # Mappings from Github no-reply addresses
Adi Gollamudi <adigollamudi@gmail.com> <68113680+Adi-Goll@users.noreply.github.com>
ajs124 <git@ajs124.de> <ajs124@users.noreply.github.com> ajs124 <git@ajs124.de> <ajs124@users.noreply.github.com>
Alek Pinchuk <apinchuk@axcient.com> <alek-p@users.noreply.github.com> Alek Pinchuk <apinchuk@axcient.com> <alek-p@users.noreply.github.com>
Aleksandr Liber <aleksandr.liber@perforce.com> <61714074+AleksandrLiber@users.noreply.github.com> Aleksandr Liber <aleksandr.liber@perforce.com> <61714074+AleksandrLiber@users.noreply.github.com>
@ -133,7 +125,6 @@ bernie1995 <bernie.pikes@gmail.com> <42413912+bernie1995@users.noreply.github.co
Bojan Novković <bnovkov@FreeBSD.org> <72801811+bnovkov@users.noreply.github.com> Bojan Novković <bnovkov@FreeBSD.org> <72801811+bnovkov@users.noreply.github.com>
Boris Protopopov <boris.protopopov@actifio.com> <bprotopopov@users.noreply.github.com> Boris Protopopov <boris.protopopov@actifio.com> <bprotopopov@users.noreply.github.com>
Brad Forschinger <github@bnjf.id.au> <bnjf@users.noreply.github.com> Brad Forschinger <github@bnjf.id.au> <bnjf@users.noreply.github.com>
Brad Spengler <94915855+bspengler-oss@users.noreply.github.com>>
Brandon Thetford <brandon@dodecatec.com> <dodexahedron@users.noreply.github.com> Brandon Thetford <brandon@dodecatec.com> <dodexahedron@users.noreply.github.com>
buzzingwires <buzzingwires@outlook.com> <131118055+buzzingwires@users.noreply.github.com> buzzingwires <buzzingwires@outlook.com> <131118055+buzzingwires@users.noreply.github.com>
Cedric Maunoury <cedric.maunoury@gmail.com> <38213715+cedricmaunoury@users.noreply.github.com> Cedric Maunoury <cedric.maunoury@gmail.com> <38213715+cedricmaunoury@users.noreply.github.com>
@ -147,7 +138,6 @@ Daniel Kobras <d.kobras@science-computing.de> <sckobras@users.noreply.github.com
Daniel Reichelt <hacking@nachtgeist.net> <nachtgeist@users.noreply.github.com> Daniel Reichelt <hacking@nachtgeist.net> <nachtgeist@users.noreply.github.com>
David Quigley <david.quigley@intel.com> <dpquigl@users.noreply.github.com> David Quigley <david.quigley@intel.com> <dpquigl@users.noreply.github.com>
Dennis R. Friedrichsen <dennis.r.friedrichsen@gmail.com> <31087738+dennisfriedrichsen@users.noreply.github.com> Dennis R. Friedrichsen <dennis.r.friedrichsen@gmail.com> <31087738+dennisfriedrichsen@users.noreply.github.com>
Dennis Vestergaard Værum <github@varum.dk> <6872940+dvaerum@users.noreply.github.com>
Dex Wood <slash2314@gmail.com> <slash2314@users.noreply.github.com> Dex Wood <slash2314@gmail.com> <slash2314@users.noreply.github.com>
DHE <git@dehacked.net> <DeHackEd@users.noreply.github.com> DHE <git@dehacked.net> <DeHackEd@users.noreply.github.com>
Dmitri John Ledkov <dimitri.ledkov@canonical.com> <19779+xnox@users.noreply.github.com> Dmitri John Ledkov <dimitri.ledkov@canonical.com> <19779+xnox@users.noreply.github.com>

14
AUTHORS
View File

@ -14,7 +14,6 @@ CONTRIBUTORS:
Adam D. Moss <c@yotes.com> Adam D. Moss <c@yotes.com>
Adam Leventhal <ahl@delphix.com> Adam Leventhal <ahl@delphix.com>
Adam Stevko <adam.stevko@gmail.com> Adam Stevko <adam.stevko@gmail.com>
Adi Gollamudi <adigollamudi@gmail.com>
adisbladis <adis@blad.is> adisbladis <adis@blad.is>
Adrian Chadd <adrian@freebsd.org> Adrian Chadd <adrian@freebsd.org>
Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
@ -33,10 +32,8 @@ CONTRIBUTORS:
Alek Pinchuk <alek@nexenta.com> Alek Pinchuk <alek@nexenta.com>
Aleksandr Liber <aleksandr.liber@perforce.com> Aleksandr Liber <aleksandr.liber@perforce.com>
Aleksa Sarai <cyphar@cyphar.com> Aleksa Sarai <cyphar@cyphar.com>
Alex <simplecodemaster@gmail.com>
Alexander Eremin <a.eremin@nexenta.com> Alexander Eremin <a.eremin@nexenta.com>
Alexander Lobakin <alobakin@pm.me> Alexander Lobakin <alobakin@pm.me>
Alexander Moch <mail@alexmoch.com>
Alexander Motin <mav@freebsd.org> Alexander Motin <mav@freebsd.org>
Alexander Pyhalov <apyhalov@gmail.com> Alexander Pyhalov <apyhalov@gmail.com>
Alexander Richardson <Alexander.Richardson@cl.cam.ac.uk> Alexander Richardson <Alexander.Richardson@cl.cam.ac.uk>
@ -49,7 +46,6 @@ CONTRIBUTORS:
Alex McWhirter <alexmcwhirter@triadic.us> Alex McWhirter <alexmcwhirter@triadic.us>
Alex Reece <alex@delphix.com> Alex Reece <alex@delphix.com>
Alex Wilson <alex.wilson@joyent.com> Alex Wilson <alex.wilson@joyent.com>
Alexx Saver <lzsaver.eth@ethermail.io>
Alex Zhuravlev <alexey.zhuravlev@intel.com> Alex Zhuravlev <alexey.zhuravlev@intel.com>
Allan Jude <allanjude@freebsd.org> Allan Jude <allanjude@freebsd.org>
Allen Holl <allen.m.holl@gmail.com> Allen Holl <allen.m.holl@gmail.com>
@ -93,7 +89,6 @@ CONTRIBUTORS:
Arun KV <arun.kv@datacore.com> Arun KV <arun.kv@datacore.com>
Arvind Sankar <nivedita@alum.mit.edu> Arvind Sankar <nivedita@alum.mit.edu>
Attila Fülöp <attila@fueloep.org> Attila Fülöp <attila@fueloep.org>
Austin Wise <AustinWise@gmail.com>
Avatat <kontakt@avatat.pl> Avatat <kontakt@avatat.pl>
Bart Coddens <bart.coddens@gmail.com> Bart Coddens <bart.coddens@gmail.com>
Basil Crow <basil.crow@delphix.com> Basil Crow <basil.crow@delphix.com>
@ -115,7 +110,6 @@ CONTRIBUTORS:
Boris Protopopov <boris.protopopov@nexenta.com> Boris Protopopov <boris.protopopov@nexenta.com>
Brad Forschinger <github@bnjf.id.au> Brad Forschinger <github@bnjf.id.au>
Brad Lewis <brad.lewis@delphix.com> Brad Lewis <brad.lewis@delphix.com>
Brad Spengler <bspengler-oss@users.noreply.github.com>
Brandon Thetford <brandon@dodecatec.com> Brandon Thetford <brandon@dodecatec.com>
Brian Atkinson <bwa@g.clemson.edu> Brian Atkinson <bwa@g.clemson.edu>
Brian Behlendorf <behlendorf1@llnl.gov> Brian Behlendorf <behlendorf1@llnl.gov>
@ -202,9 +196,7 @@ CONTRIBUTORS:
David Quigley <david.quigley@intel.com> David Quigley <david.quigley@intel.com>
Debabrata Banerjee <dbanerje@akamai.com> Debabrata Banerjee <dbanerje@akamai.com>
D. Ebdrup <debdrup@freebsd.org> D. Ebdrup <debdrup@freebsd.org>
delan azabani <dazabani@igalia.com>
Dennis R. Friedrichsen <dennis.r.friedrichsen@gmail.com> Dennis R. Friedrichsen <dennis.r.friedrichsen@gmail.com>
Dennis Vestergaard Værum <github@varum.dk>
Denys Rtveliashvili <denys@rtveliashvili.name> Denys Rtveliashvili <denys@rtveliashvili.name>
Derek Dai <daiderek@gmail.com> Derek Dai <daiderek@gmail.com>
Derek Schrock <dereks@lifeofadishwasher.com> Derek Schrock <dereks@lifeofadishwasher.com>
@ -231,7 +223,6 @@ CONTRIBUTORS:
Eric Desrochers <eric.desrochers@canonical.com> Eric Desrochers <eric.desrochers@canonical.com>
Eric Dillmann <eric@jave.fr> Eric Dillmann <eric@jave.fr>
Eric Schrock <Eric.Schrock@delphix.com> Eric Schrock <Eric.Schrock@delphix.com>
Erik Larsson <catacombae@gmail.com>
Ethan Coe-Renner <coerenner1@llnl.gov> Ethan Coe-Renner <coerenner1@llnl.gov>
Etienne Dechamps <etienne@edechamps.fr> Etienne Dechamps <etienne@edechamps.fr>
Evan Allrich <eallrich@gmail.com> Evan Allrich <eallrich@gmail.com>
@ -264,7 +255,6 @@ CONTRIBUTORS:
George Diamantopoulos <georgediam@gmail.com> George Diamantopoulos <georgediam@gmail.com>
George Gaydarov <git@gg7.io> George Gaydarov <git@gg7.io>
George Melikov <mail@gmelikov.ru> George Melikov <mail@gmelikov.ru>
George Shammas <george@shamm.as>
George Wilson <gwilson@delphix.com> George Wilson <gwilson@delphix.com>
Georgy Yakovlev <ya@sysdump.net> Georgy Yakovlev <ya@sysdump.net>
Gerardwx <gerardw@alum.mit.edu> Gerardwx <gerardw@alum.mit.edu>
@ -353,7 +343,6 @@ CONTRIBUTORS:
Joe Stein <joe.stein@delphix.com> Joe Stein <joe.stein@delphix.com>
John-Mark Gurney <jmg@funkthat.com> John-Mark Gurney <jmg@funkthat.com>
John Albietz <inthecloud247@gmail.com> John Albietz <inthecloud247@gmail.com>
John Cabaj <john.cabaj@canonical.com>
John Eismeier <john.eismeier@gmail.com> John Eismeier <john.eismeier@gmail.com>
John Gallagher <john.gallagher@delphix.com> John Gallagher <john.gallagher@delphix.com>
John Layman <jlayman@sagecloud.com> John Layman <jlayman@sagecloud.com>
@ -369,7 +358,6 @@ CONTRIBUTORS:
Jorgen Lundman <lundman@lundman.net> Jorgen Lundman <lundman@lundman.net>
Josef 'Jeff' Sipek <josef.sipek@nexenta.com> Josef 'Jeff' Sipek <josef.sipek@nexenta.com>
Jose Luis Duran <jlduran@gmail.com> Jose Luis Duran <jlduran@gmail.com>
Joseph Holsten <joseph@josephholsten.com>
Josh Soref <jsoref@users.noreply.github.com> Josh Soref <jsoref@users.noreply.github.com>
Joshua M. Clulow <josh@sysmgr.org> Joshua M. Clulow <josh@sysmgr.org>
José Luis Salvador Rufo <salvador.joseluis@gmail.com> José Luis Salvador Rufo <salvador.joseluis@gmail.com>
@ -634,7 +622,6 @@ CONTRIBUTORS:
Simon Guest <simon.guest@tesujimath.org> Simon Guest <simon.guest@tesujimath.org>
Simon Howard <fraggle@soulsphere.org> Simon Howard <fraggle@soulsphere.org>
Simon Klinkert <simon.klinkert@gmail.com> Simon Klinkert <simon.klinkert@gmail.com>
Sivesh Kumar <siveshjami@gmail.com>
Sowrabha Gopal <sowrabha.gopal@delphix.com> Sowrabha Gopal <sowrabha.gopal@delphix.com>
Spencer Kinny <spencerkinny1995@gmail.com> Spencer Kinny <spencerkinny1995@gmail.com>
Srikanth N S <srikanth.nagasubbaraoseetharaman@hpe.com> Srikanth N S <srikanth.nagasubbaraoseetharaman@hpe.com>
@ -717,7 +704,6 @@ CONTRIBUTORS:
Windel Bouwman <windel@windel.nl> Windel Bouwman <windel@windel.nl>
Wojciech Małota-Wójcik <outofforest@users.noreply.github.com> Wojciech Małota-Wójcik <outofforest@users.noreply.github.com>
Wolfgang Bumiller <w.bumiller@proxmox.com> Wolfgang Bumiller <w.bumiller@proxmox.com>
Wolfgang Hoschek <wolfgang.hoschek@mac.com>
XDTG <click1799@163.com> XDTG <click1799@163.com>
Xin Li <delphij@FreeBSD.org> Xin Li <delphij@FreeBSD.org>
Xinliang Liu <xinliang.liu@linaro.org> Xinliang Liu <xinliang.liu@linaro.org>

2
META
View File

@ -1,7 +1,7 @@
Meta: 1 Meta: 1
Name: zfs Name: zfs
Branch: 1.0 Branch: 1.0
Version: 2.4.99 Version: 2.4.1
Release: 1 Release: 1
Release-Tags: relext Release-Tags: relext
License: CDDL License: CDDL

View File

@ -1,4 +1,3 @@
# SPDX-License-Identifier: CDDL-1.0
CLEANFILES = CLEANFILES =
dist_noinst_DATA = dist_noinst_DATA =
INSTALL_DATA_HOOKS = INSTALL_DATA_HOOKS =
@ -133,7 +132,6 @@ cstyle:
! -name 'zfs_config.*' ! -name '*.mod.c' \ ! -name 'zfs_config.*' ! -name '*.mod.c' \
! -name 'opt_global.h' ! -name '*_if*.h' \ ! -name 'opt_global.h' ! -name '*_if*.h' \
! -name 'zstd_compat_wrapper.h' \ ! -name 'zstd_compat_wrapper.h' \
! -path './module/zstd/zstd-in.c' \
! -path './module/zstd/lib/*' \ ! -path './module/zstd/lib/*' \
! -path './include/sys/lua/*' \ ! -path './include/sys/lua/*' \
! -path './module/lua/l*.[ch]' \ ! -path './module/lua/l*.[ch]' \

View File

@ -1,4 +1,3 @@
#!/bin/sh #!/bin/sh
# SPDX-License-Identifier: CDDL-1.0
autoreconf -fiv "$(dirname "$0")" && rm -rf "$(dirname "$0")"/autom4te.cache autoreconf -fiv "$(dirname "$0")" && rm -rf "$(dirname "$0")"/autom4te.cache

View File

@ -1,4 +1,3 @@
# SPDX-License-Identifier: CDDL-1.0
bin_SCRIPTS = bin_SCRIPTS =
bin_PROGRAMS = bin_PROGRAMS =
sbin_SCRIPTS = sbin_SCRIPTS =
@ -36,8 +35,8 @@ zhack_SOURCES = \
zhack_LDADD = \ zhack_LDADD = \
libzpool.la \ libzpool.la \
libzfs_core.la \ libzfs_core.la \
libnvpair.la \ libnvpair.la
librange_tree.la
ztest_CFLAGS = $(AM_CFLAGS) $(KERNEL_CFLAGS) ztest_CFLAGS = $(AM_CFLAGS) $(KERNEL_CFLAGS)
ztest_CPPFLAGS = $(AM_CPPFLAGS) $(LIBZPOOL_CPPFLAGS) ztest_CPPFLAGS = $(AM_CPPFLAGS) $(LIBZPOOL_CPPFLAGS)

View File

@ -1,4 +1,3 @@
# SPDX-License-Identifier: CDDL-1.0
raidz_test_CFLAGS = $(AM_CFLAGS) $(KERNEL_CFLAGS) raidz_test_CFLAGS = $(AM_CFLAGS) $(KERNEL_CFLAGS)
raidz_test_CPPFLAGS = $(AM_CPPFLAGS) $(LIBZPOOL_CPPFLAGS) raidz_test_CPPFLAGS = $(AM_CPPFLAGS) $(LIBZPOOL_CPPFLAGS)

View File

@ -33,7 +33,6 @@
#include <sys/vdev_raidz_impl.h> #include <sys/vdev_raidz_impl.h>
#include <assert.h> #include <assert.h>
#include <stdio.h> #include <stdio.h>
#include <libzpool.h>
#include "raidz_test.h" #include "raidz_test.h"
static int *rand_data; static int *rand_data;

View File

@ -1,4 +1,3 @@
# SPDX-License-Identifier: CDDL-1.0
zdb_CPPFLAGS = $(AM_CPPFLAGS) $(LIBZPOOL_CPPFLAGS) zdb_CPPFLAGS = $(AM_CPPFLAGS) $(LIBZPOOL_CPPFLAGS)
zdb_CFLAGS = $(AM_CFLAGS) $(LIBCRYPTO_CFLAGS) zdb_CFLAGS = $(AM_CFLAGS) $(LIBCRYPTO_CFLAGS)
@ -14,8 +13,6 @@ zdb_LDADD = \
libzdb.la \ libzdb.la \
libzpool.la \ libzpool.la \
libzfs_core.la \ libzfs_core.la \
libnvpair.la \ libnvpair.la
libbtree.la \
librange_tree.la
zdb_LDADD += $(LIBCRYPTO_LIBS) zdb_LDADD += $(LIBCRYPTO_LIBS)

View File

@ -36,7 +36,6 @@
* Copyright (c) 2021 Toomas Soome <tsoome@me.com> * Copyright (c) 2021 Toomas Soome <tsoome@me.com>
* Copyright (c) 2023, 2024, Klara Inc. * Copyright (c) 2023, 2024, Klara Inc.
* Copyright (c) 2023, Rob Norris <robn@despairlabs.com> * Copyright (c) 2023, Rob Norris <robn@despairlabs.com>
* Copyright (c) 2026, TrueNAS.
*/ */
#include <stdio.h> #include <stdio.h>
@ -90,7 +89,6 @@
#include <sys/zstd/zstd.h> #include <sys/zstd/zstd.h>
#include <sys/backtrace.h> #include <sys/backtrace.h>
#include <libzpool.h>
#include <libnvpair.h> #include <libnvpair.h>
#include <libzutil.h> #include <libzutil.h>
#include <libzfs_core.h> #include <libzfs_core.h>
@ -3391,14 +3389,14 @@ zdb_derive_key(dsl_dir_t *dd, uint8_t *key_out)
static char encroot[ZFS_MAX_DATASET_NAME_LEN]; static char encroot[ZFS_MAX_DATASET_NAME_LEN];
static boolean_t key_loaded = B_FALSE; static boolean_t key_loaded = B_FALSE;
static int static void
zdb_load_key(objset_t *os) zdb_load_key(objset_t *os)
{ {
dsl_pool_t *dp; dsl_pool_t *dp;
dsl_dir_t *dd, *rdd; dsl_dir_t *dd, *rdd;
uint8_t key[WRAPPING_KEY_LEN]; uint8_t key[WRAPPING_KEY_LEN];
uint64_t rddobj; uint64_t rddobj;
int err = 0; int err;
dp = spa_get_dsl(os->os_spa); dp = spa_get_dsl(os->os_spa);
dd = os->os_dsl_dataset->ds_dir; dd = os->os_dsl_dataset->ds_dir;
@ -3411,13 +3409,9 @@ zdb_load_key(objset_t *os)
dsl_dir_rele(rdd, FTAG); dsl_dir_rele(rdd, FTAG);
if (!zdb_derive_key(dd, key)) if (!zdb_derive_key(dd, key))
err = EINVAL; fatal("couldn't derive encryption key");
dsl_pool_config_exit(dp, FTAG);
if (err != 0) { dsl_pool_config_exit(dp, FTAG);
fprintf(stderr, "couldn't derive encryption key\n");
return (err);
}
ASSERT3U(dsl_dataset_get_keystatus(dd), ==, ZFS_KEYSTATUS_UNAVAILABLE); ASSERT3U(dsl_dataset_get_keystatus(dd), ==, ZFS_KEYSTATUS_UNAVAILABLE);
@ -3434,20 +3428,16 @@ zdb_load_key(objset_t *os)
dsl_crypto_params_free(dcp, (err != 0)); dsl_crypto_params_free(dcp, (err != 0));
fnvlist_free(crypto_args); fnvlist_free(crypto_args);
if (err != 0) { if (err != 0)
fprintf(stderr, fatal(
"couldn't load encryption key for %s: %s\n", "couldn't load encryption key for %s: %s",
encroot, err == ZFS_ERR_CRYPTO_NOTSUP ? encroot, err == ZFS_ERR_CRYPTO_NOTSUP ?
"crypto params not supported" : strerror(err)); "crypto params not supported" : strerror(err));
return (err);
}
ASSERT3U(dsl_dataset_get_keystatus(dd), ==, ZFS_KEYSTATUS_AVAILABLE); ASSERT3U(dsl_dataset_get_keystatus(dd), ==, ZFS_KEYSTATUS_AVAILABLE);
printf("Unlocked encryption root: %s\n", encroot); printf("Unlocked encryption root: %s\n", encroot);
key_loaded = B_TRUE; key_loaded = B_TRUE;
return (0);
} }
static void static void
@ -3490,30 +3480,15 @@ open_objset(const char *path, const void *tag, objset_t **osp)
path, strerror(err)); path, strerror(err));
return (err); return (err);
} }
dsl_dataset_long_hold(dmu_objset_ds(*osp), tag);
dsl_pool_rele(dmu_objset_pool(*osp), tag);
/* /* succeeds or dies */
* Only try to load the key and unlock the dataset if it is zdb_load_key(*osp);
* actually encrypted; otherwise we'll just crash. Just
* ignore the -K switch entirely otherwise; it's useful to be
* able to provide even if it's not needed.
*/
if ((*osp)->os_encrypted) {
dsl_dataset_long_hold(dmu_objset_ds(*osp), tag);
dsl_pool_rele(dmu_objset_pool(*osp), tag);
err = zdb_load_key(*osp); /* release it all */
dsl_dataset_long_rele(dmu_objset_ds(*osp), tag);
/* release it all */ dsl_dataset_rele(dmu_objset_ds(*osp), tag);
dsl_dataset_long_rele(dmu_objset_ds(*osp), tag);
dsl_dataset_rele(dmu_objset_ds(*osp), tag);
if (err != 0) {
*osp = NULL;
return (err);
}
} else {
dmu_objset_rele(*osp, tag);
}
} }
int ds_hold_flags = key_loaded ? DS_HOLD_FLAG_DECRYPT : 0; int ds_hold_flags = key_loaded ? DS_HOLD_FLAG_DECRYPT : 0;
@ -3522,7 +3497,6 @@ open_objset(const char *path, const void *tag, objset_t **osp)
if (err != 0) { if (err != 0) {
(void) fprintf(stderr, "failed to hold dataset '%s': %s\n", (void) fprintf(stderr, "failed to hold dataset '%s': %s\n",
path, strerror(err)); path, strerror(err));
*osp = NULL;
return (err); return (err);
} }
dsl_dataset_long_hold(dmu_objset_ds(*osp), tag); dsl_dataset_long_hold(dmu_objset_ds(*osp), tag);

View File

@ -1,4 +1,3 @@
# SPDX-License-Identifier: CDDL-1.0
include $(srcdir)/%D%/zed.d/Makefile.am include $(srcdir)/%D%/zed.d/Makefile.am
zed_CFLAGS = $(AM_CFLAGS) zed_CFLAGS = $(AM_CFLAGS)
@ -38,7 +37,8 @@ zed_SOURCES = \
zed_LDADD = \ zed_LDADD = \
libzfs.la \ libzfs.la \
libzfs_core.la \ libzfs_core.la \
libnvpair.la libnvpair.la \
libuutil.la
zed_LDADD += -lrt $(LIBATOMIC_LIBS) $(LIBUDEV_LIBS) $(LIBUUID_LIBS) zed_LDADD += -lrt $(LIBATOMIC_LIBS) $(LIBUDEV_LIBS) $(LIBUUID_LIBS)
zed_LDFLAGS = -pthread zed_LDFLAGS = -pthread

View File

@ -29,6 +29,7 @@
#include <stddef.h> #include <stddef.h>
#include <string.h> #include <string.h>
#include <libuutil.h>
#include <libzfs.h> #include <libzfs.h>
#include <sys/types.h> #include <sys/types.h>
#include <sys/time.h> #include <sys/time.h>
@ -95,7 +96,7 @@ typedef struct zfs_case {
uint32_t zc_version; uint32_t zc_version;
zfs_case_data_t zc_data; zfs_case_data_t zc_data;
fmd_case_t *zc_case; fmd_case_t *zc_case;
list_node_t zc_node; uu_list_node_t zc_node;
id_t zc_remove_timer; id_t zc_remove_timer;
char *zc_fru; char *zc_fru;
er_timeval_t zc_when; er_timeval_t zc_when;
@ -125,7 +126,8 @@ zfs_de_stats_t zfs_stats = {
/* wait 15 seconds after a removal */ /* wait 15 seconds after a removal */
static hrtime_t zfs_remove_timeout = SEC2NSEC(15); static hrtime_t zfs_remove_timeout = SEC2NSEC(15);
static list_t zfs_cases; uu_list_pool_t *zfs_case_pool;
uu_list_t *zfs_cases;
#define ZFS_MAKE_RSRC(type) \ #define ZFS_MAKE_RSRC(type) \
FM_RSRC_CLASS "." ZFS_ERROR_CLASS "." type FM_RSRC_CLASS "." ZFS_ERROR_CLASS "." type
@ -172,8 +174,8 @@ zfs_case_unserialize(fmd_hdl_t *hdl, fmd_case_t *cp)
zcp->zc_remove_timer = fmd_timer_install(hdl, zcp, zcp->zc_remove_timer = fmd_timer_install(hdl, zcp,
NULL, zfs_remove_timeout); NULL, zfs_remove_timeout);
list_link_init(&zcp->zc_node); uu_list_node_init(zcp, &zcp->zc_node, zfs_case_pool);
list_insert_head(&zfs_cases, zcp); (void) uu_list_insert_before(zfs_cases, NULL, zcp);
fmd_case_setspecific(hdl, cp, zcp); fmd_case_setspecific(hdl, cp, zcp);
@ -204,8 +206,8 @@ zfs_other_serd_cases(fmd_hdl_t *hdl, const zfs_case_data_t *zfs_case)
next_check = gethrestime_sec() + CASE_GC_TIMEOUT_SECS; next_check = gethrestime_sec() + CASE_GC_TIMEOUT_SECS;
} }
for (zcp = list_head(&zfs_cases); zcp != NULL; for (zcp = uu_list_first(zfs_cases); zcp != NULL;
zcp = list_next(&zfs_cases, zcp)) { zcp = uu_list_next(zfs_cases, zcp)) {
zfs_case_data_t *zcd = &zcp->zc_data; zfs_case_data_t *zcd = &zcp->zc_data;
/* /*
@ -255,8 +257,8 @@ zfs_mark_vdev(uint64_t pool_guid, nvlist_t *vd, er_timeval_t *loaded)
/* /*
* Mark any cases associated with this (pool, vdev) pair. * Mark any cases associated with this (pool, vdev) pair.
*/ */
for (zcp = list_head(&zfs_cases); zcp != NULL; for (zcp = uu_list_first(zfs_cases); zcp != NULL;
zcp = list_next(&zfs_cases, zcp)) { zcp = uu_list_next(zfs_cases, zcp)) {
if (zcp->zc_data.zc_pool_guid == pool_guid && if (zcp->zc_data.zc_pool_guid == pool_guid &&
zcp->zc_data.zc_vdev_guid == vdev_guid) { zcp->zc_data.zc_vdev_guid == vdev_guid) {
zcp->zc_present = B_TRUE; zcp->zc_present = B_TRUE;
@ -302,8 +304,8 @@ zfs_mark_pool(zpool_handle_t *zhp, void *unused)
/* /*
* Mark any cases associated with just this pool. * Mark any cases associated with just this pool.
*/ */
for (zcp = list_head(&zfs_cases); zcp != NULL; for (zcp = uu_list_first(zfs_cases); zcp != NULL;
zcp = list_next(&zfs_cases, zcp)) { zcp = uu_list_next(zfs_cases, zcp)) {
if (zcp->zc_data.zc_pool_guid == pool_guid && if (zcp->zc_data.zc_pool_guid == pool_guid &&
zcp->zc_data.zc_vdev_guid == 0) zcp->zc_data.zc_vdev_guid == 0)
zcp->zc_present = B_TRUE; zcp->zc_present = B_TRUE;
@ -319,8 +321,8 @@ zfs_mark_pool(zpool_handle_t *zhp, void *unused)
if (nelem == 2) { if (nelem == 2) {
loaded.ertv_sec = tod[0]; loaded.ertv_sec = tod[0];
loaded.ertv_nsec = tod[1]; loaded.ertv_nsec = tod[1];
for (zcp = list_head(&zfs_cases); zcp != NULL; for (zcp = uu_list_first(zfs_cases); zcp != NULL;
zcp = list_next(&zfs_cases, zcp)) { zcp = uu_list_next(zfs_cases, zcp)) {
if (zcp->zc_data.zc_pool_guid == pool_guid && if (zcp->zc_data.zc_pool_guid == pool_guid &&
zcp->zc_data.zc_vdev_guid == 0) { zcp->zc_data.zc_vdev_guid == 0) {
zcp->zc_when = loaded; zcp->zc_when = loaded;
@ -387,7 +389,8 @@ zpool_find_load_time(zpool_handle_t *zhp, void *arg)
static void static void
zfs_purge_cases(fmd_hdl_t *hdl) zfs_purge_cases(fmd_hdl_t *hdl)
{ {
zfs_case_t *zcp, *next; zfs_case_t *zcp;
uu_list_walk_t *walk;
libzfs_handle_t *zhdl = fmd_hdl_getspecific(hdl); libzfs_handle_t *zhdl = fmd_hdl_getspecific(hdl);
/* /*
@ -407,8 +410,8 @@ zfs_purge_cases(fmd_hdl_t *hdl)
/* /*
* Mark the cases as not present. * Mark the cases as not present.
*/ */
for (zcp = list_head(&zfs_cases); zcp != NULL; for (zcp = uu_list_first(zfs_cases); zcp != NULL;
zcp = list_next(&zfs_cases, zcp)) zcp = uu_list_next(zfs_cases, zcp))
zcp->zc_present = B_FALSE; zcp->zc_present = B_FALSE;
/* /*
@ -422,11 +425,12 @@ zfs_purge_cases(fmd_hdl_t *hdl)
/* /*
* Remove those cases which were not found. * Remove those cases which were not found.
*/ */
for (zcp = list_head(&zfs_cases); zcp != NULL; zcp = next) { walk = uu_list_walk_start(zfs_cases, UU_WALK_ROBUST);
next = list_next(&zfs_cases, zcp); while ((zcp = uu_list_walk_next(walk)) != NULL) {
if (!zcp->zc_present) if (!zcp->zc_present)
fmd_case_close(hdl, zcp->zc_case); fmd_case_close(hdl, zcp->zc_case);
} }
uu_list_walk_end(walk);
} }
/* /*
@ -656,8 +660,8 @@ zfs_fm_recv(fmd_hdl_t *hdl, fmd_event_t *ep, nvlist_t *nvl, const char *class)
zfs_ereport_when(hdl, nvl, &er_when); zfs_ereport_when(hdl, nvl, &er_when);
for (zcp = list_head(&zfs_cases); zcp != NULL; for (zcp = uu_list_first(zfs_cases); zcp != NULL;
zcp = list_next(&zfs_cases, zcp)) { zcp = uu_list_next(zfs_cases, zcp)) {
if (zcp->zc_data.zc_pool_guid == pool_guid) { if (zcp->zc_data.zc_pool_guid == pool_guid) {
pool_found = B_TRUE; pool_found = B_TRUE;
pool_load = zcp->zc_when; pool_load = zcp->zc_when;
@ -863,8 +867,8 @@ zfs_fm_recv(fmd_hdl_t *hdl, fmd_event_t *ep, nvlist_t *nvl, const char *class)
* Pool level fault. Before solving the case, go through and * Pool level fault. Before solving the case, go through and
* close any open device cases that may be pending. * close any open device cases that may be pending.
*/ */
for (dcp = list_head(&zfs_cases); dcp != NULL; for (dcp = uu_list_first(zfs_cases); dcp != NULL;
dcp = list_next(&zfs_cases, dcp)) { dcp = uu_list_next(zfs_cases, dcp)) {
if (dcp->zc_data.zc_pool_guid == if (dcp->zc_data.zc_pool_guid ==
zcp->zc_data.zc_pool_guid && zcp->zc_data.zc_pool_guid &&
dcp->zc_data.zc_vdev_guid != 0) dcp->zc_data.zc_vdev_guid != 0)
@ -1084,7 +1088,8 @@ zfs_fm_close(fmd_hdl_t *hdl, fmd_case_t *cs)
if (zcp->zc_data.zc_has_remove_timer) if (zcp->zc_data.zc_has_remove_timer)
fmd_timer_remove(hdl, zcp->zc_remove_timer); fmd_timer_remove(hdl, zcp->zc_remove_timer);
list_remove(&zfs_cases, zcp); uu_list_remove(zfs_cases, zcp);
uu_list_node_fini(zcp, &zcp->zc_node, zfs_case_pool);
fmd_hdl_free(hdl, zcp, sizeof (zfs_case_t)); fmd_hdl_free(hdl, zcp, sizeof (zfs_case_t));
} }
@ -1112,11 +1117,23 @@ _zfs_diagnosis_init(fmd_hdl_t *hdl)
if ((zhdl = libzfs_init()) == NULL) if ((zhdl = libzfs_init()) == NULL)
return; return;
list_create(&zfs_cases, if ((zfs_case_pool = uu_list_pool_create("zfs_case_pool",
sizeof (zfs_case_t), offsetof(zfs_case_t, zc_node)); sizeof (zfs_case_t), offsetof(zfs_case_t, zc_node),
NULL, UU_LIST_POOL_DEBUG)) == NULL) {
libzfs_fini(zhdl);
return;
}
if ((zfs_cases = uu_list_create(zfs_case_pool, NULL,
UU_LIST_DEBUG)) == NULL) {
uu_list_pool_destroy(zfs_case_pool);
libzfs_fini(zhdl);
return;
}
if (fmd_hdl_register(hdl, FMD_API_VERSION, &fmd_info) != 0) { if (fmd_hdl_register(hdl, FMD_API_VERSION, &fmd_info) != 0) {
list_destroy(&zfs_cases); uu_list_destroy(zfs_cases);
uu_list_pool_destroy(zfs_case_pool);
libzfs_fini(zhdl); libzfs_fini(zhdl);
return; return;
} }
@ -1131,18 +1148,24 @@ void
_zfs_diagnosis_fini(fmd_hdl_t *hdl) _zfs_diagnosis_fini(fmd_hdl_t *hdl)
{ {
zfs_case_t *zcp; zfs_case_t *zcp;
uu_list_walk_t *walk;
libzfs_handle_t *zhdl; libzfs_handle_t *zhdl;
/* /*
* Remove all active cases. * Remove all active cases.
*/ */
while ((zcp = list_remove_head(&zfs_cases)) != NULL) { walk = uu_list_walk_start(zfs_cases, UU_WALK_ROBUST);
while ((zcp = uu_list_walk_next(walk)) != NULL) {
fmd_hdl_debug(hdl, "removing case ena %llu", fmd_hdl_debug(hdl, "removing case ena %llu",
(long long unsigned)zcp->zc_data.zc_ena); (long long unsigned)zcp->zc_data.zc_ena);
uu_list_remove(zfs_cases, zcp);
uu_list_node_fini(zcp, &zcp->zc_node, zfs_case_pool);
fmd_hdl_free(hdl, zcp, sizeof (zfs_case_t)); fmd_hdl_free(hdl, zcp, sizeof (zfs_case_t));
} }
uu_list_walk_end(walk);
list_destroy(&zfs_cases); uu_list_destroy(zfs_cases);
uu_list_pool_destroy(zfs_case_pool);
zhdl = fmd_hdl_getspecific(hdl); zhdl = fmd_hdl_getspecific(hdl);
libzfs_fini(zhdl); libzfs_fini(zhdl);

View File

@ -82,7 +82,7 @@
#include <sys/sunddi.h> #include <sys/sunddi.h>
#include <sys/sysevent/eventdefs.h> #include <sys/sysevent/eventdefs.h>
#include <sys/sysevent/dev.h> #include <sys/sysevent/dev.h>
#include <sys/taskq.h> #include <thread_pool.h>
#include <pthread.h> #include <pthread.h>
#include <unistd.h> #include <unistd.h>
#include <errno.h> #include <errno.h>
@ -98,7 +98,7 @@ typedef void (*zfs_process_func_t)(zpool_handle_t *, nvlist_t *, boolean_t);
libzfs_handle_t *g_zfshdl; libzfs_handle_t *g_zfshdl;
list_t g_pool_list; /* list of unavailable pools at initialization */ list_t g_pool_list; /* list of unavailable pools at initialization */
list_t g_device_list; /* list of disks with asynchronous label request */ list_t g_device_list; /* list of disks with asynchronous label request */
taskq_t *g_taskq; tpool_t *g_tpool;
boolean_t g_enumeration_done; boolean_t g_enumeration_done;
pthread_t g_zfs_tid; /* zfs_enum_pools() thread */ pthread_t g_zfs_tid; /* zfs_enum_pools() thread */
@ -749,8 +749,8 @@ zfs_iter_pool(zpool_handle_t *zhp, void *data)
continue; continue;
if (zfs_toplevel_state(zhp) >= VDEV_STATE_DEGRADED) { if (zfs_toplevel_state(zhp) >= VDEV_STATE_DEGRADED) {
list_remove(&g_pool_list, pool); list_remove(&g_pool_list, pool);
(void) taskq_dispatch(g_taskq, zfs_enable_ds, (void) tpool_dispatch(g_tpool, zfs_enable_ds,
pool, TQ_SLEEP); pool);
break; break;
} }
} }
@ -1347,9 +1347,9 @@ zfs_slm_fini(void)
/* wait for zfs_enum_pools thread to complete */ /* wait for zfs_enum_pools thread to complete */
(void) pthread_join(g_zfs_tid, NULL); (void) pthread_join(g_zfs_tid, NULL);
/* destroy the thread pool */ /* destroy the thread pool */
if (g_taskq != NULL) { if (g_tpool != NULL) {
taskq_wait(g_taskq); tpool_wait(g_tpool);
taskq_destroy(g_taskq); tpool_destroy(g_tpool);
} }
while ((pool = list_remove_head(&g_pool_list)) != NULL) { while ((pool = list_remove_head(&g_pool_list)) != NULL) {

View File

@ -1,4 +1,3 @@
# SPDX-License-Identifier: CDDL-1.0
zedconfdir = $(sysconfdir)/zfs/zed.d zedconfdir = $(sysconfdir)/zfs/zed.d
dist_zedconf_DATA = \ dist_zedconf_DATA = \
%D%/zed-functions.sh \ %D%/zed-functions.sh \

View File

@ -1,4 +1,3 @@
# SPDX-License-Identifier: CDDL-1.0
sbin_PROGRAMS += zfs sbin_PROGRAMS += zfs
CPPCHECKTARGETS += zfs CPPCHECKTARGETS += zfs
@ -13,7 +12,8 @@ zfs_SOURCES = \
zfs_LDADD = \ zfs_LDADD = \
libzfs.la \ libzfs.la \
libzfs_core.la \ libzfs_core.la \
libnvpair.la libnvpair.la \
libuutil.la
zfs_LDADD += $(LTLIBINTL) zfs_LDADD += $(LTLIBINTL)

View File

@ -28,6 +28,7 @@
*/ */
#include <libintl.h> #include <libintl.h>
#include <libuutil.h>
#include <stddef.h> #include <stddef.h>
#include <stdio.h> #include <stdio.h>
#include <stdlib.h> #include <stdlib.h>
@ -49,16 +50,14 @@
* When finished, we have an AVL tree of ZFS handles. We go through and execute * When finished, we have an AVL tree of ZFS handles. We go through and execute
* the provided callback for each one, passing whatever data the user supplied. * the provided callback for each one, passing whatever data the user supplied.
*/ */
typedef struct callback_data callback_data_t;
typedef struct zfs_node { typedef struct zfs_node {
zfs_handle_t *zn_handle; zfs_handle_t *zn_handle;
callback_data_t *zn_callback; uu_avl_node_t zn_avlnode;
avl_node_t zn_avlnode;
} zfs_node_t; } zfs_node_t;
struct callback_data { typedef struct callback_data {
avl_tree_t cb_avl; uu_avl_t *cb_avl;
int cb_flags; int cb_flags;
zfs_type_t cb_types; zfs_type_t cb_types;
zfs_sort_column_t *cb_sortcol; zfs_sort_column_t *cb_sortcol;
@ -66,7 +65,9 @@ struct callback_data {
int cb_depth_limit; int cb_depth_limit;
int cb_depth; int cb_depth;
uint8_t cb_props_table[ZFS_NUM_PROPS]; uint8_t cb_props_table[ZFS_NUM_PROPS];
}; } callback_data_t;
uu_avl_pool_t *avl_pool;
/* /*
* Include snaps if they were requested or if this a zfs list where types * Include snaps if they were requested or if this a zfs list where types
@ -98,12 +99,13 @@ zfs_callback(zfs_handle_t *zhp, void *data)
if ((zfs_get_type(zhp) & cb->cb_types) || if ((zfs_get_type(zhp) & cb->cb_types) ||
((zfs_get_type(zhp) == ZFS_TYPE_SNAPSHOT) && include_snaps)) { ((zfs_get_type(zhp) == ZFS_TYPE_SNAPSHOT) && include_snaps)) {
avl_index_t idx; uu_avl_index_t idx;
zfs_node_t *node = safe_malloc(sizeof (zfs_node_t)); zfs_node_t *node = safe_malloc(sizeof (zfs_node_t));
node->zn_handle = zhp; node->zn_handle = zhp;
node->zn_callback = cb; uu_avl_node_init(node, &node->zn_avlnode, avl_pool);
if (avl_find(&cb->cb_avl, node, &idx) == NULL) { if (uu_avl_find(cb->cb_avl, node, cb->cb_sortcol,
&idx) == NULL) {
if (cb->cb_proplist) { if (cb->cb_proplist) {
if ((*cb->cb_proplist) && if ((*cb->cb_proplist) &&
!(*cb->cb_proplist)->pl_all) !(*cb->cb_proplist)->pl_all)
@ -118,7 +120,7 @@ zfs_callback(zfs_handle_t *zhp, void *data)
return (-1); return (-1);
} }
} }
avl_insert(&cb->cb_avl, node, idx); uu_avl_insert(cb->cb_avl, node, idx);
should_close = B_FALSE; should_close = B_FALSE;
} else { } else {
free(node); free(node);
@ -284,7 +286,7 @@ zfs_compare(const void *larg, const void *rarg)
if (rat != NULL) if (rat != NULL)
*rat = '\0'; *rat = '\0';
ret = TREE_ISIGN(strcmp(lname, rname)); ret = strcmp(lname, rname);
if (ret == 0 && (lat != NULL || rat != NULL)) { if (ret == 0 && (lat != NULL || rat != NULL)) {
/* /*
* If we're comparing a dataset to one of its snapshots, we * If we're comparing a dataset to one of its snapshots, we
@ -338,11 +340,11 @@ zfs_compare(const void *larg, const void *rarg)
* with snapshots grouped under their parents. * with snapshots grouped under their parents.
*/ */
static int static int
zfs_sort(const void *larg, const void *rarg) zfs_sort(const void *larg, const void *rarg, void *data)
{ {
zfs_handle_t *l = ((zfs_node_t *)larg)->zn_handle; zfs_handle_t *l = ((zfs_node_t *)larg)->zn_handle;
zfs_handle_t *r = ((zfs_node_t *)rarg)->zn_handle; zfs_handle_t *r = ((zfs_node_t *)rarg)->zn_handle;
zfs_sort_column_t *sc = ((zfs_node_t *)larg)->zn_callback->cb_sortcol; zfs_sort_column_t *sc = (zfs_sort_column_t *)data;
zfs_sort_column_t *psc; zfs_sort_column_t *psc;
for (psc = sc; psc != NULL; psc = psc->sc_next) { for (psc = sc; psc != NULL; psc = psc->sc_next) {
@ -412,7 +414,7 @@ zfs_sort(const void *larg, const void *rarg)
return (-1); return (-1);
if (lstr) if (lstr)
ret = TREE_ISIGN(strcmp(lstr, rstr)); ret = strcmp(lstr, rstr);
else if (lnum < rnum) else if (lnum < rnum)
ret = -1; ret = -1;
else if (lnum > rnum) else if (lnum > rnum)
@ -436,6 +438,13 @@ zfs_for_each(int argc, char **argv, int flags, zfs_type_t types,
callback_data_t cb = {0}; callback_data_t cb = {0};
int ret = 0; int ret = 0;
zfs_node_t *node; zfs_node_t *node;
uu_avl_walk_t *walk;
avl_pool = uu_avl_pool_create("zfs_pool", sizeof (zfs_node_t),
offsetof(zfs_node_t, zn_avlnode), zfs_sort, UU_DEFAULT);
if (avl_pool == NULL)
nomem();
cb.cb_sortcol = sortcol; cb.cb_sortcol = sortcol;
cb.cb_flags = flags; cb.cb_flags = flags;
@ -480,8 +489,8 @@ zfs_for_each(int argc, char **argv, int flags, zfs_type_t types,
sizeof (cb.cb_props_table)); sizeof (cb.cb_props_table));
} }
avl_create(&cb.cb_avl, zfs_sort, if ((cb.cb_avl = uu_avl_create(avl_pool, NULL, UU_DEFAULT)) == NULL)
sizeof (zfs_node_t), offsetof(zfs_node_t, zn_avlnode)); nomem();
if (argc == 0) { if (argc == 0) {
/* /*
@ -522,20 +531,25 @@ zfs_for_each(int argc, char **argv, int flags, zfs_type_t types,
* At this point we've got our AVL tree full of zfs handles, so iterate * At this point we've got our AVL tree full of zfs handles, so iterate
* over each one and execute the real user callback. * over each one and execute the real user callback.
*/ */
for (node = avl_first(&cb.cb_avl); node != NULL; for (node = uu_avl_first(cb.cb_avl); node != NULL;
node = AVL_NEXT(&cb.cb_avl, node)) node = uu_avl_next(cb.cb_avl, node))
ret |= callback(node->zn_handle, data); ret |= callback(node->zn_handle, data);
/* /*
* Finally, clean up the AVL tree. * Finally, clean up the AVL tree.
*/ */
void *cookie = NULL; if ((walk = uu_avl_walk_start(cb.cb_avl, UU_WALK_ROBUST)) == NULL)
while ((node = avl_destroy_nodes(&cb.cb_avl, &cookie)) != NULL) { nomem();
while ((node = uu_avl_walk_next(walk)) != NULL) {
uu_avl_remove(cb.cb_avl, node);
zfs_close(node->zn_handle); zfs_close(node->zn_handle);
free(node); free(node);
} }
avl_destroy(&cb.cb_avl); uu_avl_walk_end(walk);
uu_avl_destroy(cb.cb_avl);
uu_avl_pool_destroy(avl_pool);
return (ret); return (ret);
} }

View File

@ -42,6 +42,7 @@
#include <getopt.h> #include <getopt.h>
#include <libgen.h> #include <libgen.h>
#include <libintl.h> #include <libintl.h>
#include <libuutil.h>
#include <libnvpair.h> #include <libnvpair.h>
#include <locale.h> #include <locale.h>
#include <stddef.h> #include <stddef.h>
@ -439,8 +440,8 @@ get_usage(zfs_help_t idx)
return (gettext("\tredact <snapshot> <bookmark> " return (gettext("\tredact <snapshot> <bookmark> "
"<redaction_snapshot> ...\n")); "<redaction_snapshot> ...\n"));
case HELP_REWRITE: case HELP_REWRITE:
return (gettext("\trewrite [-CPSrvx] [-o <offset>] " return (gettext("\trewrite [-Prvx] [-o <offset>] [-l <length>] "
"[-l <length>] <directory|file ...>\n")); "<directory|file ...>\n"));
case HELP_JAIL: case HELP_JAIL:
return (gettext("\tjail <jailid|jailname> <filesystem>\n")); return (gettext("\tjail <jailid|jailname> <filesystem>\n"));
case HELP_UNJAIL: case HELP_UNJAIL:
@ -2852,27 +2853,31 @@ static int us_type_bits[] = {
static const char *const us_type_names[] = { "posixgroup", "posixuser", static const char *const us_type_names[] = { "posixgroup", "posixuser",
"smbgroup", "smbuser", "all" }; "smbgroup", "smbuser", "all" };
typedef struct us_cbdata us_cbdata_t;
typedef struct us_node { typedef struct us_node {
nvlist_t *usn_nvl; nvlist_t *usn_nvl;
us_cbdata_t *usn_cbdata; uu_avl_node_t usn_avlnode;
avl_node_t usn_avlnode; uu_list_node_t usn_listnode;
list_node_t usn_listnode;
} us_node_t; } us_node_t;
struct us_cbdata { typedef struct us_cbdata {
nvlist_t **cb_nvlp; nvlist_t **cb_nvlp;
avl_tree_t cb_avl; uu_avl_pool_t *cb_avl_pool;
uu_avl_t *cb_avl;
boolean_t cb_numname; boolean_t cb_numname;
boolean_t cb_nicenum; boolean_t cb_nicenum;
boolean_t cb_sid2posix; boolean_t cb_sid2posix;
zfs_userquota_prop_t cb_prop; zfs_userquota_prop_t cb_prop;
zfs_sort_column_t *cb_sortcol; zfs_sort_column_t *cb_sortcol;
size_t cb_width[USFIELD_LAST]; size_t cb_width[USFIELD_LAST];
}; } us_cbdata_t;
static boolean_t us_populated = B_FALSE; static boolean_t us_populated = B_FALSE;
typedef struct {
zfs_sort_column_t *si_sortcol;
boolean_t si_numname;
} us_sort_info_t;
static int static int
us_field_index(const char *field) us_field_index(const char *field)
{ {
@ -2885,12 +2890,13 @@ us_field_index(const char *field)
} }
static int static int
us_compare(const void *larg, const void *rarg) us_compare(const void *larg, const void *rarg, void *unused)
{ {
const us_node_t *l = larg; const us_node_t *l = larg;
const us_node_t *r = rarg; const us_node_t *r = rarg;
zfs_sort_column_t *sortcol = l->usn_cbdata->cb_sortcol; us_sort_info_t *si = (us_sort_info_t *)unused;
boolean_t numname = l->usn_cbdata->cb_numname; zfs_sort_column_t *sortcol = si->si_sortcol;
boolean_t numname = si->si_numname;
nvlist_t *lnvl = l->usn_nvl; nvlist_t *lnvl = l->usn_nvl;
nvlist_t *rnvl = r->usn_nvl; nvlist_t *rnvl = r->usn_nvl;
int rc = 0; int rc = 0;
@ -3024,22 +3030,25 @@ userspace_cb(void *arg, const char *domain, uid_t rid, uint64_t space,
const char *propname; const char *propname;
char sizebuf[32]; char sizebuf[32];
us_node_t *node; us_node_t *node;
avl_tree_t *avl = &cb->cb_avl; uu_avl_pool_t *avl_pool = cb->cb_avl_pool;
avl_index_t idx; uu_avl_t *avl = cb->cb_avl;
uu_avl_index_t idx;
nvlist_t *props; nvlist_t *props;
us_node_t *n; us_node_t *n;
zfs_sort_column_t *sortcol = cb->cb_sortcol;
unsigned type = 0; unsigned type = 0;
const char *typestr; const char *typestr;
size_t namelen; size_t namelen;
size_t typelen; size_t typelen;
size_t sizelen; size_t sizelen;
int typeidx, nameidx, sizeidx; int typeidx, nameidx, sizeidx;
us_sort_info_t sortinfo = { sortcol, cb->cb_numname };
boolean_t smbentity = B_FALSE; boolean_t smbentity = B_FALSE;
if (nvlist_alloc(&props, NV_UNIQUE_NAME, 0) != 0) if (nvlist_alloc(&props, NV_UNIQUE_NAME, 0) != 0)
nomem(); nomem();
node = safe_malloc(sizeof (us_node_t)); node = safe_malloc(sizeof (us_node_t));
node->usn_cbdata = cb; uu_avl_node_init(node, &node->usn_avlnode, avl_pool);
node->usn_nvl = props; node->usn_nvl = props;
if (domain != NULL && domain[0] != '\0') { if (domain != NULL && domain[0] != '\0') {
@ -3141,8 +3150,8 @@ userspace_cb(void *arg, const char *domain, uid_t rid, uint64_t space,
* Check if this type/name combination is in the list and update it; * Check if this type/name combination is in the list and update it;
* otherwise add new node to the list. * otherwise add new node to the list.
*/ */
if ((n = avl_find(avl, node, &idx)) == NULL) { if ((n = uu_avl_find(avl, node, &sortinfo, &idx)) == NULL) {
avl_insert(avl, node, idx); uu_avl_insert(avl, node, idx);
} else { } else {
nvlist_free(props); nvlist_free(props);
free(node); free(node);
@ -3316,7 +3325,7 @@ print_us_node(boolean_t scripted, boolean_t parsable, int *fields, int types,
static void static void
print_us(boolean_t scripted, boolean_t parsable, int *fields, int types, print_us(boolean_t scripted, boolean_t parsable, int *fields, int types,
size_t *width, boolean_t rmnode, avl_tree_t *avl) size_t *width, boolean_t rmnode, uu_avl_t *avl)
{ {
us_node_t *node; us_node_t *node;
const char *col; const char *col;
@ -3341,7 +3350,7 @@ print_us(boolean_t scripted, boolean_t parsable, int *fields, int types,
(void) printf("\n"); (void) printf("\n");
} }
for (node = avl_first(avl); node; node = AVL_NEXT(avl, node)) { for (node = uu_avl_first(avl); node; node = uu_avl_next(avl, node)) {
print_us_node(scripted, parsable, fields, types, width, node); print_us_node(scripted, parsable, fields, types, width, node);
if (rmnode) if (rmnode)
nvlist_free(node->usn_nvl); nvlist_free(node->usn_nvl);
@ -3353,6 +3362,9 @@ zfs_do_userspace(int argc, char **argv)
{ {
zfs_handle_t *zhp; zfs_handle_t *zhp;
zfs_userquota_prop_t p; zfs_userquota_prop_t p;
uu_avl_pool_t *avl_pool;
uu_avl_t *avl_tree;
uu_avl_walk_t *walk;
char *delim; char *delim;
char deffields[] = "type,name,used,quota,objused,objquota"; char deffields[] = "type,name,used,quota,objused,objquota";
char *ofield = NULL; char *ofield = NULL;
@ -3371,8 +3383,10 @@ zfs_do_userspace(int argc, char **argv)
us_cbdata_t cb; us_cbdata_t cb;
us_node_t *node; us_node_t *node;
us_node_t *rmnode; us_node_t *rmnode;
list_t list; uu_list_pool_t *listpool;
avl_index_t idx = 0; uu_list_t *list;
uu_avl_index_t idx = 0;
uu_list_index_t idx2 = 0;
if (argc < 2) if (argc < 2)
usage(B_FALSE); usage(B_FALSE);
@ -3506,6 +3520,12 @@ zfs_do_userspace(int argc, char **argv)
return (1); return (1);
} }
if ((avl_pool = uu_avl_pool_create("us_avl_pool", sizeof (us_node_t),
offsetof(us_node_t, usn_avlnode), us_compare, UU_DEFAULT)) == NULL)
nomem();
if ((avl_tree = uu_avl_create(avl_pool, NULL, UU_DEFAULT)) == NULL)
nomem();
/* Always add default sorting columns */ /* Always add default sorting columns */
(void) zfs_add_sort_column(&sortcol, "type", B_FALSE); (void) zfs_add_sort_column(&sortcol, "type", B_FALSE);
(void) zfs_add_sort_column(&sortcol, "name", B_FALSE); (void) zfs_add_sort_column(&sortcol, "name", B_FALSE);
@ -3513,12 +3533,10 @@ zfs_do_userspace(int argc, char **argv)
cb.cb_sortcol = sortcol; cb.cb_sortcol = sortcol;
cb.cb_numname = prtnum; cb.cb_numname = prtnum;
cb.cb_nicenum = !parsable; cb.cb_nicenum = !parsable;
cb.cb_avl_pool = avl_pool;
cb.cb_avl = avl_tree;
cb.cb_sid2posix = sid2posix; cb.cb_sid2posix = sid2posix;
avl_create(&cb.cb_avl, us_compare,
sizeof (us_node_t), offsetof(us_node_t, usn_avlnode));
for (i = 0; i < USFIELD_LAST; i++) for (i = 0; i < USFIELD_LAST; i++)
cb.cb_width[i] = strlen(gettext(us_field_hdr[i])); cb.cb_width[i] = strlen(gettext(us_field_hdr[i]));
@ -3533,52 +3551,59 @@ zfs_do_userspace(int argc, char **argv)
cb.cb_prop = p; cb.cb_prop = p;
if ((ret = zfs_userspace(zhp, p, userspace_cb, &cb)) != 0) { if ((ret = zfs_userspace(zhp, p, userspace_cb, &cb)) != 0) {
zfs_close(zhp); zfs_close(zhp);
avl_destroy(&cb.cb_avl);
return (ret); return (ret);
} }
} }
zfs_close(zhp); zfs_close(zhp);
/* Sort the list */ /* Sort the list */
if ((node = avl_first(&cb.cb_avl)) == NULL) { if ((node = uu_avl_first(avl_tree)) == NULL)
avl_destroy(&cb.cb_avl);
return (0); return (0);
}
us_populated = B_TRUE; us_populated = B_TRUE;
list_create(&list, sizeof (us_node_t), listpool = uu_list_pool_create("tmplist", sizeof (us_node_t),
offsetof(us_node_t, usn_listnode)); offsetof(us_node_t, usn_listnode), NULL, UU_DEFAULT);
list_link_init(&node->usn_listnode); list = uu_list_create(listpool, NULL, UU_DEFAULT);
uu_list_node_init(node, &node->usn_listnode, listpool);
while (node != NULL) { while (node != NULL) {
rmnode = node; rmnode = node;
node = AVL_NEXT(&cb.cb_avl, node); node = uu_avl_next(avl_tree, node);
avl_remove(&cb.cb_avl, rmnode); uu_avl_remove(avl_tree, rmnode);
list_insert_head(&list, rmnode); if (uu_list_find(list, rmnode, NULL, &idx2) == NULL)
uu_list_insert(list, rmnode, idx2);
} }
for (node = list_head(&list); node != NULL; for (node = uu_list_first(list); node != NULL;
node = list_next(&list, node)) { node = uu_list_next(list, node)) {
if (avl_find(&cb.cb_avl, node, &idx) == NULL) us_sort_info_t sortinfo = { sortcol, cb.cb_numname };
avl_insert(&cb.cb_avl, node, idx);
if (uu_avl_find(avl_tree, node, &sortinfo, &idx) == NULL)
uu_avl_insert(avl_tree, node, idx);
} }
while ((node = list_remove_head(&list)) != NULL) { } uu_list_destroy(list);
list_destroy(&list); uu_list_pool_destroy(listpool);
/* Print and free node nvlist memory */ /* Print and free node nvlist memory */
print_us(scripted, parsable, fields, types, cb.cb_width, B_TRUE, print_us(scripted, parsable, fields, types, cb.cb_width, B_TRUE,
&cb.cb_avl); cb.cb_avl);
zfs_free_sort_columns(sortcol); zfs_free_sort_columns(sortcol);
/* Clean up the AVL tree */ /* Clean up the AVL tree */
void *cookie = NULL; if ((walk = uu_avl_walk_start(cb.cb_avl, UU_WALK_ROBUST)) == NULL)
while ((node = avl_destroy_nodes(&cb.cb_avl, &cookie)) != NULL) { nomem();
while ((node = uu_avl_walk_next(walk)) != NULL) {
uu_avl_remove(cb.cb_avl, node);
free(node); free(node);
} }
avl_destroy(&cb.cb_avl);
uu_avl_walk_end(walk);
uu_avl_destroy(avl_tree);
uu_avl_pool_destroy(avl_pool);
return (ret); return (ret);
} }
@ -5384,7 +5409,7 @@ typedef struct deleg_perm {
typedef struct deleg_perm_node { typedef struct deleg_perm_node {
deleg_perm_t dpn_perm; deleg_perm_t dpn_perm;
avl_node_t dpn_avl_node; uu_avl_node_t dpn_avl_node;
} deleg_perm_node_t; } deleg_perm_node_t;
typedef struct fs_perm fs_perm_t; typedef struct fs_perm fs_perm_t;
@ -5396,13 +5421,13 @@ typedef struct who_perm {
char who_ug_name[256]; /* user/group name */ char who_ug_name[256]; /* user/group name */
fs_perm_t *who_fsperm; /* uplink */ fs_perm_t *who_fsperm; /* uplink */
avl_tree_t who_deleg_perm_avl; /* permissions */ uu_avl_t *who_deleg_perm_avl; /* permissions */
} who_perm_t; } who_perm_t;
/* */ /* */
typedef struct who_perm_node { typedef struct who_perm_node {
who_perm_t who_perm; who_perm_t who_perm;
avl_node_t who_avl_node; uu_avl_node_t who_avl_node;
} who_perm_node_t; } who_perm_node_t;
typedef struct fs_perm_set fs_perm_set_t; typedef struct fs_perm_set fs_perm_set_t;
@ -5410,8 +5435,8 @@ typedef struct fs_perm_set fs_perm_set_t;
struct fs_perm { struct fs_perm {
const char *fsp_name; const char *fsp_name;
avl_tree_t fsp_sc_avl; /* sets,create */ uu_avl_t *fsp_sc_avl; /* sets,create */
avl_tree_t fsp_uge_avl; /* user,group,everyone */ uu_avl_t *fsp_uge_avl; /* user,group,everyone */
fs_perm_set_t *fsp_set; /* uplink */ fs_perm_set_t *fsp_set; /* uplink */
}; };
@ -5419,14 +5444,19 @@ struct fs_perm {
/* */ /* */
typedef struct fs_perm_node { typedef struct fs_perm_node {
fs_perm_t fspn_fsperm; fs_perm_t fspn_fsperm;
avl_tree_t fspn_avl; uu_avl_t *fspn_avl;
list_node_t fspn_list_node; uu_list_node_t fspn_list_node;
} fs_perm_node_t; } fs_perm_node_t;
/* top level structure */ /* top level structure */
struct fs_perm_set { struct fs_perm_set {
list_t fsps_list; /* list of fs_perms */ uu_list_pool_t *fsps_list_pool;
uu_list_t *fsps_list; /* list of fs_perms */
uu_avl_pool_t *fsps_named_set_avl_pool;
uu_avl_pool_t *fsps_who_perm_avl_pool;
uu_avl_pool_t *fsps_deleg_perm_avl_pool;
}; };
static inline const char * static inline const char *
@ -5489,8 +5519,9 @@ who_type2weight(zfs_deleg_who_type_t who_type)
} }
static int static int
who_perm_compare(const void *larg, const void *rarg) who_perm_compare(const void *larg, const void *rarg, void *unused)
{ {
(void) unused;
const who_perm_node_t *l = larg; const who_perm_node_t *l = larg;
const who_perm_node_t *r = rarg; const who_perm_node_t *r = rarg;
zfs_deleg_who_type_t ltype = l->who_perm.who_type; zfs_deleg_who_type_t ltype = l->who_perm.who_type;
@ -5501,24 +5532,63 @@ who_perm_compare(const void *larg, const void *rarg)
if (res == 0) if (res == 0)
res = strncmp(l->who_perm.who_name, r->who_perm.who_name, res = strncmp(l->who_perm.who_name, r->who_perm.who_name,
ZFS_MAX_DELEG_NAME-1); ZFS_MAX_DELEG_NAME-1);
return (TREE_ISIGN(res));
if (res == 0)
return (0);
if (res > 0)
return (1);
else
return (-1);
} }
static int static int
deleg_perm_compare(const void *larg, const void *rarg) deleg_perm_compare(const void *larg, const void *rarg, void *unused)
{ {
(void) unused;
const deleg_perm_node_t *l = larg; const deleg_perm_node_t *l = larg;
const deleg_perm_node_t *r = rarg; const deleg_perm_node_t *r = rarg;
return (TREE_ISIGN(strncmp(l->dpn_perm.dp_name, r->dpn_perm.dp_name, int res = strncmp(l->dpn_perm.dp_name, r->dpn_perm.dp_name,
ZFS_MAX_DELEG_NAME-1))); ZFS_MAX_DELEG_NAME-1);
if (res == 0)
return (0);
if (res > 0)
return (1);
else
return (-1);
} }
static inline void static inline void
fs_perm_set_init(fs_perm_set_t *fspset) fs_perm_set_init(fs_perm_set_t *fspset)
{ {
memset(fspset, 0, sizeof (fs_perm_set_t)); memset(fspset, 0, sizeof (fs_perm_set_t));
list_create(&fspset->fsps_list, sizeof (fs_perm_node_t),
offsetof(fs_perm_node_t, fspn_list_node)); if ((fspset->fsps_list_pool = uu_list_pool_create("fsps_list_pool",
sizeof (fs_perm_node_t), offsetof(fs_perm_node_t, fspn_list_node),
NULL, UU_DEFAULT)) == NULL)
nomem();
if ((fspset->fsps_list = uu_list_create(fspset->fsps_list_pool, NULL,
UU_DEFAULT)) == NULL)
nomem();
if ((fspset->fsps_named_set_avl_pool = uu_avl_pool_create(
"named_set_avl_pool", sizeof (who_perm_node_t), offsetof(
who_perm_node_t, who_avl_node), who_perm_compare,
UU_DEFAULT)) == NULL)
nomem();
if ((fspset->fsps_who_perm_avl_pool = uu_avl_pool_create(
"who_perm_avl_pool", sizeof (who_perm_node_t), offsetof(
who_perm_node_t, who_avl_node), who_perm_compare,
UU_DEFAULT)) == NULL)
nomem();
if ((fspset->fsps_deleg_perm_avl_pool = uu_avl_pool_create(
"deleg_perm_avl_pool", sizeof (deleg_perm_node_t), offsetof(
deleg_perm_node_t, dpn_avl_node), deleg_perm_compare, UU_DEFAULT))
== NULL)
nomem();
} }
static inline void fs_perm_fini(fs_perm_t *); static inline void fs_perm_fini(fs_perm_t *);
@ -5527,13 +5597,21 @@ static inline void who_perm_fini(who_perm_t *);
static inline void static inline void
fs_perm_set_fini(fs_perm_set_t *fspset) fs_perm_set_fini(fs_perm_set_t *fspset)
{ {
fs_perm_node_t *node; fs_perm_node_t *node = uu_list_first(fspset->fsps_list);
while ((node = list_remove_head(&fspset->fsps_list)) != NULL) {
while (node != NULL) {
fs_perm_node_t *next_node =
uu_list_next(fspset->fsps_list, node);
fs_perm_t *fsperm = &node->fspn_fsperm; fs_perm_t *fsperm = &node->fspn_fsperm;
fs_perm_fini(fsperm); fs_perm_fini(fsperm);
uu_list_remove(fspset->fsps_list, node);
free(node); free(node);
node = next_node;
} }
list_destroy(&fspset->fsps_list);
uu_avl_pool_destroy(fspset->fsps_named_set_avl_pool);
uu_avl_pool_destroy(fspset->fsps_who_perm_avl_pool);
uu_avl_pool_destroy(fspset->fsps_deleg_perm_avl_pool);
} }
static inline void static inline void
@ -5548,11 +5626,14 @@ static inline void
who_perm_init(who_perm_t *who_perm, fs_perm_t *fsperm, who_perm_init(who_perm_t *who_perm, fs_perm_t *fsperm,
zfs_deleg_who_type_t type, const char *name) zfs_deleg_who_type_t type, const char *name)
{ {
uu_avl_pool_t *pool;
pool = fsperm->fsp_set->fsps_deleg_perm_avl_pool;
memset(who_perm, 0, sizeof (who_perm_t)); memset(who_perm, 0, sizeof (who_perm_t));
avl_create(&who_perm->who_deleg_perm_avl, deleg_perm_compare, if ((who_perm->who_deleg_perm_avl = uu_avl_create(pool, NULL,
sizeof (deleg_perm_node_t), UU_DEFAULT)) == NULL)
offsetof(deleg_perm_node_t, dpn_avl_node)); nomem();
who_perm->who_type = type; who_perm->who_type = type;
who_perm->who_name = name; who_perm->who_name = name;
@ -5562,26 +5643,35 @@ who_perm_init(who_perm_t *who_perm, fs_perm_t *fsperm,
static inline void static inline void
who_perm_fini(who_perm_t *who_perm) who_perm_fini(who_perm_t *who_perm)
{ {
deleg_perm_node_t *node; deleg_perm_node_t *node = uu_avl_first(who_perm->who_deleg_perm_avl);
void *cookie = NULL;
while ((node = avl_destroy_nodes(&who_perm->who_deleg_perm_avl, while (node != NULL) {
&cookie)) != NULL) { deleg_perm_node_t *next_node =
uu_avl_next(who_perm->who_deleg_perm_avl, node);
uu_avl_remove(who_perm->who_deleg_perm_avl, node);
free(node); free(node);
node = next_node;
} }
avl_destroy(&who_perm->who_deleg_perm_avl); uu_avl_destroy(who_perm->who_deleg_perm_avl);
} }
static inline void static inline void
fs_perm_init(fs_perm_t *fsperm, fs_perm_set_t *fspset, const char *fsname) fs_perm_init(fs_perm_t *fsperm, fs_perm_set_t *fspset, const char *fsname)
{ {
uu_avl_pool_t *nset_pool = fspset->fsps_named_set_avl_pool;
uu_avl_pool_t *who_pool = fspset->fsps_who_perm_avl_pool;
memset(fsperm, 0, sizeof (fs_perm_t)); memset(fsperm, 0, sizeof (fs_perm_t));
avl_create(&fsperm->fsp_sc_avl, who_perm_compare, if ((fsperm->fsp_sc_avl = uu_avl_create(nset_pool, NULL, UU_DEFAULT))
sizeof (who_perm_node_t), offsetof(who_perm_node_t, who_avl_node)); == NULL)
avl_create(&fsperm->fsp_uge_avl, who_perm_compare, nomem();
sizeof (who_perm_node_t), offsetof(who_perm_node_t, who_avl_node));
if ((fsperm->fsp_uge_avl = uu_avl_create(who_pool, NULL, UU_DEFAULT))
== NULL)
nomem();
fsperm->fsp_set = fspset; fsperm->fsp_set = fspset;
fsperm->fsp_name = fsname; fsperm->fsp_name = fsname;
@ -5590,41 +5680,46 @@ fs_perm_init(fs_perm_t *fsperm, fs_perm_set_t *fspset, const char *fsname)
static inline void static inline void
fs_perm_fini(fs_perm_t *fsperm) fs_perm_fini(fs_perm_t *fsperm)
{ {
who_perm_node_t *node; who_perm_node_t *node = uu_avl_first(fsperm->fsp_sc_avl);
void *cookie = NULL; while (node != NULL) {
who_perm_node_t *next_node = uu_avl_next(fsperm->fsp_sc_avl,
while ((node = avl_destroy_nodes(&fsperm->fsp_sc_avl, node);
&cookie)) != NULL) {
who_perm_t *who_perm = &node->who_perm; who_perm_t *who_perm = &node->who_perm;
who_perm_fini(who_perm); who_perm_fini(who_perm);
uu_avl_remove(fsperm->fsp_sc_avl, node);
free(node); free(node);
node = next_node;
} }
cookie = NULL; node = uu_avl_first(fsperm->fsp_uge_avl);
while ((node = avl_destroy_nodes(&fsperm->fsp_uge_avl, while (node != NULL) {
&cookie)) != NULL) { who_perm_node_t *next_node = uu_avl_next(fsperm->fsp_uge_avl,
node);
who_perm_t *who_perm = &node->who_perm; who_perm_t *who_perm = &node->who_perm;
who_perm_fini(who_perm); who_perm_fini(who_perm);
uu_avl_remove(fsperm->fsp_uge_avl, node);
free(node); free(node);
node = next_node;
} }
avl_destroy(&fsperm->fsp_sc_avl); uu_avl_destroy(fsperm->fsp_sc_avl);
avl_destroy(&fsperm->fsp_uge_avl); uu_avl_destroy(fsperm->fsp_uge_avl);
} }
static void static void
set_deleg_perm_node(avl_tree_t *avl, deleg_perm_node_t *node, set_deleg_perm_node(uu_avl_t *avl, deleg_perm_node_t *node,
zfs_deleg_who_type_t who_type, const char *name, char locality) zfs_deleg_who_type_t who_type, const char *name, char locality)
{ {
avl_index_t idx = 0; uu_avl_index_t idx = 0;
deleg_perm_node_t *found_node = NULL; deleg_perm_node_t *found_node = NULL;
deleg_perm_t *deleg_perm = &node->dpn_perm; deleg_perm_t *deleg_perm = &node->dpn_perm;
deleg_perm_init(deleg_perm, who_type, name); deleg_perm_init(deleg_perm, who_type, name);
if ((found_node = avl_find(avl, node, &idx)) == NULL) if ((found_node = uu_avl_find(avl, node, NULL, &idx))
avl_insert(avl, node, idx); == NULL)
uu_avl_insert(avl, node, idx);
else { else {
node = found_node; node = found_node;
deleg_perm = &node->dpn_perm; deleg_perm = &node->dpn_perm;
@ -5649,17 +5744,20 @@ static inline int
parse_who_perm(who_perm_t *who_perm, nvlist_t *nvl, char locality) parse_who_perm(who_perm_t *who_perm, nvlist_t *nvl, char locality)
{ {
nvpair_t *nvp = NULL; nvpair_t *nvp = NULL;
avl_tree_t *avl = &who_perm->who_deleg_perm_avl; fs_perm_set_t *fspset = who_perm->who_fsperm->fsp_set;
uu_avl_t *avl = who_perm->who_deleg_perm_avl;
zfs_deleg_who_type_t who_type = who_perm->who_type; zfs_deleg_who_type_t who_type = who_perm->who_type;
while ((nvp = nvlist_next_nvpair(nvl, nvp)) != NULL) { while ((nvp = nvlist_next_nvpair(nvl, nvp)) != NULL) {
const char *name = nvpair_name(nvp); const char *name = nvpair_name(nvp);
data_type_t type = nvpair_type(nvp); data_type_t type = nvpair_type(nvp);
uu_avl_pool_t *avl_pool = fspset->fsps_deleg_perm_avl_pool;
deleg_perm_node_t *node = deleg_perm_node_t *node =
safe_malloc(sizeof (deleg_perm_node_t)); safe_malloc(sizeof (deleg_perm_node_t));
VERIFY(type == DATA_TYPE_BOOLEAN); VERIFY(type == DATA_TYPE_BOOLEAN);
uu_avl_node_init(node, &node->dpn_avl_node, avl_pool);
set_deleg_perm_node(avl, node, who_type, name, locality); set_deleg_perm_node(avl, node, who_type, name, locality);
} }
@ -5670,11 +5768,13 @@ static inline int
parse_fs_perm(fs_perm_t *fsperm, nvlist_t *nvl) parse_fs_perm(fs_perm_t *fsperm, nvlist_t *nvl)
{ {
nvpair_t *nvp = NULL; nvpair_t *nvp = NULL;
fs_perm_set_t *fspset = fsperm->fsp_set;
while ((nvp = nvlist_next_nvpair(nvl, nvp)) != NULL) { while ((nvp = nvlist_next_nvpair(nvl, nvp)) != NULL) {
nvlist_t *nvl2 = NULL; nvlist_t *nvl2 = NULL;
const char *name = nvpair_name(nvp); const char *name = nvpair_name(nvp);
avl_tree_t *avl = NULL; uu_avl_t *avl = NULL;
uu_avl_pool_t *avl_pool = NULL;
zfs_deleg_who_type_t perm_type = name[0]; zfs_deleg_who_type_t perm_type = name[0];
char perm_locality = name[1]; char perm_locality = name[1];
const char *perm_name = name + 3; const char *perm_name = name + 3;
@ -5690,7 +5790,8 @@ parse_fs_perm(fs_perm_t *fsperm, nvlist_t *nvl)
case ZFS_DELEG_CREATE_SETS: case ZFS_DELEG_CREATE_SETS:
case ZFS_DELEG_NAMED_SET: case ZFS_DELEG_NAMED_SET:
case ZFS_DELEG_NAMED_SET_SETS: case ZFS_DELEG_NAMED_SET_SETS:
avl = &fsperm->fsp_sc_avl; avl_pool = fspset->fsps_named_set_avl_pool;
avl = fsperm->fsp_sc_avl;
break; break;
case ZFS_DELEG_USER: case ZFS_DELEG_USER:
case ZFS_DELEG_USER_SETS: case ZFS_DELEG_USER_SETS:
@ -5698,7 +5799,8 @@ parse_fs_perm(fs_perm_t *fsperm, nvlist_t *nvl)
case ZFS_DELEG_GROUP_SETS: case ZFS_DELEG_GROUP_SETS:
case ZFS_DELEG_EVERYONE: case ZFS_DELEG_EVERYONE:
case ZFS_DELEG_EVERYONE_SETS: case ZFS_DELEG_EVERYONE_SETS:
avl = &fsperm->fsp_uge_avl; avl_pool = fspset->fsps_who_perm_avl_pool;
avl = fsperm->fsp_uge_avl;
break; break;
default: default:
@ -5709,12 +5811,14 @@ parse_fs_perm(fs_perm_t *fsperm, nvlist_t *nvl)
who_perm_node_t *node = safe_malloc( who_perm_node_t *node = safe_malloc(
sizeof (who_perm_node_t)); sizeof (who_perm_node_t));
who_perm = &node->who_perm; who_perm = &node->who_perm;
avl_index_t idx = 0; uu_avl_index_t idx = 0;
uu_avl_node_init(node, &node->who_avl_node, avl_pool);
who_perm_init(who_perm, fsperm, perm_type, perm_name); who_perm_init(who_perm, fsperm, perm_type, perm_name);
if ((found_node = avl_find(avl, node, &idx)) == NULL) { if ((found_node = uu_avl_find(avl, node, NULL, &idx))
if (avl == &fsperm->fsp_uge_avl) { == NULL) {
if (avl == fsperm->fsp_uge_avl) {
uid_t rid = 0; uid_t rid = 0;
struct passwd *p = NULL; struct passwd *p = NULL;
struct group *g = NULL; struct group *g = NULL;
@ -5753,7 +5857,7 @@ parse_fs_perm(fs_perm_t *fsperm, nvlist_t *nvl)
} }
} }
avl_insert(avl, node, idx); uu_avl_insert(avl, node, idx);
} else { } else {
node = found_node; node = found_node;
who_perm = &node->who_perm; who_perm = &node->who_perm;
@ -5770,6 +5874,7 @@ static inline int
parse_fs_perm_set(fs_perm_set_t *fspset, nvlist_t *nvl) parse_fs_perm_set(fs_perm_set_t *fspset, nvlist_t *nvl)
{ {
nvpair_t *nvp = NULL; nvpair_t *nvp = NULL;
uu_avl_index_t idx = 0;
while ((nvp = nvlist_next_nvpair(nvl, nvp)) != NULL) { while ((nvp = nvlist_next_nvpair(nvl, nvp)) != NULL) {
nvlist_t *nvl2 = NULL; nvlist_t *nvl2 = NULL;
@ -5782,6 +5887,10 @@ parse_fs_perm_set(fs_perm_set_t *fspset, nvlist_t *nvl)
VERIFY(DATA_TYPE_NVLIST == type); VERIFY(DATA_TYPE_NVLIST == type);
uu_list_node_init(node, &node->fspn_list_node,
fspset->fsps_list_pool);
idx = uu_list_numnodes(fspset->fsps_list);
fs_perm_init(fsperm, fspset, fsname); fs_perm_init(fsperm, fspset, fsname);
if (nvpair_value_nvlist(nvp, &nvl2) != 0) if (nvpair_value_nvlist(nvp, &nvl2) != 0)
@ -5789,7 +5898,7 @@ parse_fs_perm_set(fs_perm_set_t *fspset, nvlist_t *nvl)
(void) parse_fs_perm(fsperm, nvl2); (void) parse_fs_perm(fsperm, nvl2);
list_insert_tail(&fspset->fsps_list, node); uu_list_insert(fspset->fsps_list, node, idx);
} }
return (0); return (0);
@ -6341,7 +6450,7 @@ construct_fsacl_list(boolean_t un, struct allow_opts *opts, nvlist_t **nvlp)
} }
static void static void
print_set_creat_perms(avl_tree_t *who_avl) print_set_creat_perms(uu_avl_t *who_avl)
{ {
const char *sc_title[] = { const char *sc_title[] = {
gettext("Permission sets:\n"), gettext("Permission sets:\n"),
@ -6351,9 +6460,9 @@ print_set_creat_perms(avl_tree_t *who_avl)
who_perm_node_t *who_node = NULL; who_perm_node_t *who_node = NULL;
int prev_weight = -1; int prev_weight = -1;
for (who_node = avl_first(who_avl); who_node != NULL; for (who_node = uu_avl_first(who_avl); who_node != NULL;
who_node = AVL_NEXT(who_avl, who_node)) { who_node = uu_avl_next(who_avl, who_node)) {
avl_tree_t *avl = &who_node->who_perm.who_deleg_perm_avl; uu_avl_t *avl = who_node->who_perm.who_deleg_perm_avl;
zfs_deleg_who_type_t who_type = who_node->who_perm.who_type; zfs_deleg_who_type_t who_type = who_node->who_perm.who_type;
const char *who_name = who_node->who_perm.who_name; const char *who_name = who_node->who_perm.who_name;
int weight = who_type2weight(who_type); int weight = who_type2weight(who_type);
@ -6370,8 +6479,8 @@ print_set_creat_perms(avl_tree_t *who_avl)
else else
(void) printf("\t%s ", who_name); (void) printf("\t%s ", who_name);
for (deleg_node = avl_first(avl); deleg_node != NULL; for (deleg_node = uu_avl_first(avl); deleg_node != NULL;
deleg_node = AVL_NEXT(avl, deleg_node)) { deleg_node = uu_avl_next(avl, deleg_node)) {
if (first) { if (first) {
(void) printf("%s", (void) printf("%s",
deleg_node->dpn_perm.dp_name); deleg_node->dpn_perm.dp_name);
@ -6386,24 +6495,28 @@ print_set_creat_perms(avl_tree_t *who_avl)
} }
static void static void
print_uge_deleg_perms(avl_tree_t *who_avl, boolean_t local, boolean_t descend, print_uge_deleg_perms(uu_avl_t *who_avl, boolean_t local, boolean_t descend,
const char *title) const char *title)
{ {
who_perm_node_t *who_node = NULL; who_perm_node_t *who_node = NULL;
boolean_t prt_title = B_TRUE; boolean_t prt_title = B_TRUE;
uu_avl_walk_t *walk;
for (who_node = avl_first(who_avl); who_node != NULL; if ((walk = uu_avl_walk_start(who_avl, UU_WALK_ROBUST)) == NULL)
who_node = AVL_NEXT(who_avl, who_node)) { nomem();
while ((who_node = uu_avl_walk_next(walk)) != NULL) {
const char *who_name = who_node->who_perm.who_name; const char *who_name = who_node->who_perm.who_name;
const char *nice_who_name = who_node->who_perm.who_ug_name; const char *nice_who_name = who_node->who_perm.who_ug_name;
avl_tree_t *avl = &who_node->who_perm.who_deleg_perm_avl; uu_avl_t *avl = who_node->who_perm.who_deleg_perm_avl;
zfs_deleg_who_type_t who_type = who_node->who_perm.who_type; zfs_deleg_who_type_t who_type = who_node->who_perm.who_type;
char delim = ' '; char delim = ' ';
deleg_perm_node_t *deleg_node; deleg_perm_node_t *deleg_node;
boolean_t prt_who = B_TRUE; boolean_t prt_who = B_TRUE;
for (deleg_node = avl_first(avl); deleg_node != NULL; for (deleg_node = uu_avl_first(avl);
deleg_node = AVL_NEXT(avl, deleg_node)) { deleg_node != NULL;
deleg_node = uu_avl_next(avl, deleg_node)) {
if (local != deleg_node->dpn_perm.dp_local || if (local != deleg_node->dpn_perm.dp_local ||
descend != deleg_node->dpn_perm.dp_descend) descend != deleg_node->dpn_perm.dp_descend)
continue; continue;
@ -6453,6 +6566,8 @@ print_uge_deleg_perms(avl_tree_t *who_avl, boolean_t local, boolean_t descend,
if (!prt_who) if (!prt_who)
(void) printf("\n"); (void) printf("\n");
} }
uu_avl_walk_end(walk);
} }
static void static void
@ -6462,10 +6577,10 @@ print_fs_perms(fs_perm_set_t *fspset)
char buf[MAXNAMELEN + 32]; char buf[MAXNAMELEN + 32];
const char *dsname = buf; const char *dsname = buf;
for (node = list_head(&fspset->fsps_list); node != NULL; for (node = uu_list_first(fspset->fsps_list); node != NULL;
node = list_next(&fspset->fsps_list, node)) { node = uu_list_next(fspset->fsps_list, node)) {
avl_tree_t *sc_avl = &node->fspn_fsperm.fsp_sc_avl; uu_avl_t *sc_avl = node->fspn_fsperm.fsp_sc_avl;
avl_tree_t *uge_avl = &node->fspn_fsperm.fsp_uge_avl; uu_avl_t *uge_avl = node->fspn_fsperm.fsp_uge_avl;
int left = 0; int left = 0;
(void) snprintf(buf, sizeof (buf), (void) snprintf(buf, sizeof (buf),
@ -6487,7 +6602,7 @@ print_fs_perms(fs_perm_set_t *fspset)
} }
} }
static fs_perm_set_t fs_perm_set = {}; static fs_perm_set_t fs_perm_set = { NULL, NULL, NULL, NULL };
struct deleg_perms { struct deleg_perms {
boolean_t un; boolean_t un;
@ -7347,14 +7462,15 @@ append_options(char *mntopts, char *newopts)
static enum sa_protocol static enum sa_protocol
sa_protocol_decode(const char *protocol) sa_protocol_decode(const char *protocol)
{ {
for (enum sa_protocol i = 0; i < SA_PROTOCOL_COUNT; ++i) for (enum sa_protocol i = 0; i < ARRAY_SIZE(sa_protocol_names); ++i)
if (strcmp(protocol, zfs_share_protocol_name(i)) == 0) if (strcmp(protocol, sa_protocol_names[i]) == 0)
return (i); return (i);
(void) fputs(gettext("share type must be one of: "), stderr); (void) fputs(gettext("share type must be one of: "), stderr);
for (enum sa_protocol i = 0; i < SA_PROTOCOL_COUNT; ++i) for (enum sa_protocol i = 0;
i < ARRAY_SIZE(sa_protocol_names); ++i)
(void) fprintf(stderr, "%s%s", (void) fprintf(stderr, "%s%s",
i != 0 ? ", " : "", zfs_share_protocol_name(i)); i != 0 ? ", " : "", sa_protocol_names[i]);
(void) fputc('\n', stderr); (void) fputc('\n', stderr);
usage(B_FALSE); usage(B_FALSE);
} }
@ -7618,16 +7734,17 @@ zfs_do_share(int argc, char **argv)
typedef struct unshare_unmount_node { typedef struct unshare_unmount_node {
zfs_handle_t *un_zhp; zfs_handle_t *un_zhp;
char *un_mountp; char *un_mountp;
avl_node_t un_avlnode; uu_avl_node_t un_avlnode;
} unshare_unmount_node_t; } unshare_unmount_node_t;
static int static int
unshare_unmount_compare(const void *larg, const void *rarg) unshare_unmount_compare(const void *larg, const void *rarg, void *unused)
{ {
(void) unused;
const unshare_unmount_node_t *l = larg; const unshare_unmount_node_t *l = larg;
const unshare_unmount_node_t *r = rarg; const unshare_unmount_node_t *r = rarg;
return (TREE_ISIGN(strcmp(l->un_mountp, r->un_mountp))); return (strcmp(l->un_mountp, r->un_mountp));
} }
/* /*
@ -7809,9 +7926,11 @@ unshare_unmount(int op, int argc, char **argv)
*/ */
FILE *mnttab; FILE *mnttab;
struct mnttab entry; struct mnttab entry;
avl_tree_t tree; uu_avl_pool_t *pool;
uu_avl_t *tree = NULL;
unshare_unmount_node_t *node; unshare_unmount_node_t *node;
avl_index_t idx; uu_avl_index_t idx;
uu_avl_walk_t *walk;
enum sa_protocol *protocol = NULL, enum sa_protocol *protocol = NULL,
single_protocol[] = {SA_NO_PROTOCOL, SA_NO_PROTOCOL}; single_protocol[] = {SA_NO_PROTOCOL, SA_NO_PROTOCOL};
@ -7827,12 +7946,16 @@ unshare_unmount(int op, int argc, char **argv)
usage(B_FALSE); usage(B_FALSE);
} }
avl_create(&tree, unshare_unmount_compare, if (((pool = uu_avl_pool_create("unmount_pool",
sizeof (unshare_unmount_node_t), sizeof (unshare_unmount_node_t),
offsetof(unshare_unmount_node_t, un_avlnode)); offsetof(unshare_unmount_node_t, un_avlnode),
unshare_unmount_compare, UU_DEFAULT)) == NULL) ||
((tree = uu_avl_create(pool, NULL, UU_DEFAULT)) == NULL))
nomem();
if ((mnttab = fopen(MNTTAB, "re")) == NULL) { if ((mnttab = fopen(MNTTAB, "re")) == NULL) {
avl_destroy(&tree); uu_avl_destroy(tree);
uu_avl_pool_destroy(pool);
return (ENOENT); return (ENOENT);
} }
@ -7897,8 +8020,10 @@ unshare_unmount(int op, int argc, char **argv)
node->un_zhp = zhp; node->un_zhp = zhp;
node->un_mountp = safe_strdup(entry.mnt_mountp); node->un_mountp = safe_strdup(entry.mnt_mountp);
if (avl_find(&tree, node, &idx) == NULL) { uu_avl_node_init(node, &node->un_avlnode, pool);
avl_insert(&tree, node, idx);
if (uu_avl_find(tree, node, NULL, &idx) == NULL) {
uu_avl_insert(tree, node, idx);
} else { } else {
zfs_close(node->un_zhp); zfs_close(node->un_zhp);
free(node->un_mountp); free(node->un_mountp);
@ -7911,10 +8036,14 @@ unshare_unmount(int op, int argc, char **argv)
* Walk the AVL tree in reverse, unmounting each filesystem and * Walk the AVL tree in reverse, unmounting each filesystem and
* removing it from the AVL tree in the process. * removing it from the AVL tree in the process.
*/ */
while ((node = avl_last(&tree)) != NULL) { if ((walk = uu_avl_walk_start(tree,
UU_WALK_REVERSE | UU_WALK_ROBUST)) == NULL)
nomem();
while ((node = uu_avl_walk_next(walk)) != NULL) {
const char *mntarg = NULL; const char *mntarg = NULL;
avl_remove(&tree, node); uu_avl_remove(tree, node);
switch (op) { switch (op) {
case OP_SHARE: case OP_SHARE:
if (zfs_unshare(node->un_zhp, if (zfs_unshare(node->un_zhp,
@ -7937,7 +8066,9 @@ unshare_unmount(int op, int argc, char **argv)
if (op == OP_SHARE) if (op == OP_SHARE)
zfs_commit_shares(protocol); zfs_commit_shares(protocol);
avl_destroy(&tree); uu_avl_walk_end(walk);
uu_avl_destroy(tree);
uu_avl_pool_destroy(pool);
} else { } else {
if (argc != 1) { if (argc != 1) {
@ -9080,17 +9211,11 @@ zfs_do_rewrite(int argc, char **argv)
zfs_rewrite_args_t args; zfs_rewrite_args_t args;
memset(&args, 0, sizeof (args)); memset(&args, 0, sizeof (args));
while ((c = getopt(argc, argv, "CPSl:o:rvx")) != -1) { while ((c = getopt(argc, argv, "Pl:o:rvx")) != -1) {
switch (c) { switch (c) {
case 'C':
args.flags |= ZFS_REWRITE_SKIP_BRT;
break;
case 'P': case 'P':
args.flags |= ZFS_REWRITE_PHYSICAL; args.flags |= ZFS_REWRITE_PHYSICAL;
break; break;
case 'S':
args.flags |= ZFS_REWRITE_SKIP_SNAPSHOT;
break;
case 'l': case 'l':
args.len = strtoll(optarg, NULL, 0); args.len = strtoll(optarg, NULL, 0);
break; break;

View File

@ -56,7 +56,6 @@
#include <zfeature_common.h> #include <zfeature_common.h>
#include <libzutil.h> #include <libzutil.h>
#include <sys/metaslab_impl.h> #include <sys/metaslab_impl.h>
#include <libzpool.h>
static importargs_t g_importargs; static importargs_t g_importargs;
static char *g_pool; static char *g_pool;
@ -745,11 +744,8 @@ zhack_do_metaslab_leak(int argc, char **argv)
&start, &size), ==, 2); &start, &size), ==, 2);
ASSERT(vd); ASSERT(vd);
size_t idx; metaslab_t *cur =
idx = start >> vd->vdev_ms_shift; vd->vdev_ms[start >> vd->vdev_ms_shift];
if (idx >= vd->vdev_ms_count)
continue;
metaslab_t *cur = vd->vdev_ms[idx];
if (prev != cur) { if (prev != cur) {
if (prev) { if (prev) {
dmu_tx_commit(tx); dmu_tx_commit(tx);

View File

@ -1,4 +1,3 @@
# SPDX-License-Identifier: CDDL-1.0
sbin_PROGRAMS += zinject sbin_PROGRAMS += zinject
CPPCHECKTARGETS += zinject CPPCHECKTARGETS += zinject

View File

@ -1,4 +1,3 @@
# SPDX-License-Identifier: CDDL-1.0
zpool_CFLAGS = $(AM_CFLAGS) zpool_CFLAGS = $(AM_CFLAGS)
zpool_CFLAGS += $(LIBBLKID_CFLAGS) $(LIBUUID_CFLAGS) zpool_CFLAGS += $(LIBBLKID_CFLAGS) $(LIBUUID_CFLAGS)
@ -29,6 +28,7 @@ zpool_LDADD = \
libzfs.la \ libzfs.la \
libzfs_core.la \ libzfs_core.la \
libnvpair.la \ libnvpair.la \
libuutil.la \
libzutil.la libzutil.la
zpool_LDADD += $(LTLIBINTL) zpool_LDADD += $(LTLIBINTL)

View File

@ -30,10 +30,12 @@
*/ */
#include <libintl.h> #include <libintl.h>
#include <libuutil.h>
#include <stddef.h> #include <stddef.h>
#include <stdio.h> #include <stdio.h>
#include <stdlib.h> #include <stdlib.h>
#include <string.h> #include <string.h>
#include <thread_pool.h>
#include <libzfs.h> #include <libzfs.h>
#include <libzutil.h> #include <libzutil.h>
@ -50,28 +52,30 @@
typedef struct zpool_node { typedef struct zpool_node {
zpool_handle_t *zn_handle; zpool_handle_t *zn_handle;
avl_node_t zn_avlnode; uu_avl_node_t zn_avlnode;
hrtime_t zn_last_refresh; hrtime_t zn_last_refresh;
} zpool_node_t; } zpool_node_t;
struct zpool_list { struct zpool_list {
boolean_t zl_findall; boolean_t zl_findall;
boolean_t zl_literal; boolean_t zl_literal;
avl_tree_t zl_avl; uu_avl_t *zl_avl;
uu_avl_pool_t *zl_pool;
zprop_list_t **zl_proplist; zprop_list_t **zl_proplist;
zfs_type_t zl_type; zfs_type_t zl_type;
hrtime_t zl_last_refresh; hrtime_t zl_last_refresh;
}; };
static int static int
zpool_compare(const void *larg, const void *rarg) zpool_compare(const void *larg, const void *rarg, void *unused)
{ {
(void) unused;
zpool_handle_t *l = ((zpool_node_t *)larg)->zn_handle; zpool_handle_t *l = ((zpool_node_t *)larg)->zn_handle;
zpool_handle_t *r = ((zpool_node_t *)rarg)->zn_handle; zpool_handle_t *r = ((zpool_node_t *)rarg)->zn_handle;
const char *lname = zpool_get_name(l); const char *lname = zpool_get_name(l);
const char *rname = zpool_get_name(r); const char *rname = zpool_get_name(r);
return (TREE_ISIGN(strcmp(lname, rname))); return (strcmp(lname, rname));
} }
/* /*
@ -82,11 +86,12 @@ static int
add_pool(zpool_handle_t *zhp, zpool_list_t *zlp) add_pool(zpool_handle_t *zhp, zpool_list_t *zlp)
{ {
zpool_node_t *node, *new = safe_malloc(sizeof (zpool_node_t)); zpool_node_t *node, *new = safe_malloc(sizeof (zpool_node_t));
avl_index_t idx; uu_avl_index_t idx;
new->zn_handle = zhp; new->zn_handle = zhp;
uu_avl_node_init(new, &new->zn_avlnode, zlp->zl_pool);
node = avl_find(&zlp->zl_avl, new, &idx); node = uu_avl_find(zlp->zl_avl, new, NULL, &idx);
if (node == NULL) { if (node == NULL) {
if (zlp->zl_proplist && if (zlp->zl_proplist &&
zpool_expand_proplist(zhp, zlp->zl_proplist, zpool_expand_proplist(zhp, zlp->zl_proplist,
@ -96,7 +101,7 @@ add_pool(zpool_handle_t *zhp, zpool_list_t *zlp)
return (-1); return (-1);
} }
new->zn_last_refresh = zlp->zl_last_refresh; new->zn_last_refresh = zlp->zl_last_refresh;
avl_insert(&zlp->zl_avl, new, idx); uu_avl_insert(zlp->zl_avl, new, idx);
} else { } else {
zpool_refresh_stats_from_handle(node->zn_handle, zhp); zpool_refresh_stats_from_handle(node->zn_handle, zhp);
node->zn_last_refresh = zlp->zl_last_refresh; node->zn_last_refresh = zlp->zl_last_refresh;
@ -134,8 +139,15 @@ pool_list_get(int argc, char **argv, zprop_list_t **proplist, zfs_type_t type,
zlp = safe_malloc(sizeof (zpool_list_t)); zlp = safe_malloc(sizeof (zpool_list_t));
avl_create(&zlp->zl_avl, zpool_compare, zlp->zl_pool = uu_avl_pool_create("zfs_pool", sizeof (zpool_node_t),
sizeof (zpool_node_t), offsetof(zpool_node_t, zn_avlnode)); offsetof(zpool_node_t, zn_avlnode), zpool_compare, UU_DEFAULT);
if (zlp->zl_pool == NULL)
zpool_no_memory();
if ((zlp->zl_avl = uu_avl_create(zlp->zl_pool, NULL,
UU_DEFAULT)) == NULL)
zpool_no_memory();
zlp->zl_proplist = proplist; zlp->zl_proplist = proplist;
zlp->zl_type = type; zlp->zl_type = type;
@ -182,8 +194,8 @@ pool_list_refresh(zpool_list_t *zlp)
* state. * state.
*/ */
int navail = 0; int navail = 0;
for (zpool_node_t *node = avl_first(&zlp->zl_avl); for (zpool_node_t *node = uu_avl_first(zlp->zl_avl);
node != NULL; node = AVL_NEXT(&zlp->zl_avl, node)) { node != NULL; node = uu_avl_next(zlp->zl_avl, node)) {
boolean_t missing; boolean_t missing;
zpool_refresh_stats(node->zn_handle, &missing); zpool_refresh_stats(node->zn_handle, &missing);
navail += !missing; navail += !missing;
@ -197,8 +209,8 @@ pool_list_refresh(zpool_list_t *zlp)
/* Walk the list of existing pools, and update or remove them. */ /* Walk the list of existing pools, and update or remove them. */
zpool_node_t *node, *next; zpool_node_t *node, *next;
for (node = avl_first(&zlp->zl_avl); node != NULL; node = next) { for (node = uu_avl_first(zlp->zl_avl); node != NULL; node = next) {
next = AVL_NEXT(&zlp->zl_avl, node); next = uu_avl_next(zlp->zl_avl, node);
/* /*
* Skip any that were refreshed and are online; they were added * Skip any that were refreshed and are online; they were added
@ -212,7 +224,7 @@ pool_list_refresh(zpool_list_t *zlp)
boolean_t missing; boolean_t missing;
zpool_refresh_stats(node->zn_handle, &missing); zpool_refresh_stats(node->zn_handle, &missing);
if (missing) { if (missing) {
avl_remove(&zlp->zl_avl, node); uu_avl_remove(zlp->zl_avl, node);
zpool_close(node->zn_handle); zpool_close(node->zn_handle);
free(node); free(node);
} else { } else {
@ -220,7 +232,7 @@ pool_list_refresh(zpool_list_t *zlp)
} }
} }
return (avl_numnodes(&zlp->zl_avl)); return (uu_avl_numnodes(zlp->zl_avl));
} }
/* /*
@ -233,8 +245,8 @@ pool_list_iter(zpool_list_t *zlp, int unavail, zpool_iter_f func,
zpool_node_t *node, *next_node; zpool_node_t *node, *next_node;
int ret = 0; int ret = 0;
for (node = avl_first(&zlp->zl_avl); node != NULL; node = next_node) { for (node = uu_avl_first(zlp->zl_avl); node != NULL; node = next_node) {
next_node = AVL_NEXT(&zlp->zl_avl, node); next_node = uu_avl_next(zlp->zl_avl, node);
if (zpool_get_state(node->zn_handle) != POOL_STATE_UNAVAIL || if (zpool_get_state(node->zn_handle) != POOL_STATE_UNAVAIL ||
unavail) unavail)
ret |= func(node->zn_handle, data); ret |= func(node->zn_handle, data);
@ -249,15 +261,25 @@ pool_list_iter(zpool_list_t *zlp, int unavail, zpool_iter_f func,
void void
pool_list_free(zpool_list_t *zlp) pool_list_free(zpool_list_t *zlp)
{ {
uu_avl_walk_t *walk;
zpool_node_t *node; zpool_node_t *node;
void *cookie = NULL;
while ((node = avl_destroy_nodes(&zlp->zl_avl, &cookie)) != NULL) { if ((walk = uu_avl_walk_start(zlp->zl_avl, UU_WALK_ROBUST)) == NULL) {
(void) fprintf(stderr,
gettext("internal error: out of memory"));
exit(1);
}
while ((node = uu_avl_walk_next(walk)) != NULL) {
uu_avl_remove(zlp->zl_avl, node);
zpool_close(node->zn_handle); zpool_close(node->zn_handle);
free(node); free(node);
} }
avl_destroy(&zlp->zl_avl); uu_avl_walk_end(walk);
uu_avl_destroy(zlp->zl_avl);
uu_avl_pool_destroy(zlp->zl_pool);
free(zlp); free(zlp);
} }
@ -267,7 +289,7 @@ pool_list_free(zpool_list_t *zlp)
int int
pool_list_count(zpool_list_t *zlp) pool_list_count(zpool_list_t *zlp)
{ {
return (avl_numnodes(&zlp->zl_avl)); return (uu_avl_numnodes(zlp->zl_avl));
} }
/* /*
@ -652,21 +674,21 @@ all_pools_for_each_vdev_gather_cb(zpool_handle_t *zhp, void *cb_vcdl)
static void static void
all_pools_for_each_vdev_run_vcdl(vdev_cmd_data_list_t *vcdl) all_pools_for_each_vdev_run_vcdl(vdev_cmd_data_list_t *vcdl)
{ {
taskq_t *tq = taskq_create("vdev_run_cmd", tpool_t *t;
5 * sysconf(_SC_NPROCESSORS_ONLN), minclsyspri, 1, INT_MAX,
TASKQ_DYNAMIC); t = tpool_create(1, 5 * sysconf(_SC_NPROCESSORS_ONLN), 0, NULL);
if (tq == NULL) if (t == NULL)
return; return;
/* Spawn off the command for each vdev */ /* Spawn off the command for each vdev */
for (int i = 0; i < vcdl->count; i++) { for (int i = 0; i < vcdl->count; i++) {
(void) taskq_dispatch(tq, vdev_run_cmd_thread, (void) tpool_dispatch(t, vdev_run_cmd_thread,
(void *) &vcdl->data[i], TQ_SLEEP); (void *) &vcdl->data[i]);
} }
/* Wait for threads to finish */ /* Wait for threads to finish */
taskq_wait(tq); tpool_wait(t);
taskq_destroy(tq); tpool_destroy(t);
} }
/* /*

View File

@ -46,12 +46,14 @@
#include <inttypes.h> #include <inttypes.h>
#include <libgen.h> #include <libgen.h>
#include <libintl.h> #include <libintl.h>
#include <libuutil.h>
#include <locale.h> #include <locale.h>
#include <pthread.h> #include <pthread.h>
#include <stdio.h> #include <stdio.h>
#include <stdlib.h> #include <stdlib.h>
#include <string.h> #include <string.h>
#include <termios.h> #include <termios.h>
#include <thread_pool.h>
#include <time.h> #include <time.h>
#include <unistd.h> #include <unistd.h>
#include <pwd.h> #include <pwd.h>
@ -2388,7 +2390,7 @@ zpool_do_destroy(int argc, char **argv)
} }
typedef struct export_cbdata { typedef struct export_cbdata {
taskq_t *taskq; tpool_t *tpool;
pthread_mutex_t mnttab_lock; pthread_mutex_t mnttab_lock;
boolean_t force; boolean_t force;
boolean_t hardforce; boolean_t hardforce;
@ -2413,12 +2415,12 @@ zpool_export_one(zpool_handle_t *zhp, void *data)
* zpool_disable_datasets() is not thread-safe for mnttab access. * zpool_disable_datasets() is not thread-safe for mnttab access.
* So we serialize access here for 'zpool export -a' parallel case. * So we serialize access here for 'zpool export -a' parallel case.
*/ */
if (cb->taskq != NULL) if (cb->tpool != NULL)
(void) pthread_mutex_lock(&cb->mnttab_lock); (void) pthread_mutex_lock(&cb->mnttab_lock);
int retval = zpool_disable_datasets(zhp, cb->force); int retval = zpool_disable_datasets(zhp, cb->force);
if (cb->taskq != NULL) if (cb->tpool != NULL)
(void) pthread_mutex_unlock(&cb->mnttab_lock); (void) pthread_mutex_unlock(&cb->mnttab_lock);
if (retval) if (retval)
@ -2462,7 +2464,7 @@ zpool_export_task(void *arg)
static int static int
zpool_export_one_async(zpool_handle_t *zhp, void *data) zpool_export_one_async(zpool_handle_t *zhp, void *data)
{ {
taskq_t *tq = ((export_cbdata_t *)data)->taskq; tpool_t *tpool = ((export_cbdata_t *)data)->tpool;
async_export_args_t *aea = safe_malloc(sizeof (async_export_args_t)); async_export_args_t *aea = safe_malloc(sizeof (async_export_args_t));
/* save pool name since zhp will go out of scope */ /* save pool name since zhp will go out of scope */
@ -2470,8 +2472,7 @@ zpool_export_one_async(zpool_handle_t *zhp, void *data)
aea->aea_cbdata = data; aea->aea_cbdata = data;
/* ship off actual export to another thread */ /* ship off actual export to another thread */
if (taskq_dispatch(tq, zpool_export_task, (void *)aea, if (tpool_dispatch(tpool, zpool_export_task, (void *)aea) != 0)
TQ_SLEEP) == TASKQID_INVALID)
return (errno); /* unlikely */ return (errno); /* unlikely */
else else
return (0); return (0);
@ -2517,7 +2518,7 @@ zpool_do_export(int argc, char **argv)
cb.force = force; cb.force = force;
cb.hardforce = hardforce; cb.hardforce = hardforce;
cb.taskq = NULL; cb.tpool = NULL;
cb.retval = 0; cb.retval = 0;
argc -= optind; argc -= optind;
argv += optind; argv += optind;
@ -2531,17 +2532,16 @@ zpool_do_export(int argc, char **argv)
usage(B_FALSE); usage(B_FALSE);
} }
cb.taskq = taskq_create("zpool_export", cb.tpool = tpool_create(1, 5 * sysconf(_SC_NPROCESSORS_ONLN),
5 * sysconf(_SC_NPROCESSORS_ONLN), minclsyspri, 1, INT_MAX, 0, NULL);
TASKQ_DYNAMIC);
(void) pthread_mutex_init(&cb.mnttab_lock, NULL); (void) pthread_mutex_init(&cb.mnttab_lock, NULL);
/* Asynchronously call zpool_export_one using thread pool */ /* Asynchronously call zpool_export_one using thread pool */
ret = for_each_pool(argc, argv, B_TRUE, NULL, ZFS_TYPE_POOL, ret = for_each_pool(argc, argv, B_TRUE, NULL, ZFS_TYPE_POOL,
B_FALSE, zpool_export_one_async, &cb); B_FALSE, zpool_export_one_async, &cb);
taskq_wait(cb.taskq); tpool_wait(cb.tpool);
taskq_destroy(cb.taskq); tpool_destroy(cb.tpool);
(void) pthread_mutex_destroy(&cb.mnttab_lock); (void) pthread_mutex_destroy(&cb.mnttab_lock);
return (ret | cb.retval); return (ret | cb.retval);
@ -3456,7 +3456,7 @@ show_import(nvlist_t *config, boolean_t report_error)
case ZPOOL_STATUS_CORRUPT_POOL: case ZPOOL_STATUS_CORRUPT_POOL:
(void) printf_color(ANSI_YELLOW, gettext("The pool metadata is " (void) printf_color(ANSI_YELLOW, gettext("The pool metadata is "
"incomplete or corrupted.\n")); "corrupted.\n"));
break; break;
case ZPOOL_STATUS_VERSION_OLDER: case ZPOOL_STATUS_VERSION_OLDER:
@ -3704,12 +3704,6 @@ show_import(nvlist_t *config, boolean_t report_error)
(void) printf(gettext("Set a unique system hostid with " (void) printf(gettext("Set a unique system hostid with "
"the zgenhostid(8) command.\n")); "the zgenhostid(8) command.\n"));
break; break;
case ZPOOL_STATUS_CORRUPT_POOL:
(void) printf(gettext("The pool cannot be imported due "
"to missing or damaged devices. Ensure\n"
"\t%sall devices are present and not in use by "
"another subsystem.\n"), indent);
break;
default: default:
(void) printf(gettext("The pool cannot be imported due " (void) printf(gettext("The pool cannot be imported due "
"to damaged devices or data.\n")); "to damaged devices or data.\n"));
@ -3955,11 +3949,10 @@ import_pools(nvlist_t *pools, nvlist_t *props, char *mntopts, int flags,
uint_t npools = 0; uint_t npools = 0;
taskq_t *tq = NULL; tpool_t *tp = NULL;
if (import->do_all) { if (import->do_all) {
tq = taskq_create("zpool_import_all", tp = tpool_create(1, 5 * sysconf(_SC_NPROCESSORS_ONLN),
5 * sysconf(_SC_NPROCESSORS_ONLN), minclsyspri, 1, INT_MAX, 0, NULL);
TASKQ_DYNAMIC);
} }
/* /*
@ -4008,8 +4001,8 @@ import_pools(nvlist_t *pools, nvlist_t *props, char *mntopts, int flags,
ip->ip_mntthreads = mount_tp_nthr / npools; ip->ip_mntthreads = mount_tp_nthr / npools;
ip->ip_err = &err; ip->ip_err = &err;
(void) taskq_dispatch(tq, do_import_task, (void) tpool_dispatch(tp, do_import_task,
(void *)ip, TQ_SLEEP); (void *)ip);
} else { } else {
/* /*
* If we're importing from cachefile, then * If we're importing from cachefile, then
@ -4058,8 +4051,8 @@ import_pools(nvlist_t *pools, nvlist_t *props, char *mntopts, int flags,
} }
} }
if (import->do_all) { if (import->do_all) {
taskq_wait(tq); tpool_wait(tp);
taskq_destroy(tq); tpool_destroy(tp);
} }
/* /*
@ -6960,19 +6953,7 @@ collect_vdev_prop(zpool_prop_t prop, uint64_t value, const char *str,
switch (prop) { switch (prop) {
case ZPOOL_PROP_SIZE: case ZPOOL_PROP_SIZE:
case ZPOOL_PROP_NORMAL_SIZE:
case ZPOOL_PROP_SPECIAL_SIZE:
case ZPOOL_PROP_DEDUP_SIZE:
case ZPOOL_PROP_LOG_SIZE:
case ZPOOL_PROP_ELOG_SIZE:
case ZPOOL_PROP_SELOG_SIZE:
case ZPOOL_PROP_EXPANDSZ: case ZPOOL_PROP_EXPANDSZ:
case ZPOOL_PROP_NORMAL_EXPANDSZ:
case ZPOOL_PROP_SPECIAL_EXPANDSZ:
case ZPOOL_PROP_DEDUP_EXPANDSZ:
case ZPOOL_PROP_LOG_EXPANDSZ:
case ZPOOL_PROP_ELOG_EXPANDSZ:
case ZPOOL_PROP_SELOG_EXPANDSZ:
case ZPOOL_PROP_CHECKPOINT: case ZPOOL_PROP_CHECKPOINT:
case ZPOOL_PROP_DEDUPRATIO: case ZPOOL_PROP_DEDUPRATIO:
case ZPOOL_PROP_DEDUPCACHED: case ZPOOL_PROP_DEDUPCACHED:
@ -6983,12 +6964,6 @@ collect_vdev_prop(zpool_prop_t prop, uint64_t value, const char *str,
format); format);
break; break;
case ZPOOL_PROP_FRAGMENTATION: case ZPOOL_PROP_FRAGMENTATION:
case ZPOOL_PROP_NORMAL_FRAGMENTATION:
case ZPOOL_PROP_SPECIAL_FRAGMENTATION:
case ZPOOL_PROP_DEDUP_FRAGMENTATION:
case ZPOOL_PROP_LOG_FRAGMENTATION:
case ZPOOL_PROP_ELOG_FRAGMENTATION:
case ZPOOL_PROP_SELOG_FRAGMENTATION:
if (value == ZFS_FRAG_INVALID) { if (value == ZFS_FRAG_INVALID) {
(void) strlcpy(propval, "-", sizeof (propval)); (void) strlcpy(propval, "-", sizeof (propval));
} else if (format == ZFS_NICENUM_RAW) { } else if (format == ZFS_NICENUM_RAW) {
@ -7000,12 +6975,6 @@ collect_vdev_prop(zpool_prop_t prop, uint64_t value, const char *str,
} }
break; break;
case ZPOOL_PROP_CAPACITY: case ZPOOL_PROP_CAPACITY:
case ZPOOL_PROP_NORMAL_CAPACITY:
case ZPOOL_PROP_SPECIAL_CAPACITY:
case ZPOOL_PROP_DEDUP_CAPACITY:
case ZPOOL_PROP_LOG_CAPACITY:
case ZPOOL_PROP_ELOG_CAPACITY:
case ZPOOL_PROP_SELOG_CAPACITY:
/* capacity value is in parts-per-10,000 (aka permyriad) */ /* capacity value is in parts-per-10,000 (aka permyriad) */
if (format == ZFS_NICENUM_RAW) if (format == ZFS_NICENUM_RAW)
(void) snprintf(propval, sizeof (propval), "%llu", (void) snprintf(propval, sizeof (propval), "%llu",
@ -10646,8 +10615,7 @@ print_status_reason(zpool_handle_t *zhp, status_cbdata_t *cbp,
case ZPOOL_STATUS_CORRUPT_POOL: case ZPOOL_STATUS_CORRUPT_POOL:
(void) snprintf(status, ST_SIZE, gettext("The pool metadata is " (void) snprintf(status, ST_SIZE, gettext("The pool metadata is "
"incomplete or corrupted and the pool cannot be " "corrupted and the pool cannot be opened.\n"));
"opened.\n"));
zpool_explain_recover(zpool_get_handle(zhp), zpool_explain_recover(zpool_get_handle(zhp),
zpool_get_name(zhp), reason, zpool_get_config(zhp, NULL), zpool_get_name(zhp), reason, zpool_get_config(zhp, NULL),
action, AC_SIZE); action, AC_SIZE);

View File

@ -114,3 +114,29 @@ array64_max(uint64_t array[], unsigned int len)
return (max); return (max);
} }
/*
* Find highest one bit set.
* Returns bit number + 1 of highest bit that is set, otherwise returns 0.
*/
int
highbit64(uint64_t i)
{
if (i == 0)
return (0);
return (NBBY * sizeof (uint64_t) - __builtin_clzll(i));
}
/*
* Find lowest one bit set.
* Returns bit number + 1 of lowest bit that is set, otherwise returns 0.
*/
int
lowbit64(uint64_t i)
{
if (i == 0)
return (0);
return (__builtin_ffsll(i));
}

View File

@ -45,6 +45,8 @@ void *safe_realloc(void *, size_t);
void zpool_no_memory(void); void zpool_no_memory(void);
uint_t num_logs(nvlist_t *nv); uint_t num_logs(nvlist_t *nv);
uint64_t array64_max(uint64_t array[], unsigned int len); uint64_t array64_max(uint64_t array[], unsigned int len);
int highbit64(uint64_t i);
int lowbit64(uint64_t i);
/* /*
* Misc utility functions * Misc utility functions

View File

@ -1,4 +1,3 @@
# SPDX-License-Identifier: CDDL-1.0
zfsexec_PROGRAMS += zpool_influxdb zfsexec_PROGRAMS += zpool_influxdb
CPPCHECKTARGETS += zpool_influxdb CPPCHECKTARGETS += zpool_influxdb

View File

@ -1,4 +1,3 @@
# SPDX-License-Identifier: CDDL-1.0
zstream_CPPFLAGS = $(AM_CPPFLAGS) $(LIBZPOOL_CPPFLAGS) zstream_CPPFLAGS = $(AM_CPPFLAGS) $(LIBZPOOL_CPPFLAGS)
sbin_PROGRAMS += zstream sbin_PROGRAMS += zstream

View File

@ -191,9 +191,9 @@ zfs_redup_stream(int infd, int outfd, boolean_t verbose)
#ifdef _ILP32 #ifdef _ILP32
uint64_t max_rde_size = SMALLEST_POSSIBLE_MAX_RDT_MB << 20; uint64_t max_rde_size = SMALLEST_POSSIBLE_MAX_RDT_MB << 20;
#else #else
uint64_t physbytes = sysconf(_SC_PHYS_PAGES) * sysconf(_SC_PAGESIZE); uint64_t physmem = sysconf(_SC_PHYS_PAGES) * sysconf(_SC_PAGESIZE);
uint64_t max_rde_size = uint64_t max_rde_size =
MAX((physbytes * MAX_RDT_PHYSMEM_PERCENT) / 100, MAX((physmem * MAX_RDT_PHYSMEM_PERCENT) / 100,
SMALLEST_POSSIBLE_MAX_RDT_MB << 20); SMALLEST_POSSIBLE_MAX_RDT_MB << 20);
#endif #endif

View File

@ -139,10 +139,9 @@
#include <sys/crypto/icp.h> #include <sys/crypto/icp.h>
#include <sys/zfs_impl.h> #include <sys/zfs_impl.h>
#include <sys/backtrace.h> #include <sys/backtrace.h>
#include <libzpool.h>
#include <libspl.h>
static int ztest_fd_data = -1; static int ztest_fd_data = -1;
static int ztest_fd_rand = -1;
typedef struct ztest_shared_hdr { typedef struct ztest_shared_hdr {
uint64_t zh_hdr_size; uint64_t zh_hdr_size;
@ -903,10 +902,13 @@ ztest_random(uint64_t range)
{ {
uint64_t r; uint64_t r;
ASSERT3S(ztest_fd_rand, >=, 0);
if (range == 0) if (range == 0)
return (0); return (0);
random_get_pseudo_bytes((uint8_t *)&r, sizeof (r)); if (read(ztest_fd_rand, &r, sizeof (r)) != sizeof (r))
fatal(B_TRUE, "short read from /dev/urandom");
return (r % range); return (r % range);
} }
@ -8148,8 +8150,10 @@ ztest_raidz_expand_run(ztest_shared_t *zs, spa_t *spa)
/* Setup a 1 MiB buffer of random data */ /* Setup a 1 MiB buffer of random data */
uint64_t bufsize = 1024 * 1024; uint64_t bufsize = 1024 * 1024;
void *buffer = umem_alloc(bufsize, UMEM_NOFAIL); void *buffer = umem_alloc(bufsize, UMEM_NOFAIL);
random_get_pseudo_bytes((uint8_t *)buffer, bufsize);
if (read(ztest_fd_rand, buffer, bufsize) != bufsize) {
fatal(B_TRUE, "short read from /dev/urandom");
}
/* /*
* Put some data in the pool and then attach a vdev to initiate * Put some data in the pool and then attach a vdev to initiate
* reflow. * reflow.
@ -8955,13 +8959,13 @@ main(int argc, char **argv)
exit(EXIT_FAILURE); exit(EXIT_FAILURE);
} }
libspl_init();
/* /*
* Force random_get_bytes() to use /dev/urandom in order to prevent * Force random_get_bytes() to use /dev/urandom in order to prevent
* ztest from needlessly depleting the system entropy pool. * ztest from needlessly depleting the system entropy pool.
*/ */
random_force_pseudo(B_TRUE); random_path = "/dev/urandom";
ztest_fd_rand = open(random_path, O_RDONLY | O_CLOEXEC);
ASSERT3S(ztest_fd_rand, >=, 0);
if (!fd_data_str) { if (!fd_data_str) {
process_options(argc, argv); process_options(argc, argv);

View File

@ -1,4 +1,3 @@
# SPDX-License-Identifier: CDDL-1.0
# #
# cppcheck for userspace nodist_*_SOURCES are kernel code and cppcheck goes crazy on them. # cppcheck for userspace nodist_*_SOURCES are kernel code and cppcheck goes crazy on them.
# #

View File

@ -1,4 +1,3 @@
# SPDX-License-Identifier: CDDL-1.0
# #
# Default build rules for all user space components, every Makefile.am # Default build rules for all user space components, every Makefile.am
# should include these rules and override or extend them as needed. # should include these rules and override or extend them as needed.
@ -9,9 +8,9 @@ AM_CPPFLAGS = \
-include $(top_builddir)/zfs_config.h \ -include $(top_builddir)/zfs_config.h \
-I$(top_builddir)/include \ -I$(top_builddir)/include \
-I$(top_srcdir)/include \ -I$(top_srcdir)/include \
-I$(top_srcdir)/module/icp/include \
-I$(top_srcdir)/lib/libspl/include \ -I$(top_srcdir)/lib/libspl/include \
-I$(top_srcdir)/lib/libspl/include/os/@ac_system_l@ \ -I$(top_srcdir)/lib/libspl/include/os/@ac_system_l@
-I$(top_srcdir)/lib/libzpool/include
AM_LIBTOOLFLAGS = --silent AM_LIBTOOLFLAGS = --silent

View File

@ -1,4 +1,3 @@
# SPDX-License-Identifier: CDDL-1.0
# Global ShellCheck exclusions: # Global ShellCheck exclusions:
# #
# ShellCheck can't follow non-constant source. Use a directive to specify location. [SC1090] # ShellCheck can't follow non-constant source. Use a directive to specify location. [SC1090]

View File

@ -1,4 +1,3 @@
# SPDX-License-Identifier: CDDL-1.0
subst_sed_cmd = \ subst_sed_cmd = \
-e 's|@abs_top_srcdir[@]|$(abs_top_srcdir)|g' \ -e 's|@abs_top_srcdir[@]|$(abs_top_srcdir)|g' \
-e 's|@bindir[@]|$(bindir)|g' \ -e 's|@bindir[@]|$(bindir)|g' \

View File

@ -1,4 +1,3 @@
dnl # SPDX-License-Identifier: CDDL-1.0
dnl # dnl #
dnl # Set the target cpu architecture. This allows the dnl # Set the target cpu architecture. This allows the
dnl # following syntax to be used in a Makefile.am. dnl # following syntax to be used in a Makefile.am.

View File

@ -1,4 +1,3 @@
dnl # SPDX-License-Identifier: CDDL-1.0
dnl # dnl #
dnl # Enabled -fsanitize=address if supported by $CC. dnl # Enabled -fsanitize=address if supported by $CC.
dnl # dnl #

View File

@ -1,4 +1,3 @@
dnl # SPDX-License-Identifier: CDDL-1.0
dnl # dnl #
dnl # Check if cppcheck is available. dnl # Check if cppcheck is available.
dnl # dnl #

View File

@ -1,4 +1,3 @@
dnl # SPDX-License-Identifier: CDDL-1.0
dnl # dnl #
dnl # Check if GNU parallel is available. dnl # Check if GNU parallel is available.
dnl # dnl #

View File

@ -1,4 +1,3 @@
dnl # SPDX-License-Identifier: CDDL-1.0
dnl # dnl #
dnl # The majority of the python scripts are written to be compatible dnl # The majority of the python scripts are written to be compatible
dnl # with Python 3.6. This option is intended to dnl # with Python 3.6. This option is intended to

View File

@ -1,4 +1,3 @@
dnl # SPDX-License-Identifier: CDDL-1.0
dnl # dnl #
dnl # ZFS_AC_PYTHON_MODULE(module_name, [action-if-true], [action-if-false]) dnl # ZFS_AC_PYTHON_MODULE(module_name, [action-if-true], [action-if-false])
dnl # dnl #

View File

@ -1,4 +1,3 @@
dnl # SPDX-License-Identifier: CDDL-1.0
dnl # dnl #
dnl # Set the flags used for sed in-place edits. dnl # Set the flags used for sed in-place edits.
dnl # dnl #

View File

@ -1,4 +1,3 @@
dnl # SPDX-License-Identifier: CDDL-1.0
dnl # dnl #
dnl # Check if shellcheck and/or checkbashisms are available. dnl # Check if shellcheck and/or checkbashisms are available.
dnl # dnl #

View File

@ -1,4 +1,3 @@
dnl # SPDX-License-Identifier: CDDL-1.0
dnl # dnl #
dnl # Set the target system dnl # Set the target system
dnl # dnl #

View File

@ -1,4 +1,3 @@
# SPDX-License-Identifier: FSFAP
# =========================================================================== # ===========================================================================
# https://www.gnu.org/software/autoconf-archive/ax_compare_version.html # https://www.gnu.org/software/autoconf-archive/ax_compare_version.html
# =========================================================================== # ===========================================================================

View File

@ -1,4 +1,3 @@
# SPDX-License-Identifier: FSFAP
# =========================================================================== # ===========================================================================
# https://www.gnu.org/software/autoconf-archive/ax_count_cpus.html # https://www.gnu.org/software/autoconf-archive/ax_count_cpus.html
# =========================================================================== # ===========================================================================

View File

@ -1,4 +1,3 @@
# SPDX-License-Identifier: GPL-3.0-or-later WITH Autoconf-exception-macro
# =========================================================================== # ===========================================================================
# https://www.gnu.org/software/autoconf-archive/ax_python_devel.html # https://www.gnu.org/software/autoconf-archive/ax_python_devel.html
# =========================================================================== # ===========================================================================

View File

@ -1,4 +1,3 @@
# SPDX-License-Identifier: FSFAP
# =========================================================================== # ===========================================================================
# http://www.gnu.org/software/autoconf-archive/ax_restore_flags.html # http://www.gnu.org/software/autoconf-archive/ax_restore_flags.html
# =========================================================================== # ===========================================================================

View File

@ -1,4 +1,3 @@
# SPDX-License-Identifier: FSFAP
# =========================================================================== # ===========================================================================
# http://www.gnu.org/software/autoconf-archive/ax_save_flags.html # http://www.gnu.org/software/autoconf-archive/ax_save_flags.html
# =========================================================================== # ===========================================================================

View File

@ -1,4 +1,3 @@
# SPDX-License-Identifier: CDDL-1.0
PHONY += deb-kmod deb-dkms deb-utils deb deb-local native-deb-local \ PHONY += deb-kmod deb-dkms deb-utils deb deb-local native-deb-local \
native-deb-utils native-deb-kmod native-deb native-deb-utils native-deb-kmod native-deb
@ -58,21 +57,22 @@ deb-utils: deb-local rpm-utils-initramfs
debarch=`$(DPKG) --print-architecture`; \ debarch=`$(DPKG) --print-architecture`; \
pkg1=$${name}-$${version}.$${arch}.rpm; \ pkg1=$${name}-$${version}.$${arch}.rpm; \
pkg2=libnvpair3-$${version}.$${arch}.rpm; \ pkg2=libnvpair3-$${version}.$${arch}.rpm; \
pkg3=libzfs7-$${version}.$${arch}.rpm; \ pkg3=libuutil3-$${version}.$${arch}.rpm; \
pkg4=libzpool7-$${version}.$${arch}.rpm; \ pkg4=libzfs7-$${version}.$${arch}.rpm; \
pkg5=libzfs7-devel-$${version}.$${arch}.rpm; \ pkg5=libzpool7-$${version}.$${arch}.rpm; \
pkg6=$${name}-test-$${version}.$${arch}.rpm; \ pkg6=libzfs7-devel-$${version}.$${arch}.rpm; \
pkg7=$${name}-dracut-$${version}.noarch.rpm; \ pkg7=$${name}-test-$${version}.$${arch}.rpm; \
pkg8=$${name}-initramfs-$${version}.$${arch}.rpm; \ pkg8=$${name}-dracut-$${version}.noarch.rpm; \
pkg9=`ls python3-pyzfs-$${version}.noarch.rpm 2>/dev/null`; \ pkg9=$${name}-initramfs-$${version}.$${arch}.rpm; \
pkg10=`ls pam_zfs_key-$${version}.$${arch}.rpm 2>/dev/null`; \ pkg10=`ls python3-pyzfs-$${version}.noarch.rpm 2>/dev/null`; \
pkg11=`ls pam_zfs_key-$${version}.$${arch}.rpm 2>/dev/null`; \
## Arguments need to be passed to dh_shlibdeps. Alien provides no mechanism ## Arguments need to be passed to dh_shlibdeps. Alien provides no mechanism
## to do this, so we install a shim onto the path which calls the real ## to do this, so we install a shim onto the path which calls the real
## dh_shlibdeps with the required arguments. ## dh_shlibdeps with the required arguments.
path_prepend=`mktemp -d /tmp/intercept.XXXXXX`; \ path_prepend=`mktemp -d /tmp/intercept.XXXXXX`; \
echo "#!$(SHELL)" > $${path_prepend}/dh_shlibdeps; \ echo "#!$(SHELL)" > $${path_prepend}/dh_shlibdeps; \
echo "`which dh_shlibdeps` -- \ echo "`which dh_shlibdeps` -- \
-xlibnvpair3linux -xlibzfs7linux -xlibzpool7linux" \ -xlibuutil3linux -xlibnvpair3linux -xlibzfs7linux -xlibzpool7linux" \
>> $${path_prepend}/dh_shlibdeps; \ >> $${path_prepend}/dh_shlibdeps; \
## These -x arguments are passed to dpkg-shlibdeps, which exclude the ## These -x arguments are passed to dpkg-shlibdeps, which exclude the
## Debianized packages from the auto-generated dependencies of the new debs, ## Debianized packages from the auto-generated dependencies of the new debs,

View File

@ -1,4 +1,3 @@
dnl # SPDX-License-Identifier: CDDL-1.0
# find_system_lib.m4 - Macros to search for a system library. -*- Autoconf -*- # find_system_lib.m4 - Macros to search for a system library. -*- Autoconf -*-
dnl requires pkg.m4 from pkg-config dnl requires pkg.m4 from pkg-config

View File

@ -1,4 +1,3 @@
# SPDX-License-Identifier: FSFULLR
# gettext.m4 serial 70 (gettext-0.20) # gettext.m4 serial 70 (gettext-0.20)
dnl Copyright (C) 1995-2014, 2016, 2018 Free Software Foundation, Inc. dnl Copyright (C) 1995-2014, 2016, 2018 Free Software Foundation, Inc.
dnl This file is free software; the Free Software Foundation dnl This file is free software; the Free Software Foundation

View File

@ -1,4 +1,3 @@
# SPDX-License-Identifier: FSFULLR
# host-cpu-c-abi.m4 serial 11 # host-cpu-c-abi.m4 serial 11
dnl Copyright (C) 2002-2019 Free Software Foundation, Inc. dnl Copyright (C) 2002-2019 Free Software Foundation, Inc.
dnl This file is free software; the Free Software Foundation dnl This file is free software; the Free Software Foundation

View File

@ -1,4 +1,3 @@
# SPDX-License-Identifier: FSFULLR
# iconv.m4 serial 21 # iconv.m4 serial 21
dnl Copyright (C) 2000-2002, 2007-2014, 2016-2019 Free Software Foundation, dnl Copyright (C) 2000-2002, 2007-2014, 2016-2019 Free Software Foundation,
dnl Inc. dnl Inc.

View File

@ -1,4 +1,3 @@
dnl # SPDX-License-Identifier: CDDL-1.0
dnl # dnl #
dnl # Linux 5.0: access_ok() drops 'type' parameter: dnl # Linux 5.0: access_ok() drops 'type' parameter:
dnl # dnl #

View File

@ -1,4 +1,3 @@
dnl # SPDX-License-Identifier: CDDL-1.0
dnl # dnl #
dnl # 3.1 API change, dnl # 3.1 API change,
dnl # posix_acl_equiv_mode now wants an umode_t instead of a mode_t dnl # posix_acl_equiv_mode now wants an umode_t instead of a mode_t
@ -22,35 +21,6 @@ AC_DEFUN([ZFS_AC_KERNEL_POSIX_ACL_EQUIV_MODE_WANTS_UMODE_T], [
]) ])
]) ])
dnl #
dnl # 7.0 API change
dnl # posix_acl_to_xattr() now allocates and returns the value.
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_POSIX_ACL_TO_XATTR_ALLOC], [
ZFS_LINUX_TEST_SRC([posix_acl_to_xattr_alloc], [
#include <linux/fs.h>
#include <linux/posix_acl_xattr.h>
], [
struct user_namespace *ns = NULL;
struct posix_acl *acl = NULL;
size_t size = 0;
gfp_t gfp = 0;
void *xattr = NULL;
xattr = posix_acl_to_xattr(ns, acl, &size, gfp);
])
])
AC_DEFUN([ZFS_AC_KERNEL_POSIX_ACL_TO_XATTR_ALLOC], [
AC_MSG_CHECKING([whether posix_acl_to_xattr() allocates its result]);
ZFS_LINUX_TEST_RESULT([posix_acl_to_xattr_alloc], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_POSIX_ACL_TO_XATTR_ALLOC, 1,
[posix_acl_to_xattr() allocates its result])
], [
AC_MSG_RESULT(no)
])
])
dnl # dnl #
dnl # 3.1 API change, dnl # 3.1 API change,
dnl # Check if inode_operations contains the function get_acl dnl # Check if inode_operations contains the function get_acl
@ -203,14 +173,12 @@ AC_DEFUN([ZFS_AC_KERNEL_INODE_OPERATIONS_SET_ACL], [
AC_DEFUN([ZFS_AC_KERNEL_SRC_ACL], [ AC_DEFUN([ZFS_AC_KERNEL_SRC_ACL], [
ZFS_AC_KERNEL_SRC_POSIX_ACL_EQUIV_MODE_WANTS_UMODE_T ZFS_AC_KERNEL_SRC_POSIX_ACL_EQUIV_MODE_WANTS_UMODE_T
ZFS_AC_KERNEL_SRC_POSIX_ACL_TO_XATTR_ALLOC
ZFS_AC_KERNEL_SRC_INODE_OPERATIONS_GET_ACL ZFS_AC_KERNEL_SRC_INODE_OPERATIONS_GET_ACL
ZFS_AC_KERNEL_SRC_INODE_OPERATIONS_SET_ACL ZFS_AC_KERNEL_SRC_INODE_OPERATIONS_SET_ACL
]) ])
AC_DEFUN([ZFS_AC_KERNEL_ACL], [ AC_DEFUN([ZFS_AC_KERNEL_ACL], [
ZFS_AC_KERNEL_POSIX_ACL_EQUIV_MODE_WANTS_UMODE_T ZFS_AC_KERNEL_POSIX_ACL_EQUIV_MODE_WANTS_UMODE_T
ZFS_AC_KERNEL_POSIX_ACL_TO_XATTR_ALLOC
ZFS_AC_KERNEL_INODE_OPERATIONS_GET_ACL ZFS_AC_KERNEL_INODE_OPERATIONS_GET_ACL
ZFS_AC_KERNEL_INODE_OPERATIONS_SET_ACL ZFS_AC_KERNEL_INODE_OPERATIONS_SET_ACL
]) ])

View File

@ -1,4 +1,3 @@
dnl # SPDX-License-Identifier: CDDL-1.0
dnl # dnl #
dnl # 5.16 API change dnl # 5.16 API change
dnl # add_disk grew a must-check return code dnl # add_disk grew a must-check return code

View File

@ -1,4 +1,3 @@
dnl # SPDX-License-Identifier: CDDL-1.0
dnl # dnl #
dnl # 6.10 kernel, check number of args of __assign_str() for trace: dnl # 6.10 kernel, check number of args of __assign_str() for trace:
dnl dnl

View File

@ -1,4 +1,3 @@
dnl # SPDX-License-Identifier: CDDL-1.0
dnl # dnl #
dnl # 2.6.37 API change dnl # 2.6.37 API change
dnl # The dops->d_automount() dentry operation was added as a clean dnl # The dops->d_automount() dentry operation was added as a clean

View File

@ -1,4 +1,3 @@
dnl # SPDX-License-Identifier: CDDL-1.0
dnl # dnl #
dnl # Linux 4.8 API, dnl # Linux 4.8 API,
dnl # dnl #

View File

@ -1,4 +1,3 @@
dnl # SPDX-License-Identifier: CDDL-1.0
dnl # dnl #
dnl # 5.12 API change removes BIO_MAX_PAGES in favor of bio_max_segs() dnl # 5.12 API change removes BIO_MAX_PAGES in favor of bio_max_segs()
dnl # which will handle the logic of setting the upper-bound to a dnl # which will handle the logic of setting the upper-bound to a

View File

@ -1,4 +1,3 @@
dnl # SPDX-License-Identifier: CDDL-1.0
dnl # dnl #
dnl # 2.6.39 API change, dnl # 2.6.39 API change,
dnl # blk_start_plug() and blk_finish_plug() dnl # blk_start_plug() and blk_finish_plug()
@ -226,30 +225,6 @@ AC_DEFUN([ZFS_AC_KERNEL_BLK_QUEUE_MAX_HW_SECTORS], [
]) ])
]) ])
dnl #
dnl # 7.0 API change
dnl # blk_queue_rot() replaces blk_queue_nonrot() (inverted meaning)
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_BLK_QUEUE_ROT], [
ZFS_LINUX_TEST_SRC([blk_queue_rot], [
#include <linux/blkdev.h>
], [
struct request_queue *q __attribute__ ((unused)) = NULL;
(void) blk_queue_rot(q);
], [])
])
AC_DEFUN([ZFS_AC_KERNEL_BLK_QUEUE_ROT], [
AC_MSG_CHECKING([whether blk_queue_rot() is available])
ZFS_LINUX_TEST_RESULT([blk_queue_rot], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_BLK_QUEUE_ROT, 1,
[blk_queue_rot() is available])
],[
AC_MSG_RESULT(no)
])
])
dnl # dnl #
dnl # 2.6.34 API change dnl # 2.6.34 API change
dnl # blk_queue_max_segments() consolidates blk_queue_max_hw_segments() dnl # blk_queue_max_segments() consolidates blk_queue_max_hw_segments()
@ -303,7 +278,6 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_BLK_QUEUE], [
ZFS_AC_KERNEL_SRC_BLK_QUEUE_SECURE_ERASE ZFS_AC_KERNEL_SRC_BLK_QUEUE_SECURE_ERASE
ZFS_AC_KERNEL_SRC_BLK_QUEUE_MAX_HW_SECTORS ZFS_AC_KERNEL_SRC_BLK_QUEUE_MAX_HW_SECTORS
ZFS_AC_KERNEL_SRC_BLK_QUEUE_MAX_SEGMENTS ZFS_AC_KERNEL_SRC_BLK_QUEUE_MAX_SEGMENTS
ZFS_AC_KERNEL_SRC_BLK_QUEUE_ROT
ZFS_AC_KERNEL_SRC_BLK_MQ_RQ_HCTX ZFS_AC_KERNEL_SRC_BLK_MQ_RQ_HCTX
]) ])
@ -316,6 +290,5 @@ AC_DEFUN([ZFS_AC_KERNEL_BLK_QUEUE], [
ZFS_AC_KERNEL_BLK_QUEUE_SECURE_ERASE ZFS_AC_KERNEL_BLK_QUEUE_SECURE_ERASE
ZFS_AC_KERNEL_BLK_QUEUE_MAX_HW_SECTORS ZFS_AC_KERNEL_BLK_QUEUE_MAX_HW_SECTORS
ZFS_AC_KERNEL_BLK_QUEUE_MAX_SEGMENTS ZFS_AC_KERNEL_BLK_QUEUE_MAX_SEGMENTS
ZFS_AC_KERNEL_BLK_QUEUE_ROT
ZFS_AC_KERNEL_BLK_MQ_RQ_HCTX ZFS_AC_KERNEL_BLK_MQ_RQ_HCTX
]) ])

View File

@ -1,4 +1,3 @@
dnl # SPDX-License-Identifier: CDDL-1.0
dnl # dnl #
dnl # 2.6.38 API change, dnl # 2.6.38 API change,
dnl # Added blkdev_get_by_path() dnl # Added blkdev_get_by_path()

View File

@ -1,4 +1,3 @@
dnl # SPDX-License-Identifier: CDDL-1.0
dnl # dnl #
dnl # 2.6.38 API change dnl # 2.6.38 API change
dnl # dnl #

View File

@ -1,4 +1,3 @@
dnl # SPDX-License-Identifier: CDDL-1.0
dnl # dnl #
dnl # 2.6.33 API change dnl # 2.6.33 API change
dnl # Added eops->commit_metadata() callback to allow the underlying dnl # Added eops->commit_metadata() callback to allow the underlying

View File

@ -1,4 +1,3 @@
dnl # SPDX-License-Identifier: CDDL-1.0
dnl # dnl #
dnl # Certain kernel build options are not supported. These must be dnl # Certain kernel build options are not supported. These must be
dnl # detected at configure time and cause a build failure. Otherwise dnl # detected at configure time and cause a build failure. Otherwise

View File

@ -1,4 +1,3 @@
dnl # SPDX-License-Identifier: CDDL-1.0
dnl # dnl #
dnl # On certain architectures `__copy_from_user_inatomic` dnl # On certain architectures `__copy_from_user_inatomic`
dnl # is a GPL exported variable and cannot be used by OpenZFS. dnl # is a GPL exported variable and cannot be used by OpenZFS.

View File

@ -1,4 +1,3 @@
dnl # SPDX-License-Identifier: CDDL-1.0
dnl # dnl #
dnl # cpu_has_feature() may referencing GPL-only cpu_feature_keys on powerpc dnl # cpu_has_feature() may referencing GPL-only cpu_feature_keys on powerpc
dnl # dnl #

View File

@ -1,4 +1,3 @@
dnl # SPDX-License-Identifier: CDDL-1.0
dnl # dnl #
dnl # Ensure the DECLARE_EVENT_CLASS macro is available to non-GPL modules. dnl # Ensure the DECLARE_EVENT_CLASS macro is available to non-GPL modules.
dnl # dnl #

View File

@ -1,4 +1,3 @@
dnl # SPDX-License-Identifier: CDDL-1.0
dnl # dnl #
dnl # 2.6.28 API change dnl # 2.6.28 API change
dnl # Added d_obtain_alias() helper function. dnl # Added d_obtain_alias() helper function.

View File

@ -1,4 +1,3 @@
dnl # SPDX-License-Identifier: CDDL-1.0
dnl # dnl #
dnl # 2.6.33 API change dnl # 2.6.33 API change
dnl # Discard granularity and alignment restrictions may now be set. dnl # Discard granularity and alignment restrictions may now be set.

View File

@ -1,4 +1,3 @@
dnl # SPDX-License-Identifier: CDDL-1.0
dnl # dnl #
dnl # 6.18 API change dnl # 6.18 API change
dnl # - generic_drop_inode() renamed to inode_generic_drop() dnl # - generic_drop_inode() renamed to inode_generic_drop()

View File

@ -1,4 +1,3 @@
dnl # SPDX-License-Identifier: CDDL-1.0
dnl # dnl #
dnl # 6.12 removed f_version from struct file dnl # 6.12 removed f_version from struct file
dnl # dnl #

View File

@ -1,23 +0,0 @@
dnl # SPDX-License-Identifier: CDDL-1.0
dnl #
dnl # 6.3 API change
dnl # locking support functions (eg generic_setlease) were moved out of
dnl # linux/fs.h to linux/filelock.h
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_FILELOCK_HEADER], [
ZFS_LINUX_TEST_SRC([filelock_header], [
#include <linux/fs.h>
#include <linux/filelock.h>
], [])
])
AC_DEFUN([ZFS_AC_KERNEL_FILELOCK_HEADER], [
AC_MSG_CHECKING([for standalone filelock header])
ZFS_LINUX_TEST_RESULT([filelock_header], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_FILELOCK_HEADER, 1, [linux/filelock.h exists])
], [
AC_MSG_RESULT(no)
])
])

View File

@ -1,4 +1,3 @@
dnl # SPDX-License-Identifier: CDDL-1.0
AC_DEFUN([ZFS_AC_KERNEL_SRC_COPY_SPLICE_READ], [ AC_DEFUN([ZFS_AC_KERNEL_SRC_COPY_SPLICE_READ], [
dnl # dnl #
dnl # Kernel 6.5 - generic_file_splice_read was removed in favor dnl # Kernel 6.5 - generic_file_splice_read was removed in favor

View File

@ -1,4 +1,3 @@
dnl # SPDX-License-Identifier: CDDL-1.0
dnl # dnl #
dnl # Starting from Linux 5.13, flush_dcache_page() becomes an inline dnl # Starting from Linux 5.13, flush_dcache_page() becomes an inline
dnl # function and may indirectly referencing GPL-only symbols: dnl # function and may indirectly referencing GPL-only symbols:

View File

@ -1,4 +1,3 @@
dnl # SPDX-License-Identifier: CDDL-1.0
dnl # dnl #
dnl # 2.6.28 API change, dnl # 2.6.28 API change,
dnl # check if fmode_t typedef is defined dnl # check if fmode_t typedef is defined

View File

@ -1,4 +1,3 @@
dnl # SPDX-License-Identifier: CDDL-1.0
dnl # dnl #
dnl # 2.6.38 API change dnl # 2.6.38 API change
dnl # follow_down() renamed follow_down_one(). The original follow_down() dnl # follow_down() renamed follow_down_one(). The original follow_down()

View File

@ -1,4 +1,3 @@
dnl # SPDX-License-Identifier: CDDL-1.0
dnl # dnl #
dnl # Handle differences in kernel FPU code. dnl # Handle differences in kernel FPU code.
dnl # dnl #

View File

@ -1,4 +1,3 @@
dnl # SPDX-License-Identifier: CDDL-1.0
dnl # dnl #
dnl # Linux 5.2 API change dnl # Linux 5.2 API change
dnl # dnl #

View File

@ -1,33 +0,0 @@
dnl # SPDX-License-Identifier: CDDL-1.0
dnl #
dnl # 2.6.38 API change
dnl # The .get_sb callback has been replaced by a .mount callback
dnl # in the file_system_type structure.
dnl #
dnl # 5.2 API change
dnl # The new fs_context-based filesystem API is introduced, with the old
dnl # one (via file_system_type.mount) preserved as a compatibility shim.
dnl #
dnl # 7.0 API change
dnl # Compatibility shim removed, so all callers must go through the mount API.
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_FS_CONTEXT], [
ZFS_LINUX_TEST_SRC([fs_context], [
#include <linux/fs.h>
#include <linux/fs_context.h>
],[
static struct fs_context fs __attribute__ ((unused)) = { 0 };
static struct fs_context *fsp __attribute__ ((unused));
fsp = vfs_dup_fs_context(&fs);
])
])
AC_DEFUN([ZFS_AC_KERNEL_FS_CONTEXT], [
AC_MSG_CHECKING([whether fs_context exists])
ZFS_LINUX_TEST_RESULT([fs_context], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_FS_CONTEXT, 1, [fs_context exists])
],[
AC_MSG_RESULT(no)
])
])

View File

@ -0,0 +1,30 @@
dnl #
dnl # 2.6.38 API change
dnl # The .get_sb callback has been replaced by a .mount callback
dnl # in the file_system_type structure.
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_FST_MOUNT], [
ZFS_LINUX_TEST_SRC([file_system_type_mount], [
#include <linux/fs.h>
static struct dentry *
mount(struct file_system_type *fs_type, int flags,
const char *osname, void *data) {
struct dentry *d = NULL;
return (d);
}
static struct file_system_type fst __attribute__ ((unused)) = {
.mount = mount,
};
],[])
])
AC_DEFUN([ZFS_AC_KERNEL_FST_MOUNT], [
AC_MSG_CHECKING([whether fst->mount() exists])
ZFS_LINUX_TEST_RESULT([file_system_type_mount], [
AC_MSG_RESULT(yes)
],[
ZFS_LINUX_TEST_ERROR([fst->mount()])
])
])

View File

@ -1,4 +1,3 @@
dnl # SPDX-License-Identifier: CDDL-1.0
dnl # dnl #
dnl # 6.6 API change, dnl # 6.6 API change,
dnl # fsync_bdev was removed in favor of sync_blockdev dnl # fsync_bdev was removed in favor of sync_blockdev

View File

@ -1,4 +1,3 @@
dnl # SPDX-License-Identifier: CDDL-1.0
dnl # dnl #
dnl # 5.3 API change dnl # 5.3 API change
dnl # The generic_fadvise() function is present since 4.19 kernel dnl # The generic_fadvise() function is present since 4.19 kernel

View File

@ -1,4 +1,3 @@
dnl # SPDX-License-Identifier: CDDL-1.0
dnl # dnl #
dnl # 5.12 API dnl # 5.12 API
dnl # dnl #

View File

@ -1,4 +1,3 @@
dnl # SPDX-License-Identifier: CDDL-1.0
dnl # dnl #
dnl # Check for generic io accounting interface. dnl # Check for generic io accounting interface.
dnl # dnl #

View File

@ -1,4 +1,3 @@
dnl # SPDX-License-Identifier: CDDL-1.0
dnl # dnl #
dnl # 5.17 API change, dnl # 5.17 API change,
dnl # dnl #

View File

@ -1,4 +1,3 @@
dnl # SPDX-License-Identifier: CDDL-1.0
dnl # dnl #
dnl # 2.6.x API change dnl # 2.6.x API change
dnl # dnl #

View File

@ -1,4 +1,3 @@
dnl # SPDX-License-Identifier: CDDL-1.0
dnl # dnl #
dnl # 6.0 API change dnl # 6.0 API change
dnl # struct iattr has two unions for the uid and gid dnl # struct iattr has two unions for the uid and gid

View File

@ -1,4 +1,3 @@
dnl # SPDX-License-Identifier: CDDL-1.0
dnl # dnl #
dnl # 5.12 API dnl # 5.12 API
dnl # dnl #

View File

@ -1,4 +1,3 @@
dnl # SPDX-License-Identifier: CDDL-1.0
AC_DEFUN([ZFS_AC_KERNEL_SRC_CREATE], [ AC_DEFUN([ZFS_AC_KERNEL_SRC_CREATE], [
dnl # dnl #
dnl # 6.3 API change dnl # 6.3 API change

View File

@ -1,4 +1,3 @@
dnl # SPDX-License-Identifier: CDDL-1.0
AC_DEFUN([ZFS_AC_KERNEL_SRC_INODE_GETATTR], [ AC_DEFUN([ZFS_AC_KERNEL_SRC_INODE_GETATTR], [
dnl # dnl #
dnl # Linux 6.3 API dnl # Linux 6.3 API

View File

@ -1,4 +1,3 @@
dnl # SPDX-License-Identifier: CDDL-1.0
dnl # dnl #
dnl # 3.6 API change dnl # 3.6 API change
dnl # dnl #

View File

@ -1,4 +1,3 @@
dnl # SPDX-License-Identifier: CDDL-1.0
AC_DEFUN([ZFS_AC_KERNEL_SRC_PERMISSION], [ AC_DEFUN([ZFS_AC_KERNEL_SRC_PERMISSION], [
dnl # dnl #
dnl # 6.3 API change dnl # 6.3 API change

View File

@ -1,4 +1,3 @@
dnl # SPDX-License-Identifier: CDDL-1.0
AC_DEFUN([ZFS_AC_KERNEL_SRC_INODE_SETATTR], [ AC_DEFUN([ZFS_AC_KERNEL_SRC_INODE_SETATTR], [
dnl # dnl #
dnl # Linux 6.3 API dnl # Linux 6.3 API

View File

@ -1,4 +1,3 @@
dnl # SPDX-License-Identifier: CDDL-1.0
dnl # dnl #
dnl # 6.19 API change. inode->i_state no longer accessible directly; helper dnl # 6.19 API change. inode->i_state no longer accessible directly; helper
dnl # functions exist. dnl # functions exist.

View File

@ -1,4 +1,3 @@
dnl # SPDX-License-Identifier: CDDL-1.0
AC_DEFUN([ZFS_AC_KERNEL_SRC_INODE_TIMES], [ AC_DEFUN([ZFS_AC_KERNEL_SRC_INODE_TIMES], [
dnl # dnl #

View File

@ -1,4 +1,3 @@
dnl # SPDX-License-Identifier: CDDL-1.0
dnl # dnl #
dnl # 2.6.28 API change dnl # 2.6.28 API change
dnl # Added insert_inode_locked() helper function. dnl # Added insert_inode_locked() helper function.

View File

@ -1,4 +1,3 @@
dnl # SPDX-License-Identifier: CDDL-1.0
dnl # dnl #
dnl # 2.6.39 API change, dnl # 2.6.39 API change,
dnl # The is_owner_or_cap() macro was renamed to inode_owner_or_capable(), dnl # The is_owner_or_cap() macro was renamed to inode_owner_or_capable(),

View File

@ -1,4 +1,3 @@
dnl # SPDX-License-Identifier: CDDL-1.0
dnl # dnl #
dnl # 6.18: some architectures and config option causes the kasan_ inline dnl # 6.18: some architectures and config option causes the kasan_ inline
dnl # functions to reference the GPL-only symbol 'kasan_flag_enabled', dnl # functions to reference the GPL-only symbol 'kasan_flag_enabled',

Some files were not shown because too many files have changed in this diff Show More