Compare commits

...

160 Commits

Author SHA1 Message Date
Tony Hutter c840612ee1 Tag zfs-2.3.6
META file and changelog updated.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
2026-02-19 14:58:21 -08:00
Tony Hutter 65579f4cba CI: Test & fix Linux ZFS built-in build
ZFS can be built directly into the Linux kernel.  Add a test build
of this to the CI to verify it works.  The test build is only enabled
on Fedora runners (since they run the newest kernels) and is done in
parallel with ZTS.  The test build is done on vm2, since it typically
finishes ~15min before vm1 and thus has time to spare.

In addition:

- Update 'copy-builtin' to check that $1 is a directory
- Fix some VERIFYs that were causing the built-in build to fail

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #18234
2026-02-19 14:58:21 -08:00
Alexx Saver 2032f21857 chksum: run 256K benchmark on demand, preserve chksum_stat_data
Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexx Saver <lzsaver.eth@ethermail.io>
Co-authored-by: Adam Moss <c@yotes.com>
Closes #17945
Closes #17946
2026-02-17 10:18:14 -08:00
Tony Hutter dc58baf9d1 Linux 6.19 compat: META
Update the META file to reflect compatibility with the 6.19
kernel.

Reviewed-by: Rob Norris <robn@despairlabs.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #18197
2026-02-11 16:18:01 -08:00
Brooks Davis 06a88f9d13 nvpair: chase FreeBSD xdrproc_t definition
As of FreeBSD 16, xdrproc_t will take exactly two arguments in both
kernel and userspace in line with the Linux kernel.

Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Reviewed-by: Alan Somers <asomers@freebsd.org>
Signed-off-by:	Brooks Davis <brooks@capabilitieslimited.co.uk>
Closes #18154
2026-02-11 16:18:01 -08:00
Alek P 88ce22ed95 remove thread unsafe debug code causing FreeBSD double free panic
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alan Somers <asomers@gmail.com>
Signed-off-by: Alek Pinchuk <apinchuk@axcient.com>
Closes #18140
2026-02-11 16:18:01 -08:00
Mark Johnston 366dad1cac FreeBSD: Remove references to DEBUG_VFS_LOCKS
This option is removed upstream in favour of plain INVARIANTS.

VNASSERT is always defined so I see no reason to use it conditionally.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Signed-off-by: Mark Johnston <markj@FreeBSD.org>
Closes #18136
2026-02-11 16:18:01 -08:00
Alexander Motin 1dc5088e6a FreeBSD: Remove HAVE_INLINE_FLSL use
These macros are deprecated in FreeBSD kernel for several years,
and unneeded for much longer.  Instead, similar to Linux, let
kernel let compiler do the right things.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <alexander.motin@TrueNAS.com>
Closes #18004
2026-02-11 16:18:01 -08:00
Rob Norris 135fffbc3e Linux 6.19: replace i_state access with inode_state_read_once()
Sponsored-by: https://despairlabs.com/sponsor/
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rob Norris <robn@despairlabs.com>
Closes #18053
2026-02-11 16:18:01 -08:00
Rob Norris 18065e9296 Linux 6.18: generic_drop_inode() and generic_delete_inode() renamed
Sponsored-by: https://despairlabs.com/sponsor/
Signed-off-by: Rob Norris <robn@despairlabs.com>
2026-02-11 16:18:01 -08:00
Rob Norris 00ee7f9430 linux/super: add tunable to request immediate reclaim of unused dentries
Traditionally, unused dentries would be cached in the dentry cache until
the associated entry is no longer on disk. The cached dentry continues
to hold an inode reference, causing the inode to be pinned (see previous
commit).

Here we implement the dentry op d_delete, which is roughly analogous to
the drop_inode superblock op, and add a zfs_delete_dentry tunable to
control its behaviour. By default it continues the traditional
behaviour, but when the tunable is enabled, we signal that an unused
dentry should be freed immediately, releasing its inode reference, and
so allowing that inode to be deleted if no longer in use.

Sponsored-by: Klara, Inc.
Sponsored-by: Fastmail Pty Ltd
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Signed-off-by: Rob Norris <rob.norris@klarasystems.com>
Closes #17746
2026-02-11 16:18:01 -08:00
Rob Norris 3662c7f33c linux/super: add tunable to request immediate reclaim of unused inodes
Traditionally, unused inodes would be held on the superblock inode cache
until the associated on-disk file is removed or the kernel requests
reclaim.  On filesystems with millions of rarely-used files, this can be
a lot of unusable memory.

Here we implement the superblock drop_inode method, and add a
zfs_delete_inode tunable to control its behaviour. By default it
continues the traditional behaviour, but when the tunable is enabled, we
signal that the inode should be deleted immediately when the last
reference is dropped, rather than cached. This releases the associated
data to the dbuf cache and ARC, allowing them to be reclaimed normally.

Sponsored-by: Klara, Inc.
Sponsored-by: Fastmail Pty Ltd
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Signed-off-by: Rob Norris <rob.norris@klarasystems.com>
Closes #17746
2026-02-11 16:18:01 -08:00
Rob Norris 29bda86d7b config: restore ZFS_AC_KERNEL_DENTRY tests
Accidentally removed calls in ed048fdc5b.

Sponsored-by: https://despairlabs.com/sponsor/
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Rob Norris <robn@despairlabs.com>
Closes #17621
2026-02-11 16:18:01 -08:00
Alex 6788dcd47c Fix a declaration position of the nth_page.
Compilation time bug introduced by 87df5e4 commit.
Fix for the compilation error(Linux kernel 6.18.0):
"zfs/module/os/linux/zfs/abd_os.c:920:32: error: implicit declaration
of function ‘nth_page’; did you mean ‘pte_page’?
[-Werror=implicit-function-declaration]".

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Rob Norris <robn@despairlabs.com>
Signed-off-by: agiUnderground <alex.dev.cv@gmail.com>
Closes #18034
2026-02-11 13:33:19 -08:00
Erik Larsson beb25b936b Fix build for Linux 6.18 with PowerPC/RISC-V kernels. (#18145)
The macro 'flush_dcache_page(...)' modifies the page flags, but in Linux
6.18 the type of the page flags changed from 'unsigned long' to the
struct type 'memdesc_flags_t' with a single member 'f' which is the page
flags field.

Signed-off-by: Erik Larsson <catacombae@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
2026-02-11 13:33:19 -08:00
John Cabaj d857aea6d4 Linux 6.19: handle --werror with CONFIG_OBJTOOL_WERROR=y
Linux upstream commit 56754f0f46f6: "objtool: Rename
--Werror to --werror" did just that, so we should check for
either "--Werror" or "--werror", else the build will fail

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Attila Fülöp <attila@fueloep.org>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: John Cabaj <john.cabaj@canonical.com>
Closes #18152
2026-02-11 13:33:19 -08:00
Brian Behlendorf f7ab47908b ZTS: update the relevant mmp test cases
- mmp_concurrent_import: added test case to verify that concurrent
  import correctness.  The pool may only be imported once.

- mmp_exported_import: an activity check is now required for pools
  which were cleanly exported if the system and pool hostids don't
  match.

- mmp_inactive_import: an activity check is now required for any
  pool which wasn't cleanly exported, even if the system and pool
  hostids match.

- mmp_on_uberblocks: updated expected uberblocks to take in to account
  the value MMP_INTERVAL_DEFAULT is set too.

- mmp_reset_interval: reduce the number of iterations from 10 to 3.
  This is sufficient to verify functionality and significantly speeds
  up the test.

- mmp_on_uberblocks: adjust the thresholds and increase the runtime
  to avoid false positives observed in CI.

- Update tests to use 'zhack action idle' instead of ztest to improve
  the reliability of the tests.

- Add additional log_note messages to test cases which have multiple
  verification steps to make it clear which portion of a test failed
  when reviewing the logs.

- Replace default_setup/cleanup_noexit calls with 'zpool create' and
  'zpool destroy' calls to avoid additional unnecessary dataset
  creation work.

- Update activity/noactivity check helper functions to use the
  ZFS_LOAD_INFO_DEBUG information now available from 'zpool import'
  to determine if this activity check ran and why.  This is more
  reliable in the CI than measuring the runtime.

- Removed all mmp tests from the zts-report.py exceptions list.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-by: Akash B <akash-b@hpe.com>
2026-02-11 13:33:19 -08:00
Brian Behlendorf d56f3cb331 zhack: add "action idle" subcommand
In order to reliably test the multihost protection we need two (or more)
systems attempting to import the pool at the same time.  Historically, we've
used ztest running in userspace to simulate an active pool and attempted to
import the pool with the kernel modules.  This works but ztest is a bit
unwieldy for this and if it crashes for unrelated reasons it can result
in false positives.

All we really need is the pool imported in userspace so the MMP thread is
active and writing out uberblocks.  We can extend zhack which already knows
how to import the pool read/write and add an option to leave the pool open
and idle.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-by: Akash B <akash-b@hpe.com>
2026-02-11 13:33:19 -08:00
Brian Behlendorf d8594ba2b8 zhack: add -G option to dump debug buffer
Add a -G option to zhack to dump the internal debug buffer on exit.
We were able to use the same code from zdb for this which was nice.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-by: Akash B <akash-b@hpe.com>
2026-02-11 13:33:19 -08:00
Brian Behlendorf a65bb7c518 mmp: claim sequence id before final import
As part of SPA_LOAD_IMPORT add an additional activity check to
detect simultaneous imports from different hosts.  This check is
only required when the timing is such that there's no activity
for the the read-only tryimport check to detect.  This extra
safety chceck operates as follows:

1. Repeats the following MMP check 10 times:
  a. Write out an MMP uberblock with the best txg and a random
     sequence id to all primary pool vdevs.
  b. Verify a minimum number of good writes such that even if
     the pool appears degraded on the remote host it will see
     at least one of the updated MMP uberblocks.
  c. Wait for the MMP interval this leaves a window for other
     racing hosts to make similar modifications which can be
     detected.
  d. Call vdev_uberblock_load() to determine the best uberblock
     to use, this should be the MMP uberblock just written.
  e. Verify the txg and random sequeunce number match the MMP
     uberblock written in 1a.

2. Restore the original MMP uberblocks.  This allows the check
   to be performed again if the pool fails to import for an
   unrelated reason.

This change also includes some refactoring and minor improvements.

- Never try loading earlier txgs during import when the import
  fails with EREMOTEIO or EINTER.  These errors don't indicate
  the txg is damaged but instead that its either in use on a
  remote host or the import was interactively cancelled.  No
  rewind is also performed for EBADD which can result from a
  stale trusted config when doing a verbatim import.

- Refactor the code for consistent logging of the multihost
  activity check using spa_load_note() and console messages
  indicating when the activity check was trigger and the result.

- Added MMP_*_MASK and MMP_SEQ_CLEAR() macros to allow easier
  modification of the sequence number in an uberblock.

- Added ZFS_LOAD_INFO_DEBUG environment variable which can be
  set to log to dump to stdout the spa_load_info nvlist returned
  during import.  This is used by the updated mmp test cases
  to determine if an activity check was run and its result.

- Standardize the mmp messages similarly to make it easier to
  find all the relevent mmp lines in the debug log.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-by: Akash B <akash-b@hpe.com>
2026-02-11 13:33:19 -08:00
Brian Behlendorf 328a823848 mmp: add spa_load_name() for tryimport
Tryimport adds a unique prefix to the pool name to avoid name
collisions.  This makes it awkward to log user-friendly info
during a tryimport.  Add a spa_load_name() function which can
be used to report the unmodified pool name.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-by: Akash B <akash-b@hpe.com>
2026-02-11 13:33:19 -08:00
Brian Behlendorf 8bbd86693e mmp: move "Starting import" log message
Move the "Starting import" log message in to the import block so
it's matched with the "Fiinshed importing" debug message.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-by: Akash B <akash-b@hpe.com>
2026-02-11 13:33:19 -08:00
Brian Behlendorf 36c315571c mmp: further restrict mmp exported pool check
For a cleanly exported pools there exists a small window where
both systems may determine it's safe to import the pool and skip
the activity check.  Only allow the check to be skipped when the
last imported hostid matches the systems hostid and the pool was
cleanly exported.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-by: Akash B <akash-b@hpe.com>
2026-02-11 13:33:19 -08:00
Rob Norris 8010a8a3ca spa_activity_check: narrow scope of MMP vars
They aren't used outside these very small blocks, and their initial
values are never used at all.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Rob Norris <robn@despairlabs.com>
Sponsored-by: https://despairlabs.com/sponsor/
Closes #17551
2026-02-11 13:33:19 -08:00
Paul Dagnelie a1d839eddd Enable zhack to work properly with 4k sector size disks
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Paul Dagnelie <paul.dagnelie@klarasystems.com>
Closes #17576
2026-02-11 13:33:12 -08:00
Paul Dagnelie 411249498e Add allocation profile export and zhack subcommand for import
When attempting to debug performance problems on large systems, one of
the major factors that affect performance is free space
fragmentation. This heavily affects the allocation process, which is an
area of active development in ZFS. Unfortunately, fragmenting a large
pool for testing purposes is time consuming; it usually involves filling
the pool and then repeatedly overwriting data until the free space
becomes fragmented, which can take many hours. And even if the time is
available, artificial workloads rarely generate the same fragmentation
patterns as the natural workloads they're attempting to mimic.

This patch has two parts. First, in zdb, we add the ability to export
the full allocation map of the pool. It iterates over each vdev,
printing every allocated segment in the ms_allocatable range tree. This
can be done while the pool is online, though in that case the allocation
map may actually be from several different TXGs as new ones are loaded
on demand.

The second is a new subcommand for zhack, zhack metaslab leak (and its
supporting kernel changes). This is a zhack subcommand that imports a
pool and then modified the range trees of the metaslabs, allowing the
sync process to write them out normall. It does not currently store
those allocations anywhere to make them reversible, and there is no
corresponding free subcommand (which would be extremely dangerous); this
is an irreversible process, only intended for performance testing. The
only way to reclaim the space afterwards is to destroy the pool or roll
back to a checkpoint.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Paul Dagnelie <paul.dagnelie@klarasystems.com>
Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Closes #17576
2026-02-11 10:27:01 -08:00
Tony Hutter 9a5027ccce CI: Test build Lustre against ZFS
The Lustre filessytem calls a number of exported ZFS functions.  Do a
test build on the Almalinux runners to make sure we're not breaking
Lustre.  We do the Lustre build in parallel with the normal ZTS test
for efficiency, since ZTS isn't very CPU intensive. The full Lustre
build takes around 15min when run on its own.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #18161
2026-02-11 10:26:41 -08:00
Tony Hutter a50b8a727c CI: Fix qemu-1-setup failure, remove debug stuff
- For whatever reason, the runner will now startup with either two 75GB
  disks or one 150GB disk.  Previously the runner was always booting
  with two 75GB, but about a quarter of the time it now starts up
  with a single 150GB disk.  This caused qemu-1-setup.sh to fail
  since it expected the two 75GB disks.  This commit updates
  qemu-1-setup.sh to work with either disk config.

- Remove the watchdog from qemu-1-setup.sh.  It didn't turn out to be
  useful.

- Remove the timestamps that zfs-qemu.yml added to the qemu-1-setup.sh
  output.  The timestamps were redundant, since you can already
  download timestamped logs from the Github web interface.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #18166
2026-02-11 10:26:36 -08:00
Alexander Moch bf4b271af1 CI: Add Alpine Linux 3.23 runner to the pipeline (#18087)
Add an Alpine Linux 3.23 runner to the CI chain to run OpenZFS builds
and tests against musl libc.

Currently, zfs_send_sparse is killed after 10 minutes on Alpine, causing
cascading EBUSY failures in the test suite. With zfs_send_sparse
disabled, the ZFS test suite reaches a pass rate of 94.62%.

This commit introduces the required Alpine-specific setup and a small
set of shell and cloud-init compatibility fixes that also apply to
existing Linux runners.

The Alpine runner is not enabled by default and is not executed for new
pull requests.

Sponsored-by: ERNW Research GmbH - https://ernw-research.de/

Signed-off-by: Alexander Moch <amoch@ernw.de>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de>
2026-02-11 10:26:30 -08:00
Tony Hutter 38ed094954 ZTS: add mount_loopback to test zfs behind loop dev
Add a test case to reproduce issue #17277:

1. Make a pool
2. Write a file to the pool
3. Mount the file as a loopback device
4. Make an XFS filesystem on the loopback device
5. Mount the XFS filesystem... <hangs>

Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Reviewed-by: Rob Norris <robn@despairlabs.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Issue #17277
Closes #17329
2026-02-11 10:26:24 -08:00
Tony Hutter 497b9291d1 CI: Test 2.4.x in qemu-test-repo-vm.sh, quick mode
The qemu-test-repo-vm.sh script tests installs ZFS from different
repos.  Have it test from the new 2.4.x repos as well.

Also add a checkbox to run in "lookup mode".  This just does a
quick lookup to see what version is installed in each repo.  It does
not do a test install and module load.  It only takes 3min to run vs
over an hour for the full version.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #18070
2026-02-11 10:25:56 -08:00
Brian Behlendorf 60b37bc647 CI: Add smatch static analysis workflow
Smatch is an actively maintained kernel-aware static analyzer
for C with a low false positive rate.  Since the code checker
can be run relatively quickly against the entire OpenZFS code
base (15 min) it makes sense to add it as a GitHub Actions
workflow.  Today smatch reports a significant numbers warnings
so the workflow is configured to always pass as long as the
analysis was run.  The results are available for reference.
Long term it would ideal to resolve all of the errors/warnings
at which point the workflow can be updated to fail when new
problems are detected.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Toomas Soome <tsoome@me.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #17935
2026-02-11 10:25:48 -08:00
Alexander Motin 44e6a07bff ZIO: Set minimum number of free issue threads to 32
Free issue threads might block waiting for synchronous DDT, BRT or
GANG header reads. So unlike other taskqs using ZTI_SCALE to scale
with number of CPUs, here we also need some amount of threads to
potentially saturate pool reads.  I am not sure we always want the
96 threads we had before ZTI_SCALE introduction at #11966 on small
systems, but lets make it at least 32.

While here, make free taskqs configurable, similar to read and
write ones.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Rob Norris <robn@despairlabs.com>
Signed-off-by: Alexander Motin <alexander.motin@TrueNAS.com>
Closes #17903
2025-12-19 19:55:14 -08:00
Alexander Motin 08d34f28f1 Suppress some ashift warnings
Do not warn about vdev ashifts being smaller then physical ashifts
in a pool status if the pool ashift property set and vdev ashift
satisfies it (bigger or equal), since user explicitly requested
this.  The ashift of individual vdevs are still reported.

Do not warn about vdev ashifts in zpool import, since it doesn't
matter much, and we don't even report individual vdevs ashifts
there.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Allan Jude <allan@klarasystems.com>
Reviewed-by: Rob Norris <rob.norris@klarasystems.com>
Signed-off-by: Alexander Motin <alexander.motin@TrueNAS.com>
Closes #17830
2025-12-19 19:55:14 -08:00
Alexander Motin 1397bf1e0e Explicit set ashift for non-leaf vdevs
Before this change ashift property was applied only to a leaf
vdevs.  As result, it worked only as a minimal value for parent
vdevs, since bigger physical_ashift value reported by any child
could be used instead when deciding parent's ashift, as if the
ashift property was never set.

This change explicitly passes ZPOOL_CONFIG_ASHIFT to all vdevs,
allowing override for parents only if the passed value is below
logical_ashift and so unacceptable.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Allan Jude <allan@klarasystems.com>
Reviewed-by: Rob Norris <rob.norris@klarasystems.com>
Signed-off-by: Alexander Motin <alexander.motin@TrueNAS.com>
Closes #17826
2025-12-19 19:55:14 -08:00
Alexander Motin c87a1f7137 raidz_test: Restore rand_data protection
It feels dirty to modify protection of a memory allocated via libc,
but at least we should try to restore it before freeing.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Rob Norris <robn@despairlabs.com>
Signed-off-by: Alexander Motin <alexander.motin@TrueNAS.com>
Closes #17977
2025-12-19 19:55:14 -08:00
Alexander Motin b2d052e617 raidz_test: Fix ZIO ABDs initialization
- When filling ABDs of several segments, consider offset.
 - "Corrupt" ABDs with actually different data to fail something.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Rob Norris <robn@despairlabs.com>
Signed-off-by: Alexander Motin <alexander.motin@TrueNAS.com>
Closes #17977
2025-12-19 19:55:14 -08:00
Alexander Motin 23d4ce66f8 raidz_test: Set io_offset reasonably
- io_offset of 1 makes no sense.  Set default to 0.
 - Initialize io_offset in all cases.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Rob Norris <robn@despairlabs.com>
Signed-off-by: Alexander Motin <alexander.motin@TrueNAS.com>
Closes #17977
2025-12-19 19:55:14 -08:00
Alexander Motin af9ae623e0 ZFS: Enable more logs for raidz_001_neg
The output is not so big here, so lets collect something useful.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Rob Norris <robn@despairlabs.com>
Signed-off-by: Alexander Motin <alexander.motin@TrueNAS.com>
Closes #17977
2025-12-19 19:55:14 -08:00
Tony Hutter e35fdeb411 CI: Use Ubuntu mirrors instead of azure (#18057)
Use the official Ubuntu apt mirrors instead of
azure.archive.ubuntu.com, since that mirror can be slow:

    https://github.com/actions/runner-images/issues/7048

This can help speed up the 'Setup QEMU' stage.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #18057
2025-12-19 19:55:14 -08:00
Tony Hutter 00ab445b51 CI: Change timeout values
The 'Setup QEMU' CI step updates and installs all packages necessary to
startup QEMU.  Typically the step takes a little over a minute, but
we've seen cases where it can take legitimately take more than 45min
minutes.  Change the timeout to 60 minutes.

In addition, change the 'Install dependencies' timeout to 60min since
we've also seen timeouts there.

Lastly, remove all timeouts from the zfs-qemu-packages workflow.
We do this so that we can always build packages from a branch, even if
the time it takes to do a CI step changes over time.  It's ok to
eliminate the timeouts from the zfs-qemu-packages completely since that
workflow is only run manually.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #18056
2025-12-19 19:55:14 -08:00
Tony Hutter e3fe4293f7 CI: zfs-test-packages: Add in new repos
Test install from our new repos: zfs-latest, zfs-legacy,
zfs-2.3, zfs-2.2, from the zfs-test-packages workflow.
This on-demand workflow is use to verify that the zfs RPMs
in the repos are correct.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #17956
2025-12-19 19:55:14 -08:00
Tony Hutter e51c8c0e83 CI: Fix Ubuntu 22.01 rsend failures
For whatever reason, the single `log_note` in the `directory_diff`
function causes the function to stop executing on Ubuntu 22.  This
causes most of the rsend tests to fail.  Remove the line since it's only
informational.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #18032
2025-12-19 19:55:14 -08:00
Brian Behlendorf 0afe9b67c2 CI: exclude signed-off-by/reviewed-by from 72 char limit
Allow an author or reviewer's name and email address to exceed
the 72 character limit enforced by the commitcheck target.

Reviewed-by: RageLtMan <rageltman@sempervictus>
Reviewed-by: Rob Norris <robn@despairlabs.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #18030
2025-12-19 19:55:14 -08:00
bspengler-oss 4f77b30135 Fix HIGHMEM/kmap API violation in zfs_uiomove_bvec_impl()
Fix another instance where ZFS assumes multiple pages can be
mapped at once via zfs_kmap_local(), resulting in crashes and
potential memory corruption on HIGHMEM-enabled (typically 32-bit)
systems.

Reviewed-by: RageLtMan <rageltman@sempervictus>
Reviewed-by: Rob Norris <robn@despairlabs.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: bspengler-oss <94915855+bspengler-oss@users.noreply.github.com>
Closes #15668
Closes #18030
2025-12-19 19:55:14 -08:00
bspengler-oss 445879656b Preserve LIFO ordering of kmap ops in abd_raidz_gen_iterate()
ZFS typically preserves proper LIFO ordering regarding map/unmap
operations that wrap the Linux kernel's kmap interfaces that
require such ordering, but one instance in abd_raidz_gen_iterate()
did not.

Similar issues have been fixed in the Linux kernel in the past,
see for instance CVE-2025-39899 for userfaultfd.

Reviewed-by: RageLtMan <rageltman@sempervictus>
Reviewed-by: Rob Norris <robn@despairlabs.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: bspengler-oss <94915855+bspengler-oss@users.noreply.github.com>
Closes #15668
Closes #18030
2025-12-19 19:55:14 -08:00
bspengler-oss 0dcb882037 Fix interaction of abd_iter_map()/abd_iter_unmap() with HIGHMEM
HIGHMEM kmap interfaces operate on only a single page at a time
yet ZFS hadn't accounted for this, resulting in crashes and
potential memory corruption on HIGHMEM (typically 32-bit) systems.
This was caught by PaX's KERNSEAL feature as it makes use of
HIGHMEM functionality on x64.

On typical 64-bit systems, this issue wouldn't have been observed,
as the map interfaces simply fall back to returning an address in
lowmem where the contiguous pages can be accessed directly.

Joint work with the PaX Team, tested by Mark van Dijk

Reviewed-by: RageLtMan <rageltman@sempervictus>
Reviewed-by: Rob Norris <robn@despairlabs.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: bspengler-oss <94915855+bspengler-oss@users.noreply.github.com>
Closes #15668
Closes #18030
2025-12-19 19:55:14 -08:00
Rob Norris 2fec0e3add Linux 6.18: META
Sponsored-by: https://despairlabs.com/sponsor/
Signed-off-by: Rob Norris <robn@despairlabs.com>
2025-12-17 11:24:44 -08:00
Rob Norris b8c5d43f34 config/kmap_atomic: initialise test data
6.18 changes kmap_atomic() to take a const pointer. This is no problem
for the places we use it, but Clang fails the test due to a warning
about being unable to guarantee that uninitialised data will definitely
not change. Easily solved by forcibly initialising it.

Sponsored-by: https://despairlabs.com/sponsor/
Signed-off-by: Rob Norris <robn@despairlabs.com>
2025-12-17 11:24:44 -08:00
Rob Norris b3922eb8c1 zvol_id: make array length properly known at compile time
Using strlen() in an static array declaration is a GCC extension. Clang
calls it "gnu-folding-constant" and warns about it, which breaks the
build. If it were widespread we could just turn off the warning, but
since there's only one case, lets just change the array to an explicit
size.

Sponsored-by: https://despairlabs.com/sponsor/
Signed-off-by: Rob Norris <robn@despairlabs.com>
2025-12-17 11:24:44 -08:00
Rob Norris 8ebb586e0e Linux: bump -std to gnu11
Linux switched from -std=gnu89 to -std=gnu11 in 5.18
(torvalds/linux@e8c07082a8). We've always overridden that with gnu99
because we use some newer features.

More recent kernels are using C11 features in headers that we include.
GCC generally doesn't seem to care, but more recent versions of Clang
seem to be enforcing our gnu99 override more strictly, which breaks the
build in some configurations.

Just bumping our "override" to match the kernel seems to be the easiest
workaround. It's an effective no-op since 5.18, while still allowing us
to build on older kernels.

Sponsored-by: https://despairlabs.com/sponsor/
Signed-off-by: Rob Norris <robn@despairlabs.com>
2025-12-17 11:24:44 -08:00
Rob Norris f4ead6682c sha256_generic: make internal functions a little more private
Linux 6.18 has conflicting prototypes for various sha256_* and sha512_*
functions, which we get through a very long include chain. That's tough
to fix right now; easier is just to rename our internal functions.

Sponsored-by: https://despairlabs.com/sponsor/
Signed-off-by: Rob Norris <robn@despairlabs.com>
2025-12-17 11:24:44 -08:00
Rob Norris a82b804250 Linux 6.18: namespace type moved to ns_common
The namespace type has moved from the namespace ops struct to the
"common" base namespace struct. Detect this and define a macro that does
the right thing for both versions.

Sponsored-by: https://despairlabs.com/sponsor/
Signed-off-by: Rob Norris <robn@despairlabs.com>
2025-12-17 11:24:44 -08:00
Rob Norris 8757506930 Linux 6.18: replace write_cache_pages()
Linux 6.18 removed write_cache_pages() without a usable replacement.
Here we implement a minimal zpl_write_cache_pages() that find the dirty
pages within the mapping, gets them into the expected state and hands
them off to zfs_putpage(), which handles the rest.

Sponsored-by: https://despairlabs.com/sponsor/
Signed-off-by: Rob Norris <robn@despairlabs.com>
2025-12-17 11:24:44 -08:00
Rob Norris c1f1464525 Linux 6.18: block_device_operations->getgeo takes struct gendisk*
Sponsored-by: https://despairlabs.com/sponsor/
Signed-off-by: Rob Norris <robn@despairlabs.com>
2025-12-17 11:24:44 -08:00
Rob Norris 51ab0e2185 Linux 6.18: convert ida_simple_* calls
ida_simple_get() and ida_simple_remove() are removed in 6.18. However,
since 4.19 they have been simple wrappers around ida_alloc() and
ida_free(), so we can just use those directly.

Sponsored-by: https://despairlabs.com/sponsor/
Signed-off-by: Rob Norris <robn@despairlabs.com>
2025-12-17 11:24:44 -08:00
Rob Norris 51421ecbe8 Linux 6.18: replace nth_page()
Sponsored-by: https://despairlabs.com/sponsor/
Signed-off-by: Rob Norris <robn@despairlabs.com>
2025-12-17 11:24:44 -08:00
Alexander Moch bbbf438d66 linux: use sys/stat.h instead of linux/stat.h
glibc includes linux/stat.h for statx, but musl defines its own statx
struct and associated constants, which does not include STATX_MNT_ID
yet. Thus, including linux/stat.h directly should be avoided for
maximum libc compatibility.

Tested on:
  - glibc: x86_64, i686, aarch64, armv7l, armv6l
  - musl: x86_64, aarch64, armv7l, armv6l

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Tested-By: Achill Gilgenast <achill@achill.org>

Closes #17675
(cherry picked from commit ccf5a8a6fc)

Signed-off-by: classabbyamp <dev@placeviolette.net>
Signed-off-by: Alexander Moch <mail@alexmoch.com>
Co-authored-by: classabbyamp <5366828+classabbyamp@users.noreply.github.com>
2025-12-09 11:58:45 -08:00
Alexander Moch 2d9ba1e3c8 config: Fix LLVM-21 -Wuninitialized-const-pointer warning (#17997)
LLVM-21 enables -Wuninitialized-const-pointer which results in the
following compiler warning and the bdev_file_open_by_path() interface
not being detected for 6.9 and newer kernels.  The blk_holder_ops
are not used by the ZFS code so we can safely use a NULL argument
for this check.

    bdev_file_open_by_path/bdev_file_open_by_path.c:110:54: error:
    variable 'h' is uninitialized when passed as a const pointer
    argument here [-Werror,-Wuninitialized-const-pointer]

Reviewed-by: Rob Norris <robn@despairlabs.com>

Closes #17682
Closes #17684
(cherry picked from commit 9acedbacee)

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Moch <mail@alexmoch.com>
Co-authored-by: Brian Behlendorf <behlendorf1@llnl.gov>
2025-12-03 11:30:55 -08:00
Tony Hutter ab38521f31 Tag zfs-2.3.5
META file and changelog updated.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
2025-11-13 09:13:28 -08:00
Alexander Motin 12d3e1fc61 FreeBSD: Satisfy ASSERT_VOP_IN_SEQC()
zfs_aclset_common() might be called for newly created or not even
created vnodes, that triggers assertions on newer FreeBSD versions
with DEBUG_VFS_LOCKS included into INVARIANTS.  In the first case
make sure to call vn_seqc_write_begin()/_end(), in the second just
skip the assertion.

The similar has to be done for project management IOCTL and file-
bases extended attributes, since those are not going through VFS.

Signed-off-by: Alexander Motin <alexander.motin@TrueNAS.com>
Closes #17722
2025-11-13 09:13:28 -08:00
Tino Reichardt a68a9c726d CI: Update FreeBSD versions and ci-type handling
Update FreeBSD versions:
- add FreeBSD 15.0-STABLE
- add FreeBSD 16.0-CURRENT

So we use the latest versions of each line now:
  - Freebsd 14.3 (RELEASE)
  - FreeBSD 15.0 (STABLE)
  - FreeBSD 16.0 (CURRENT)

In commits - you may specify which type of CI should run:
- ZFS-CI-Type: quick
- ZFS-CI-Type: linux
- ZFS-CI-Type: freebsd
- ZFS-CI-Type: full

Reviewed-by: Alexx Saver <lzsaver@users.noreply.github.com>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de>
Closes #17896
2025-11-12 10:59:16 -08:00
Alexander Motin 19b9d93970 BRT: Fix ranges to blocks conversion math
BRT_RANGESIZE_TO_NBLOCKS() takes number of ranges as its argument.
To get number of blocks we should multiply it by the entry size,
not divide by it, as it was due to missing parentheses.

Before #17875 this could cause small memory corruptions for vdevs
bigger than 64TB, but the change made the bug more noticeable.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <alexander.motin@TrueNAS.com>
Closes #17886
Closes #17915
2025-11-12 09:52:16 -08:00
Brian Behlendorf 54d76c8d1e zstd: disable intrinsics
Disable the aarch64 NEON SIMD intrinsics for kernel builds.  Safely
using them in the kernel context requires saving/restoring the FPU
registers which is not currently done.

Additionally, remove the aarch64 optimized PREFETCH_L1 and PREFETCH_L2
instruction.  Rely on the more portable compiler built ins.

This lets us remove the problematic workaround in the aarch64_compat.h
header which undefines the __aarch64__ macro.

Reviewed-by: Rob Norris <robn@despairlabs.com>
Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #17904
Closes #17852
2025-11-12 09:52:16 -08:00
Tony Hutter b8ee796945 Linux 6.17 compat: Fix broken projectquota on 6.17
We need to specifically use the FX_XFLAG_* macros in zpl_ioctl_*attr()
codepaths, and the FS_*_FL macros in the zpl_ioctl_*flags() codepaths.
The earlier code just assumes the FS_*_FL macros for both codepaths.
The 6.17 kernel add a bitmask check in copy_fsxattr_from_user() that
exposed this error via failing 'projectquota' ZTS tests.

Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #17884
Closes #17869
2025-11-12 09:52:16 -08:00
youzhongyang 8e7a310860 Synchronize the update of feature refcount
The concurrent execution of feature_sync() can lead to a panic due 
to an unprotected update of the feature refcount.  Resolve this by
using the spa->spa_feat_stats_lock to synchronize the update of the 
refcount.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Youzhong Yang <yyang@mathworks.com>
Closes #17184
Closes #17632
2025-11-05 08:53:06 -08:00
Alexander Motin 9f2cbea1dc zdb: Fix asize overflow in verify_livelist_allocs()
Spacemap entry might be too big to fit into a block pointer ashift.
We hit an assertion trying to run `zdb -bvy` on a large pool.  But
it seems the code does not really need size there, since we only
need to search for a range of offsets, so setting it to zero should
just make btree return position just before the first entry.  I
suspect the previous code could actually miss the first entry
due to this if its size was smaller.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Igor Kozhukhov <igor@dilos.org>
Signed-off-by: Alexander Motin <alexander.motin@TrueNAS.com>
Closes #17764
2025-10-22 10:36:30 -07:00
Alexander Motin 81ceee0cff Fix two infinite loops if dmu_prefetch_max set to zero
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <alexander.motin@TrueNAS.com>
Closes #17692
Closes #17729
2025-10-22 10:36:30 -07:00
Robert Evans a60214e792 dnode_next_offset: backtrack if lower level does not match
This changes the basic search algorithm from a single search up and down
the tree to a full depth-first traversal to handle conditions where the
tree matches at a higher level but not a lower level.

Normally higher level blocks always point to matching blocks, but there
are cases where this does not happen:

1. Racing block pointer updates from dbuf_write_ready.

   Before f664f1ee7f (#8946), both dbuf_write_ready and
   dnode_next_offset held dn_struct_rwlock which protected against
   pointer writes from concurrent syncs.

   This no longer applies, so sync context can f.e. clear or fill all
   L1->L0 BPs before the L2->L1 BP and higher BP's are updated.

   dnode_free_range in particular can reach this case and skip over L1
   blocks that need to be dirtied. Later, sync will panic in
   free_children when trying to clear a non-dirty indirect block.

   This case was found with ztest.

2. txg > 0, non-hole case. This is #11196.

   Freeing blocks/dnodes breaks the assumption that a match at a higher
   level implies a match at a lower level when filtering txg > 0.

   Whenever some but not all L0 blocks are freed, the parent L1 block is
   rewritten. Its updated L2->L1 BP reflects a newer birth txg.

   Later when searching by txg, if the L1 block matches since the txg is
   newer, it is possible that none of the remaining L1->L0 BPs match if
   none have been updated.

   The same behavior is possible with dnode search at L0.

   This is reachable from dsl_destroy_head for synchronous freeing.
   When this happens open context fails to free objects leaving sync
   context stuck freeing potentially many objects.

   This is also reachable from traverse_pool for extreme rewind where it
   is theoretically possible that datasets not dirtied after txg are
   skipped if the MOS has high enough indirection to trigger this case.

In both of these cases, without backtracking the search ends prematurely
as ESRCH result implies no more matches in the entire object.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Akash B <akash-b@hpe.com>
Signed-off-by: Robert Evans <evansr@google.com>
Closes #16025
Closes #11196
2025-10-21 11:30:30 -07:00
Tony Hutter 2a5349bf93 zvol: verify IO type is supported
ZVOLs don't support all block layer IO request types.  Add a check for
the IO types we do support.  Also, remove references to
io_is_secure_erase() since they are not supported on ZVOLs.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #17803
2025-10-21 11:02:42 -07:00
Tony Hutter 0bb5950e72 zvol: Fix blk-mq sync
The zvol blk-mq codepaths would erroneously send FLUSH and TRIM
commands down the read codepath, rather than write.  This fixes
the issue, and updates the zvol_misc_fua test to verify that
sync writes are actually happening.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Reviewed-by: Ameer Hamza <ahamza@ixsystems.com>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #17761
Closes #17765
2025-10-21 11:02:42 -07:00
Brian Behlendorf 1baecd3a78 Linux 6.17 compat: META
Update the META file to reflect compatibility with the 6.17
kernel.

Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Reviewed-by: Rob Norris <robn@despairlabs.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #17789
2025-10-21 11:02:42 -07:00
Brian Behlendorf 2b0502c578 Add interface to interface spa_get_worst_case_min_alloc() function
Provide an interface to retrieve the lowest and highest minimum
allocation size for the normal allocation class.  This can be used
by external consumers of the DMU to estimate potential wasted
capacity when setting the recordsize for an object.

The new "min_alloc" and "max_alloc" keys are added to the pool
configuration and used by default_volblocksize() to warn when
an ineffecient block size is requested.  For older kmods which
don't yet include the new keys fallback to the previous logic.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #17758
2025-10-21 11:02:42 -07:00
Rob Norris 660077ffee contrib/initramfs/scripts/zfs: shellcheck fixup
I got a newer shellcheck, and it pointed out that read without a target
variable is not POSIXly. The var was removed in c3ef9f7528, so I put it
back, and now shellcheck complains about an unused var. That's actually
correct, but necessary, so I've added a suppression for that, probably
better.

Sponsored-by: https://despairlabs.com/sponsor/
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Rob Norris <robn@despairlabs.com>
Closes #17626
2025-10-21 11:02:42 -07:00
Brian Behlendorf e5ca94f21e Fix 'zpool add' safety check corner cases
Three cases were discovered where 'zpool add' would fail to
warn when adding vdevs to a pool with a mismatched replication
level.  These are:

  1. When a pool contains mixed file and disk vdevs.
  2. When a pool contains an active dRAID distributed spare
  3. When a pool contains an active hot spare

The lack of warnings are caused by get_replication() assessing
the current pool configuration an inconsistent and disabling
the mismatched replication check for the new pool configuration
after 'zpool add'.  This change updates get_replication() to
be slightly more tolerant in the non-fatal case.

The zpool_add_010_pos.ksh test case was split in to separate
tests: zpool_add_warn_create.ksh, pool_add_warn_degraded.ksh,
and zpool_add_warn_removal.  These test were extended to
include coverage for dRAID pools and the three scenarios
described above.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #17780
2025-10-21 11:02:35 -07:00
Rob Norris 365926a37c Linux 6.17: d_set_d_op() is no longer available
We only have extremely narrow uses, so move it all into a single
function that does only what we need, with and without d_set_d_op().

Sponsored-by: https://despairlabs.com/sponsor/
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Rob Norris <robn@despairlabs.com>
Closes #17621
2025-10-21 10:36:35 -07:00
Ameer Hamza 4b24cbba80 CI: Fix FreeBSD 15.0 by staying on ALPHA4 due to broken ALPHA5 image
FreeBSD 15.0-ALPHA5 image fails to boot on cloud VMs due to missing
/boot/efi mount point, causing the system to drop to single user mode
where SSH cannot start. Work around this by staying on ALPHA4 and
setting IGNORE_OSVERSION=yes to bypass pkg's kernel version mismatch
prompt during bootstrap. This allows CI to proceed with ALPHA4 until we
have a stable FreeBSD 15.0 image.

Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Closes #17846
2025-10-21 10:35:58 -07:00
Tino Reichardt 007f325e1b CI: Switch FreeBSD 15 to 15.0-ALPHA4 and add FreeBSD 16
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de>
Closes #17815
2025-10-21 10:35:58 -07:00
Shreshth3 03c956e806 docs: fix a few small typos (#17804)
Signed-off-by: Shreshth Srivastava <shreshthsrivastava2@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
2025-10-21 10:35:58 -07:00
Tony Hutter 2a331e411d CI: Add ZTS -O option, log Setup Testing Machines step
Add a -O option to zfs-test.sh to dump debug information on test
timeout.  The debug info includes:

- 30 lines from 'top'
- /proc/<PID>/stack output of process with highest CPU usage
- Last lines strace-ing process with highest CPU usage
- /proc/sysrq-trigger kernel stack traces

All debug information gets dumped to /dev/kmsg (Linux only).

In addition, print out the VM console lines from the "Setup Testing
Machines" step.  We have often see VMs timeout at this step and don't
know why.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #17753
2025-10-21 10:35:58 -07:00
Brian Behlendorf 39b9b62a96 CI: Switch FreeBSD 15 to 15.0-ALPHA3
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Closes #17795
2025-10-21 10:35:58 -07:00
Brian Behlendorf 395f099323 CI: Remove Buildbot references
The Buildbot CI infrastructure has been fully replaced by GitHub
Actions.  Remove any lingering references from the repository.

Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #17794
2025-10-21 10:35:58 -07:00
Brian Behlendorf 78b21d8d19 CI: update perf and bpftools with the kernel packages
When updating a Fedora instance to an experimental kernel make sure
to include the matching versioned perf and bpftool packages.  This
helps ensure there are no unexpected conflicts which would prevent
the new packages from being installed.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #17791
2025-10-21 10:35:58 -07:00
Alexander Motin a83eefee1e CI: Switch FreeBSD 15 to 15.0-ALPHA2
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Alexander Motin <alexander.motin@TrueNAS.com>
Closes #17749
2025-10-21 10:35:58 -07:00
Tony Hutter cb66796f87 CI: Increase setup timeout to 20min, add timestamps
- Increase qemu-1-setup.sh timeout to 20min since it sometimes
  fails to complete after 15min.

- Timestamp all qemu-1-setup.sh lines to look for hangs.

- Add a 'watchdog' process to print out the top running process every
  30sec to help with debugging.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #17714
2025-10-21 10:35:58 -07:00
Shengqi Chen f4a85d6f2c ci: fix syntax issues in zfs-qemu.yml
Otherwise it might become `if [ == "" ]` which is ill-formed.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Shengqi Chen <harry-chen@outlook.com>
Closes #17695
2025-10-21 10:35:58 -07:00
Shengqi Chen 0c9302b042 ci: use real head sha instead of GITHUB_SHA when generating CI type
Because GitHub creates a merge commit on top of real head, so the check
on HEAD will fail regardlessly.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Shengqi Chen <harry-chen@outlook.com>
Closes #17695
2025-10-21 10:35:58 -07:00
Tony Hutter 4f33cd2350 CI: Increase 'Setup QEMU' timeout to 15 minutes
We've seen Fedora 42 still setting up after 10 min.  Change the timeout
to 15 min.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #17697
2025-10-21 10:35:58 -07:00
Tony Hutter 34f96a15c7 Tag zfs-2.3.4
META file and changelog updated.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
2025-08-20 09:29:36 -07:00
Tino Reichardt fdb5078d82 CI: Add Debian 13 to the FULL_OS runner list
This commit adds Debian 13 alias Trixie to the checked operating
systems. The image needs to be run with UEFI support.

Current Debian version overview:
- Debian 11 (Bullseye) -> "oldoldstable"
- Debian 12 (Bookworm) -> "oldstable"
- Debian 13 (Trixie) -> new "stable"

The CI will be run on Debian 12 and Debian 13 now.
Debian 11 is kept, but won't be used automatically.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de>
Closes #17648
2025-08-20 09:29:36 -07:00
Shengqi Chen 8cca55f18b Debian rules: install scripts/objtool-wrapper.in into dkms tree
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Attila Fülöp <attila@fueloep.org>
Signed-off-by: Shengqi Chen <harry-chen@outlook.com>
Closes #17633
Closes #17646
2025-08-20 09:20:45 -07:00
Attila Fülöp ef1ee9421d objtool-wrapper: Update Debian packaging
6cf17f65 (#17456) introduced a change to `configure.ac` which
breaks the patching done in the Debian packages DKMS source
installation phase. This results in a failed module build.

Adapt the awk script doing the patching to handle the added
`AC_CONFIG_FILE` entry.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Tested-by: Shengqi Chen <harry-chen@outlook.com>
Signed-off-by: Attila Fülöp <attila@fueloep.org>
Closes #17633
Closes #17646
2025-08-20 09:12:26 -07:00
shodanshok 435006d81d add uncompressed_size to arc_summary
Add uncompressed ARC size to statistics reported by arc_summary.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Gionatan Danti <g.danti@assyoma.it>
Closes #17556
2025-08-19 10:30:04 -07:00
rmacklem 725886d67a FreeBSD: Add support for _PC_HAS_HIDDENSYSTEM
In FreeBSD there is now a pathconf name _PC_HAS_HIDDENSYSTEM.
This patch adds support for it to OpenZFS.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Rick Macklem <rmacklem@uoguelph.ca>
Closes #17518
2025-08-19 10:30:04 -07:00
Meriel Luna Mittelbach bce049389d Add templated zfs-mount@.service
Runs `zfs mount -R <dataset>` at boot, after `zfs mount -a`.
Intended to replace `mountpoint=legacy` in certain mount setups.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Meriel Luna Mittelbach <lunarlambda@gmail.com>
Closes #17483
2025-08-19 10:30:04 -07:00
Mark Johnston c3d74a0d6f FreeBSD: Ensure that z_pflags is initialized for new znodes
The field is subsequently accessed in zfs_mknode(), in
zfs_inherit_projid().  The Linux implementation of zfs_create_fs() has
this initialization already; there is no counterpart to
zfs_create_share_dir() that I can see.

Reported-by: KMSAN
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Rob Norris <rob.norris@klarasystems.com>
Signed-off-by: Mark Johnston <markj@FreeBSD.org>
Closes #17486
2025-08-19 10:30:04 -07:00
Tony Hutter 9cf069b366 CI: Add optional patch level, fix hostname on F42
In the past there have been times when we need to generate new RPMs
for an existing ZFS release.  Typically this happens when a new RHEL
version comes out and the kernel symbols no longer match.  To get
users to auto-update we just bump the patch number.  For example, we
had to create zfs-2.1.13-1 for EL8.8 and zfs-2.1.13-2 for EL8.9.

This commit adds an optional patch level text box to the github
package builder runner.

In addition, this commit also uses `hostnamectl` instead of `hostname`
for F42+ compatibility, if available.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #17638
2025-08-18 17:06:55 -07:00
Richard Yao 3f87c9c276 Add CodeQL mismatched dsl_dataset_hold/_rele pairs check
This check is currently limited to checking mismatches that occur in the
same stack frame. It does not detect across stack frames.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Richard Yao <richard@ryao.dev>
Closes #17352
2025-08-18 17:06:55 -07:00
Patrick Fasano 9d14ce4db7 Add conflict/replacement with older SONAME libzfs and libzpool packages
In e8f0aa143e, the SONAMEs and package
names for libzfs and libzpool were bumped. The `contrib/debian/control`
file did not declare a conflict/replacement with the old package name.
This can cause dpkg to leave a system in an inconsistent state if the
old package is not manually uninstalled first.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Shengqi Chen <harry-chen@outlook.com>
Signed-off-by: Patrick Fasano <patrick@patrickfasano.com>
Closes #17586
2025-08-18 16:09:44 -07:00
Rob Norris 3b64a9619f FreeBSD: zfs_putpages: don't undirty pages until after write completes
In syncing mode, zfs_putpages() would put the entire range of pages onto
the ZIL, then return VM_PAGER_OK for each page to the kernel. However,
an associated zil_commit() or txg sync had not happened at this point,
so the write may not actually be on disk.

So, we rework that case to use a ZIL commit callback, and do the
post-write work of undirtying the page and signaling completion there.
We return VM_PAGER_PEND to the kernel instead so it knows that we will
take care of it.

The original version of this (238eab7dc1) copied the Linux model and did
the cleanup in a ZIL callback for both async and sync. This was a
mistake, as FreeBSD does not have a separate "busy for writeback" flag
like Linux which keeps the page usable. The full sbusy flag locks the
entire page out until the itx callback fires, which for async is after
txg sync, which could be literal seconds in the future.

For the async case, the data is already on the DMU and the in-memory
ZIL, which is sufficient for async writeback, so the old method of
logging it without a callback, undirtying the page and returning is more
than sufficient and reclaims that lost performance.

Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Mark Johnston <markj@FreeBSD.org>
Signed-off-by: Rob Norris <rob.norris@klarasystems.com>
Closes #17533
2025-08-12 22:41:17 -04:00
Mark Johnston a072611eef Revert "FreeBSD: zfs_putpages: don't undirty pages until after write completes"
This causes async putpages to leave the pages sbusied for a long time,
which hurts concurrency.  Revert for now until we have a better
approach.

This reverts commit 238eab7dc1.

Reported by:    Ihor Antonov <ngor@hugpoint.tech>
Discussed with: Rob Norris <rob.norris@klarasystems.com>

References: freebsd/freebsd-src@738a9a7
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Mark Johnston <markj@FreeBSD.org>
Ported-by: Rob Norris <rob.norris@klarasystems.com>
Signed-off-by: Rob Norris <rob.norris@klarasystems.com>
Closes #17533
2025-08-12 22:41:17 -04:00
Brian Behlendorf 0fe10361ba Allow vmem_alloc backed multilists
Systems with a large number of CPU cores (192+) may trigger the large
allocation warning in multilist_create() on Linux.  Silence the warning
by converting the allocation to vmem_alloc().

On Linux this results in a call to kvalloc() which will alloc vmem
for large allocations and kmem for small allocations.

On FreeBSD both vmem_alloc and kmem_alloc internally use the same
allocator so there is no functional change.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Reviewed-by: Rob Norris <robn@despairlabs.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #17616
2025-08-12 17:24:30 -07:00
Brian Behlendorf 3e78905ffb Silence zstd large allocation warning
Allow zstd_mempool_init() to allocate using vmem_alloc() instead
of kmem_alloc() to silence the large allocation warning on Linux
during module load when the system has a large number of CPUs.

It's not at all clear to me that scaling the allocation size with
the number of CPUs is beneficial and that should be evaluated.
But for the moment this should resolve the warning without
introducing any unexpected side effects.

Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Reviewed-by: Rob Norris <robn@despairlabs.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #17620
Closes #11557
2025-08-12 17:24:26 -07:00
Colin Percival 46de04d2e9 FreeBSD 15.0 is now "PRERELEASE"
Chase URL change from the FreeBSD project.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Colin Percival <cperciva@tarsnap.com>
Closes #17617
2025-08-12 17:24:22 -07:00
achill 41ca2296cd Linux 6.16 compat: META
Update the META file to reflect compatibility with the 6.16
kernel.

Tested with 6.16.0-0-stable of Alpine Linux edge, see
<https://gitlab.alpinelinux.org/alpine/aports/-/merge_requests/87929>.

Reviewed-by: Rob Norris <robn@despairlabs.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Achill Gilgenast <achill@achill.org>
Closes #17578
2025-08-12 17:24:19 -07:00
René Wirnata 9651668457 zed: prettify slack notification message
This converts the body of a ZED slack notification from
plain text to code block style to help with readability.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: René Wirnata <rene.wirnata@pandascience.net>
Closes #17610
2025-08-12 17:24:15 -07:00
Rob Norris a49c957299 linux/zvol_os: fix crash with blk-mq on Linux 4.19
03987f71e3 (#16069) added a workaround to get the blk-mq hardware
context for older kernels that don't cache it in the struct request.
However, this workaround appears to be incomplete.

In 4.19, the rq data context is optional. If its not initialised, then
the cached rq->cpu will be -1, and so using it to index into mq_map
causes a crash.

Given that the upstream 4.19 is now in extended LTS and rarely seen,
RHEL8 4.18+ has long carried "modern" blk-mq support, and the cached
hardware context has been available since 5.1, I'm not going to huge
lengths to get queue selection correct for the very few people that are
likely to feel it. To that end, we simply call raw_smp_processor_id() to
get a valid CPU id and use that instead.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Reviewed-by: Paul Dagnelie <paul.dagnelie@klarasystems.com>
Signed-off-by: Rob Norris <rob.norris@klarasystems.com>
Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Closes #17597
2025-08-12 17:24:11 -07:00
Todd Zullinger d1d706350e rpm: don't list /sbin/zgenhostid twice in %files
The location of zgenhostid was changed in 0ae733c7a (Install zgenhostid
to sbindir, 2021-01-21).  We include all files within sbindir two lines
earlier, which causes rpmbuild to report:

    File listed twice: /sbin/zgenhostid

Drop the redundant entry from the %files section.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Todd Zullinger <tmz@pobox.com>
Closes #17601
2025-08-12 17:24:08 -07:00
Attila Fülöp 11f844175e config: Avoid void main() in toolchain-simd.m4
Be standard-compliant by using `int main()`.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Signed-off-by: Attila Fülöp <attila@fueloep.org>
Closes #13303
Closes #17590
2025-08-12 17:23:57 -07:00
Attila Fülöp 57b614e025 SIMD: Don't require definition of HAVE_XSAVE
Currently we fail the compilation via the #error directive if
`HAVE_XSAVE` isn't defined. This breaks i586 builds since we check
the toolchains SIMD support only on i686 and onward.

Remove the requirement to fix the build on i586.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Signed-off-by: Attila Fülöp <attila@fueloep.org>
Closes #13303
Closes #17590
2025-08-12 17:23:51 -07:00
Rob Norris 0c7d6e20e6 Linux: zfs_putpage: document (and fix!) confusing sync/commit modes
The structure of zfs_putpage() and its callers is tricky to follow.
There's a lot more we could do to improve it, but at least now we have
some description of one of the trickier bits.

Writing this exposed a very subtle bug: most async pages pushed out
through zpl_putpages() would go to the ZIL with commit=false, which can
yield a less-efficient write policy. So this commit updates that too.

Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Signed-off-by: Rob Norris <rob.norris@klarasystems.com>
Closes #17584
2025-08-12 17:23:46 -07:00
Rob Norris b9c45fe68c Linux: zfs_putpage: complete async page writeback immediately
For async page writeback, we do not need to wait for the page to be on
disk before returning to the caller; it's enough that the data from the
dirty page be on the DMU and in the in-memory ZIL, just like any other
write.

So, if this is not a syncing write, don't add a callback to the itx, and
instead just unlock the page immediately.

(This is effectively the same concept used for FreeBSD in d323fbf49c).

Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Signed-off-by: Rob Norris <rob.norris@klarasystems.com>
Closes #17584
Closes #14290
2025-08-12 17:23:43 -07:00
Rob Norris f72226a75c Linux: sync: remove async/sync accounting
All this machinery is there to try to understand when there an async
writeback waiting to complete because the intent log callbacks are still
outstanding, and force them with a timely zil_commit(). The next commit
fixes this properly, so there's no need for all this extra housekeeping.

Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Signed-off-by: Rob Norris <rob.norris@klarasystems.com>
Closes #17584
2025-08-12 17:23:39 -07:00
Rob Norris 97fe86837c ZTS: mmap_ftruncate test to confirm async writeback behaviour
Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Signed-off-by: Rob Norris <rob.norris@klarasystems.com>
Closes #17584
2025-08-12 17:23:35 -07:00
Rob Norris df5e02d253 CI: match and trim out internal timestamp for test prefix
Adjust the regexes to match the test line with timestamps, then remove
them for the summary. The internal timestamp is still in the full logs.

Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de>
Signed-off-by: Rob Norris <rob.norris@klarasystems.com>
Closes #17045
2025-08-12 17:23:28 -07:00
Rob Norris 245adb6a4f ZTS: include microsecond timestamps on all output
When reviewing test output after a failure, it's often quite difficult
to work out the order and timing of events, and to correlate test suite
output with kernel logs.

This adds timestamps to ZTS output to help with this, in three places:

- all of the standard log_XXX functions ultimately end up in _printline,
  which now prefixes output with a timestamp. An escape hatch
  environment variable is provided for user_cmd, which often calls the
  logging functions while also depending on the captured output.

- the test runner logging function log() also now prefixes its output
  with a timestamp.

- on failure, when capturing the kernel log in zfs_dmesg.ksh, the "iso"
  time format is requested.

Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de>
Signed-off-by: Rob Norris <rob.norris@klarasystems.com>
Closes #17045
2025-08-12 17:23:07 -07:00
Brian Behlendorf 82a0868ce4 CI: Remove Debian backports
The latest Debian 11 image includes bullseye-backports as a default
repository in the /etc/apt/sources.list.  However, this repository
has gone end of life which effectively breaks the default install.

We shouldn't need anything in backports so lets unconditionally
remove backports on all Debian builders to resolve the issue.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #17569
2025-08-12 17:22:45 -07:00
Coleman Kane e7e0bb3b61 linux: Fix out-of-src builds
The linux kernel modules haven't been building successfully when the
build occurs in a separate directory than the source code, which is a
common build pattern in Linux. Was not able to determine the root cause,
but the %.o targets in subdirectories are no longer being matched by the
pattern targets in the Linux Kbuild system. This change fixes the issue
by dynamically creating the missing ones inside our Kbuild.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Coleman Kane <ckane@colemankane.org>
Closes #17517
2025-08-12 17:22:40 -07:00
Paul Dagnelie 6af1f61ad4 Fix zdb pool/ with -k
When examining the root dataset with zdb -k, we get into a mismatched
state. main() knows we are not examining the whole pool, but it strips
off the trailing slash. import_checkpointed_state() then thinks we are
examining the whole pool, and does not update the target path
appropriately. The fix is to directly inform import_checkpointed_state
that we are examining a filesystem, and not the whole pool.

Sponsored-by: Klara, Inc.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Rob Norris <rob.norris@klarasystems.com>
Signed-off-by: Paul Dagnelie <paul.dagnelie@klarasystems.com>
Co-authored-by: Paul Dagnelie <paul.dagnelie@klarasystems.com>
Closes #17536
2025-08-12 17:21:47 -07:00
Carl George 8c4f625c12 CI: Add CentOS Stream 9/10 to the FULL_OS runner list
Testing on CentOS Stream provides several months advance notice of
changes coming to the RHEL kernel.  This should help OpenZFS be
proactive instead of reactive to new RHEL minor versions.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de>
Signed-off-by: Carl George <carlwgeorge@gmail.com>
ZFS-CI-Type: full
Closes #16904
Closes #17526
2025-08-12 17:20:16 -07:00
Tino Reichardt 7882e85a9b Delete unused .cirrus.yml
The Cirrus_CI was planned for testing FreeBSD, but never really used I
think. Currently it's not needed anymore, so remove it.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de>
Closes #17155
Closes #17535
2025-08-12 17:19:43 -07:00
Tino Reichardt 6b38d0f7ff ZTS: Fix FreeBSD 15.0 ksh errors
The package ksh93 is replaced by ksh now.
This works for FreeBSD 13 and 14 also.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de>
Closes #17523
2025-08-12 17:19:32 -07:00
Alexander Motin 80b6457fcd CI: Switch from FreeBSD 13.4 to 13.5
FreeBSD 13.4 is EOL since June 30, 2025.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de>
Signed-off-by:	Alexander Motin <mav@FreeBSD.org>
Closes #17519
2025-08-12 17:19:07 -07:00
Brian Behlendorf 2518f4b124 Revert "Fix incorrect expected error in ztest"
This reverts commit 2076011e0c.  The
comment which explains EINVAL should be expected for this case was
wrong, not the code.  The kernel will return ENOTSUP when attaching
a distributed spare to the wrong top-level dRAID vdev.  See the
check for this in spa_vdev_attach().

Reviewed-by: Paul Dagnelie <paul.dagnelie@klarasystems.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #17503
2025-08-12 17:18:40 -07:00
Igor Ostapenko 90d2c4407a ztest: Fix false positive of ENOSPC handling
Before running a pass zs_enospc_count is checked to free up some space
by destroying a random dataset. But the space freed may still be not
re-usable during the TXG_DEFER window breaking the next dataset creation
in ztest_generic_run().
    
Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Igor Ostapenko <igor.ostapenko@klarasystems.com>
Closes #17506
2025-08-12 17:18:34 -07:00
Brian Behlendorf f7698f47e8 CI: run ztest on compressed zpool
When running ztest under the CI a common failure mode is for the
underlying filesystem to run out of available free space.  Since
the storage associated with a GitHub-hosted running is fixed, we
instead create a pool and use a compressed ZFS dataset to store
the ztest vdev files.  This significantly increases the available
capacity since the data written by ztest is highly compressible.
A compression ratio of over 40:1 is conservatively achieved using
the default lz4 compression.  Autotrimming is enabled to ensure
freed blocks are discarded from the backing cipool vdev file.

Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #17501
2025-08-12 17:17:49 -07:00
Martin Rüegg 6c1130a730 pyzfs: Adapt python lib directory evaluation from ax_python_devel.m4
71216b91d2 introduced a regression
on debian/ubuntu systems during build.

The reason being, that building the RPM for pyzfs was using
a different library path than building the library itself.
This is now harmonized.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Martin Rüegg <martin.rueegg@metaworx.ch>
Closes #16155
Closes #17480
2025-08-12 17:17:24 -07:00
Martin Rüegg 74b539d3dc pyzfs: Update ax_python_devel.m4 to serial 37
Fixes an obvious typo, where a variable was missing the required
leading dollar sign ($)

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Martin Rüegg <martin.rueegg@metaworx.ch>
Closes #17480
2025-08-12 17:17:17 -07:00
Chunwei Chen 024e60b927 Missing tests in make pkg
```
Warning: TestGroup '/var/tmp/tests/functional/ctime' not added to this
run. Auxiliary script '/var/tmp/tests/functional/ctime/setup' failed
verification.
```

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chunwei Chen <david.chen@nutanix.com>
Closes #17491
2025-08-12 17:17:09 -07:00
Olivier Certner 5289f6f961 spa: ZIO_TASKQ_ISSUE: Use symbolic priority
This allows to change the meaning of priority differences in FreeBSD
without requiring code changes in ZFS.

This upstreams commit fd141584cf89d7d2 from FreeBSD src.

Sponsored-by: The FreeBSD Foundation
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Olivier Certner <olce@FreeBSD.org>
Closes #17489
2025-08-12 17:16:00 -07:00
Paul Dagnelie 094305c937 Fix TestGroup warning due to missing tags
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Rob Norris <rob.norris@klarasystems.com>
Signed-off-by: Paul Dagnelie <paul.dagnelie@klarasystems.com>
Co-authored-by: Paul Dagnelie <paul.dagnelie@klarasystems.com>
Closes #17473
2025-08-12 15:59:59 -07:00
Tino Reichardt a826f7a993 ZTS: Use FreeBSD cloudinit images
FreeBSD provides CI-IMAGES since some time. These images are
based on nuageinit, which does not support fqdn and sudo for
example. So we need currently some workarounds to get it
working.

The FreeBSD images will be more compatible with cloud-init in
some near future. Then we can remove the workaround things.

These versions are used for testing:
- freebsd13-4r (RELEASE)
- freebsd14-3s (STABLE)
- freebsd15-0c (CURRENT)

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de>
Closes #17462
2025-08-12 15:58:46 -07:00
Attila Fülöp 86bf73c1eb objtool wrapper: use absolute path to call the wrapper
Older kernel versions run make outside of the build directory. This
works since all paths are absolute. Relative paths will fail in such
a scenario.

Use an absolute path to the objtool wrapper as well, since the
relative path breaks the build on older kernels.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Attila Fülöp <attila@fueloep.org>
Closes #17541
2025-08-07 13:10:58 -04:00
Attila Fülöp 1d293b377a Linux build: handle CONFIG_OBJTOOL_WERROR=y
Linux 5.16 by default fails the build on objtool warnings. We have
known and understood objtool warnings we can't fix without
involving Linux maintainers.

To work around this we introduce an objtool wrapper script which
removes the `--Werror` flag.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Attila Fülöp <attila@fueloep.org>
Closes #17456
2025-08-07 13:10:33 -04:00
Alexander Motin 22eb2bdce3 Make TX abort after assign safer
It is not right, but there are few examples when TX is aborted
after being assigned in case of error.  To handle it better on
production systems add extra cleanup steps.

While here, replace couple dmu_tx_abort() in simple cases.

Reviewed-by: Rob Norris <robn@despairlabs.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Igor Kozhukhov <igor@dilos.org>
Signed-off-by:	Alexander Motin <mav@FreeBSD.org>
Sponsored by:	iXsystems, Inc.
Closes #17438
2025-08-07 12:41:40 -04:00
Alexander Motin 809b553940 Introduce zfs rewrite subcommand (#17246)
This allows to rewrite content of specified file(s) as-is without
modifications, but at a different location, compression, checksum,
dedup, copies and other parameter values.  It is faster than read
plus write, since it does not require data copying to user-space.
It is also faster for sync=always datasets, since without data
modification it does not require ZIL writing.  Also since it is
protected by normal range range locks, it can be done under any
other load.  Also it does not affect file's modification time or
other properties.

Signed-off-by:	Alexander Motin <mav@FreeBSD.org>
Sponsored by:	iXsystems, Inc.
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Rob Norris <robn@despairlabs.com>
2025-08-07 12:34:28 -04:00
Rob Norris abb6211e7a Linux 6.16: remove writepage and readahead_page
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rob Norris <robn@despairlabs.com>
Closes #17443
2025-08-07 12:29:45 -04:00
khoang98 c405a7a35c Skip dbuf_evict_one() from dbuf_evict_notify() for reclaim thread
Avoid calling dbuf_evict_one() from memory reclaim contexts (e.g. Linux
kswapd, FreeBSD pagedaemon). This prevents deadlock caused by reclaim
threads waiting for the dbuf hash lock in the call sequence:
dbuf_evict_one -> dbuf_destroy -> arc_buf_destroy

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Signed-off-by: Kaitlin Hoang <kthoang@amazon.com>
Closes #17561
2025-08-07 12:15:14 -04:00
shodanshok 4808641e71 enforce arc_dnode_limit
Linux kernel shrinker in the context of null/root memcg does not scan
dentry and inode caches added by a task running in non-root memcg. For
ZFS this means that dnode cache routinely overflows, evicting valuable
meta/data and putting additional memory pressure on the system.

This patch restores zfs_prune_aliases as fallback when the kernel
shrinker does nothing, enabling zfs to actually free dnodes. Moreover,
it (indirectly) calls arc_evict when dnode_size > dnode_limit.

Reviewed-by: Rob Norris <robn@despairlabs.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Gionatan Danti <g.danti@assyoma.it>
Closes #17487
Closes #17542
2025-08-07 12:11:34 -04:00
Alexander Motin 30fa92bff3 Increase meta-dnode redundancy in "some" mode
Loss of one indirect block of the meta dnode likely means loss of
the whole dataset.  It is worse than one file that the man page
promises, and in my opinion is not much better than "none" mode.

This change restores redundancy of the meta-dnode indirect blocks,
while same time still corrects expectations in the man page.

Reviewed-by: Akash B <akash-b@hpe.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Rob Norris <robn@despairlabs.com>
Signed-off-by:	Alexander Motin <mav@FreeBSD.org>
Sponsored by:	iXsystems, Inc.
Closes #17339
2025-08-05 13:15:44 -04:00
Paul Dagnelie fd5a27c9db Ensure that gang_copies is always at least as large as copies
As discussed in the comments of PR #17004, you can theoretically run
into a case where a gang child has more copies than the gang header,
which can lead to some odd accounting behavior (and even trip a
VERIFY). While the accounting code could be changed to handle this, it
fundamentally doesn't seem to make a lot of sense to allow this to
happen. If the data is supposed to have a certain level of reliability,
that isn't actually achieved unless the gang_copies property is set to
match it.

Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Paul Dagnelie <paul.dagnelie@klarasystems.com>
Closes #17484
2025-08-05 13:14:45 -04:00
Rob Norris 3ad3f439bb zts: add spdx license tags to gang_blocks tests (#17160)
Missed in #17073, probably because that PR was branched before #17001
was landed and never rebased.

Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.

Signed-off-by: Rob Norris <rob.norris@klarasystems.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
2025-08-05 13:14:11 -04:00
Paul Dagnelie a46ce73ca8 Make ganging redundancy respect redundant_metadata property (#17073)
The redundant_metadata setting in ZFS allows users to trade resilience
for performance and space savings. This applies to all data and metadata
blocks in zfs, with one exception: gang blocks. Gang blocks currently
just take the copies property of the IO being ganged and, if it's 1,
sets it to 2. This means that we always make at least two copies of a
gang header, which is good for resilience. However, if the users care
more about performance than resilience, their gang blocks will be even
more of a penalty than usual.

We add logic to calculate the number of gang headers copies directly,
and store it as a separate IO property. This is stored in the IO
properties and not calculated when we decide to gang because by that
point we may not have easy access to the relevant information about what
kind of block is being stored. We also check the redundant_metadata
property when doing so, and use that to decide whether to store an extra
copy of the gang headers, compared to the underlying blocks.

Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.

Signed-off-by: Paul Dagnelie <paul.dagnelie@klarasystems.com>
Co-authored-by: Paul Dagnelie <paul.dagnelie@klarasystems.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
2025-08-05 13:10:40 -04:00
Ameer Hamza 90790955a6 SPDX: Add missing CDDL-1.0 license
Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
2025-08-05 12:51:35 -04:00
Igor Ostapenko 95abbc71c3 range_tree: Provide more debug details upon unexpected add/remove
Sponsored-by: Klara, Inc.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Signed-off-by: Igor Ostapenko <igor.ostapenko@klarasystems.com>
Closes #17581
2025-08-05 12:34:54 -04:00
Tino Reichardt fc658b9935 Faster checksum benchmark on system boot
While booting, only the needed 256KiB benchmarks are done now.

The delay for checking all checksums occurs when requested via:
- Linux: cat /proc/spl/kstat/zfs/chksum_bench
- FreeBSD: sysctl kstat.zfs.misc.chksum_bench

Reported by: Lahiru Gunathilake <gunathilakebllg@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de>
Co-authored-by: Colin Percival <cperciva@tarsnap.com>
Closes #17563
Closes #17560
2025-08-05 12:34:13 -04:00
Paul Dagnelie 271b9797c5 Don't use wrong weight when passivating group
When we're passivating a metaslab group we start by passivating the 
metaslabs that have been activated for each of the allocators.  To do 
that, we need to provide a weight. However, currently this erroneously 
always uses a segment-based weight, even if segment-based weighting is 
disabled.

Use the normal weight function, which will decide which type of weight 
to use.

Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Signed-off-by: Paul Dagnelie <paul.dagnelie@klarasystems.com>
Closes #17566
2025-08-05 12:33:52 -04:00
Brian Behlendorf 582e7847f6 Default to zfs_bclone_wait_dirty=1
Update the default FICLONE and FICLONERANGE ioctl behavior to wait
on dirty blocks.  While this does remove some control from the
application, in practice ZFS is better positioned to the optimial
thing and immediately force a TXG sync.

Reviewed-by: Rob Norris <robn@despairlabs.com>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #17455
2025-08-05 12:33:30 -04:00
Andriy Tkachuk 6d378564b4 zdb: fix checksum calculation for decompressed blocks
Currently, when reading compressed blocks with -R and decompressing
them with :d option and specifying lsize, which is normally bigger
than psize for compressed blocks, the checksum is calculated on
decompressed data. But it makes no sense since zfs always calculates
checksum on physical, i.e. compressed data. So reading the same block
produces different checksum results depending on how we read it,
whether we decompress it or not, which, again, makes no sense.

Fix: use psize instead of lsize when calculating the checksum so that
it is always calculated on the physical block size, no matter was it
compressed or not.

Signed-off-by: Andriy Tkachuk <andriy.tkachuk@seagate.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Closes #17547
2025-08-05 12:33:00 -04:00
Ameer Hamza 0c928f7a37 ZED: Fix device type detection and pool iteration logic
During hotplug REMOVED events, devid matching fails for partition-based
spares because devid information is not stored in pool config for
partitioned devices. However, when devid is populated by the hotplug
event, the original code skipped the search logic entirely, skipping
vdev_guid matching and resulting in wrong device type detection that
caused spares to be incorrectly identified as l2arc devices.
Additionally, fix zfs_agent_iter_pool() to use the return value from
zfs_agent_iter_vdev() instead of relying on search parameters, which
was previously ignored. Also add pool_guid optimization to enable
targeted pool searching when pool_guid is available.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
Closes #17545
2025-08-05 12:32:28 -04:00
Chunwei Chen c79d5e4f33 Define sops->free_inode() to prevent use-after-free during lookup
On Linux, when doing path lookup with LOOKUP_RCU, dentry and inode can
be dereferenced without refcounts and locks. For this reason, dentry and
inode must only be freed after RCU grace period.

However, zfs currently frees inode in zfs_inode_destroy synchronously
and we can't use GPL-only call_rcu() in zfs directly. Fortunately, on
Linux 5.2 and after, if we define sops->free_inode(), the kernel will do
call_rcu() for us.

This issue may be triggered more easily with init_on_free=1 boot
parameter:

BUG: kernel NULL pointer dereference, address: 0000000000000020
RIP: 0010:selinux_inode_permission+0x10e/0x1c0
Call Trace:
 ? show_trace_log_lvl+0x1be/0x2d9
 ? show_trace_log_lvl+0x1be/0x2d9
 ? show_trace_log_lvl+0x1be/0x2d9
 ? security_inode_permission+0x37/0x60
 ? __die_body.cold+0x8/0xd
 ? no_context+0x113/0x220
 ? exc_page_fault+0x6d/0x130
 ? asm_exc_page_fault+0x1e/0x30
 ? selinux_inode_permission+0x10e/0x1c0
 security_inode_permission+0x37/0x60
 link_path_walk.part.0.constprop.0+0xb5/0x360
 ? path_init+0x27d/0x3c0
 path_lookupat+0x3e/0x1a0
 filename_lookup+0xc0/0x1d0
 ? __check_object_size.part.0+0x123/0x150
 ? strncpy_from_user+0x4e/0x130
 ? getname_flags.part.0+0x4b/0x1c0
 vfs_statx+0x72/0x120
 ? ioctl_has_perm.constprop.0.isra.0+0xbd/0x120
 __do_sys_newlstat+0x39/0x70
 ? __x64_sys_ioctl+0x8d/0xd0
 do_syscall_64+0x30/0x40
 entry_SYSCALL_64_after_hwframe+0x62/0xc7

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Rob Norris <rob.norris@klarasystems.com>
Signed-off-by: Chunwei Chen <david.chen@nutanix.com>
Co-authored-by: Chunwei Chen <david.chen@nutanix.com>
Closes #17546
2025-08-05 12:30:23 -04:00
Alexander Motin 347d68048a ZIL: Force writing of open LWB on suspend
Under parallel workloads ZIL may delay writes of open LWBs that
are not full enough.  On suspend we do not expect anything new to
appear since zil_get_commit_list() will not let it pass, only
returning TXG number to wait for.  But I suspect that waiting for
the TXG commit without having the last LWB issued may not wait for
its completion, resulting in panic described in #17509.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Rob Norris <rob.norris@klarasystems.com>
Signed-off-by:	Alexander Motin <mav@FreeBSD.org>
Sponsored by:	iXsystems, Inc.
Closes #17521
2025-08-05 12:28:41 -04:00
Paul Dagnelie acf3871ef8 Correct weight recalculation of space-based metaslabs
Currently, after a failed allocation, the metaslab code recalculates the
weight for a metaslab. However, for space-based metaslabs, it uses the
maximum free segment size instead of the normal weighting
algorithm. This is presumably because the normal metaslab weight is
(roughly) intended to estimate the size of the largest free segment, but
it doesn't do that reliably at most fragmentation levels. This means
that recalculated metaslabs are forced to a weight that isn't really
using the same units as the rest of them, resulting in undesirable
behaviors. We switch this to use the normal space-weighting function.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Paul Dagnelie <paul.dagnelie@klarasystems.com>
Sponsored-by: Wasabi Technology, Inc.
Sponsored-by: Klara, Inc.
Closes #17531
2025-08-05 12:28:34 -04:00
Ameer Hamza 21d5f25724 Validate mountpoint on path-based unmount using statx
Use statx to verify that path-based unmounts proceed only if the
mountpoint reported by statx matches the MNTTAB entry reported by
libzfs, aborting the operation if they differ. Align
`zfs umount /path` behavior with `zfs umount dataset`.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
Closes #17481
2025-08-05 12:27:25 -04:00
Paul Dagnelie 7e945a5b3f Fix other nonrot bugs
There are still a variety of bugs involving the vdev_nonrot property
that will cause problems if you try to run the test suite with
segment-based weighting disabled, and with other things in the weighting
code. Parents' nonrot property need to be updated when children are
added. When vdevs are expanded and more metaslabs are added, the weights
have to be recalculated (since the number of metaslabs is an input to
the lba bias function). When opening, faulted or unopenable children
should not be considered for whether a vdev is nonrot or not (since the
nonrot property is determined during a successful open, this can cause
false negatives). And draid spares need to have the nonrot property set
correctly.

Sponsored-by: Eshtek, creators of HexOS
Sponsored-by: Klara, Inc.
Reviewed-by: Allan Jude <allan@klarasystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Rob Norris <robn@despairlabs.com>
Signed-off-by: Paul Dagnelie <paul.dagnelie@klarasystems.com>
Closes #17469
2025-08-05 12:25:26 -04:00
Alexander Motin 85ce6b8ab2 Polish db_rwlock scope
dbuf_verify(): Don't need the lock, since we only compare pointers.

dbuf_findbp(): Don't need the lock, since aside of unneeded assert
we only produce the pointer, but don't de-reference it.

dnode_next_offset_level(): When working on top level indirection
should lock dnode buffer's db_rwlock, since it is our parent.  If
dnode has no buffer, then it is meta-dnode or one of quotas and we
should lock the dataset's ds_bp_rwlock instead.

Reviewed-by: Alan Somers <asomers@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by:	Alexander Motin <mav@FreeBSD.org>
Sponsored by:	iXsystems, Inc.
Closes #17441
2025-08-05 12:23:32 -04:00
Mariusz Zaborski 954894ee53 scrub: generate scrub_finish event
The `scn_min_txg` can now be used not only with resilver. Instead
of checking `scn_min_txg` to determine whether it’s a resilver or
a scrub, simply check which function is defined. Thanks to this
change, a scrub_finish event is generated when performing a scrub
from the saved txg.

Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Mariusz Zaborski <mariusz.zaborski@klarasystems.com>
Closes #17432
2025-08-05 12:21:46 -04:00
Alexander Motin a4e775d2ca Some arc_release() cleanup
- Don't drop L2ARC header if we have more buffers in this header.
Since we leave them the header, leave them the L2ARC header also.
Honestly we are not required to drop it even if there are no other
buffers, but then we'd need to allocate it a separate header, which
we might drop soon if the old block is really deleted.  Multiple
buffers in a header likely mean active snapshots or dedup, so we
know that the block in L2ARC will remain valid.  It might be rare,
but why not?
 - Remove some impossible assertions and conditions.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by:	Alexander Motin <mav@FreeBSD.org>
Sponsored by:	iXsystems, Inc.
Closes #17126
2025-08-05 12:16:27 -04:00
Paul Dagnelie 661310ff5c FDT dedup log sync -- remove incremental
This PR condenses the FDT dedup log syncing into a single sync
pass. This reduces the overhead of modifying indirect blocks for the
dedup table multiple times per txg. In addition, changes were made to
the formula for how much to sync per txg. We now also consider the
backlog we have to clear, to prevent it from growing too large, or
remaining large on an idle system.

Sponsored-by: Klara, Inc.
Sponsored-by: iXsystems, Inc.
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Authored-by: Don Brady <don.brady@klarasystems.com>
Authored-by: Paul Dagnelie <paul.dagnelie@klarasystems.com>
Signed-off-by: Paul Dagnelie <paul.dagnelie@klarasystems.com>
Closes #17038
2025-08-05 12:15:21 -04:00
Alexander Motin f9d59b579e ZIL: Relax parallel write ZIOs processing
ZIL introduced dependencies between its write ZIOs to permit flush
defer, when we flush vdev caches only once all the write ZIOs has
completed.  But it was recently spotted that it serializes not only
ZIO completions handling, but also their ready stage.  It means ZIO
pipeline can't calculate checksums for the following ZIOs until all
the previous are checksumed, even though it is not required.  On a
systems where memory throughput of a single CPU core is limited,
it creates single-core CPU bottleneck, which is difficult to see
due to ZIO pipeline design with many taskqueue threads.

While it would be great to bypass the ready stage waits, it would
require changes to ZIO code, and I haven't found a clean way to do
it.  But I've noticed that we don't need any dependency between
the write ZIOs if the previous one has some waiters, which means
it won't defer any flushes and work as a barrier for the earlier
ones.

Bypassing it won't help large single-thread writes, since all the
write ZIOs except the last in that case won't have waiters, and
so will be dependent.  But in that case the ZIO processing might
not be a bottleneck, since there will be only one thread populating
the write buffers, that will likely be the bottleneck.

But bypassing the ZIO dependency on multi-threaded write workloads
really allows them to scale beyond the checksuming throughput of
one CPU core.

My tests with writing 12 files on a same dataset on a pool with
4 striped NVMes as SLOGs from 12 threads with 1MB blocks on a
system with Xeon Silver 4114 CPU show total throughput increase
from 4.3GB/s to 8.5GB/s, increasing the SLOGs busy from ~30% to
~70%.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Rob Norris <robn@despairlabs.com>
Signed-off-by:	Alexander Motin <mav@FreeBSD.org>
Sponsored by:	iXsystems, Inc.
Closes #17458
2025-08-05 12:14:18 -04:00
243 changed files with 6951 additions and 1970 deletions
-21
View File
@@ -1,21 +0,0 @@
env:
CIRRUS_CLONE_DEPTH: 1
ARCH: amd64
build_task:
matrix:
freebsd_instance:
image_family: freebsd-13-5
freebsd_instance:
image_family: freebsd-14-2
freebsd_instance:
image_family: freebsd-15-0-snap
prepare_script:
- pkg install -y autoconf automake libtool gettext-runtime gmake ksh93 py311-packaging py311-cffi py311-sysctl
configure_script:
- env MAKE=gmake ./autogen.sh
- env MAKE=gmake ./configure --with-config="user" --with-python=3.11
build_script:
- gmake -j `sysctl -n kern.smp.cpus`
install_script:
- gmake install
+1 -1
View File
@@ -14,7 +14,7 @@ Please check our issue tracker before opening a new feature request.
Filling out the following template will help other contributors better understand your proposed feature.
-->
### Describe the feature would like to see added to OpenZFS
### Describe the feature you would like to see added to OpenZFS
<!--
Provide a clear and concise description of the feature.
-5
View File
@@ -2,11 +2,6 @@
<!--- Provide a general summary of your changes in the Title above -->
<!---
Documentation on ZFS Buildbot options can be found at
https://openzfs.github.io/openzfs-docs/Developer%20Resources/Buildbot%20Options.html
-->
### Motivation and Context
<!--- Why is this change required? What problem does it solve? -->
<!--- If it fixes an open issue, please link to the issue here. -->
+1
View File
@@ -2,3 +2,4 @@ name: "Custom CodeQL Analysis"
queries:
- uses: ./.github/codeql/custom-queries/cpp/deprecatedFunctionUsage.ql
- uses: ./.github/codeql/custom-queries/cpp/dslDatasetHoldReleMismatch.ql
@@ -0,0 +1,34 @@
/**
* @name Detect mismatched dsl_dataset_hold/_rele pairs
* @description Flags instances of issue #12014 where
* - a dataset held with dsl_dataset_hold_obj() ends up in dsl_dataset_rele_flags(), or
* - a dataset held with dsl_dataset_hold_obj_flags() ends up in dsl_dataset_rele().
* @kind problem
* @severity error
* @tags correctness
* @id cpp/dslDatasetHoldReleMismatch
*/
import cpp
from Variable ds, Call holdCall, Call releCall, string message
where
ds.getType().toString() = "dsl_dataset_t *" and
holdCall.getASuccessor*() = releCall and
(
(holdCall.getTarget().getName() = "dsl_dataset_hold_obj_flags" and
holdCall.getArgument(4).(AddressOfExpr).getOperand().(VariableAccess).getTarget() = ds and
releCall.getTarget().getName() = "dsl_dataset_rele" and
releCall.getArgument(0).(VariableAccess).getTarget() = ds and
message = "Held with dsl_dataset_hold_obj_flags but released with dsl_dataset_rele")
or
(holdCall.getTarget().getName() = "dsl_dataset_hold_obj" and
holdCall.getArgument(3).(AddressOfExpr).getOperand().(VariableAccess).getTarget() = ds and
releCall.getTarget().getName() = "dsl_dataset_rele_flags" and
releCall.getArgument(0).(VariableAccess).getTarget() = ds and
message = "Held with dsl_dataset_hold_obj but released with dsl_dataset_rele_flags")
)
select releCall,
"Mismatched release: held with $@ but released with " + releCall.getTarget().getName() + " for dataset $@",
holdCall, holdCall.getTarget().getName(),
ds, ds.toString()
@@ -7,7 +7,7 @@ Prints "quick" if (explicity required by user):
- the *last* commit message contains 'ZFS-CI-Type: quick'
or if (heuristics):
- the files changed are not in the list of specified directories, and
- all commit messages do not contain 'ZFS-CI-Type: full'
- all commit messages do not contain 'ZFS-CI-Type: (full|linux|freebsd)'
Otherwise prints "full".
"""
@@ -65,12 +65,12 @@ if __name__ == '__main__':
# check last (HEAD) commit message
last_commit_message_raw = subprocess.run([
'git', 'show', '-s', '--format=%B', 'HEAD'
'git', 'show', '-s', '--format=%B', head
], check=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
for line in last_commit_message_raw.stdout.decode().splitlines():
if line.strip().lower() == 'zfs-ci-type: quick':
output_type('quick', f'explicitly requested by HEAD commit {head}')
output_type('quick', f'requested by HEAD commit {head}')
# check all commit messages
all_commit_message_raw = subprocess.run([
@@ -83,8 +83,12 @@ if __name__ == '__main__':
for line in all_commit_message:
if line.startswith('ZFS-CI-Commit:'):
commit_ref = line.lstrip('ZFS-CI-Commit:').rstrip()
if line.strip().lower() == 'zfs-ci-type: freebsd':
output_type('freebsd', f'requested by commit {commit_ref}')
if line.strip().lower() == 'zfs-ci-type: linux':
output_type('linux', f'requested by commit {commit_ref}')
if line.strip().lower() == 'zfs-ci-type: full':
output_type('full', f'explicitly requested by commit {commit_ref}')
output_type('full', f'requested by commit {commit_ref}')
# check changed files
changed_files_raw = subprocess.run([
+91 -23
View File
@@ -6,6 +6,20 @@
set -eu
# The default 'azure.archive.ubuntu.com' mirrors can be really slow.
# Prioritize the official Ubuntu mirrors.
#
# The normal apt-mirrors.txt will look like:
#
# http://azure.archive.ubuntu.com/ubuntu/ priority:1
# https://archive.ubuntu.com/ubuntu/ priority:2
# https://security.ubuntu.com/ubuntu/ priority:3
#
# Just delete the 'azure.archive.ubuntu.com' line.
sudo sed -i '/azure.archive.ubuntu.com/d' /etc/apt/apt-mirrors.txt
echo "Using mirrors:"
cat /etc/apt/apt-mirrors.txt
# install needed packages
export DEBIAN_FRONTEND="noninteractive"
sudo apt-get -y update
@@ -20,35 +34,89 @@ ssh-keygen -t ed25519 -f ~/.ssh/id_ed25519 -q -N ""
sudo systemctl stop docker.socket
sudo systemctl stop multipathd.socket
# remove default swapfile and /mnt
sudo swapoff -a
sudo umount -l /mnt
DISK="/dev/disk/cloud/azure_resource-part1"
sudo sed -e "s|^$DISK.*||g" -i /etc/fstab
sudo wipefs -aq $DISK
sudo systemctl daemon-reload
# Special case:
#
# For reasons unknown, the runner can boot-up with two different block device
# configurations. On one config you get two 75GB block devices, and on the
# other you get a single 150GB block device. Here's what both look like:
#
# --- Two 75GB block devices ---
# NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
# sda 8:0 0 150G 0 disk
# ├─sda1 8:1 0 149G 0 part /
# ├─sda14 8:14 0 4M 0 part
# ├─sda15 8:15 0 106M 0 part /boot/efi
# └─sda16 259:0 0 913M 0 part /boot
#
# lrwxrwxrwx 1 root root 9 Jan 29 18:07 azure_root -> ../../sda
# lrwxrwxrwx 1 root root 10 Jan 29 18:07 azure_root-part1 -> ../../sda1
# lrwxrwxrwx 1 root root 11 Jan 29 18:07 azure_root-part14 -> ../../sda14
# lrwxrwxrwx 1 root root 11 Jan 29 18:07 azure_root-part15 -> ../../sda15
# lrwxrwxrwx 1 root root 11 Jan 29 18:07 azure_root-part16 -> ../../sda16
#
# --- One 150GB block device ---
# NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
# sda 8:0 0 75G 0 disk
# ├─sda1 8:1 0 74G 0 part /
# ├─sda14 8:14 0 4M 0 part
# ├─sda15 8:15 0 106M 0 part /boot/efi
# └─sda16 259:0 0 913M 0 part /boot
# sdb 8:16 0 75G 0 disk
# └─sdb1 8:17 0 75G 0 part
#
# lrwxrwxrwx 1 root root 9 Jan 29 18:07 azure_resource -> ../../sdb
# lrwxrwxrwx 1 root root 10 Jan 29 18:07 azure_resource-part1 -> ../../sdb1
# lrwxrwxrwx 1 root root 9 Jan 29 18:07 azure_root -> ../../sda
# lrwxrwxrwx 1 root root 10 Jan 29 18:07 azure_root-part1 -> ../../sda1
# lrwxrwxrwx 1 root root 11 Jan 29 18:07 azure_root-part14 -> ../../sda14
# lrwxrwxrwx 1 root root 11 Jan 29 18:07 azure_root-part15 -> ../../sda15
#
# If we have the azure_resource-part1 partition, umount it, partition it, and
# use it as our ZFS disk and swap partition. If not, just create a file VDEV
# and swap file and use that instead.
# remove default swapfile and /mnt
if [ -e /dev/disk/cloud/azure_resource-part1 ] ; then
sudo umount -l /mnt
DISK="/dev/disk/cloud/azure_resource-part1"
sudo sed -e "s|^$DISK.*||g" -i /etc/fstab
sudo wipefs -aq $DISK
sudo systemctl daemon-reload
fi
sudo modprobe loop
sudo modprobe zfs
# partition the disk as needed
DISK="/dev/disk/cloud/azure_resource"
sudo sgdisk --zap-all $DISK
sudo sgdisk -p \
-n 1:0:+16G -c 1:"swap" \
-n 2:0:0 -c 2:"tests" \
$DISK
sync
sleep 1
if [ -e /dev/disk/cloud/azure_resource-part1 ] ; then
echo "We have two 75GB block devices"
# partition the disk as needed
DISK="/dev/disk/cloud/azure_resource"
sudo sgdisk --zap-all $DISK
sudo sgdisk -p \
-n 1:0:+16G -c 1:"swap" \
-n 2:0:0 -c 2:"tests" \
$DISK
sync
sleep 1
sudo fallocate -l 12G /test.ssd2
DISKS="$DISK-part2 /test.ssd2"
SWAP=$DISK-part1
else
echo "We have a single 150GB block device"
sudo fallocate -l 72G /test.ssd2
SWAP=/swapfile.ssd
sudo fallocate -l 16G $SWAP
sudo chmod 600 $SWAP
DISKS="/test.ssd2"
fi
# swap with same size as RAM (16GiB)
sudo mkswap $DISK-part1
sudo swapon $DISK-part1
# JBOD 2xdisk for OpenZFS storage (test vm's)
SSD1="$DISK-part2"
sudo fallocate -l 12G /test.ssd2
SSD2=$(sudo losetup -b 4096 -f /test.ssd2 --show)
sudo mkswap $SWAP
sudo swapon $SWAP
# adjust zfs module parameter and create pool
exec 1>/dev/null
@@ -57,7 +125,7 @@ ARC_MAX=$((1024*1024*512))
echo $ARC_MIN | sudo tee /sys/module/zfs/parameters/zfs_arc_min
echo $ARC_MAX | sudo tee /sys/module/zfs/parameters/zfs_arc_max
echo 1 | sudo tee /sys/module/zfs/parameters/zvol_use_blk_mq
sudo zpool create -f -o ashift=12 zpool $SSD1 $SSD2 -O relatime=off \
sudo zpool create -f -o ashift=12 zpool $DISKS -O relatime=off \
-O atime=off -O xattr=sa -O compression=lz4 -O sync=disabled \
-O redundant_metadata=none -O mountpoint=/mnt/tests
+162 -68
View File
@@ -12,10 +12,10 @@ OS="$1"
# OS variant (virt-install --os-variant list)
OSv=$OS
# compressed with .zst extension
REPO="https://github.com/mcmilk/openzfs-freebsd-images"
FREEBSD="$REPO/releases/download/v2025-04-13"
URLzs=""
# FreeBSD urls's
FREEBSD_REL="https://download.freebsd.org/releases/CI-IMAGES"
FREEBSD_SNAP="https://download.freebsd.org/snapshots/CI-IMAGES"
URLxz=""
# Ubuntu mirrors
UBMIRROR="https://cloud-images.ubuntu.com"
@@ -25,6 +25,10 @@ UBMIRROR="https://cloud-images.ubuntu.com"
# default nic model for vm's
NIC="virtio"
# additional options for virt-install
OPTS[0]=""
OPTS[1]=""
case "$OS" in
almalinux8)
OSNAME="AlmaLinux 8"
@@ -39,20 +43,25 @@ case "$OS" in
OSv="almalinux9"
URL="https://repo.almalinux.org/almalinux/10/cloud/x86_64/images/AlmaLinux-10-GenericCloud-latest.x86_64.qcow2"
;;
alpine3-23)
OSNAME="Alpine Linux 3.23.2"
# Alpine Linux v3.22 and v3.23 are unknown to osinfo as of 2025-12-26.
OSv="alpinelinux3.21"
URL="https://dl-cdn.alpinelinux.org/alpine/v3.23/releases/cloud/generic_alpine-3.23.2-x86_64-bios-cloudinit-r0.qcow2"
;;
archlinux)
OSNAME="Archlinux"
URL="https://geo.mirror.pkgbuild.com/images/latest/Arch-Linux-x86_64-cloudimg.qcow2"
;;
centos-stream10)
OSNAME="CentOS Stream 10"
# TODO: #16903 Overwrite OSv to stream9 for virt-install until it's added to osinfo
OSv="centos-stream9"
URL="https://cloud.centos.org/centos/10-stream/x86_64/images/CentOS-Stream-GenericCloud-10-latest.x86_64.qcow2"
;;
centos-stream9)
OSNAME="CentOS Stream 9"
URL="https://cloud.centos.org/centos/9-stream/x86_64/images/CentOS-Stream-GenericCloud-9-latest.x86_64.qcow2"
;;
centos-stream10)
OSNAME="CentOS Stream 10"
OSv="centos-stream9"
URL="https://cloud.centos.org/centos/10-stream/x86_64/images/CentOS-Stream-GenericCloud-10-latest.x86_64.qcow2"
;;
debian11)
OSNAME="Debian 11"
URL="https://cloud.debian.org/images/cloud/bullseye/latest/debian-11-generic-amd64.qcow2"
@@ -61,6 +70,14 @@ case "$OS" in
OSNAME="Debian 12"
URL="https://cloud.debian.org/images/cloud/bookworm/latest/debian-12-generic-amd64.qcow2"
;;
debian13)
OSNAME="Debian 13"
# TODO: Overwrite OSv to debian13 for virt-install until it's added to osinfo
OSv="debian12"
URL="https://cloud.debian.org/images/cloud/trixie/latest/debian-13-generic-amd64.qcow2"
OPTS[0]="--boot"
OPTS[1]="uefi=on"
;;
fedora41)
OSNAME="Fedora 41"
OSv="fedora-unknown"
@@ -71,50 +88,61 @@ case "$OS" in
OSv="fedora-unknown"
URL="https://download.fedoraproject.org/pub/fedora/linux/releases/42/Cloud/x86_64/images/Fedora-Cloud-Base-Generic-42-1.1.x86_64.qcow2"
;;
freebsd13-4r)
OSNAME="FreeBSD 13.4-RELEASE"
OSv="freebsd13.0"
URLzs="$FREEBSD/amd64-freebsd-13.4-RELEASE.qcow2.zst"
BASH="/usr/local/bin/bash"
NIC="rtl8139"
fedora43)
OSNAME="Fedora 43"
OSv="fedora-unknown"
URL="https://download.fedoraproject.org/pub/fedora/linux/releases/43/Cloud/x86_64/images/Fedora-Cloud-Base-Generic-43-1.6.x86_64.qcow2"
;;
freebsd13-5r)
OSNAME="FreeBSD 13.5-RELEASE"
FreeBSD="13.5-RELEASE"
OSNAME="FreeBSD $FreeBSD"
OSv="freebsd13.0"
URLzs="$FREEBSD/amd64-freebsd-13.5-RELEASE.qcow2.zst"
BASH="/usr/local/bin/bash"
URLxz="$FREEBSD_REL/$FreeBSD/amd64/Latest/FreeBSD-$FreeBSD-amd64-BASIC-CI.raw.xz"
KSRC="$FREEBSD_REL/../amd64/$FreeBSD/src.txz"
NIC="rtl8139"
;;
freebsd14-1r)
OSNAME="FreeBSD 14.1-RELEASE"
OSv="freebsd14.0"
URLzs="$FREEBSD/amd64-freebsd-14.1-RELEASE.qcow2.zst"
BASH="/usr/local/bin/bash"
;;
freebsd14-2r)
OSNAME="FreeBSD 14.2-RELEASE"
FreeBSD="14.2-RELEASE"
OSNAME="FreeBSD $FreeBSD"
OSv="freebsd14.0"
URLzs="$FREEBSD/amd64-freebsd-14.2-RELEASE.qcow2.zst"
BASH="/usr/local/bin/bash"
URLxz="$FREEBSD_REL/$FreeBSD/amd64/Latest/FreeBSD-$FreeBSD-amd64-BASIC-CI.raw.xz"
KSRC="$FREEBSD_REL/../amd64/$FreeBSD/src.txz"
;;
freebsd14-3r)
FreeBSD="14.3-RELEASE"
OSNAME="FreeBSD $FreeBSD"
OSv="freebsd14.0"
URLxz="$FREEBSD_REL/$FreeBSD/amd64/Latest/FreeBSD-$FreeBSD-amd64-BASIC-CI.raw.xz"
KSRC="$FREEBSD_REL/../amd64/$FreeBSD/src.txz"
;;
freebsd13-5s)
OSNAME="FreeBSD 13.5-STABLE"
FreeBSD="13.5-STABLE"
OSNAME="FreeBSD $FreeBSD"
OSv="freebsd13.0"
URLzs="$FREEBSD/amd64-freebsd-13.5-STABLE.qcow2.zst"
BASH="/usr/local/bin/bash"
URLxz="$FREEBSD_SNAP/$FreeBSD/amd64/Latest/FreeBSD-$FreeBSD-amd64-BASIC-CI.raw.xz"
KSRC="$FREEBSD_SNAP/../amd64/$FreeBSD/src.txz"
NIC="rtl8139"
;;
freebsd14-2s)
OSNAME="FreeBSD 14.2-STABLE"
freebsd14-3s)
FreeBSD="14.3-STABLE"
OSNAME="FreeBSD $FreeBSD"
OSv="freebsd14.0"
URLzs="$FREEBSD/amd64-freebsd-14.2-STABLE.qcow2.zst"
BASH="/usr/local/bin/bash"
URLxz="$FREEBSD_SNAP/$FreeBSD/amd64/Latest/FreeBSD-$FreeBSD-amd64-BASIC-CI-ufs.raw.xz"
KSRC="$FREEBSD_SNAP/../amd64/$FreeBSD/src.txz"
;;
freebsd15-0c)
OSNAME="FreeBSD 15.0-CURRENT"
freebsd15-0s)
FreeBSD="15.0-STABLE"
OSNAME="FreeBSD $FreeBSD"
OSv="freebsd14.0"
URLzs="$FREEBSD/amd64-freebsd-15.0-CURRENT.qcow2.zst"
BASH="/usr/local/bin/bash"
URLxz="$FREEBSD_SNAP/$FreeBSD/amd64/Latest/FreeBSD-$FreeBSD-amd64-BASIC-CI-ufs.raw.xz"
KSRC="$FREEBSD_SNAP/../amd64/$FreeBSD/src.txz"
;;
freebsd16-0c)
FreeBSD="16.0-CURRENT"
OSNAME="FreeBSD $FreeBSD"
OSv="freebsd14.0"
URLxz="$FREEBSD_SNAP/$FreeBSD/amd64/Latest/FreeBSD-$FreeBSD-amd64-BASIC-CI-ufs.raw.xz"
KSRC="$FREEBSD_SNAP/../amd64/$FreeBSD/src.txz"
;;
tumbleweed)
OSNAME="openSUSE Tumbleweed"
@@ -168,46 +196,73 @@ echo "CPU=\"$CPU\"" >> $ENV
sudo mkdir -p "/mnt/tests"
sudo chown -R $(whoami) /mnt/tests
DISK="/dev/zvol/zpool/openzfs"
sudo zfs create -ps -b 64k -V 80g zpool/openzfs
while true; do test -b $DISK && break; sleep 1; done
# we are downloading via axel, curl and wget are mostly slower and
# require more return value checking
IMG="/mnt/tests/cloudimg.qcow2"
if [ ! -z "$URLzs" ]; then
echo "Loading image $URLzs ..."
time axel -q -o "$IMG.zst" "$URLzs"
zstd -q -d --rm "$IMG.zst"
IMG="/mnt/tests/cloud-image"
if [ ! -z "$URLxz" ]; then
echo "Loading $URLxz ..."
time axel -q -o "$IMG" "$URLxz"
echo "Loading $KSRC ..."
time axel -q -o ~/src.txz $KSRC
else
echo "Loading image $URL ..."
echo "Loading $URL ..."
time axel -q -o "$IMG" "$URL"
fi
DISK="/dev/zvol/zpool/openzfs"
FORMAT="raw"
sudo zfs create -ps -b 64k -V 80g zpool/openzfs
while true; do test -b $DISK && break; sleep 1; done
echo "Importing VM image to zvol..."
sudo qemu-img dd -f qcow2 -O raw if=$IMG of=$DISK bs=4M
if [ ! -z "$URLxz" ]; then
xzcat -T0 $IMG | sudo dd of=$DISK bs=4M
else
sudo qemu-img dd -f qcow2 -O raw if=$IMG of=$DISK bs=4M
fi
rm -f $IMG
PUBKEY=$(cat ~/.ssh/id_ed25519.pub)
cat <<EOF > /tmp/user-data
if [ ${OS:0:7} != "freebsd" ]; then
cat <<EOF > /tmp/user-data
#cloud-config
fqdn: $OS
hostname: $OS
users:
- name: root
shell: $BASH
- name: zfs
sudo: ALL=(ALL) NOPASSWD:ALL
shell: $BASH
ssh_authorized_keys:
- $PUBKEY
- name: root
shell: /bin/bash
sudo: ['ALL=(ALL) NOPASSWD:ALL']
- name: zfs
shell: /bin/bash
sudo: ['ALL=(ALL) NOPASSWD:ALL']
ssh_authorized_keys:
- $PUBKEY
# Workaround for Alpine Linux.
lock_passwd: false
passwd: '*'
packages:
- sudo
- bash
growpart:
mode: auto
devices: ['/']
ignore_growroot_disabled: false
EOF
else
cat <<EOF > /tmp/user-data
#cloud-config
hostname: $OS
# minimized config without sudo for nuageinit of FreeBSD
growpart:
mode: auto
devices: ['/']
ignore_growroot_disabled: false
EOF
fi
sudo virsh net-update default add ip-dhcp-host \
"<host mac='52:54:00:83:79:00' ip='192.168.122.10'/>" --live --config
@@ -223,15 +278,8 @@ sudo virt-install \
--graphics none \
--network bridge=virbr0,model=$NIC,mac='52:54:00:83:79:00' \
--cloud-init user-data=/tmp/user-data \
--disk $DISK,bus=virtio,cache=none,format=$FORMAT,driver.discard=unmap \
--import --noautoconsole >/dev/null
# enable KSM on Linux
if [ ${OS:0:7} != "freebsd" ]; then
sudo virsh dommemstat --domain "openzfs" --period 5
sudo virsh node-memory-tune 100 50 1
echo 1 | sudo tee /sys/kernel/mm/ksm/run > /dev/null
fi
--disk $DISK,bus=virtio,cache=none,format=raw,driver.discard=unmap \
--import --noautoconsole ${OPTS[0]} ${OPTS[1]} >/dev/null
# Give the VMs hostnames so we don't have to refer to them with
# hardcoded IP addresses.
@@ -252,3 +300,49 @@ StrictHostKeyChecking no
# small timeout, used in while loops later
ConnectTimeout 1
EOF
if [ ${OS:0:7} != "freebsd" ]; then
# enable KSM on Linux
sudo virsh dommemstat --domain "openzfs" --period 5
sudo virsh node-memory-tune 100 50 1
echo 1 | sudo tee /sys/kernel/mm/ksm/run > /dev/null
else
# on FreeBSD we need some more init stuff, because of nuageinit
BASH="/usr/local/bin/bash"
while pidof /usr/bin/qemu-system-x86_64 >/dev/null; do
ssh 2>/dev/null root@vm0 "uname -a" && break
done
ssh root@vm0 "env IGNORE_OSVERSION=yes pkg install -y bash ca_root_nss git qemu-guest-agent python3 py311-cloud-init"
ssh root@vm0 "chsh -s $BASH root"
ssh root@vm0 'sysrc qemu_guest_agent_enable="YES"'
ssh root@vm0 'sysrc cloudinit_enable="YES"'
ssh root@vm0 "pw add user zfs -w no -s $BASH"
ssh root@vm0 'mkdir -p ~zfs/.ssh'
ssh root@vm0 'echo "zfs ALL=(ALL:ALL) NOPASSWD: ALL" >> /usr/local/etc/sudoers'
ssh root@vm0 'echo "PubkeyAuthentication yes" >> /etc/ssh/sshd_config'
scp ~/.ssh/id_ed25519.pub "root@vm0:~zfs/.ssh/authorized_keys"
ssh root@vm0 'chown -R zfs ~zfs'
ssh root@vm0 'service sshd restart'
scp ~/src.txz "root@vm0:/tmp/src.txz"
ssh root@vm0 'tar -C / -zxf /tmp/src.txz'
fi
#
# Config for Alpine Linux similar to FreeBSD.
#
if [ ${OS:0:6} == "alpine" ]; then
while pidof /usr/bin/qemu-system-x86_64 >/dev/null; do
ssh 2>/dev/null zfs@vm0 "uname -a" && break
done
# Enable community and testing repositories.
ssh zfs@vm0 "sudo rm -rf /etc/apk/repositories"
ssh zfs@vm0 "sudo setup-apkrepos -c1"
ssh zfs@vm0 "echo '@testing http://dl-cdn.alpinelinux.org/alpine/edge/testing' | sudo tee -a /etc/apk/repositories"
# Upgrade to edge or latest-stable.
#ssh zfs@vm0 "sudo sed -i 's#/v[0-9]\+\.[0-9]\+/#/edge/#g' /etc/apk/repositories"
#ssh zfs@vm0 "sudo sed -i 's#/v[0-9]\+\.[0-9]\+/#/latest-stable/#g' /etc/apk/repositories"
# Update and upgrade after repository setup.
ssh zfs@vm0 "sudo apk update"
ssh zfs@vm0 "sudo apk add --upgrade apk-tools"
ssh zfs@vm0 "sudo apk upgrade --available"
fi
+58 -8
View File
@@ -10,6 +10,32 @@
set -eu
function alpine() {
echo "##[group]Install Development Tools"
sudo apk add \
acl alpine-sdk attr autoconf automake bash build-base clang21 coreutils \
cpio cryptsetup curl curl-dev dhcpcd eudev eudev-dev eudev-libs findutils \
fio gawk gdb gettext-dev git grep jq libaio libaio-dev libcurl \
libtirpc-dev libtool libunwind libunwind-dev linux-headers linux-tools \
linux-virt linux-virt-dev lsscsi m4 make nfs-utils openssl-dev parted \
pax procps py3-cffi py3-distlib py3-packaging py3-setuptools python3 \
python3-dev qemu-guest-agent rng-tools rsync samba samba-server sed \
strace sysstat util-linux util-linux-dev wget words xfsprogs xxhash \
zlib-dev pamtester@testing
echo "##[endgroup]"
echo "##[group]Switch to eudev"
sudo setup-devd udev
echo "##[endgroup]"
echo "##[group]Install ksh93 from Source"
git clone --depth 1 https://github.com/ksh93/ksh.git /tmp/ksh
cd /tmp/ksh
./bin/package make
sudo ./bin/package install /
echo "##[endgroup]"
}
function archlinux() {
echo "##[group]Running pacman -Syu"
sudo btrfs filesystem resize max /
@@ -20,14 +46,19 @@ function archlinux() {
sudo pacman -Sy --noconfirm base-devel bc cpio cryptsetup dhclient dkms \
fakeroot fio gdb inetutils jq less linux linux-headers lsscsi nfs-utils \
parted pax perf python-packaging python-setuptools qemu-guest-agent ksh \
samba sysstat rng-tools rsync wget xxhash
samba strace sysstat rng-tools rsync wget xxhash
echo "##[endgroup]"
}
function debian() {
export DEBIAN_FRONTEND="noninteractive"
echo "##[group]Wait for cloud-init to finish"
cloud-init status --wait
echo "##[endgroup]"
echo "##[group]Running apt-get update+upgrade"
sudo sed -i '/[[:alpha:]]-backports/d' /etc/apt/sources.list
sudo apt-get update -y
sudo apt-get upgrade -y
echo "##[endgroup]"
@@ -40,9 +71,10 @@ function debian() {
libelf-dev libffi-dev libmount-dev libpam0g-dev libselinux-dev libssl-dev \
libtool libtool-bin libudev-dev libunwind-dev linux-headers-$(uname -r) \
lsscsi nfs-kernel-server pamtester parted python3 python3-all-dev \
python3-cffi python3-dev python3-distlib python3-packaging \
python3-cffi python3-dev python3-distlib python3-packaging libtirpc-dev \
python3-setuptools python3-sphinx qemu-guest-agent rng-tools rpm2cpio \
rsync samba sysstat uuid-dev watchdog wget xfslibs-dev xxhash zlib1g-dev
rsync samba strace sysstat uuid-dev watchdog wget xfslibs-dev xxhash \
zlib1g-dev
echo "##[endgroup]"
}
@@ -51,7 +83,7 @@ function freebsd() {
echo "##[group]Install Development Tools"
sudo pkg install -y autoconf automake autotools base64 checkbashisms fio \
gdb gettext gettext-runtime git gmake gsed jq ksh93 lcov libtool lscpu \
gdb gettext gettext-runtime git gmake gsed jq ksh lcov libtool lscpu \
pkgconf python python3 pamtester pamtester qemu-guest-agent rsync xxhash
sudo pkg install -xy \
'^samba4[[:digit:]]+$' \
@@ -86,8 +118,13 @@ function rhel() {
libuuid-devel lsscsi mdadm nfs-utils openssl-devel pam-devel pamtester \
parted perf python3 python3-cffi python3-devel python3-packaging \
kernel-devel python3-setuptools qemu-guest-agent rng-tools rpcgen \
rpm-build rsync samba sysstat systemd watchdog wget xfsprogs-devel xxhash \
zlib-devel
rpm-build rsync samba strace sysstat systemd watchdog wget xfsprogs-devel \
xxhash zlib-devel
# These are needed for building Lustre. We only install these on EL VMs since
# we don't plan to test build Lustre on other platforms.
sudo dnf install -y libnl3-devel libyaml-devel libmount-devel
echo "##[endgroup]"
}
@@ -103,7 +140,7 @@ function install_fedora_experimental_kernel {
our_version="$1"
sudo dnf -y copr enable @kernel-vanilla/stable
sudo dnf -y copr enable @kernel-vanilla/mainline
all="$(sudo dnf list --showduplicates kernel-*)"
all="$(sudo dnf list --showduplicates kernel-* python3-perf* perf* bpftool*)"
echo "Available versions:"
echo "$all"
@@ -138,6 +175,9 @@ case "$1" in
sudo dnf install -y kernel-abi-stablelists
echo "##[endgroup]"
;;
alpine*)
alpine
;;
archlinux)
archlinux
;;
@@ -186,6 +226,16 @@ test -z "${ONLY_DEPS:-}" || exit 0
# Start services
echo "##[group]Enable services"
case "$1" in
alpine*)
sudo -E rc-update add qemu-guest-agent
sudo -E rc-update add nfs
sudo -E rc-update add samba
sudo -E rc-update add dhcpcd
# Remove services related to cloud-init.
sudo -E rc-update del cloud-init default
sudo -E rc-update del cloud-final default
sudo -E rc-update del cloud-config default
;;
freebsd*)
# add virtio things
echo 'virtio_load="YES"' | sudo -E tee -a /boot/loader.conf
@@ -241,7 +291,7 @@ case "$1" in
esac
case "$1" in
archlinux|freebsd*)
alpine*|archlinux|freebsd*)
true
;;
*)
+20 -3
View File
@@ -5,12 +5,13 @@
#
# Usage:
#
# qemu-4-build-vm.sh OS [--enable-debug][--dkms][--poweroff]
# [--release][--repo][--tarball]
# qemu-4-build-vm.sh OS [--enable-debug][--dkms][--patch-level NUM]
# [--poweroff][--release][--repo][--tarball]
#
# OS: OS name like 'fedora41'
# --enable-debug: Build RPMs with '--enable-debug' (for testing)
# --dkms: Build DKMS RPMs as well
# --patch-level NUM: Use a custom patch level number for packages.
# --poweroff: Power-off the VM after building
# --release Build zfs-release*.rpm as well
# --repo After building everything, copy RPMs into /tmp/repo
@@ -21,6 +22,7 @@
ENABLE_DEBUG=""
DKMS=""
PATCH_LEVEL=""
POWEROFF=""
RELEASE=""
REPO=""
@@ -35,6 +37,11 @@ while [[ $# -gt 0 ]]; do
DKMS=1
shift
;;
--patch-level)
PATCH_LEVEL=$2
shift
shift
;;
--poweroff)
POWEROFF=1
shift
@@ -215,6 +222,10 @@ function rpm_build_and_install() {
run ./autogen.sh
echo "##[endgroup]"
if [ -n "$PATCH_LEVEL" ] ; then
sed -i -E 's/(Release:\s+)1/\1'$PATCH_LEVEL'/g' META
fi
echo "##[group]Configure"
run ./configure --enable-debuginfo $extra
echo "##[endgroup]"
@@ -328,7 +339,13 @@ fi
# almalinux9.5
# fedora42
source /etc/os-release
sudo hostname "$ID$VERSION_ID"
if which hostnamectl &> /dev/null ; then
# Fedora 42+ use hostnamectl
sudo hostnamectl set-hostname "$ID$VERSION_ID"
sudo hostnamectl set-hostname --pretty "$ID$VERSION_ID"
else
sudo hostname "$ID$VERSION_ID"
fi
# save some sysinfo
uname -a > /var/tmp/uname.txt
+45 -16
View File
@@ -12,16 +12,26 @@ source /var/tmp/env.txt
# wait for poweroff to succeed
PID=$(pidof /usr/bin/qemu-system-x86_64)
tail --pid=$PID -f /dev/null
sudo virsh undefine openzfs
sudo virsh undefine --nvram openzfs
# cpu pinning
CPUSET=("0,1" "2,3")
# additional options for virt-install
OPTS[0]=""
OPTS[1]=""
case "$OS" in
freebsd*)
# FreeBSD needs only 6GiB
RAM=6
;;
debian13)
RAM=8
# Boot Debian 13 with uefi=on and secureboot=off (ZFS Kernel Module not signed)
OPTS[0]="--boot"
OPTS[1]="firmware=efi,firmware.feature0.name=secure-boot,firmware.feature0.enabled=no"
;;
*)
# Linux needs more memory, but can be optimized to share it via KSM
RAM=8
@@ -48,13 +58,21 @@ for ((i=1; i<=VMs; i++)); do
fqdn: vm$i
users:
- name: root
shell: $BASH
- name: zfs
sudo: ALL=(ALL) NOPASSWD:ALL
shell: $BASH
ssh_authorized_keys:
- $PUBKEY
- name: root
shell: /bin/bash
sudo: ['ALL=(ALL) NOPASSWD:ALL']
- name: zfs
shell: /bin/bash
sudo: ['ALL=(ALL) NOPASSWD:ALL']
ssh_authorized_keys:
- $PUBKEY
# Workaround for Alpine Linux.
lock_passwd: false
passwd: '*'
packages:
- sudo
- bash
growpart:
mode: auto
@@ -79,7 +97,7 @@ EOF
--network bridge=virbr0,model=$NIC,mac="52:54:00:83:79:0$i" \
--disk $DISK-system,bus=virtio,cache=none,format=$FORMAT,driver.discard=unmap \
--disk $DISK-tests,bus=virtio,cache=none,format=$FORMAT,driver.discard=unmap \
--import --noautoconsole >/dev/null
--import --noautoconsole ${OPTS[0]} ${OPTS[1]}
done
# generate some memory stats
@@ -98,19 +116,30 @@ echo '*/5 * * * * /root/cronjob.sh' > crontab.txt
sudo crontab crontab.txt
rm crontab.txt
# check if the machines are okay
echo "Waiting for vm's to come up... (${VMs}x CPU=$CPU RAM=$RAM)"
for ((i=1; i<=VMs; i++)); do
.github/workflows/scripts/qemu-wait-for-vm.sh vm$i
done
echo "All $VMs VMs are up now."
# Save the VM's serial output (ttyS0) to /var/tmp/console.txt
# - ttyS0 on the VM corresponds to a local /dev/pty/N entry
# - use 'virsh ttyconsole' to lookup the /dev/pty/N entry
for ((i=1; i<=VMs; i++)); do
mkdir -p $RESPATH/vm$i
read "pty" <<< $(sudo virsh ttyconsole vm$i)
# Create the file so we can tail it, even if there's no output.
touch $RESPATH/vm$i/console.txt
sudo nohup bash -c "cat $pty > $RESPATH/vm$i/console.txt" &
# Write all VM boot lines to the console to aid in debugging failed boots.
# The boot lines from all the VMs will be munged together, so prepend each
# line with the vm hostname (like 'vm1:').
(while IFS=$'\n' read -r line; do echo "vm$i: $line" ; done < <(sudo tail -f $RESPATH/vm$i/console.txt)) &
done
echo "Console logging for ${VMs}x $OS started."
# check if the machines are okay
echo "Waiting for vm's to come up... (${VMs}x CPU=$CPU RAM=$RAM)"
for ((i=1; i<=VMs; i++)); do
.github/workflows/scripts/qemu-wait-for-vm.sh vm$i
done
echo "All $VMs VMs are up now."
+51
View File
@@ -0,0 +1,51 @@
#!/usr/bin/env bash
######################################################################
# 6) Test if Lustre can still build against ZFS
######################################################################
set -e
# Build from the latest Lustre tag rather than the master branch. We do this
# under the assumption that master is going to have a lot of churn thus will be
# more prone to breaking the build than a point release. We don't want ZFS
# PR's reporting bad test results simply because upstream Lustre accidentally
# broke their build.
#
# Skip any RC tags, or any tags where the last version digit is 50 or more.
# Versions with 50 or more are development versions of Lustre.
repo=https://github.com/lustre/lustre-release.git
tag="$(git ls-remote --refs --exit-code --sort=version:refname --tags $repo | \
awk -F '_' '/-RC/{next}; /refs\/tags\/v/{if ($NF < 50){print}}' | \
tail -n 1 | sed 's/.*\///')"
echo "Cloning Lustre tag $tag"
git clone --depth 1 --branch "$tag" "$repo"
cd lustre-release
# Include Lustre patches to build against master/zfs-2.4.x. Once these
# patches are merged we can remove these lines.
patches=('https://review.whamcloud.com/changes/fs%2Flustre-release~62101/revisions/2/patch?download'
'https://review.whamcloud.com/changes/fs%2Flustre-release~63267/revisions/9/patch?download')
for p in "${patches[@]}" ; do
curl $p | base64 -d > patch
patch -p1 < patch || true
done
echo "Configure Lustre"
./autogen.sh
# EL 9 needs '--disable-gss-keyring'
./configure --with-zfs --disable-gss-keyring
echo "Building Lustre RPMs"
make rpms
ls *.rpm
# There's only a handful of Lustre RPMs we actually need to install
lustrerpms="$(ls *.rpm | grep -E 'kmod-lustre-osd-zfs-[0-9]|kmod-lustre-[0-9]|lustre-osd-zfs-mount-[0-9]')"
echo "Installing: $lustrerpms"
sudo dnf -y install $lustrerpms
sudo modprobe -v lustre
# Should see some Lustre lines in dmesg
sudo dmesg | grep -Ei 'lnet|lustre'
+127 -13
View File
@@ -4,7 +4,10 @@
# 6) load openzfs module and run the tests
#
# called on runner: qemu-6-tests.sh
# called on qemu-vm: qemu-6-tests.sh $OS $2/$3
# called on qemu-vm: qemu-6-tests.sh $OS $2 $3 [--lustre|--builtin] [quick|default]
#
# --lustre: Test build lustre in addition to the normal tests
# --builtin: Test build ZFS as a kernel built-in in addition to the normal tests
######################################################################
set -eu
@@ -21,11 +24,13 @@ function prefix() {
S=$((DIFF-(M*60)))
CTR=$(cat /tmp/ctr)
echo $LINE| grep -q "^Test[: ]" && CTR=$((CTR+1)) && echo $CTR > /tmp/ctr
echo $LINE| grep -q '^\[.*] Test[: ]' && CTR=$((CTR+1)) && echo $CTR > /tmp/ctr
BASE="$HOME/work/zfs/zfs"
COLOR="$BASE/scripts/zfs-tests-color.sh"
CLINE=$(echo $LINE| grep "^Test[ :]" | sed -e 's|/usr/local|/usr|g' \
CLINE=$(echo $LINE| grep '^\[.*] Test[: ]' \
| sed -e 's|^\[.*] Test|Test|g' \
| sed -e 's|/usr/local|/usr|g' \
| sed -e 's| /usr/share/zfs/zfs-tests/tests/| |g' | $COLOR)
if [ -z "$CLINE" ]; then
printf "vm${ID}: %s\n" "$LINE"
@@ -36,6 +41,54 @@ function prefix() {
fi
}
function do_lustre_build() {
local rc=0
$HOME/zfs/.github/workflows/scripts/qemu-6-lustre-tests-vm.sh &> /var/tmp/lustre.txt || rc=$?
echo "$rc" > /var/tmp/lustre-exitcode.txt
if [ "$rc" != "0" ] ; then
echo "$rc" > /var/tmp/tests-exitcode.txt
fi
}
export -f do_lustre_build
# Test build ZFS into the kernel directly
function do_builtin_build() {
local rc=0
# Get currently full kernel version (like '6.18.8')
fullver=$(uname -r | grep -Eo '^[0-9]+\.[0-9]+\.[0-9]+')
# Get just the major ('6')
major=$(echo $fullver | grep -Eo '^[0-9]+')
(
set -e
wget https://cdn.kernel.org/pub/linux/kernel/v${major}.x/linux-$fullver.tar.xz
tar -xf $HOME/linux-$fullver.tar.xz
cd $HOME/linux-$fullver
make tinyconfig
./scripts/config --enable EFI_PARTITON
./scripts/config --enable BLOCK
# BTRFS_FS is easiest config option to enable CONFIG_ZLIB_INFLATE|DEFLATE
./scripts/config --enable BTRFS_FS
yes "" | make oldconfig
make prepare
cd $HOME/zfs
./configure --with-linux=$HOME/linux-$fullver --enable-linux-builtin --enable-debug
./copy-builtin $HOME/linux-$fullver
cd $HOME/linux-$fullver
./scripts/config --enable ZFS
yes "" | make oldconfig
make -j `nproc`
) &> /var/tmp/builtin.txt || rc=$?
echo "$rc" > /var/tmp/builtin-exitcode.txt
if [ "$rc" != "0" ] ; then
echo "$rc" > /var/tmp/tests-exitcode.txt
fi
}
export -f do_builtin_build
# called directly on the runner
if [ -z ${1:-} ]; then
cd "/var/tmp"
@@ -47,8 +100,24 @@ if [ -z ${1:-} ]; then
for ((i=1; i<=VMs; i++)); do
IP="192.168.122.1$i"
# We do an additional test build of Lustre against ZFS if we're vm2
# on almalinux*. At the time of writing, the vm2 tests were
# completing roughly 15min before the vm1 tests, so it makes sense
# to have vm2 do the build.
#
# In addition, we do an additional test build of ZFS as a Linux
# kernel built-in on Fedora. Again, we do it on vm2 to exploit vm2's
# early finish time.
extra=""
if [[ "$OS" == almalinux* ]] && [[ "$i" == "2" ]] ; then
extra="--lustre"
elif [[ "$OS" == fedora* ]] && [[ "$i" == "2" ]] ; then
extra="--builtin"
fi
daemonize -c /var/tmp -p vm${i}.pid -o vm${i}log.txt -- \
$SSH zfs@$IP $TESTS $OS $i $VMs $CI_TYPE
$SSH zfs@$IP $TESTS $OS $i $VMs $extra $CI_TYPE
# handly line by line and add info prefix
stdbuf -oL tail -fq vm${i}log.txt \
| while read -r line; do prefix "$i" "$line"; done &
@@ -68,9 +137,35 @@ if [ -z ${1:-} ]; then
exit 0
fi
# this part runs inside qemu vm
#############################################
# Everything from here on runs inside qemu vm
#############################################
# Process cmd line args
OS="$1"
shift
NUM="$1"
shift
DEN="$1"
shift
BUILD_LUSTRE=0
BUILD_BUILTIN=0
if [ "$1" == "--lustre" ] ; then
BUILD_LUSTRE=1
shift
elif [ "$1" == "--builtin" ] ; then
BUILD_BUILTIN=1
shift
fi
if [ "$1" == "quick" ] ; then
export RUNFILES="sanity.run"
fi
export PATH="$PATH:/bin:/sbin:/usr/bin:/usr/sbin:/usr/local/sbin:/usr/local/bin"
case "$1" in
case "$OS" in
freebsd*)
TDIR="/usr/local/share/zfs"
sudo kldstat -n zfs 2>/dev/null && sudo kldunload zfs
@@ -93,23 +188,42 @@ case "$1" in
;;
esac
# enable io_uring on el9/el10
case "$1" in
# Distribution-specific settings.
case "$OS" in
almalinux9|almalinux10|centos-stream*)
# Enable io_uring on Enterprise Linux 9 and 10.
sudo sysctl kernel.io_uring_disabled=0 > /dev/null
;;
alpine*)
# Ensure `/etc/zfs/zpool.cache` exists.
sudo mkdir -p /etc/zfs
sudo touch /etc/zfs/zpool.cache
sudo chmod 644 /etc/zfs/zpool.cache
;;
esac
# Lustre calls a number of exported ZFS module symbols. To make sure we don't
# change the symbols and break Lustre, do a quick Lustre build of the latest
# released Lustre against ZFS.
#
# Note that we do the Lustre test build in parallel with ZTS. ZTS isn't very
# CPU intensive, so we can use idle CPU cycles "guilt free" for the build.
# The Lustre build on its own takes ~15min.
if [ "$BUILD_LUSTRE" == "1" ] ; then
do_lustre_build &
elif [ "$BUILD_BUILTIN" == "1" ] ; then
# Try building ZFS directly into the Linux kernel (not as a module)
do_builtin_build &
fi
# run functional testings and save exitcode
cd /var/tmp
TAGS=$2/$3
if [ "$4" == "quick" ]; then
export RUNFILES="sanity.run"
fi
TAGS=$NUM/$DEN
sudo dmesg -c > dmesg-prerun.txt
mount > mount.txt
df -h > df-prerun.txt
$TDIR/zfs-tests.sh -vK -s 3GB -T $TAGS
$TDIR/zfs-tests.sh -vKO -s 3GB -T $TAGS
RV=$?
df -h > df-postrun.txt
echo $RV > tests-exitcode.txt
@@ -31,6 +31,12 @@ EOF
rm -f tmp$$
}
function showfile_tail() {
echo "##[group]$2 (final lines)"
tail -n 80 $1
echo "##[endgroup]"
}
# overview
cat /tmp/summary.txt
echo ""
@@ -46,6 +52,32 @@ fi
echo -e "\nFull logs for download:\n $1\n"
for ((i=1; i<=VMs; i++)); do
# Print Lustre build test results (the build is only done on vm2)
if [ -f vm$i/lustre-exitcode.txt ] ; then
rv=$(< vm$i/lustre-exitcode.txt)
if [ $rv = 0 ]; then
vm="vm$i"
else
vm="vm$i"
touch /tmp/have_failed_tests
fi
file="vm$i/lustre.txt"
test -s "$file" && showfile_tail "$file" "$vm: Lustre build"
fi
if [ -f vm$i/builtin-exitcode.txt ] ; then
rv=$(< vm$i/builtin-exitcode.txt)
if [ $rv = 0 ]; then
vm="vm$i"
else
vm="vm$i"
touch /tmp/have_failed_tests
fi
file="vm$i/builtin.txt"
test -s "$file" && showfile_tail "$file" "$vm: Linux built-in build"
fi
rv=$(cat vm$i/tests-exitcode.txt)
if [ $rv = 0 ]; then
+33 -8
View File
@@ -4,7 +4,11 @@
#
# USAGE:
#
# ./qemu-test-repo-vm [URL]
# ./qemu-test-repo-vm [--install] [URL]
#
# --lookup: When testing a repo, only lookup the latest package versions,
# don't try to install them. Installing all of them takes over
# an hour, so this is much quicker.
#
# URL: URL to use instead of http://download.zfsonlinux.org
# If blank, use the default repo from zfs-release RPM.
@@ -15,6 +19,13 @@ source /etc/os-release
OS="$ID"
VERSION="$VERSION_ID"
LOOKUP=""
if [ -n "$1" ] && [ "$1" == "--lookup" ] ; then
LOOKUP=1
shift
fi
ALTHOST=""
if [ -n "$1" ] ; then
ALTHOST="$1"
@@ -42,7 +53,19 @@ function test_install {
sudo sed -i "s;baseurl=http://download.zfsonlinux.org;baseurl=$host;g" /etc/yum.repos.d/zfs.repo
fi
sudo dnf -y install $args zfs zfs-test
baseurl=$(grep -A 5 "\[$repo\]" /etc/yum.repos.d/zfs.repo | awk -F'=' '/baseurl=/{print $2; exit}')
# Just do a version lookup - don't try to install any RPMs
if [ "$LOOKUP" == "1" ] ; then
package="$(dnf list $args zfs | tail -n 1 | awk '{print $2}')"
echo "$repo ${package} $baseurl" >> $SUMMARY
return
fi
if ! sudo dnf -y install $args zfs zfs-test ; then
echo "$repo ${package}...[FAILED] $baseurl" >> $SUMMARY
return
fi
# Load modules and create a simple pool as a sanity test.
sudo /usr/share/zfs/zfs.sh -r
@@ -51,7 +74,6 @@ function test_install {
sudo zpool status
# Print out repo name, rpm installed (kmod or dkms), and repo URL
baseurl=$(grep -A 5 "\[$repo\]" /etc/yum.repos.d/zfs.repo | awk -F'=' '/baseurl=/{print $2; exit}')
package=$(sudo rpm -qa | grep zfs | grep -E 'kmod|dkms')
echo "$repo $package $baseurl" >> $SUMMARY
@@ -70,16 +92,19 @@ almalinux*)
name=$(curl -Ls $url | grep 'dnf install' | grep -Eo 'zfs-release-[0-9]+-[0-9]+')
sudo dnf -y install https://zfsonlinux.org/epel/$name$(rpm --eval "%{dist}").noarch.rpm 2>&1
sudo rpm -qi zfs-release
test_install zfs $ALTHOST
test_install zfs-kmod $ALTHOST
test_install zfs-testing $ALTHOST
test_install zfs-testing-kmod $ALTHOST
for i in zfs zfs-kmod zfs-testing zfs-testing-kmod zfs-latest \
zfs-latest-kmod zfs-legacy zfs-legacy-kmod zfs-2.2 \
zfs-2.2-kmod zfs-2.3 zfs-2.3-kmod zfs-2.4 zfs-2.4-kmod; do
test_install $i $ALTHOST
done
;;
fedora*)
url='https://raw.githubusercontent.com/openzfs/openzfs-docs/refs/heads/master/docs/Getting%20Started/Fedora/index.rst'
name=$(curl -Ls $url | grep 'dnf install' | grep -Eo 'zfs-release-[0-9]+-[0-9]+')
sudo dnf -y install -y https://zfsonlinux.org/fedora/$name$(rpm --eval "%{dist}").noarch.rpm
test_install zfs $ALTHOST
for i in zfs zfs-latest zfs-legacy zfs-2.2 zfs-2.3 zfs-2.4 ; do
test_install $i $ALTHOST
done
;;
esac
echo "##[endgroup]"
+52
View File
@@ -0,0 +1,52 @@
name: smatch
on:
push:
pull_request:
concurrency:
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
cancel-in-progress: true
jobs:
smatch:
runs-on: ubuntu-24.04
steps:
- name: Checkout smatch
uses: actions/checkout@v4
with:
repository: error27/smatch
ref: master
path: smatch
- name: Install smatch dependencies
run: |
sudo apt-get install -y llvm gcc make sqlite3 libsqlite3-dev libdbd-sqlite3-perl libssl-dev libtry-tiny-perl
- name: Make smatch
run: |
cd $GITHUB_WORKSPACE/smatch
make -j$(nproc)
- name: Checkout OpenZFS
uses: actions/checkout@v4
with:
ref: ${{ github.event.pull_request.head.sha }}
path: zfs
- name: Install OpenZFS dependencies
run: |
cd $GITHUB_WORKSPACE/zfs
sudo apt-get purge -y snapd google-chrome-stable firefox
ONLY_DEPS=1 .github/workflows/scripts/qemu-3-deps-vm.sh ubuntu24
- name: Autogen.sh OpenZFS
run: |
cd $GITHUB_WORKSPACE/zfs
./autogen.sh
- name: Configure OpenZFS
run: |
cd $GITHUB_WORKSPACE/zfs
./configure --enable-debug
- name: Make OpenZFS
run: |
cd $GITHUB_WORKSPACE/zfs
make -j$(nproc) CHECK="$GITHUB_WORKSPACE/smatch/smatch" CC=$GITHUB_WORKSPACE/smatch/cgcc | tee $GITHUB_WORKSPACE/smatch.log
- name: Smatch results log
run: |
grep -E 'error:|warn:|warning:' $GITHUB_WORKSPACE/smatch.log
+25 -8
View File
@@ -32,11 +32,22 @@ on:
options:
- "Build RPMs"
- "Test repo"
patch_level:
type: string
required: false
default: ""
description: "(optional) patch level number"
repo_url:
type: string
required: false
default: ""
description: "(optional) repo URL (blank: use http://download.zfsonlinux.org)"
lookup:
type: boolean
required: false
default: false
description: "(optional) do version lookup only on repo test"
concurrency:
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
cancel-in-progress: true
@@ -47,7 +58,7 @@ jobs:
strategy:
fail-fast: false
matrix:
os: ['almalinux8', 'almalinux9', 'almalinux10', 'fedora41', 'fedora42']
os: ['almalinux8', 'almalinux9', 'almalinux10', 'fedora41', 'fedora42', 'fedora43']
runs-on: ubuntu-24.04
steps:
- uses: actions/checkout@v4
@@ -55,20 +66,16 @@ jobs:
ref: ${{ github.event.pull_request.head.sha }}
- name: Setup QEMU
timeout-minutes: 10
run: .github/workflows/scripts/qemu-1-setup.sh
- name: Start build machine
timeout-minutes: 10
run: .github/workflows/scripts/qemu-2-start.sh ${{ matrix.os }}
- name: Install dependencies
timeout-minutes: 20
run: |
.github/workflows/scripts/qemu-3-deps.sh ${{ matrix.os }}
- name: Build modules or Test repo
timeout-minutes: 30
run: |
set -e
if [ "${{ github.event.inputs.test_type }}" == "Test repo" ] ; then
@@ -76,14 +83,24 @@ jobs:
.github/workflows/scripts/qemu-prepare-for-build.sh
mkdir -p /tmp/repo
ssh zfs@vm0 '$HOME/zfs/.github/workflows/scripts/qemu-test-repo-vm.sh' ${{ github.event.inputs.repo_url }}
EXTRA=""
if [ "${{ github.event.inputs.lookup }}" == 'true' ] ; then
EXTRA="--lookup"
fi
ssh zfs@vm0 '$HOME/zfs/.github/workflows/scripts/qemu-test-repo-vm.sh' $EXTRA ${{ github.event.inputs.repo_url }}
else
.github/workflows/scripts/qemu-4-build.sh --repo --release --dkms --tarball ${{ matrix.os }}
EXTRA=""
if [ -n "${{ github.event.inputs.patch_level }}" ] ; then
EXTRA="--patch-level ${{ github.event.inputs.patch_level }}"
fi
.github/workflows/scripts/qemu-4-build.sh $EXTRA \
--repo --release --dkms --tarball ${{ matrix.os }}
fi
- name: Prepare artifacts
if: always()
timeout-minutes: 10
run: |
rsync -a zfs@vm0:/tmp/repo /tmp || true
.github/workflows/scripts/replace-dupes-with-symlinks.sh /tmp/repo
+39 -40
View File
@@ -5,21 +5,16 @@ on:
pull_request:
workflow_dispatch:
inputs:
include_stream9:
type: boolean
required: false
default: false
description: 'Test on CentOS 9 stream'
include_stream10:
type: boolean
required: false
default: false
description: 'Test on CentOS 10 stream'
fedora_kernel_ver:
type: string
required: false
default: ""
description: "(optional) Experimental kernel version to install on Fedora (like '6.14' or '6.13.3-0.rc3')"
specific_os:
type: string
required: false
default: ""
description: "(optional) Only run on this specific OS (like 'fedora42' or 'alpine3-23')"
concurrency:
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
@@ -39,41 +34,45 @@ jobs:
- name: Generate OS config and CI type
id: os
run: |
FULL_OS='["almalinux8", "almalinux9", "almalinux10", "debian11", "debian12", "fedora41", "fedora42", "freebsd13-4r", "freebsd14-2s", "freebsd15-0c", "ubuntu22", "ubuntu24"]'
QUICK_OS='["almalinux8", "almalinux9", "almalinux10", "debian12", "fedora42", "freebsd14-2r", "ubuntu24"]'
ci_type="default"
# determine CI type when running on PR
ci_type="full"
if ${{ github.event_name == 'pull_request' }}; then
head=${{ github.event.pull_request.head.sha }}
base=${{ github.event.pull_request.base.sha }}
ci_type=$(python3 .github/workflows/scripts/generate-ci-type.py $head $base)
fi
if [ "$ci_type" == "quick" ]; then
os_selection="$QUICK_OS"
else
os_selection="$FULL_OS"
fi
if [ ${{ github.event.inputs.fedora_kernel_ver }} != "" ] ; then
# They specified a custom kernel version for Fedora. Use only
# Fedora runners.
case "$ci_type" in
quick)
os_selection='["almalinux8", "almalinux9", "almalinux10", "debian12", "fedora42", "freebsd15-0s", "ubuntu24"]'
;;
linux)
os_selection='["almalinux8", "almalinux9", "almalinux10", "centos-stream9", "centos-stream10", "debian11", "debian12", "debian13", "fedora41", "fedora42", "fedora43", "ubuntu22", "ubuntu24"]'
;;
freebsd)
os_selection='["freebsd13-5r", "freebsd14-2r", "freebsd14-3r", "freebsd13-5s", "freebsd14-3s", "freebsd15-0s", "freebsd16-0c"]'
;;
*)
# default list
os_selection='["almalinux8", "almalinux9", "almalinux10", "centos-stream9", "centos-stream10", "debian12", "debian13", "fedora42", "fedora43", "freebsd14-3r", "freebsd15-0s", "freebsd16-0c", "ubuntu22", "ubuntu24"]'
;;
esac
if ${{ github.event.inputs.fedora_kernel_ver != '' }}; then
# They specified a custom kernel version for Fedora.
# Use only Fedora runners.
os_json=$(echo ${os_selection} | jq -c '[.[] | select(startswith("fedora"))]')
elif ${{ github.event.inputs.specific_os != '' }}; then
# Use only the specified runner.
os_json=$(jq -cn --arg os "${{ github.event.inputs.specific_os }}" '[ $os ]')
else
# Normal case
os_json=$(echo ${os_selection} | jq -c)
fi
# Add optional runners
if [ "${{ github.event.inputs.include_stream9 }}" == 'true' ]; then
os_json=$(echo $os_json | jq -c '. += ["centos-stream9"]')
fi
if [ "${{ github.event.inputs.include_stream10 }}" == 'true' ]; then
os_json=$(echo $os_json | jq -c '. += ["centos-stream10"]')
fi
echo $os_json
echo "os=$os_json" >> $GITHUB_OUTPUT
echo "ci_type=$ci_type" >> $GITHUB_OUTPUT
echo "os=$os_json" | tee -a $GITHUB_OUTPUT
echo "ci_type=$ci_type" | tee -a $GITHUB_OUTPUT
qemu-vm:
name: qemu-x86
@@ -81,13 +80,13 @@ jobs:
strategy:
fail-fast: false
matrix:
# rhl: almalinux8, almalinux9, centos-stream9, fedora41
# debian: debian11, debian12, ubuntu22, ubuntu24
# rhl: almalinux8, almalinux9, centos-streamX, fedora4x
# debian: debian12, debian13, ubuntu22, ubuntu24
# misc: archlinux, tumbleweed
# FreeBSD variants of 2024-12:
# FreeBSD Release: freebsd13-4r, freebsd14-2r
# FreeBSD Stable: freebsd13-4s, freebsd14-2s
# FreeBSD Current: freebsd15-0c
# FreeBSD variants of november 2025:
# FreeBSD Release: freebsd13-5r, freebsd14-2r, freebsd14-3r
# FreeBSD Stable: freebsd13-5s, freebsd14-3s, freebsd15-0s
# FreeBSD Current: freebsd16-0c
os: ${{ fromJson(needs.test-config.outputs.test_os) }}
runs-on: ubuntu-24.04
steps:
@@ -96,7 +95,7 @@ jobs:
ref: ${{ github.event.pull_request.head.sha }}
- name: Setup QEMU
timeout-minutes: 10
timeout-minutes: 60
run: .github/workflows/scripts/qemu-1-setup.sh
- name: Start build machine
@@ -104,7 +103,7 @@ jobs:
run: .github/workflows/scripts/qemu-2-start.sh ${{ matrix.os }}
- name: Install dependencies
timeout-minutes: 20
timeout-minutes: 60
run: .github/workflows/scripts/qemu-3-deps.sh ${{ matrix.os }} ${{ github.event.inputs.fedora_kernel_ver }}
- name: Build modules
+12 -12
View File
@@ -12,7 +12,8 @@ jobs:
zloop:
runs-on: ubuntu-24.04
env:
TEST_DIR: /var/tmp/zloop
WORK_DIR: /mnt/zloop
CORE_DIR: /mnt/zloop/cores
steps:
- uses: actions/checkout@v4
with:
@@ -40,38 +41,37 @@ jobs:
sudo modprobe zfs
- name: Tests
run: |
sudo mkdir -p $TEST_DIR
# run for 10 minutes or at most 6 iterations for a maximum runner
# time of 60 minutes.
sudo /usr/share/zfs/zloop.sh -t 600 -I 6 -l -m 1 -- -T 120 -P 60
sudo truncate -s 256G /mnt/vdev
sudo zpool create cipool -m $WORK_DIR -O compression=on -o autotrim=on /mnt/vdev
sudo /usr/share/zfs/zloop.sh -t 600 -I 6 -l -m 1 -c $CORE_DIR -f $WORK_DIR -- -T 120 -P 60
- name: Prepare artifacts
if: failure()
run: |
sudo chmod +r -R $TEST_DIR/
sudo chmod +r -R $WORK_DIR/
- name: Ztest log
if: failure()
run: |
grep -B10 -A1000 'ASSERT' $TEST_DIR/*/ztest.out || tail -n 1000 $TEST_DIR/*/ztest.out
grep -B10 -A1000 'ASSERT' $CORE_DIR/*/ztest.out || tail -n 1000 $CORE_DIR/*/ztest.out
- name: Gdb log
if: failure()
run: |
sed -n '/Backtraces (full)/q;p' $TEST_DIR/*/ztest.gdb
sed -n '/Backtraces (full)/q;p' $CORE_DIR/*/ztest.gdb
- name: Zdb log
if: failure()
run: |
cat $TEST_DIR/*/ztest.zdb
cat $CORE_DIR/*/ztest.zdb
- uses: actions/upload-artifact@v4
if: failure()
with:
name: Logs
path: |
/var/tmp/zloop/*/
!/var/tmp/zloop/*/vdev/
/mnt/zloop/*/
!/mnt/zloop/cores/*/vdev/
if-no-files-found: ignore
- uses: actions/upload-artifact@v4
if: failure()
with:
name: Pool files
path: |
/var/tmp/zloop/*/vdev/
/mnt/zloop/cores/*/vdev/
if-no-files-found: ignore
+2 -2
View File
@@ -1,10 +1,10 @@
Meta: 1
Name: zfs
Branch: 1.0
Version: 2.3.3
Version: 2.3.6
Release: 1
Release-Tags: relext
License: CDDL
Author: OpenZFS
Linux-Maximum: 6.15
Linux-Maximum: 6.19
Linux-Minimum: 4.18
+3
View File
@@ -559,6 +559,7 @@ def section_arc(kstats_dict):
print()
compressed_size = arc_stats['compressed_size']
uncompressed_size = arc_stats['uncompressed_size']
overhead_size = arc_stats['overhead_size']
bonus_size = arc_stats['bonus_size']
dnode_size = arc_stats['dnode_size']
@@ -671,6 +672,8 @@ def section_arc(kstats_dict):
print()
print('ARC misc:')
prt_i2('Uncompressed size:', f_perc(uncompressed_size, compressed_size),
f_bytes(uncompressed_size))
prt_i1('Memory throttles:', arc_stats['memory_throttle_count'])
prt_i1('Memory direct reclaims:', arc_stats['memory_direct_count'])
prt_i1('Memory indirect reclaims:', arc_stats['memory_indirect_count'])
+19 -4
View File
@@ -264,9 +264,21 @@ cmp_data(raidz_test_opts_t *opts, raidz_map_t *rm)
static int
init_rand(void *data, size_t size, void *private)
{
size_t *offsetp = (size_t *)private;
size_t offset = *offsetp;
VERIFY3U(offset + size, <=, SPA_MAXBLOCKSIZE);
memcpy(data, (char *)rand_data + offset, size);
*offsetp = offset + size;
return (0);
}
static int
corrupt_rand_fill(void *data, size_t size, void *private)
{
(void) private;
memcpy(data, rand_data, size);
memset(data, 0xAA, size);
return (0);
}
@@ -278,7 +290,7 @@ corrupt_colums(raidz_map_t *rm, const int *tgts, const int cnt)
for (int i = 0; i < cnt; i++) {
raidz_col_t *col = &rr->rr_col[tgts[i]];
abd_iterate_func(col->rc_abd, 0, col->rc_size,
init_rand, NULL);
corrupt_rand_fill, NULL);
}
}
}
@@ -286,7 +298,8 @@ corrupt_colums(raidz_map_t *rm, const int *tgts, const int cnt)
void
init_zio_abd(zio_t *zio)
{
abd_iterate_func(zio->io_abd, 0, zio->io_size, init_rand, NULL);
size_t offset = 0;
abd_iterate_func(zio->io_abd, 0, zio->io_size, init_rand, &offset);
}
static void
@@ -373,7 +386,7 @@ init_raidz_map(raidz_test_opts_t *opts, zio_t **zio, const int parity)
*zio = umem_zalloc(sizeof (zio_t), UMEM_NOFAIL);
(*zio)->io_offset = 0;
(*zio)->io_offset = opts->rto_offset;
(*zio)->io_size = alloc_dsize;
(*zio)->io_abd = raidz_alloc(alloc_dsize);
init_zio_abd(*zio);
@@ -834,6 +847,8 @@ main(int argc, char **argv)
err = run_test(NULL);
}
mprotect(rand_data, SPA_MAXBLOCKSIZE, PROT_READ | PROT_WRITE);
umem_free(rand_data, SPA_MAXBLOCKSIZE);
kernel_fini();
+1 -1
View File
@@ -72,7 +72,7 @@ typedef struct raidz_test_opts {
static const raidz_test_opts_t rto_opts_defaults = {
.rto_ashift = 9,
.rto_offset = 1ULL << 0,
.rto_offset = 0,
.rto_dcols = 8,
.rto_dsize = 1<<19,
.rto_v = D_ALL,
+68 -28
View File
@@ -107,7 +107,9 @@ extern uint_t zfs_reconstruct_indirect_combinations_max;
extern uint_t zfs_btree_verify_intensity;
static const char cmdname[] = "zdb";
uint8_t dump_opt[256];
uint8_t dump_opt[512];
#define ALLOCATED_OPT 256
typedef void object_viewer_t(objset_t *, uint64_t, void *data, size_t size);
@@ -381,7 +383,7 @@ verify_livelist_allocs(metaslab_verify_t *mv, uint64_t txg,
sublivelist_verify_block_t svb = {{{0}}};
DVA_SET_VDEV(&svb.svb_dva, mv->mv_vdid);
DVA_SET_OFFSET(&svb.svb_dva, offset);
DVA_SET_ASIZE(&svb.svb_dva, size);
DVA_SET_ASIZE(&svb.svb_dva, 0);
zfs_btree_index_t where;
uint64_t end_offset = offset + size;
@@ -619,8 +621,9 @@ livelist_metaslab_validate(spa_t *spa)
metaslab_calculate_range_tree_type(vd, m,
&start, &shift);
metaslab_verify_t mv;
mv.mv_allocated = zfs_range_tree_create(NULL,
type, NULL, start, shift);
mv.mv_allocated = zfs_range_tree_create_flags(
NULL, type, NULL, start, shift,
0, "livelist_metaslab_validate:mv_allocated");
mv.mv_vdid = vd->vdev_id;
mv.mv_msid = m->ms_id;
mv.mv_start = m->ms_start;
@@ -1650,6 +1653,16 @@ dump_metaslab_stats(metaslab_t *msp)
dump_histogram(rt->rt_histogram, ZFS_RANGE_TREE_HISTOGRAM_SIZE, 0);
}
static void
dump_allocated(void *arg, uint64_t start, uint64_t size)
{
uint64_t *off = arg;
if (*off != start)
(void) printf("ALLOC: %"PRIu64" %"PRIu64"\n", *off,
start - *off);
*off = start + size;
}
static void
dump_metaslab(metaslab_t *msp)
{
@@ -1666,13 +1679,24 @@ dump_metaslab(metaslab_t *msp)
(u_longlong_t)msp->ms_id, (u_longlong_t)msp->ms_start,
(u_longlong_t)space_map_object(sm), freebuf);
if (dump_opt['m'] > 2 && !dump_opt['L']) {
if (dump_opt[ALLOCATED_OPT] ||
(dump_opt['m'] > 2 && !dump_opt['L'])) {
mutex_enter(&msp->ms_lock);
VERIFY0(metaslab_load(msp));
}
if (dump_opt['m'] > 2 && !dump_opt['L']) {
zfs_range_tree_stat_verify(msp->ms_allocatable);
dump_metaslab_stats(msp);
metaslab_unload(msp);
mutex_exit(&msp->ms_lock);
}
if (dump_opt[ALLOCATED_OPT]) {
uint64_t off = msp->ms_start;
zfs_range_tree_walk(msp->ms_allocatable, dump_allocated,
&off);
if (off != msp->ms_start + msp->ms_size)
(void) printf("ALLOC: %"PRIu64" %"PRIu64"\n", off,
msp->ms_size - off);
}
if (dump_opt['m'] > 1 && sm != NULL &&
@@ -1687,6 +1711,12 @@ dump_metaslab(metaslab_t *msp)
SPACE_MAP_HISTOGRAM_SIZE, sm->sm_shift);
}
if (dump_opt[ALLOCATED_OPT] ||
(dump_opt['m'] > 2 && !dump_opt['L'])) {
metaslab_unload(msp);
mutex_exit(&msp->ms_lock);
}
if (vd->vdev_ops == &vdev_draid_ops)
ASSERT3U(msp->ms_size, <=, 1ULL << vd->vdev_ms_shift);
else
@@ -1723,8 +1753,9 @@ print_vdev_metaslab_header(vdev_t *vd)
}
}
(void) printf("\tvdev %10llu %s",
(u_longlong_t)vd->vdev_id, bias_str);
(void) printf("\tvdev %10llu\t%s metaslab shift %4llu",
(u_longlong_t)vd->vdev_id, bias_str,
(u_longlong_t)vd->vdev_ms_shift);
if (ms_flush_data_obj != 0) {
(void) printf(" ms_unflushed_phys object %llu",
@@ -2545,12 +2576,14 @@ snprintf_blkptr_compact(char *blkbuf, size_t buflen, const blkptr_t *bp,
blkbuf[0] = '\0';
for (i = 0; i < ndvas; i++)
for (i = 0; i < ndvas; i++) {
(void) snprintf(blkbuf + strlen(blkbuf),
buflen - strlen(blkbuf), "%llu:%llx:%llx ",
buflen - strlen(blkbuf), "%llu:%llx:%llx%s ",
(u_longlong_t)DVA_GET_VDEV(&dva[i]),
(u_longlong_t)DVA_GET_OFFSET(&dva[i]),
(u_longlong_t)DVA_GET_ASIZE(&dva[i]));
(u_longlong_t)DVA_GET_ASIZE(&dva[i]),
(DVA_GET_GANG(&dva[i]) ? "G" : ""));
}
if (BP_IS_HOLE(bp)) {
(void) snprintf(blkbuf + strlen(blkbuf),
@@ -6320,8 +6353,9 @@ zdb_claim_removing(spa_t *spa, zdb_cb_t *zcb)
ASSERT0(zfs_range_tree_space(svr->svr_allocd_segs));
zfs_range_tree_t *allocs = zfs_range_tree_create(NULL, ZFS_RANGE_SEG64,
NULL, 0, 0);
zfs_range_tree_t *allocs = zfs_range_tree_create_flags(
NULL, ZFS_RANGE_SEG64, NULL, 0, 0,
0, "zdb_claim_removing:allocs");
for (uint64_t msi = 0; msi < vd->vdev_ms_count; msi++) {
metaslab_t *msp = vd->vdev_ms[msi];
@@ -7704,7 +7738,8 @@ zdb_set_skip_mmp(char *target)
* applies to the new_path parameter if allocated.
*/
static char *
import_checkpointed_state(char *target, nvlist_t *cfg, char **new_path)
import_checkpointed_state(char *target, nvlist_t *cfg, boolean_t target_is_spa,
char **new_path)
{
int error = 0;
char *poolname, *bogus_name = NULL;
@@ -7712,11 +7747,11 @@ import_checkpointed_state(char *target, nvlist_t *cfg, char **new_path)
/* If the target is not a pool, the extract the pool name */
char *path_start = strchr(target, '/');
if (path_start != NULL) {
if (target_is_spa || path_start == NULL) {
poolname = target;
} else {
size_t poolname_len = path_start - target;
poolname = strndup(target, poolname_len);
} else {
poolname = target;
}
if (cfg == NULL) {
@@ -7747,10 +7782,11 @@ import_checkpointed_state(char *target, nvlist_t *cfg, char **new_path)
"with error %d\n", bogus_name, error);
}
if (new_path != NULL && path_start != NULL) {
if (asprintf(new_path, "%s%s", bogus_name, path_start) == -1) {
if (new_path != NULL && !target_is_spa) {
if (asprintf(new_path, "%s%s", bogus_name,
path_start != NULL ? path_start : "") == -1) {
free(bogus_name);
if (path_start != NULL)
if (!target_is_spa && path_start != NULL)
free(poolname);
return (NULL);
}
@@ -7979,7 +8015,7 @@ verify_checkpoint_blocks(spa_t *spa)
* name) so we can do verification on it against the current state
* of the pool.
*/
checkpoint_pool = import_checkpointed_state(spa->spa_name, NULL,
checkpoint_pool = import_checkpointed_state(spa->spa_name, NULL, B_TRUE,
NULL);
ASSERT(strcmp(spa->spa_name, checkpoint_pool) != 0);
@@ -8449,8 +8485,9 @@ dump_zpool(spa_t *spa)
if (dump_opt['d'] || dump_opt['i']) {
spa_feature_t f;
mos_refd_objs = zfs_range_tree_create(NULL, ZFS_RANGE_SEG64,
NULL, 0, 0);
mos_refd_objs = zfs_range_tree_create_flags(
NULL, ZFS_RANGE_SEG64, NULL, 0, 0,
0, "dump_zpool:mos_refd_objs");
dump_objset(dp->dp_meta_objset);
if (dump_opt['d'] >= 3) {
@@ -8981,7 +9018,7 @@ zdb_read_block(char *thing, spa_t *spa)
DVA_SET_VDEV(&dva[0], vd->vdev_id);
DVA_SET_OFFSET(&dva[0], offset);
DVA_SET_GANG(&dva[0], !!(flags & ZDB_FLAG_GBH));
DVA_SET_GANG(&dva[0], 0);
DVA_SET_ASIZE(&dva[0], vdev_psize_to_asize(vd, psize));
BP_SET_BIRTH(bp, TXG_INITIAL, TXG_INITIAL);
@@ -8996,7 +9033,7 @@ zdb_read_block(char *thing, spa_t *spa)
BP_SET_BYTEORDER(bp, ZFS_HOST_BYTEORDER);
spa_config_enter(spa, SCL_STATE, FTAG, RW_READER);
zio = zio_root(spa, NULL, NULL, 0);
zio = zio_root(spa, NULL, NULL, ZIO_FLAG_CANFAIL);
if (vd == vd->vdev_top) {
/*
@@ -9118,7 +9155,7 @@ zdb_read_block(char *thing, spa_t *spa)
ck_zio->io_offset =
DVA_GET_OFFSET(&bp->blk_dva[0]);
ck_zio->io_bp = bp;
zio_checksum_compute(ck_zio, ck, pabd, lsize);
zio_checksum_compute(ck_zio, ck, pabd, psize);
printf(
"%12s\t"
"cksum=%016llx:%016llx:%016llx:%016llx\n",
@@ -9311,6 +9348,8 @@ main(int argc, char **argv)
{"all-reconstruction", no_argument, NULL, 'Y'},
{"livelist", no_argument, NULL, 'y'},
{"zstd-headers", no_argument, NULL, 'Z'},
{"allocated-map", no_argument, NULL,
ALLOCATED_OPT},
{0, 0, 0, 0}
};
@@ -9341,6 +9380,7 @@ main(int argc, char **argv)
case 'u':
case 'y':
case 'Z':
case ALLOCATED_OPT:
dump_opt[c]++;
dump_all = 0;
break;
@@ -9695,7 +9735,7 @@ main(int argc, char **argv)
char *checkpoint_target = NULL;
if (dump_opt['k']) {
checkpoint_pool = import_checkpointed_state(target, cfg,
&checkpoint_target);
target_is_spa, &checkpoint_target);
if (checkpoint_target != NULL)
target = checkpoint_target;
+1 -1
View File
@@ -29,6 +29,6 @@
#define _ZDB_H
void dump_intent_log(zilog_t *);
extern uint8_t dump_opt[256];
extern uint8_t dump_opt[512];
#endif /* _ZDB_H */
-2
View File
@@ -48,8 +48,6 @@
#include "zdb.h"
extern uint8_t dump_opt[256];
static char tab_prefix[4] = "\t\t\t";
static void
+36 -31
View File
@@ -134,11 +134,13 @@ zfs_agent_iter_vdev(zpool_handle_t *zhp, nvlist_t *nvl, void *arg)
* of blkid cache and L2ARC VDEV does not contain pool guid in its
* blkid, so this is a special case for L2ARC VDEV.
*/
else if (gsp->gs_vdev_guid != 0 && gsp->gs_devid == NULL &&
else if (gsp->gs_vdev_guid != 0 &&
nvlist_lookup_uint64(nvl, ZPOOL_CONFIG_GUID, &vdev_guid) == 0 &&
gsp->gs_vdev_guid == vdev_guid) {
(void) nvlist_lookup_string(nvl, ZPOOL_CONFIG_DEVID,
&gsp->gs_devid);
if (gsp->gs_devid == NULL) {
(void) nvlist_lookup_string(nvl, ZPOOL_CONFIG_DEVID,
&gsp->gs_devid);
}
(void) nvlist_lookup_uint64(nvl, ZPOOL_CONFIG_EXPANSION_TIME,
&gsp->gs_vdev_expandtime);
return (B_TRUE);
@@ -156,22 +158,28 @@ zfs_agent_iter_pool(zpool_handle_t *zhp, void *arg)
/*
* For each vdev in this pool, look for a match by devid
*/
if ((config = zpool_get_config(zhp, NULL)) != NULL) {
if (nvlist_lookup_nvlist(config, ZPOOL_CONFIG_VDEV_TREE,
&nvl) == 0) {
(void) zfs_agent_iter_vdev(zhp, nvl, gsp);
}
}
/*
* if a match was found then grab the pool guid
*/
if (gsp->gs_vdev_guid && gsp->gs_devid) {
(void) nvlist_lookup_uint64(config, ZPOOL_CONFIG_POOL_GUID,
&gsp->gs_pool_guid);
}
boolean_t found = B_FALSE;
uint64_t pool_guid;
/* Get pool configuration and extract pool GUID */
if ((config = zpool_get_config(zhp, NULL)) == NULL ||
nvlist_lookup_uint64(config, ZPOOL_CONFIG_POOL_GUID,
&pool_guid) != 0)
goto out;
/* Skip this pool if we're looking for a specific pool */
if (gsp->gs_pool_guid != 0 && pool_guid != gsp->gs_pool_guid)
goto out;
if (nvlist_lookup_nvlist(config, ZPOOL_CONFIG_VDEV_TREE, &nvl) == 0)
found = zfs_agent_iter_vdev(zhp, nvl, gsp);
if (found && gsp->gs_pool_guid == 0)
gsp->gs_pool_guid = pool_guid;
out:
zpool_close(zhp);
return (gsp->gs_devid != NULL && gsp->gs_vdev_guid != 0);
return (found);
}
void
@@ -233,20 +241,17 @@ zfs_agent_post_event(const char *class, const char *subclass, nvlist_t *nvl)
* For multipath, spare and l2arc devices ZFS_EV_VDEV_GUID or
* ZFS_EV_POOL_GUID may be missing so find them.
*/
if (devid == NULL || pool_guid == 0 || vdev_guid == 0) {
if (devid == NULL)
search.gs_vdev_guid = vdev_guid;
else
search.gs_devid = devid;
zpool_iter(g_zfs_hdl, zfs_agent_iter_pool, &search);
if (devid == NULL)
devid = search.gs_devid;
if (pool_guid == 0)
pool_guid = search.gs_pool_guid;
if (vdev_guid == 0)
vdev_guid = search.gs_vdev_guid;
devtype = search.gs_vdev_type;
}
search.gs_devid = devid;
search.gs_vdev_guid = vdev_guid;
search.gs_pool_guid = pool_guid;
zpool_iter(g_zfs_hdl, zfs_agent_iter_pool, &search);
if (devid == NULL)
devid = search.gs_devid;
if (pool_guid == 0)
pool_guid = search.gs_pool_guid;
if (vdev_guid == 0)
vdev_guid = search.gs_vdev_guid;
devtype = search.gs_vdev_type;
/*
* We want to avoid reporting "remove" events coming from
+2 -1
View File
@@ -441,8 +441,9 @@ zed_notify_slack_webhook()
"${pathname}")"
# Construct the JSON message for posting.
# shellcheck disable=SC2016
#
msg_json="$(printf '{"text": "*%s*\\n%s"}' "${subject}" "${msg_body}" )"
msg_json="$(printf '{"text": "*%s*\\n```%s```"}' "${subject}" "${msg_body}" )"
# Send the POST request and check for errors.
#
+240 -11
View File
@@ -37,6 +37,7 @@
#include <assert.h>
#include <ctype.h>
#include <sys/debug.h>
#include <dirent.h>
#include <errno.h>
#include <getopt.h>
#include <libgen.h>
@@ -121,6 +122,7 @@ static int zfs_do_change_key(int argc, char **argv);
static int zfs_do_project(int argc, char **argv);
static int zfs_do_version(int argc, char **argv);
static int zfs_do_redact(int argc, char **argv);
static int zfs_do_rewrite(int argc, char **argv);
static int zfs_do_wait(int argc, char **argv);
#ifdef __FreeBSD__
@@ -193,6 +195,7 @@ typedef enum {
HELP_CHANGE_KEY,
HELP_VERSION,
HELP_REDACT,
HELP_REWRITE,
HELP_JAIL,
HELP_UNJAIL,
HELP_WAIT,
@@ -227,7 +230,7 @@ static zfs_command_t command_table[] = {
{ "promote", zfs_do_promote, HELP_PROMOTE },
{ "rename", zfs_do_rename, HELP_RENAME },
{ "bookmark", zfs_do_bookmark, HELP_BOOKMARK },
{ "program", zfs_do_channel_program, HELP_CHANNEL_PROGRAM },
{ "diff", zfs_do_diff, HELP_DIFF },
{ NULL },
{ "list", zfs_do_list, HELP_LIST },
{ NULL },
@@ -249,27 +252,31 @@ static zfs_command_t command_table[] = {
{ NULL },
{ "send", zfs_do_send, HELP_SEND },
{ "receive", zfs_do_receive, HELP_RECEIVE },
{ "redact", zfs_do_redact, HELP_REDACT },
{ NULL },
{ "allow", zfs_do_allow, HELP_ALLOW },
{ NULL },
{ "unallow", zfs_do_unallow, HELP_UNALLOW },
{ NULL },
{ "hold", zfs_do_hold, HELP_HOLD },
{ "holds", zfs_do_holds, HELP_HOLDS },
{ "release", zfs_do_release, HELP_RELEASE },
{ "diff", zfs_do_diff, HELP_DIFF },
{ NULL },
{ "load-key", zfs_do_load_key, HELP_LOAD_KEY },
{ "unload-key", zfs_do_unload_key, HELP_UNLOAD_KEY },
{ "change-key", zfs_do_change_key, HELP_CHANGE_KEY },
{ "redact", zfs_do_redact, HELP_REDACT },
{ NULL },
{ "program", zfs_do_channel_program, HELP_CHANNEL_PROGRAM },
{ "rewrite", zfs_do_rewrite, HELP_REWRITE },
{ "wait", zfs_do_wait, HELP_WAIT },
#ifdef __FreeBSD__
{ NULL },
{ "jail", zfs_do_jail, HELP_JAIL },
{ "unjail", zfs_do_unjail, HELP_UNJAIL },
#endif
#ifdef __linux__
{ NULL },
{ "zone", zfs_do_zone, HELP_ZONE },
{ "unzone", zfs_do_unzone, HELP_UNZONE },
#endif
@@ -432,6 +439,9 @@ get_usage(zfs_help_t idx)
case HELP_REDACT:
return (gettext("\tredact <snapshot> <bookmark> "
"<redaction_snapshot> ...\n"));
case HELP_REWRITE:
return (gettext("\trewrite [-rvx] [-o <offset>] [-l <length>] "
"<directory|file ...>\n"));
case HELP_JAIL:
return (gettext("\tjail <jailid|jailname> <filesystem>\n"));
case HELP_UNJAIL:
@@ -920,19 +930,15 @@ usage:
}
/*
* Return a default volblocksize for the pool which always uses more than
* half of the data sectors. This primarily applies to dRAID which always
* writes full stripe widths.
* Calculate the minimum allocation size based on the top-level vdevs.
*/
static uint64_t
default_volblocksize(zpool_handle_t *zhp, nvlist_t *props)
calculate_volblocksize(nvlist_t *config)
{
uint64_t volblocksize, asize = SPA_MINBLOCKSIZE;
uint64_t asize = SPA_MINBLOCKSIZE;
nvlist_t *tree, **vdevs;
uint_t nvdevs;
nvlist_t *config = zpool_get_config(zhp, NULL);
if (nvlist_lookup_nvlist(config, ZPOOL_CONFIG_VDEV_TREE, &tree) != 0 ||
nvlist_lookup_nvlist_array(tree, ZPOOL_CONFIG_CHILDREN,
&vdevs, &nvdevs) != 0) {
@@ -963,6 +969,24 @@ default_volblocksize(zpool_handle_t *zhp, nvlist_t *props)
}
}
return (asize);
}
/*
* Return a default volblocksize for the pool which always uses more than
* half of the data sectors. This primarily applies to dRAID which always
* writes full stripe widths.
*/
static uint64_t
default_volblocksize(zpool_handle_t *zhp, nvlist_t *props)
{
uint64_t volblocksize, asize = SPA_MINBLOCKSIZE;
nvlist_t *config = zpool_get_config(zhp, NULL);
if (nvlist_lookup_uint64(config, ZPOOL_CONFIG_MAX_ALLOC, &asize) != 0)
asize = calculate_volblocksize(config);
/*
* Calculate the target volblocksize such that more than half
* of the asize is used. The following table is for 4k sectors.
@@ -7716,6 +7740,7 @@ unshare_unmount_path(int op, char *path, int flags, boolean_t is_manual)
struct extmnttab entry;
const char *cmdname = (op == OP_SHARE) ? "unshare" : "unmount";
ino_t path_inode;
char *zfs_mntpnt, *entry_mntpnt;
/*
* Search for the given (major,minor) pair in the mount table.
@@ -7757,6 +7782,24 @@ unshare_unmount_path(int op, char *path, int flags, boolean_t is_manual)
goto out;
}
/*
* If the filesystem is mounted, check that the mountpoint matches
* the one in the mnttab entry w.r.t. provided path. If it doesn't,
* then we should not proceed further.
*/
entry_mntpnt = strdup(entry.mnt_mountp);
if (zfs_is_mounted(zhp, &zfs_mntpnt)) {
if (strcmp(zfs_mntpnt, entry_mntpnt) != 0) {
(void) fprintf(stderr, gettext("cannot %s '%s': "
"not an original mountpoint\n"), cmdname, path);
free(zfs_mntpnt);
free(entry_mntpnt);
goto out;
}
free(zfs_mntpnt);
}
free(entry_mntpnt);
if (op == OP_SHARE) {
char nfs_mnt_prop[ZFS_MAXPROPLEN];
char smbshare_prop[ZFS_MAXPROPLEN];
@@ -9013,6 +9056,192 @@ zfs_do_project(int argc, char **argv)
return (ret);
}
static int
zfs_rewrite_file(const char *path, boolean_t verbose, zfs_rewrite_args_t *args)
{
int fd, ret = 0;
fd = open(path, O_WRONLY);
if (fd < 0) {
ret = errno;
(void) fprintf(stderr, gettext("failed to open %s: %s\n"),
path, strerror(errno));
return (ret);
}
if (ioctl(fd, ZFS_IOC_REWRITE, args) < 0) {
ret = errno;
(void) fprintf(stderr, gettext("failed to rewrite %s: %s\n"),
path, strerror(errno));
} else if (verbose) {
printf("%s\n", path);
}
close(fd);
return (ret);
}
static int
zfs_rewrite_dir(const char *path, boolean_t verbose, boolean_t xdev, dev_t dev,
zfs_rewrite_args_t *args, nvlist_t *dirs)
{
struct dirent *ent;
DIR *dir;
int ret = 0, err;
dir = opendir(path);
if (dir == NULL) {
if (errno == ENOENT)
return (0);
ret = errno;
(void) fprintf(stderr, gettext("failed to opendir %s: %s\n"),
path, strerror(errno));
return (ret);
}
size_t plen = strlen(path) + 1;
while ((ent = readdir(dir)) != NULL) {
char *fullname;
struct stat st;
if (ent->d_type != DT_REG && ent->d_type != DT_DIR)
continue;
if (strcmp(ent->d_name, ".") == 0 ||
strcmp(ent->d_name, "..") == 0)
continue;
if (plen + strlen(ent->d_name) >= PATH_MAX) {
(void) fprintf(stderr, gettext("path too long %s/%s\n"),
path, ent->d_name);
ret = ENAMETOOLONG;
continue;
}
if (asprintf(&fullname, "%s/%s", path, ent->d_name) == -1) {
(void) fprintf(stderr,
gettext("failed to allocate memory\n"));
ret = ENOMEM;
continue;
}
if (xdev) {
if (lstat(fullname, &st) < 0) {
ret = errno;
(void) fprintf(stderr,
gettext("failed to stat %s: %s\n"),
fullname, strerror(errno));
free(fullname);
continue;
}
if (st.st_dev != dev) {
free(fullname);
continue;
}
}
if (ent->d_type == DT_REG) {
err = zfs_rewrite_file(fullname, verbose, args);
if (err)
ret = err;
} else { /* DT_DIR */
fnvlist_add_uint64(dirs, fullname, dev);
}
free(fullname);
}
closedir(dir);
return (ret);
}
static int
zfs_rewrite_path(const char *path, boolean_t verbose, boolean_t recurse,
boolean_t xdev, zfs_rewrite_args_t *args, nvlist_t *dirs)
{
struct stat st;
int ret = 0;
if (lstat(path, &st) < 0) {
ret = errno;
(void) fprintf(stderr, gettext("failed to stat %s: %s\n"),
path, strerror(errno));
return (ret);
}
if (S_ISREG(st.st_mode)) {
ret = zfs_rewrite_file(path, verbose, args);
} else if (S_ISDIR(st.st_mode) && recurse) {
ret = zfs_rewrite_dir(path, verbose, xdev, st.st_dev, args,
dirs);
}
return (ret);
}
static int
zfs_do_rewrite(int argc, char **argv)
{
int ret = 0, err, c;
boolean_t recurse = B_FALSE, verbose = B_FALSE, xdev = B_FALSE;
if (argc < 2)
usage(B_FALSE);
zfs_rewrite_args_t args;
memset(&args, 0, sizeof (args));
while ((c = getopt(argc, argv, "l:o:rvx")) != -1) {
switch (c) {
case 'l':
args.len = strtoll(optarg, NULL, 0);
break;
case 'o':
args.off = strtoll(optarg, NULL, 0);
break;
case 'r':
recurse = B_TRUE;
break;
case 'v':
verbose = B_TRUE;
break;
case 'x':
xdev = B_TRUE;
break;
default:
(void) fprintf(stderr, gettext("invalid option '%c'\n"),
optopt);
usage(B_FALSE);
}
}
argv += optind;
argc -= optind;
if (argc == 0) {
(void) fprintf(stderr,
gettext("missing file or directory target(s)\n"));
usage(B_FALSE);
}
nvlist_t *dirs = fnvlist_alloc();
for (int i = 0; i < argc; i++) {
err = zfs_rewrite_path(argv[i], verbose, recurse, xdev, &args,
dirs);
if (err)
ret = err;
}
nvpair_t *dir;
while ((dir = nvlist_next_nvpair(dirs, NULL)) != NULL) {
err = zfs_rewrite_dir(nvpair_name(dir), verbose, xdev,
fnvpair_value_uint64(dir), &args, dirs);
if (err)
ret = err;
fnvlist_remove_nvpair(dirs, dir);
}
fnvlist_free(dirs);
return (ret);
}
static int
zfs_do_wait(int argc, char **argv)
{
+28 -8
View File
@@ -145,11 +145,11 @@ zfs_project_handle_one(const char *name, zfs_project_control_t *zpc)
switch (zpc->zpc_op) {
case ZFS_PROJECT_OP_LIST:
(void) printf("%5u %c %s\n", fsx.fsx_projid,
(fsx.fsx_xflags & ZFS_PROJINHERIT_FL) ? 'P' : '-', name);
(fsx.fsx_xflags & FS_XFLAG_PROJINHERIT) ? 'P' : '-', name);
goto out;
case ZFS_PROJECT_OP_CHECK:
if (fsx.fsx_projid == zpc->zpc_expected_projid &&
fsx.fsx_xflags & ZFS_PROJINHERIT_FL)
fsx.fsx_xflags & FS_XFLAG_PROJINHERIT)
goto out;
if (!zpc->zpc_newline) {
@@ -164,29 +164,30 @@ zfs_project_handle_one(const char *name, zfs_project_control_t *zpc)
"(%u/%u)\n", name, fsx.fsx_projid,
(uint32_t)zpc->zpc_expected_projid);
if (!(fsx.fsx_xflags & ZFS_PROJINHERIT_FL))
if (!(fsx.fsx_xflags & FS_XFLAG_PROJINHERIT))
(void) printf("%s - project inherit flag is not set\n",
name);
goto out;
case ZFS_PROJECT_OP_CLEAR:
if (!(fsx.fsx_xflags & ZFS_PROJINHERIT_FL) &&
if (!(fsx.fsx_xflags & FS_XFLAG_PROJINHERIT) &&
(zpc->zpc_keep_projid ||
fsx.fsx_projid == ZFS_DEFAULT_PROJID))
goto out;
fsx.fsx_xflags &= ~ZFS_PROJINHERIT_FL;
fsx.fsx_xflags &= ~FS_XFLAG_PROJINHERIT;
if (!zpc->zpc_keep_projid)
fsx.fsx_projid = ZFS_DEFAULT_PROJID;
break;
case ZFS_PROJECT_OP_SET:
if (fsx.fsx_projid == zpc->zpc_expected_projid &&
(!zpc->zpc_set_flag || fsx.fsx_xflags & ZFS_PROJINHERIT_FL))
(!zpc->zpc_set_flag ||
fsx.fsx_xflags & FS_XFLAG_PROJINHERIT))
goto out;
fsx.fsx_projid = zpc->zpc_expected_projid;
if (zpc->zpc_set_flag)
fsx.fsx_xflags |= ZFS_PROJINHERIT_FL;
fsx.fsx_xflags |= FS_XFLAG_PROJINHERIT;
break;
default:
ASSERT(0);
@@ -194,11 +195,30 @@ zfs_project_handle_one(const char *name, zfs_project_control_t *zpc)
}
ret = ioctl(fd, ZFS_IOC_FSSETXATTR, &fsx);
if (ret)
if (ret) {
(void) fprintf(stderr,
gettext("failed to set xattr for %s: %s\n"),
name, strerror(errno));
if (errno == ENOTSUP) {
char *kver = zfs_version_kernel();
/*
* Special case: a module/userspace version mismatch can
* return ENOTSUP due to us fixing the XFLAGs bits in
* #17884. In that case give a hint to the user that
* they should take action to make the versions match.
*/
if (strcmp(kver, ZFS_META_ALIAS) != 0) {
fprintf(stderr,
gettext("Warning: The zfs module version "
"(%s) and userspace\nversion (%s) do not "
"match up. This may be the\ncause of the "
"\"Operation not supported\" error.\n"),
kver, ZFS_META_ALIAS);
}
}
}
out:
close(fd);
return (ret);
+348 -6
View File
@@ -52,12 +52,15 @@
#include <sys/zio_compress.h>
#include <sys/zfeature.h>
#include <sys/dmu_tx.h>
#include <sys/backtrace.h>
#include <zfeature_common.h>
#include <libzutil.h>
#include <sys/metaslab_impl.h>
static importargs_t g_importargs;
static char *g_pool;
static boolean_t g_readonly;
static boolean_t g_dump_dbgmsg;
typedef enum {
ZHACK_REPAIR_OP_UNKNOWN = 0,
@@ -69,11 +72,23 @@ static __attribute__((noreturn)) void
usage(void)
{
(void) fprintf(stderr,
"Usage: zhack [-c cachefile] [-d dir] <subcommand> <args> ...\n"
"where <subcommand> <args> is one of the following:\n"
"Usage: zhack [-o tunable] [-c cachefile] [-d dir] [-G] "
"<subcommand> <args> ...\n"
" where <subcommand> <args> is one of the following:\n"
"\n");
(void) fprintf(stderr,
" global options:\n"
" -c <cachefile> reads config from the given cachefile\n"
" -d <dir> directory with vdevs for import\n"
" -o var=value... set global variable to an unsigned "
"32-bit integer\n"
" -G dump zfs_dbgmsg buffer before exiting\n"
"\n"
" action idle <pool> [-f] [-t seconds]\n"
" import the pool for a set time then export it\n"
" -t <seconds> sets the time the pool is imported\n"
"\n"
" feature stat <pool>\n"
" print information about enabled features\n"
" feature enable [-r] [-d desc] <pool> <feature>\n"
@@ -93,10 +108,46 @@ usage(void)
" -c repair corrupted label checksums\n"
" -u restore the label on a detached device\n"
"\n"
" <device> : path to vdev\n");
" <device> : path to vdev\n"
"\n"
" metaslab leak <pool>\n"
" apply allocation map from zdb to specified pool\n");
exit(1);
}
static void
dump_debug_buffer(void)
{
ssize_t ret __attribute__((unused));
if (!g_dump_dbgmsg)
return;
/*
* We use write() instead of printf() so that this function
* is safe to call from a signal handler.
*/
ret = write(STDERR_FILENO, "\n", 1);
zfs_dbgmsg_print(STDERR_FILENO, "zhack");
}
static void sig_handler(int signo)
{
struct sigaction action;
libspl_backtrace(STDERR_FILENO);
dump_debug_buffer();
/*
* Restore default action and re-raise signal so SIGSEGV and
* SIGABRT can trigger a core dump.
*/
action.sa_handler = SIG_DFL;
sigemptyset(&action.sa_mask);
action.sa_flags = 0;
(void) sigaction(signo, &action, NULL);
raise(signo);
}
static __attribute__((format(printf, 3, 4))) __attribute__((noreturn)) void
fatal(spa_t *spa, const void *tag, const char *fmt, ...)
@@ -114,6 +165,8 @@ fatal(spa_t *spa, const void *tag, const char *fmt, ...)
va_end(ap);
(void) fputc('\n', stderr);
dump_debug_buffer();
exit(1);
}
@@ -169,7 +222,7 @@ zhack_import(char *target, boolean_t readonly)
zfeature_checks_disable = B_TRUE;
error = spa_import(target, config, props,
(readonly ? ZFS_IMPORT_SKIP_MMP : ZFS_IMPORT_NORMAL));
(readonly ? ZFS_IMPORT_SKIP_MMP : ZFS_IMPORT_NORMAL));
fnvlist_free(config);
zfeature_checks_disable = B_FALSE;
if (error == EEXIST)
@@ -363,10 +416,12 @@ feature_incr_sync(void *arg, dmu_tx_t *tx)
zfeature_info_t *feature = arg;
uint64_t refcount;
mutex_enter(&spa->spa_feat_stats_lock);
VERIFY0(feature_get_refcount_from_disk(spa, feature, &refcount));
feature_sync(spa, feature, refcount + 1, tx);
spa_history_log_internal(spa, "zhack feature incr", tx,
"name=%s", feature->fi_guid);
mutex_exit(&spa->spa_feat_stats_lock);
}
static void
@@ -376,10 +431,12 @@ feature_decr_sync(void *arg, dmu_tx_t *tx)
zfeature_info_t *feature = arg;
uint64_t refcount;
mutex_enter(&spa->spa_feat_stats_lock);
VERIFY0(feature_get_refcount_from_disk(spa, feature, &refcount));
feature_sync(spa, feature, refcount - 1, tx);
spa_history_log_internal(spa, "zhack feature decr", tx,
"name=%s", feature->fi_guid);
mutex_exit(&spa->spa_feat_stats_lock);
}
static void
@@ -496,6 +553,259 @@ zhack_do_feature(int argc, char **argv)
return (0);
}
static void
zhack_do_action_idle(int argc, char **argv)
{
spa_t *spa;
char *target, *tmp;
int idle_time = 0;
int c;
optind = 1;
while ((c = getopt(argc, argv, "+t:")) != -1) {
switch (c) {
case 't':
idle_time = strtol(optarg, &tmp, 0);
if (*tmp) {
(void) fprintf(stderr, "error: time must "
"be an integer in seconds: %s\n", tmp);
usage();
}
if (idle_time < 0) {
(void) fprintf(stderr, "error: time must "
"not be negative: %d\n", idle_time);
usage();
}
break;
default:
usage();
break;
}
}
argc -= optind;
argv += optind;
if (argc < 1) {
(void) fprintf(stderr, "error: missing pool name\n");
usage();
}
target = argv[0];
zhack_spa_open(target, B_FALSE, FTAG, &spa);
fprintf(stdout, "Imported pool %s, idle for %d seconds\n",
target, idle_time);
sleep(idle_time);
spa_close(spa, FTAG);
}
static int
zhack_do_action(int argc, char **argv)
{
char *subcommand;
argc--;
argv++;
if (argc == 0) {
(void) fprintf(stderr,
"error: no import operation specified\n");
usage();
}
subcommand = argv[0];
if (strcmp(subcommand, "idle") == 0) {
zhack_do_action_idle(argc, argv);
} else {
(void) fprintf(stderr, "error: unknown subcommand: %s\n",
subcommand);
usage();
}
return (0);
}
static boolean_t
strstarts(const char *a, const char *b)
{
return (strncmp(a, b, strlen(b)) == 0);
}
static void
metaslab_force_alloc(metaslab_t *msp, uint64_t start, uint64_t size,
dmu_tx_t *tx)
{
ASSERT(msp->ms_disabled);
ASSERT(MUTEX_HELD(&msp->ms_lock));
uint64_t txg = dmu_tx_get_txg(tx);
uint64_t off = start;
while (off < start + size) {
uint64_t ostart, osize;
boolean_t found = zfs_range_tree_find_in(msp->ms_allocatable,
off, start + size - off, &ostart, &osize);
if (!found)
break;
zfs_range_tree_remove(msp->ms_allocatable, ostart, osize);
if (zfs_range_tree_is_empty(msp->ms_allocating[txg & TXG_MASK]))
vdev_dirty(msp->ms_group->mg_vd, VDD_METASLAB, msp,
txg);
zfs_range_tree_add(msp->ms_allocating[txg & TXG_MASK], ostart,
osize);
msp->ms_allocating_total += osize;
off = ostart + osize;
}
}
static void
zhack_do_metaslab_leak(int argc, char **argv)
{
int c;
char *target;
spa_t *spa;
optind = 1;
boolean_t force = B_FALSE;
while ((c = getopt(argc, argv, "f")) != -1) {
switch (c) {
case 'f':
force = B_TRUE;
break;
default:
usage();
break;
}
}
argc -= optind;
argv += optind;
if (argc < 1) {
(void) fprintf(stderr, "error: missing pool name\n");
usage();
}
target = argv[0];
zhack_spa_open(target, B_FALSE, FTAG, &spa);
spa_config_enter(spa, SCL_VDEV | SCL_ALLOC, FTAG, RW_READER);
char *line = NULL;
size_t cap = 0;
vdev_t *vd = NULL;
metaslab_t *prev = NULL;
dmu_tx_t *tx = NULL;
while (getline(&line, &cap, stdin) > 0) {
if (strstarts(line, "\tvdev ")) {
uint64_t vdev_id, ms_shift;
if (sscanf(line,
"\tvdev %10"PRIu64"\t%*s metaslab shift %4"PRIu64,
&vdev_id, &ms_shift) == 1) {
VERIFY3U(sscanf(line, "\tvdev %"PRIu64
"\t metaslab shift %4"PRIu64,
&vdev_id, &ms_shift), ==, 2);
}
vd = vdev_lookup_top(spa, vdev_id);
if (vd == NULL) {
fprintf(stderr, "error: no such vdev with "
"id %"PRIu64"\n", vdev_id);
break;
}
if (tx) {
dmu_tx_commit(tx);
mutex_exit(&prev->ms_lock);
metaslab_enable(prev, B_FALSE, B_FALSE);
tx = NULL;
prev = NULL;
}
if (vd->vdev_ms_shift != ms_shift) {
fprintf(stderr, "error: ms_shift mismatch: %"
PRIu64" != %"PRIu64"\n", vd->vdev_ms_shift,
ms_shift);
break;
}
} else if (strstarts(line, "\tmetaslabs ")) {
uint64_t ms_count;
VERIFY3U(sscanf(line, "\tmetaslabs %"PRIu64, &ms_count),
==, 1);
ASSERT(vd);
if (!force && vd->vdev_ms_count != ms_count) {
fprintf(stderr, "error: ms_count mismatch: %"
PRIu64" != %"PRIu64"\n", vd->vdev_ms_count,
ms_count);
break;
}
} else if (strstarts(line, "ALLOC:")) {
uint64_t start, size;
VERIFY3U(sscanf(line, "ALLOC: %"PRIu64" %"PRIu64"\n",
&start, &size), ==, 2);
ASSERT(vd);
metaslab_t *cur =
vd->vdev_ms[start >> vd->vdev_ms_shift];
if (prev != cur) {
if (prev) {
dmu_tx_commit(tx);
mutex_exit(&prev->ms_lock);
metaslab_enable(prev, B_FALSE, B_FALSE);
}
ASSERT(cur);
metaslab_disable(cur);
mutex_enter(&cur->ms_lock);
metaslab_load(cur);
prev = cur;
tx = dmu_tx_create_dd(
spa_get_dsl(vd->vdev_spa)->dp_root_dir);
dmu_tx_assign(tx, DMU_TX_WAIT);
}
metaslab_force_alloc(cur, start, size, tx);
} else {
continue;
}
}
if (tx) {
dmu_tx_commit(tx);
mutex_exit(&prev->ms_lock);
metaslab_enable(prev, B_FALSE, B_FALSE);
tx = NULL;
prev = NULL;
}
if (line)
free(line);
spa_config_exit(spa, SCL_VDEV | SCL_ALLOC, FTAG);
spa_close(spa, FTAG);
}
static int
zhack_do_metaslab(int argc, char **argv)
{
char *subcommand;
argc--;
argv++;
if (argc == 0) {
(void) fprintf(stderr,
"error: no metaslab operation specified\n");
usage();
}
subcommand = argv[0];
if (strcmp(subcommand, "leak") == 0) {
zhack_do_metaslab_leak(argc, argv);
} else {
(void) fprintf(stderr, "error: unknown subcommand: %s\n",
subcommand);
usage();
}
return (0);
}
#define ASHIFT_UBERBLOCK_SHIFT(ashift) \
MIN(MAX(ashift, UBERBLOCK_SHIFT), \
MAX_UBERBLOCK_SHIFT)
@@ -971,17 +1281,35 @@ zhack_do_label(int argc, char **argv)
int
main(int argc, char **argv)
{
struct sigaction action;
char *path[MAX_NUM_PATHS];
const char *subcommand;
int rv = 0;
int c;
/*
* Set up signal handlers, so if we crash due to bad on-disk data we
* can get more info. Unlike ztest, we don't bail out if we can't set
* up signal handlers, because zhack is very useful without them.
*/
action.sa_handler = sig_handler;
sigemptyset(&action.sa_mask);
action.sa_flags = 0;
if (sigaction(SIGSEGV, &action, NULL) < 0) {
(void) fprintf(stderr, "zhack: cannot catch SIGSEGV: %s\n",
strerror(errno));
}
if (sigaction(SIGABRT, &action, NULL) < 0) {
(void) fprintf(stderr, "zhack: cannot catch SIGABRT: %s\n",
strerror(errno));
}
g_importargs.path = path;
dprintf_setup(&argc, argv);
zfs_prop_init();
while ((c = getopt(argc, argv, "+c:d:")) != -1) {
while ((c = getopt(argc, argv, "+c:d:Go:")) != -1) {
switch (c) {
case 'c':
g_importargs.cachefile = optarg;
@@ -990,6 +1318,13 @@ main(int argc, char **argv)
assert(g_importargs.paths < MAX_NUM_PATHS);
g_importargs.path[g_importargs.paths++] = optarg;
break;
case 'G':
g_dump_dbgmsg = B_TRUE;
break;
case 'o':
if (set_global_var(optarg) != 0)
exit(1);
break;
default:
usage();
break;
@@ -1007,10 +1342,14 @@ main(int argc, char **argv)
subcommand = argv[0];
if (strcmp(subcommand, "feature") == 0) {
if (strcmp(subcommand, "action") == 0) {
rv = zhack_do_action(argc, argv);
} else if (strcmp(subcommand, "feature") == 0) {
rv = zhack_do_feature(argc, argv);
} else if (strcmp(subcommand, "label") == 0) {
return (zhack_do_label(argc, argv));
} else if (strcmp(subcommand, "metaslab") == 0) {
rv = zhack_do_metaslab(argc, argv);
} else {
(void) fprintf(stderr, "error: unknown subcommand: %s\n",
subcommand);
@@ -1022,6 +1361,9 @@ main(int argc, char **argv)
"changes may not be committed to disk\n");
}
if (g_dump_dbgmsg)
dump_debug_buffer();
kernel_fini();
return (rv);
+3
View File
@@ -3883,6 +3883,9 @@ do_import(nvlist_t *config, const char *newname, const char *mntopts,
hostid, ctime(&timestamp));
}
if (getenv("ZFS_LOAD_INFO_DEBUG"))
dump_nvlist(nvinfo, 4);
return (1);
}
+50 -42
View File
@@ -270,14 +270,13 @@ is_spare(nvlist_t *config, const char *path)
* draid* Virtual dRAID spare
*/
static nvlist_t *
make_leaf_vdev(nvlist_t *props, const char *arg, boolean_t is_primary)
make_leaf_vdev(const char *arg, boolean_t is_primary, uint64_t ashift)
{
char path[MAXPATHLEN];
struct stat64 statbuf;
nvlist_t *vdev = NULL;
const char *type = NULL;
boolean_t wholedisk = B_FALSE;
uint64_t ashift = 0;
int err;
/*
@@ -381,31 +380,6 @@ make_leaf_vdev(nvlist_t *props, const char *arg, boolean_t is_primary)
verify(nvlist_add_uint64(vdev, ZPOOL_CONFIG_WHOLE_DISK,
(uint64_t)wholedisk) == 0);
/*
* Override defaults if custom properties are provided.
*/
if (props != NULL) {
const char *value = NULL;
if (nvlist_lookup_string(props,
zpool_prop_to_name(ZPOOL_PROP_ASHIFT), &value) == 0) {
if (zfs_nicestrtonum(NULL, value, &ashift) != 0) {
(void) fprintf(stderr,
gettext("ashift must be a number.\n"));
return (NULL);
}
if (ashift != 0 &&
(ashift < ASHIFT_MIN || ashift > ASHIFT_MAX)) {
(void) fprintf(stderr,
gettext("invalid 'ashift=%" PRIu64 "' "
"property: only values between %" PRId32 " "
"and %" PRId32 " are allowed.\n"),
ashift, ASHIFT_MIN, ASHIFT_MAX);
return (NULL);
}
}
}
/*
* If the device is known to incorrectly report its physical sector
* size explicitly provide the known correct value.
@@ -610,22 +584,28 @@ get_replication(nvlist_t *nvroot, boolean_t fatal)
ZPOOL_CONFIG_PATH, &path) == 0);
/*
* If we have a raidz/mirror that combines disks
* with files, report it as an error.
* Skip active spares they should never cause
* the pool to be evaluated as inconsistent.
*/
if (!dontreport && type != NULL &&
if (is_spare(NULL, path))
continue;
/*
* If we have a raidz/mirror that combines disks
* with files, only report it as an error when
* fatal is set to ensure all the replication
* checks aren't skipped in check_replication().
*/
if (fatal && !dontreport && type != NULL &&
strcmp(type, childtype) != 0) {
if (ret != NULL)
free(ret);
ret = NULL;
if (fatal)
vdev_error(gettext(
"mismatched replication "
"level: %s contains both "
"files and devices\n"),
rep.zprl_type);
else
return (NULL);
vdev_error(gettext(
"mismatched replication "
"level: %s contains both "
"files and devices\n"),
rep.zprl_type);
dontreport = B_TRUE;
}
@@ -1496,6 +1476,29 @@ construct_spec(nvlist_t *props, int argc, char **argv)
const char *type, *fulltype;
boolean_t is_log, is_special, is_dedup, is_spare;
boolean_t seen_logs;
uint64_t ashift = 0;
if (props != NULL) {
const char *value = NULL;
if (nvlist_lookup_string(props,
zpool_prop_to_name(ZPOOL_PROP_ASHIFT), &value) == 0) {
if (zfs_nicestrtonum(NULL, value, &ashift) != 0) {
(void) fprintf(stderr,
gettext("ashift must be a number.\n"));
return (NULL);
}
if (ashift != 0 &&
(ashift < ASHIFT_MIN || ashift > ASHIFT_MAX)) {
(void) fprintf(stderr,
gettext("invalid 'ashift=%" PRIu64 "' "
"property: only values between %" PRId32 " "
"and %" PRId32 " are allowed.\n"),
ashift, ASHIFT_MIN, ASHIFT_MAX);
return (NULL);
}
}
}
top = NULL;
toplevels = 0;
@@ -1602,9 +1605,9 @@ construct_spec(nvlist_t *props, int argc, char **argv)
children * sizeof (nvlist_t *));
if (child == NULL)
zpool_no_memory();
if ((nv = make_leaf_vdev(props, argv[c],
if ((nv = make_leaf_vdev(argv[c],
!(is_log || is_special || is_dedup ||
is_spare))) == NULL) {
is_spare), ashift)) == NULL) {
for (c = 0; c < children - 1; c++)
nvlist_free(child[c]);
free(child);
@@ -1668,6 +1671,10 @@ construct_spec(nvlist_t *props, int argc, char **argv)
ZPOOL_CONFIG_ALLOCATION_BIAS,
VDEV_ALLOC_BIAS_DEDUP) == 0);
}
if (ashift > 0) {
fnvlist_add_uint64(nv,
ZPOOL_CONFIG_ASHIFT, ashift);
}
if (strcmp(type, VDEV_TYPE_RAIDZ) == 0) {
verify(nvlist_add_uint64(nv,
ZPOOL_CONFIG_NPARITY,
@@ -1695,8 +1702,9 @@ construct_spec(nvlist_t *props, int argc, char **argv)
* We have a device. Pass off to make_leaf_vdev() to
* construct the appropriate nvlist describing the vdev.
*/
if ((nv = make_leaf_vdev(props, argv[0], !(is_log ||
is_special || is_dedup || is_spare))) == NULL)
if ((nv = make_leaf_vdev(argv[0], !(is_log ||
is_special || is_dedup || is_spare),
ashift)) == NULL)
goto spec_out;
verify(nvlist_add_uint64(nv,
+25 -18
View File
@@ -3881,7 +3881,7 @@ ztest_vdev_attach_detach(ztest_ds_t *zd, uint64_t id)
* If newvd is too small, it should fail with EOVERFLOW.
*
* If newvd is a distributed spare and it's being attached to a
* dRAID which is not its parent it should fail with EINVAL.
* dRAID which is not its parent it should fail with ENOTSUP.
*/
if (pvd->vdev_ops != &vdev_mirror_ops &&
pvd->vdev_ops != &vdev_root_ops && (!replacing ||
@@ -3900,7 +3900,7 @@ ztest_vdev_attach_detach(ztest_ds_t *zd, uint64_t id)
else if (ashift > oldvd->vdev_top->vdev_ashift)
expected_error = EDOM;
else if (newvd_is_dspare && pvd != vdev_draid_spare_get_parent(newvd))
expected_error = EINVAL;
expected_error = ENOTSUP;
else
expected_error = 0;
@@ -7812,6 +7812,9 @@ ztest_dataset_open(int d)
ztest_dataset_name(name, ztest_opts.zo_pool, d);
if (ztest_opts.zo_verbose >= 6)
(void) printf("Opening %s\n", name);
(void) pthread_rwlock_rdlock(&ztest_name_lock);
error = ztest_dataset_create(name);
@@ -8307,41 +8310,44 @@ static void
ztest_generic_run(ztest_shared_t *zs, spa_t *spa)
{
kthread_t **run_threads;
int t;
int i, ndatasets;
run_threads = umem_zalloc(ztest_opts.zo_threads * sizeof (kthread_t *),
UMEM_NOFAIL);
/*
* Actual number of datasets to be used.
*/
ndatasets = MIN(ztest_opts.zo_datasets, ztest_opts.zo_threads);
/*
* Prepare the datasets first.
*/
for (i = 0; i < ndatasets; i++)
VERIFY0(ztest_dataset_open(i));
/*
* Kick off all the tests that run in parallel.
*/
for (t = 0; t < ztest_opts.zo_threads; t++) {
if (t < ztest_opts.zo_datasets && ztest_dataset_open(t) != 0) {
umem_free(run_threads, ztest_opts.zo_threads *
sizeof (kthread_t *));
return;
}
run_threads[t] = thread_create(NULL, 0, ztest_thread,
(void *)(uintptr_t)t, 0, NULL, TS_RUN | TS_JOINABLE,
for (i = 0; i < ztest_opts.zo_threads; i++) {
run_threads[i] = thread_create(NULL, 0, ztest_thread,
(void *)(uintptr_t)i, 0, NULL, TS_RUN | TS_JOINABLE,
defclsyspri);
}
/*
* Wait for all of the tests to complete.
*/
for (t = 0; t < ztest_opts.zo_threads; t++)
VERIFY0(thread_join(run_threads[t]));
for (i = 0; i < ztest_opts.zo_threads; i++)
VERIFY0(thread_join(run_threads[i]));
/*
* Close all datasets. This must be done after all the threads
* are joined so we can be sure none of the datasets are in-use
* by any of the threads.
*/
for (t = 0; t < ztest_opts.zo_threads; t++) {
if (t < ztest_opts.zo_datasets)
ztest_dataset_close(t);
}
for (i = 0; i < ndatasets; i++)
ztest_dataset_close(i);
txg_wait_synced(spa_get_dsl(spa), 0);
@@ -8464,6 +8470,7 @@ ztest_run(ztest_shared_t *zs)
int d = ztest_random(ztest_opts.zo_datasets);
ztest_dataset_destroy(d);
txg_wait_synced(spa_get_dsl(spa), 0);
}
zs->zs_enospc_count = 0;
+2 -2
View File
@@ -72,7 +72,7 @@
# modified version of the Autoconf Macro, you may extend this special
# exception to the GPL to apply to your modified version as well.
#serial 36
#serial 37
AU_ALIAS([AC_PYTHON_DEVEL], [AX_PYTHON_DEVEL])
AC_DEFUN([AX_PYTHON_DEVEL],[
@@ -316,7 +316,7 @@ EOD`
PYTHON_LIBS="-L$ac_python_libdir -lpython$ac_python_version"
fi
if test -z "PYTHON_LIBS"; then
if test -z "$PYTHON_LIBS"; then
AC_MSG_WARN([
Cannot determine location of your Python DSO. Please check it was installed with
dynamic libraries enabled, or try setting PYTHON_LIBS by hand.
+3 -6
View File
@@ -29,9 +29,8 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_BLKDEV_GET_BY_PATH_4ARG], [
const char *path = "path";
fmode_t mode = 0;
void *holder = NULL;
struct blk_holder_ops h;
bdev = blkdev_get_by_path(path, mode, holder, &h);
bdev = blkdev_get_by_path(path, mode, holder, NULL);
])
])
@@ -48,9 +47,8 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_BLKDEV_BDEV_OPEN_BY_PATH], [
const char *path = "path";
fmode_t mode = 0;
void *holder = NULL;
struct blk_holder_ops h;
bdh = bdev_open_by_path(path, mode, holder, &h);
bdh = bdev_open_by_path(path, mode, holder, NULL);
])
])
@@ -68,9 +66,8 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_BDEV_FILE_OPEN_BY_PATH], [
const char *path = "path";
fmode_t mode = 0;
void *holder = NULL;
struct blk_holder_ops h;
file = bdev_file_open_by_path(path, mode, holder, &h);
file = bdev_file_open_by_path(path, mode, holder, NULL);
])
])
+34
View File
@@ -119,15 +119,49 @@ AC_DEFUN([ZFS_AC_KERNEL_BLOCK_DEVICE_OPERATIONS_REVALIDATE_DISK], [
])
])
dnl #
dnl # 6.18 API change
dnl # block_device_operation->getgeo takes struct gendisk* as first arg
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_BLOCK_DEVICE_OPERATIONS_GETGEO_GENDISK], [
ZFS_LINUX_TEST_SRC([block_device_operations_getgeo_gendisk], [
#include <linux/blkdev.h>
static int blk_getgeo(struct gendisk *disk, struct hd_geometry *geo)
{
(void) disk, (void) geo;
return (0);
}
static const struct block_device_operations
bops __attribute__ ((unused)) = {
.getgeo = blk_getgeo,
};
], [], [])
])
AC_DEFUN([ZFS_AC_KERNEL_BLOCK_DEVICE_OPERATIONS_GETGEO_GENDISK], [
AC_MSG_CHECKING([whether bops->getgeo() takes gendisk as first arg])
ZFS_LINUX_TEST_RESULT([block_device_operations_getgeo_gendisk], [
AC_MSG_RESULT(yes)
AC_DEFINE([HAVE_BLOCK_DEVICE_OPERATIONS_GETGEO_GENDISK], [1],
[Define if getgeo() in block_device_operations takes struct gendisk * as its first arg])
],[
AC_MSG_RESULT(no)
])
])
AC_DEFUN([ZFS_AC_KERNEL_SRC_BLOCK_DEVICE_OPERATIONS], [
ZFS_AC_KERNEL_SRC_BLOCK_DEVICE_OPERATIONS_CHECK_EVENTS
ZFS_AC_KERNEL_SRC_BLOCK_DEVICE_OPERATIONS_RELEASE_VOID
ZFS_AC_KERNEL_SRC_BLOCK_DEVICE_OPERATIONS_RELEASE_1ARG
ZFS_AC_KERNEL_SRC_BLOCK_DEVICE_OPERATIONS_REVALIDATE_DISK
ZFS_AC_KERNEL_SRC_BLOCK_DEVICE_OPERATIONS_GETGEO_GENDISK
])
AC_DEFUN([ZFS_AC_KERNEL_BLOCK_DEVICE_OPERATIONS], [
ZFS_AC_KERNEL_BLOCK_DEVICE_OPERATIONS_CHECK_EVENTS
ZFS_AC_KERNEL_BLOCK_DEVICE_OPERATIONS_RELEASE_VOID
ZFS_AC_KERNEL_BLOCK_DEVICE_OPERATIONS_REVALIDATE_DISK
ZFS_AC_KERNEL_BLOCK_DEVICE_OPERATIONS_GETGEO_GENDISK
])
+32 -5
View File
@@ -24,6 +24,9 @@ dnl #
dnl # 2.6.38 API change
dnl # Added d_set_d_op() helper function.
dnl #
dnl # 6.17 API change
dnl # d_set_d_op() removed. No direct replacement.
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_D_SET_D_OP], [
ZFS_LINUX_TEST_SRC([d_set_d_op], [
#include <linux/dcache.h>
@@ -34,22 +37,46 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_D_SET_D_OP], [
AC_DEFUN([ZFS_AC_KERNEL_D_SET_D_OP], [
AC_MSG_CHECKING([whether d_set_d_op() is available])
ZFS_LINUX_TEST_RESULT_SYMBOL([d_set_d_op],
[d_set_d_op], [fs/dcache.c], [
ZFS_LINUX_TEST_RESULT([d_set_d_op], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_D_SET_D_OP, 1,
[Define if d_set_d_op() is available])
], [
ZFS_LINUX_TEST_ERROR([d_set_d_op])
AC_MSG_RESULT(no)
])
])
dnl #
dnl # 6.17 API change
dnl # sb->s_d_op removed; set_default_d_op(sb, dop) added
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_SET_DEFAULT_D_OP], [
ZFS_LINUX_TEST_SRC([set_default_d_op], [
#include <linux/dcache.h>
], [
set_default_d_op(NULL, NULL);
])
])
AC_DEFUN([ZFS_AC_KERNEL_SET_DEFAULT_D_OP], [
AC_MSG_CHECKING([whether set_default_d_op() is available])
ZFS_LINUX_TEST_RESULT([set_default_d_op], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_SET_DEFAULT_D_OP, 1,
[Define if set_default_d_op() is available])
], [
AC_MSG_RESULT(no)
])
])
AC_DEFUN([ZFS_AC_KERNEL_SRC_DENTRY], [
ZFS_AC_KERNEL_SRC_D_OBTAIN_ALIAS
ZFS_AC_KERNEL_SRC_D_SET_D_OP
ZFS_AC_KERNEL_SRC_S_D_OP
ZFS_AC_KERNEL_SRC_SET_DEFAULT_D_OP
])
AC_DEFUN([ZFS_AC_KERNEL_DENTRY], [
ZFS_AC_KERNEL_D_OBTAIN_ALIAS
ZFS_AC_KERNEL_D_SET_D_OP
ZFS_AC_KERNEL_S_D_OP
ZFS_AC_KERNEL_SET_DEFAULT_D_OP
])
+24
View File
@@ -0,0 +1,24 @@
dnl #
dnl # 6.18 API change
dnl # - generic_drop_inode() renamed to inode_generic_drop()
dnl # - generic_delete_inode() renamed to inode_just_drop()
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_INODE_GENERIC_DROP], [
ZFS_LINUX_TEST_SRC([inode_generic_drop], [
#include <linux/fs.h>
],[
struct inode *ip = NULL;
inode_generic_drop(ip);
])
])
AC_DEFUN([ZFS_AC_KERNEL_INODE_GENERIC_DROP], [
AC_MSG_CHECKING([whether inode_generic_drop() exists])
ZFS_LINUX_TEST_RESULT([inode_generic_drop], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_INODE_GENERIC_DROP, 1,
[inode_generic_drop() exists])
],[
AC_MSG_RESULT(no)
])
])
+24
View File
@@ -0,0 +1,24 @@
dnl #
dnl # Linux 5.2 API change
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_SOPS_FREE_INODE], [
ZFS_LINUX_TEST_SRC([super_operations_free_inode], [
#include <linux/fs.h>
static void free_inode(struct inode *) { }
static struct super_operations sops __attribute__ ((unused)) = {
.free_inode = free_inode,
};
],[])
])
AC_DEFUN([ZFS_AC_KERNEL_SOPS_FREE_INODE], [
AC_MSG_CHECKING([whether sops->free_inode() exists])
ZFS_LINUX_TEST_RESULT([super_operations_free_inode], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_SOPS_FREE_INODE, 1, [sops->free_inode() exists])
],[
AC_MSG_RESULT(no)
])
])
+23
View File
@@ -0,0 +1,23 @@
dnl #
dnl # 6.19 API change. inode->i_state no longer accessible directly; helper
dnl # functions exist.
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_INODE_STATE_READ_ONCE], [
ZFS_LINUX_TEST_SRC([inode_state_read_once], [
#include <linux/fs.h>
], [
struct inode i = {};
inode_state_read_once(&i);
],[])
])
AC_DEFUN([ZFS_AC_KERNEL_INODE_STATE_READ_ONCE], [
AC_MSG_CHECKING([whether inode_state_read_once() exists])
ZFS_LINUX_TEST_RESULT([inode_state_read_once], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_INODE_STATE_READ_ONCE, 1,
[inode_state_read_once() exists])
],[
AC_MSG_RESULT(no)
])
])
+1 -1
View File
@@ -7,7 +7,7 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_KMAP_ATOMIC_ARGS], [
ZFS_LINUX_TEST_SRC([kmap_atomic], [
#include <linux/pagemap.h>
],[
struct page page;
struct page page = {};
kmap_atomic(&page);
])
])
+27
View File
@@ -16,9 +16,36 @@ AC_DEFUN([ZFS_AC_KERNEL_MM_PAGE_FLAG_ERROR], [
])
])
dnl #
dnl # Linux 6.18+ uses a struct typedef (memdesc_flags_t) instead of an
dnl # 'unsigned long' for the 'flags' field in 'struct page'.
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_MM_PAGE_FLAGS_STRUCT], [
ZFS_LINUX_TEST_SRC([mm_page_flags_struct], [
#include <linux/mm.h>
static const struct page p __attribute__ ((unused)) = {
.flags = { .f = 0 }
};
],[])
])
AC_DEFUN([ZFS_AC_KERNEL_MM_PAGE_FLAGS_STRUCT], [
AC_MSG_CHECKING([whether 'flags' in 'struct page' is a struct])
ZFS_LINUX_TEST_RESULT([mm_page_flags_struct], [
AC_MSG_RESULT([yes])
AC_DEFINE(HAVE_MM_PAGE_FLAGS_STRUCT, 1,
['flags' in 'struct page' is a struct])
],[
AC_MSG_RESULT([no])
])
])
AC_DEFUN([ZFS_AC_KERNEL_SRC_MM_PAGE_FLAGS], [
ZFS_AC_KERNEL_SRC_MM_PAGE_FLAG_ERROR
ZFS_AC_KERNEL_SRC_MM_PAGE_FLAGS_STRUCT
])
AC_DEFUN([ZFS_AC_KERNEL_MM_PAGE_FLAGS], [
ZFS_AC_KERNEL_MM_PAGE_FLAG_ERROR
ZFS_AC_KERNEL_MM_PAGE_FLAGS_STRUCT
])
+31
View File
@@ -0,0 +1,31 @@
dnl #
dnl # 6.18 API change
dnl # ns->ops->type was moved to ns->ns.ns_type (struct ns_common)
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_NS_COMMON_TYPE], [
ZFS_LINUX_TEST_SRC([ns_common_type], [
#include <linux/user_namespace.h>
],[
struct user_namespace ns;
ns.ns.ns_type = 0;
])
])
AC_DEFUN([ZFS_AC_KERNEL_NS_COMMON_TYPE], [
AC_MSG_CHECKING([whether ns_type is accessible through ns_common])
ZFS_LINUX_TEST_RESULT([ns_common_type], [
AC_MSG_RESULT(yes)
AC_DEFINE([HAVE_NS_COMMON_TYPE], 1,
[Define if ns_type is accessible through ns_common])
],[
AC_MSG_RESULT(no)
])
])
AC_DEFUN([ZFS_AC_KERNEL_SRC_NAMESPACE], [
ZFS_AC_KERNEL_SRC_NS_COMMON_TYPE
])
AC_DEFUN([ZFS_AC_KERNEL_NAMESPACE], [
ZFS_AC_KERNEL_NS_COMMON_TYPE
])
+17
View File
@@ -49,6 +49,15 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_OBJTOOL], [
#error "STACK_FRAME_NON_STANDARD is not defined."
#endif
])
dnl # 6.15 made CONFIG_OBJTOOL_WERROR=y the default. We need to handle
dnl # this or our build will fail.
ZFS_LINUX_TEST_SRC([config_objtool_werror], [
#if !defined(CONFIG_OBJTOOL_WERROR)
#error "CONFIG_OBJTOOL_WERROR is not defined."
#endif
])
])
AC_DEFUN([ZFS_AC_KERNEL_OBJTOOL], [
@@ -84,6 +93,14 @@ AC_DEFUN([ZFS_AC_KERNEL_OBJTOOL], [
],[
AC_MSG_RESULT(no)
])
AC_MSG_CHECKING([whether CONFIG_OBJTOOL_WERROR is defined])
ZFS_LINUX_TEST_RESULT([config_objtool_werror],[
AC_MSG_RESULT(yes)
CONFIG_OBJTOOL_WERROR_DEFINED=yes
],[
AC_MSG_RESULT(no)
])
],[
AC_MSG_RESULT(no)
])
+23
View File
@@ -0,0 +1,23 @@
dnl #
dnl # Linux 6.16 removed readahead_page
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_PAGEMAP_READAHEAD_PAGE], [
ZFS_LINUX_TEST_SRC([pagemap_has_readahead_page], [
#include <linux/pagemap.h>
], [
struct page *p __attribute__ ((unused)) = NULL;
struct readahead_control *ractl __attribute__ ((unused)) = NULL;
p = readahead_page(ractl);
])
])
AC_DEFUN([ZFS_AC_KERNEL_PAGEMAP_READAHEAD_PAGE], [
AC_MSG_CHECKING([whether readahead_page() exists])
ZFS_LINUX_TEST_RESULT([pagemap_has_readahead_page], [
AC_MSG_RESULT([yes])
AC_DEFINE(HAVE_PAGEMAP_READAHEAD_PAGE, 1,
[readahead_page() exists])
],[
AC_MSG_RESULT([no])
])
])
-79
View File
@@ -1,79 +0,0 @@
dnl #
dnl # 2.6.38 API change
dnl # ns_capable() was introduced
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_NS_CAPABLE], [
ZFS_LINUX_TEST_SRC([ns_capable], [
#include <linux/capability.h>
],[
ns_capable((struct user_namespace *)NULL, CAP_SYS_ADMIN);
])
])
AC_DEFUN([ZFS_AC_KERNEL_NS_CAPABLE], [
AC_MSG_CHECKING([whether ns_capable exists])
ZFS_LINUX_TEST_RESULT([ns_capable], [
AC_MSG_RESULT(yes)
],[
ZFS_LINUX_TEST_ERROR([ns_capable()])
])
])
dnl #
dnl # 2.6.39 API change
dnl # struct user_namespace was added to struct cred_t as cred->user_ns member
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_CRED_USER_NS], [
ZFS_LINUX_TEST_SRC([cred_user_ns], [
#include <linux/cred.h>
],[
struct cred cr;
cr.user_ns = (struct user_namespace *)NULL;
])
])
AC_DEFUN([ZFS_AC_KERNEL_CRED_USER_NS], [
AC_MSG_CHECKING([whether cred_t->user_ns exists])
ZFS_LINUX_TEST_RESULT([cred_user_ns], [
AC_MSG_RESULT(yes)
],[
ZFS_LINUX_TEST_ERROR([cred_t->user_ns()])
])
])
dnl #
dnl # 3.4 API change
dnl # kuid_has_mapping() and kgid_has_mapping() were added to distinguish
dnl # between internal kernel uids/gids and user namespace uids/gids.
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_KUID_HAS_MAPPING], [
ZFS_LINUX_TEST_SRC([kuid_has_mapping], [
#include <linux/uidgid.h>
],[
kuid_has_mapping((struct user_namespace *)NULL, KUIDT_INIT(0));
kgid_has_mapping((struct user_namespace *)NULL, KGIDT_INIT(0));
])
])
AC_DEFUN([ZFS_AC_KERNEL_KUID_HAS_MAPPING], [
AC_MSG_CHECKING([whether kuid_has_mapping/kgid_has_mapping exist])
ZFS_LINUX_TEST_RESULT([kuid_has_mapping], [
AC_MSG_RESULT(yes)
],[
ZFS_LINUX_TEST_ERROR([kuid_has_mapping()])
])
])
AC_DEFUN([ZFS_AC_KERNEL_SRC_USERNS_CAPABILITIES], [
ZFS_AC_KERNEL_SRC_NS_CAPABLE
ZFS_AC_KERNEL_SRC_HAS_CAPABILITY
ZFS_AC_KERNEL_SRC_CRED_USER_NS
ZFS_AC_KERNEL_SRC_KUID_HAS_MAPPING
])
AC_DEFUN([ZFS_AC_KERNEL_USERNS_CAPABILITIES], [
ZFS_AC_KERNEL_NS_CAPABLE
ZFS_AC_KERNEL_HAS_CAPABILITY
ZFS_AC_KERNEL_CRED_USER_NS
ZFS_AC_KERNEL_KUID_HAS_MAPPING
])
+24
View File
@@ -0,0 +1,24 @@
dnl #
dnl # Linux 6.16 removes address_space_operations ->writepage
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_VFS_WRITEPAGE], [
ZFS_LINUX_TEST_SRC([vfs_has_writepage], [
#include <linux/fs.h>
static const struct address_space_operations
aops __attribute__ ((unused)) = {
.writepage = NULL,
};
],[])
])
AC_DEFUN([ZFS_AC_KERNEL_VFS_WRITEPAGE], [
AC_MSG_CHECKING([whether aops->writepage exists])
ZFS_LINUX_TEST_RESULT([vfs_has_writepage], [
AC_MSG_RESULT([yes])
AC_DEFINE(HAVE_VFS_WRITEPAGE, 1,
[address_space_operations->writepage exists])
],[
AC_MSG_RESULT([no])
])
])
+58
View File
@@ -0,0 +1,58 @@
AC_DEFUN([ZFS_AC_KERNEL_SRC_WRITEPAGE_T], [
dnl #
dnl # 6.3 API change
dnl # The writepage_t function type now has its first argument as
dnl # struct folio* instead of struct page*
dnl #
ZFS_LINUX_TEST_SRC([writepage_t_folio], [
#include <linux/writeback.h>
static int putpage(struct folio *folio,
struct writeback_control *wbc, void *data)
{ return 0; }
writepage_t func = putpage;
],[])
])
AC_DEFUN([ZFS_AC_KERNEL_WRITEPAGE_T], [
AC_MSG_CHECKING([whether int (*writepage_t)() takes struct folio*])
ZFS_LINUX_TEST_RESULT([writepage_t_folio], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_WRITEPAGE_T_FOLIO, 1,
[int (*writepage_t)() takes struct folio*])
],[
AC_MSG_RESULT(no)
])
])
AC_DEFUN([ZFS_AC_KERNEL_SRC_WRITE_CACHE_PAGES], [
dnl #
dnl # 6.18 API change
dnl # write_cache_pages() has been removed.
dnl #
ZFS_LINUX_TEST_SRC([write_cache_pages], [
#include <linux/writeback.h>
], [
(void) write_cache_pages(NULL, NULL, NULL, NULL);
])
])
AC_DEFUN([ZFS_AC_KERNEL_WRITE_CACHE_PAGES], [
AC_MSG_CHECKING([whether write_cache_pages() is available])
ZFS_LINUX_TEST_RESULT([write_cache_pages], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_WRITE_CACHE_PAGES, 1,
[write_cache_pages() is available])
],[
AC_MSG_RESULT(no)
])
])
AC_DEFUN([ZFS_AC_KERNEL_SRC_WRITEBACK], [
ZFS_AC_KERNEL_SRC_WRITEPAGE_T
ZFS_AC_KERNEL_SRC_WRITE_CACHE_PAGES
])
AC_DEFUN([ZFS_AC_KERNEL_WRITEBACK], [
ZFS_AC_KERNEL_WRITEPAGE_T
ZFS_AC_KERNEL_WRITE_CACHE_PAGES
])
-26
View File
@@ -1,26 +0,0 @@
AC_DEFUN([ZFS_AC_KERNEL_SRC_WRITEPAGE_T], [
dnl #
dnl # 6.3 API change
dnl # The writepage_t function type now has its first argument as
dnl # struct folio* instead of struct page*
dnl #
ZFS_LINUX_TEST_SRC([writepage_t_folio], [
#include <linux/writeback.h>
static int putpage(struct folio *folio,
struct writeback_control *wbc, void *data)
{ return 0; }
writepage_t func = putpage;
],[])
])
AC_DEFUN([ZFS_AC_KERNEL_WRITEPAGE_T], [
AC_MSG_CHECKING([whether int (*writepage_t)() takes struct folio*])
ZFS_LINUX_TEST_RESULT([writepage_t_folio], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_WRITEPAGE_T_FOLIO, 1,
[int (*writepage_t)() takes struct folio*])
],[
AC_MSG_RESULT(no)
])
])
+16 -2
View File
@@ -59,6 +59,7 @@ AC_DEFUN([ZFS_AC_KERNEL_TEST_SRC], [
ZFS_AC_KERNEL_SRC_ACL
ZFS_AC_KERNEL_SRC_INODE_SETATTR
ZFS_AC_KERNEL_SRC_INODE_GETATTR
ZFS_AC_KERNEL_SRC_INODE_STATE_READ_ONCE
ZFS_AC_KERNEL_SRC_SHOW_OPTIONS
ZFS_AC_KERNEL_SRC_SHRINKER
ZFS_AC_KERNEL_SRC_MKDIR
@@ -70,6 +71,7 @@ AC_DEFUN([ZFS_AC_KERNEL_TEST_SRC], [
ZFS_AC_KERNEL_SRC_COMMIT_METADATA
ZFS_AC_KERNEL_SRC_SETATTR_PREPARE
ZFS_AC_KERNEL_SRC_INSERT_INODE_LOCKED
ZFS_AC_KERNEL_SRC_DENTRY
ZFS_AC_KERNEL_SRC_TRUNCATE_SETSIZE
ZFS_AC_KERNEL_SRC_SECURITY_INODE
ZFS_AC_KERNEL_SRC_FST_MOUNT
@@ -82,6 +84,7 @@ AC_DEFUN([ZFS_AC_KERNEL_TEST_SRC], [
ZFS_AC_KERNEL_SRC_VFS_MIGRATEPAGE
ZFS_AC_KERNEL_SRC_VFS_FSYNC_2ARGS
ZFS_AC_KERNEL_SRC_VFS_READPAGES
ZFS_AC_KERNEL_SRC_VFS_WRITEPAGE
ZFS_AC_KERNEL_SRC_VFS_SET_PAGE_DIRTY_NOBUFFERS
ZFS_AC_KERNEL_SRC_VFS_IOV_ITER
ZFS_AC_KERNEL_SRC_VFS_GENERIC_COPY_FILE_RANGE
@@ -111,6 +114,7 @@ AC_DEFUN([ZFS_AC_KERNEL_TEST_SRC], [
ZFS_AC_KERNEL_SRC_STANDALONE_LINUX_STDARG
ZFS_AC_KERNEL_SRC_STRLCPY
ZFS_AC_KERNEL_SRC_PAGEMAP_FOLIO_WAIT_BIT
ZFS_AC_KERNEL_SRC_PAGEMAP_READAHEAD_PAGE
ZFS_AC_KERNEL_SRC_ADD_DISK
ZFS_AC_KERNEL_SRC_KTHREAD
ZFS_AC_KERNEL_SRC_ZERO_PAGE
@@ -118,7 +122,7 @@ AC_DEFUN([ZFS_AC_KERNEL_TEST_SRC], [
ZFS_AC_KERNEL_SRC_IDMAP_MNT_API
ZFS_AC_KERNEL_SRC_IDMAP_NO_USERNS
ZFS_AC_KERNEL_SRC_IATTR_VFSID
ZFS_AC_KERNEL_SRC_WRITEPAGE_T
ZFS_AC_KERNEL_SRC_WRITEBACK
ZFS_AC_KERNEL_SRC_RECLAIMED
ZFS_AC_KERNEL_SRC_REGISTER_SYSCTL_TABLE
ZFS_AC_KERNEL_SRC_REGISTER_SYSCTL_SZ
@@ -132,6 +136,9 @@ AC_DEFUN([ZFS_AC_KERNEL_TEST_SRC], [
ZFS_AC_KERNEL_SRC_PIN_USER_PAGES
ZFS_AC_KERNEL_SRC_TIMER
ZFS_AC_KERNEL_SRC_SUPER_BLOCK_S_WB_ERR
ZFS_AC_KERNEL_SRC_SOPS_FREE_INODE
ZFS_AC_KERNEL_SRC_NAMESPACE
ZFS_AC_KERNEL_SRC_INODE_GENERIC_DROP
case "$host_cpu" in
powerpc*)
ZFS_AC_KERNEL_SRC_CPU_HAS_FEATURE
@@ -174,6 +181,7 @@ AC_DEFUN([ZFS_AC_KERNEL_TEST_RESULT], [
ZFS_AC_KERNEL_ACL
ZFS_AC_KERNEL_INODE_SETATTR
ZFS_AC_KERNEL_INODE_GETATTR
ZFS_AC_KERNEL_INODE_STATE_READ_ONCE
ZFS_AC_KERNEL_SHOW_OPTIONS
ZFS_AC_KERNEL_SHRINKER
ZFS_AC_KERNEL_MKDIR
@@ -185,6 +193,7 @@ AC_DEFUN([ZFS_AC_KERNEL_TEST_RESULT], [
ZFS_AC_KERNEL_COMMIT_METADATA
ZFS_AC_KERNEL_SETATTR_PREPARE
ZFS_AC_KERNEL_INSERT_INODE_LOCKED
ZFS_AC_KERNEL_DENTRY
ZFS_AC_KERNEL_TRUNCATE_SETSIZE
ZFS_AC_KERNEL_SECURITY_INODE
ZFS_AC_KERNEL_FST_MOUNT
@@ -197,6 +206,7 @@ AC_DEFUN([ZFS_AC_KERNEL_TEST_RESULT], [
ZFS_AC_KERNEL_VFS_MIGRATEPAGE
ZFS_AC_KERNEL_VFS_FSYNC_2ARGS
ZFS_AC_KERNEL_VFS_READPAGES
ZFS_AC_KERNEL_VFS_WRITEPAGE
ZFS_AC_KERNEL_VFS_SET_PAGE_DIRTY_NOBUFFERS
ZFS_AC_KERNEL_VFS_IOV_ITER
ZFS_AC_KERNEL_VFS_GENERIC_COPY_FILE_RANGE
@@ -226,6 +236,7 @@ AC_DEFUN([ZFS_AC_KERNEL_TEST_RESULT], [
ZFS_AC_KERNEL_STANDALONE_LINUX_STDARG
ZFS_AC_KERNEL_STRLCPY
ZFS_AC_KERNEL_PAGEMAP_FOLIO_WAIT_BIT
ZFS_AC_KERNEL_PAGEMAP_READAHEAD_PAGE
ZFS_AC_KERNEL_ADD_DISK
ZFS_AC_KERNEL_KTHREAD
ZFS_AC_KERNEL_ZERO_PAGE
@@ -233,7 +244,7 @@ AC_DEFUN([ZFS_AC_KERNEL_TEST_RESULT], [
ZFS_AC_KERNEL_IDMAP_MNT_API
ZFS_AC_KERNEL_IDMAP_NO_USERNS
ZFS_AC_KERNEL_IATTR_VFSID
ZFS_AC_KERNEL_WRITEPAGE_T
ZFS_AC_KERNEL_WRITEBACK
ZFS_AC_KERNEL_RECLAIMED
ZFS_AC_KERNEL_REGISTER_SYSCTL_TABLE
ZFS_AC_KERNEL_REGISTER_SYSCTL_SZ
@@ -248,6 +259,9 @@ AC_DEFUN([ZFS_AC_KERNEL_TEST_RESULT], [
ZFS_AC_KERNEL_PIN_USER_PAGES
ZFS_AC_KERNEL_TIMER
ZFS_AC_KERNEL_SUPER_BLOCK_S_WB_ERR
ZFS_AC_KERNEL_SOPS_FREE_INODE
ZFS_AC_KERNEL_NAMESPACE
ZFS_AC_KERNEL_INODE_GENERIC_DROP
case "$host_cpu" in
powerpc*)
ZFS_AC_KERNEL_CPU_HAS_FEATURE
+46 -23
View File
@@ -38,9 +38,10 @@ AC_DEFUN([ZFS_AC_CONFIG_TOOLCHAIN_CAN_BUILD_SSE], [
AC_MSG_CHECKING([whether host toolchain supports SSE])
AC_LINK_IFELSE([AC_LANG_SOURCE([[
void main()
int main()
{
__asm__ __volatile__("xorps %xmm0, %xmm1");
return (0);
}
]])], [
AC_DEFINE([HAVE_SSE], 1, [Define if host toolchain supports SSE])
@@ -57,9 +58,10 @@ AC_DEFUN([ZFS_AC_CONFIG_TOOLCHAIN_CAN_BUILD_SSE2], [
AC_MSG_CHECKING([whether host toolchain supports SSE2])
AC_LINK_IFELSE([AC_LANG_SOURCE([[
void main()
int main()
{
__asm__ __volatile__("pxor %xmm0, %xmm1");
return (0);
}
]])], [
AC_DEFINE([HAVE_SSE2], 1, [Define if host toolchain supports SSE2])
@@ -76,10 +78,11 @@ AC_DEFUN([ZFS_AC_CONFIG_TOOLCHAIN_CAN_BUILD_SSE3], [
AC_MSG_CHECKING([whether host toolchain supports SSE3])
AC_LINK_IFELSE([AC_LANG_SOURCE([[
void main()
int main()
{
char v[16];
__asm__ __volatile__("lddqu %0,%%xmm0" :: "m"(v[0]));
return (0);
}
]])], [
AC_DEFINE([HAVE_SSE3], 1, [Define if host toolchain supports SSE3])
@@ -96,9 +99,10 @@ AC_DEFUN([ZFS_AC_CONFIG_TOOLCHAIN_CAN_BUILD_SSSE3], [
AC_MSG_CHECKING([whether host toolchain supports SSSE3])
AC_LINK_IFELSE([AC_LANG_SOURCE([[
void main()
int main()
{
__asm__ __volatile__("pshufb %xmm0,%xmm1");
return (0);
}
]])], [
AC_DEFINE([HAVE_SSSE3], 1, [Define if host toolchain supports SSSE3])
@@ -115,9 +119,10 @@ AC_DEFUN([ZFS_AC_CONFIG_TOOLCHAIN_CAN_BUILD_SSE4_1], [
AC_MSG_CHECKING([whether host toolchain supports SSE4.1])
AC_LINK_IFELSE([AC_LANG_SOURCE([[
void main()
int main()
{
__asm__ __volatile__("pmaxsb %xmm0,%xmm1");
return (0);
}
]])], [
AC_DEFINE([HAVE_SSE4_1], 1, [Define if host toolchain supports SSE4.1])
@@ -134,9 +139,10 @@ AC_DEFUN([ZFS_AC_CONFIG_TOOLCHAIN_CAN_BUILD_SSE4_2], [
AC_MSG_CHECKING([whether host toolchain supports SSE4.2])
AC_LINK_IFELSE([AC_LANG_SOURCE([[
void main()
int main()
{
__asm__ __volatile__("pcmpgtq %xmm0, %xmm1");
return (0);
}
]])], [
AC_DEFINE([HAVE_SSE4_2], 1, [Define if host toolchain supports SSE4.2])
@@ -153,10 +159,11 @@ AC_DEFUN([ZFS_AC_CONFIG_TOOLCHAIN_CAN_BUILD_AVX], [
AC_MSG_CHECKING([whether host toolchain supports AVX])
AC_LINK_IFELSE([AC_LANG_SOURCE([[
void main()
int main()
{
char v[32];
__asm__ __volatile__("vmovdqa %0,%%ymm0" :: "m"(v[0]));
return (0);
}
]])], [
AC_MSG_RESULT([yes])
@@ -174,9 +181,10 @@ AC_DEFUN([ZFS_AC_CONFIG_TOOLCHAIN_CAN_BUILD_AVX2], [
AC_LINK_IFELSE([AC_LANG_SOURCE([
[
void main()
int main()
{
__asm__ __volatile__("vpshufb %ymm0,%ymm1,%ymm2");
return (0);
}
]])], [
AC_MSG_RESULT([yes])
@@ -194,9 +202,10 @@ AC_DEFUN([ZFS_AC_CONFIG_TOOLCHAIN_CAN_BUILD_AVX512F], [
AC_LINK_IFELSE([AC_LANG_SOURCE([
[
void main()
int main()
{
__asm__ __volatile__("vpandd %zmm0,%zmm1,%zmm2");
return (0);
}
]])], [
AC_MSG_RESULT([yes])
@@ -214,9 +223,10 @@ AC_DEFUN([ZFS_AC_CONFIG_TOOLCHAIN_CAN_BUILD_AVX512CD], [
AC_LINK_IFELSE([AC_LANG_SOURCE([
[
void main()
int main()
{
__asm__ __volatile__("vplzcntd %zmm0,%zmm1");
return (0);
}
]])], [
AC_MSG_RESULT([yes])
@@ -234,9 +244,10 @@ AC_DEFUN([ZFS_AC_CONFIG_TOOLCHAIN_CAN_BUILD_AVX512DQ], [
AC_LINK_IFELSE([AC_LANG_SOURCE([
[
void main()
int main()
{
__asm__ __volatile__("vandpd %zmm0,%zmm1,%zmm2");
return (0);
}
]])], [
AC_MSG_RESULT([yes])
@@ -254,9 +265,10 @@ AC_DEFUN([ZFS_AC_CONFIG_TOOLCHAIN_CAN_BUILD_AVX512BW], [
AC_LINK_IFELSE([AC_LANG_SOURCE([
[
void main()
int main()
{
__asm__ __volatile__("vpshufb %zmm0,%zmm1,%zmm2");
return (0);
}
]])], [
AC_MSG_RESULT([yes])
@@ -274,9 +286,10 @@ AC_DEFUN([ZFS_AC_CONFIG_TOOLCHAIN_CAN_BUILD_AVX512IFMA], [
AC_LINK_IFELSE([AC_LANG_SOURCE([
[
void main()
int main()
{
__asm__ __volatile__("vpmadd52luq %zmm0,%zmm1,%zmm2");
return (0);
}
]])], [
AC_MSG_RESULT([yes])
@@ -294,9 +307,10 @@ AC_DEFUN([ZFS_AC_CONFIG_TOOLCHAIN_CAN_BUILD_AVX512VBMI], [
AC_LINK_IFELSE([AC_LANG_SOURCE([
[
void main()
int main()
{
__asm__ __volatile__("vpermb %zmm0,%zmm1,%zmm2");
return (0);
}
]])], [
AC_MSG_RESULT([yes])
@@ -314,9 +328,10 @@ AC_DEFUN([ZFS_AC_CONFIG_TOOLCHAIN_CAN_BUILD_AVX512PF], [
AC_LINK_IFELSE([AC_LANG_SOURCE([
[
void main()
int main()
{
__asm__ __volatile__("vgatherpf0dps (%rsi,%zmm0,4){%k1}");
return (0);
}
]])], [
AC_MSG_RESULT([yes])
@@ -334,9 +349,10 @@ AC_DEFUN([ZFS_AC_CONFIG_TOOLCHAIN_CAN_BUILD_AVX512ER], [
AC_LINK_IFELSE([AC_LANG_SOURCE([
[
void main()
int main()
{
__asm__ __volatile__("vexp2pd %zmm0,%zmm1");
return (0);
}
]])], [
AC_MSG_RESULT([yes])
@@ -354,9 +370,10 @@ AC_DEFUN([ZFS_AC_CONFIG_TOOLCHAIN_CAN_BUILD_AVX512VL], [
AC_LINK_IFELSE([AC_LANG_SOURCE([
[
void main()
int main()
{
__asm__ __volatile__("vpabsq %zmm0,%zmm1");
return (0);
}
]])], [
AC_MSG_RESULT([yes])
@@ -374,9 +391,10 @@ AC_DEFUN([ZFS_AC_CONFIG_TOOLCHAIN_CAN_BUILD_AES], [
AC_LINK_IFELSE([AC_LANG_SOURCE([
[
void main()
int main()
{
__asm__ __volatile__("aesenc %xmm0, %xmm1");
return (0);
}
]])], [
AC_MSG_RESULT([yes])
@@ -394,9 +412,10 @@ AC_DEFUN([ZFS_AC_CONFIG_TOOLCHAIN_CAN_BUILD_PCLMULQDQ], [
AC_LINK_IFELSE([AC_LANG_SOURCE([
[
void main()
int main()
{
__asm__ __volatile__("pclmulqdq %0, %%xmm0, %%xmm1" :: "i"(0));
return (0);
}
]])], [
AC_MSG_RESULT([yes])
@@ -414,9 +433,10 @@ AC_DEFUN([ZFS_AC_CONFIG_TOOLCHAIN_CAN_BUILD_MOVBE], [
AC_LINK_IFELSE([AC_LANG_SOURCE([
[
void main()
int main()
{
__asm__ __volatile__("movbe 0(%eax), %eax");
return (0);
}
]])], [
AC_MSG_RESULT([yes])
@@ -434,10 +454,11 @@ AC_DEFUN([ZFS_AC_CONFIG_TOOLCHAIN_CAN_BUILD_XSAVE], [
AC_LINK_IFELSE([AC_LANG_SOURCE([
[
void main()
int main()
{
char b[4096] __attribute__ ((aligned (64)));
__asm__ __volatile__("xsave %[b]\n" : : [b] "m" (*b) : "memory");
return (0);
}
]])], [
AC_MSG_RESULT([yes])
@@ -455,10 +476,11 @@ AC_DEFUN([ZFS_AC_CONFIG_TOOLCHAIN_CAN_BUILD_XSAVEOPT], [
AC_LINK_IFELSE([AC_LANG_SOURCE([
[
void main()
int main()
{
char b[4096] __attribute__ ((aligned (64)));
__asm__ __volatile__("xsaveopt %[b]\n" : : [b] "m" (*b) : "memory");
return (0);
}
]])], [
AC_MSG_RESULT([yes])
@@ -476,10 +498,11 @@ AC_DEFUN([ZFS_AC_CONFIG_TOOLCHAIN_CAN_BUILD_XSAVES], [
AC_LINK_IFELSE([AC_LANG_SOURCE([
[
void main()
int main()
{
char b[4096] __attribute__ ((aligned (64)));
__asm__ __volatile__("xsaves %[b]\n" : : [b] "m" (*b) : "memory");
return (0);
}
]])], [
AC_MSG_RESULT([yes])
+34
View File
@@ -0,0 +1,34 @@
dnl #
dnl # Check for statx() function and STATX_MNT_ID availability
dnl #
AC_DEFUN([ZFS_AC_CONFIG_USER_STATX], [
AC_CHECK_HEADERS([sys/stat.h],
[have_stat_headers=yes],
[have_stat_headers=no])
AS_IF([test "x$have_stat_headers" = "xyes"], [
AC_CHECK_FUNC([statx], [
AC_DEFINE([HAVE_STATX], [1], [statx() is available])
dnl Check for STATX_MNT_ID availability
AC_MSG_CHECKING([for STATX_MNT_ID])
AC_COMPILE_IFELSE([
AC_LANG_PROGRAM([[
#include <sys/stat.h>
]], [[
struct statx stx;
int mask = STATX_MNT_ID;
(void)mask;
(void)stx.stx_mnt_id;
]])
], [
AC_MSG_RESULT([yes])
AC_DEFINE([HAVE_STATX_MNT_ID], [1], [STATX_MNT_ID is available])
], [
AC_MSG_RESULT([no])
])
])
], [
AC_MSG_WARN([sys/stat.h not found; skipping statx support])
])
]) dnl end AC_DEFUN
+1
View File
@@ -17,6 +17,7 @@ AC_DEFUN([ZFS_AC_CONFIG_USER], [
ZFS_AC_CONFIG_USER_LIBUDEV
ZFS_AC_CONFIG_USER_LIBUUID
ZFS_AC_CONFIG_USER_LIBBLKID
ZFS_AC_CONFIG_USER_STATX
])
ZFS_AC_CONFIG_USER_LIBTIRPC
ZFS_AC_CONFIG_USER_LIBCRYPTO
+40
View File
@@ -205,6 +205,46 @@ AC_DEFUN([ZFS_AC_DEBUG_INVARIANTS], [
AC_MSG_RESULT([$enable_invariants])
])
dnl # Disabled by default. If enabled allows a configured "turn objtools
dnl # warnings into errors" (CONFIG_OBJTOOL_WERROR) behavior to take effect.
dnl # If disabled, objtool warnings are never turned into errors. It can't
dnl # be enabled if the kernel wasn't compiled with CONFIG_OBJTOOL_WERROR=y.
dnl #
AC_DEFUN([ZFS_AC_OBJTOOL_WERROR], [
AC_MSG_CHECKING([whether objtool error on warning behavior is enabled])
AC_ARG_ENABLE([objtool-werror],
[AS_HELP_STRING([--enable-objtool-werror],
[Enable objtool's error on warning behaviour if present @<:@default=no@:>@])],
[enable_objtool_werror=$enableval],
[enable_objtool_werror=no])
AC_MSG_RESULT([$enable_objtool_werror])
AS_IF([test x$CONFIG_OBJTOOL_WERROR_DEFINED = xyes],[
AS_IF([test x$enable_objtool_werror = xyes],[
AC_MSG_NOTICE([enable-objtool-werror defined, keeping -Werror ])
],[
AC_MSG_NOTICE([enable-objtool-werror undefined, disabling -Werror ])
OBJTOOL_DISABLE_WERROR=y
abs_objtool_binary=$kernelsrc/tools/objtool/objtool
AS_IF([test -x $abs_objtool_binary],[],[
AC_MSG_ERROR([*** objtool binary $abs_objtool_binary not found])
])
dnl # The path to the wrapper is defined in modules/Makefile.in.
])
],[
dnl # We can't enable --Werror if it's not there.
AS_IF([test x$enable_objtool_werror = xyes],[
AC_MSG_ERROR([
*** Cannot enable objtool-werror,
*** a kernel built with CONFIG_OBJTOOL_WERROR=y is required.
])
],[])
])
AC_SUBST(OBJTOOL_DISABLE_WERROR)
AC_SUBST(abs_objtool_binary)
])
AC_DEFUN([ZFS_AC_CONFIG_ALWAYS], [
AX_COUNT_CPUS([])
AC_SUBST(CPU_COUNT)
+2
View File
@@ -65,6 +65,7 @@ ZFS_AC_DEBUGINFO
ZFS_AC_DEBUG_KMEM
ZFS_AC_DEBUG_KMEM_TRACKING
ZFS_AC_DEBUG_INVARIANTS
ZFS_AC_OBJTOOL_WERROR
AC_CONFIG_FILES([
contrib/debian/rules
@@ -86,6 +87,7 @@ AC_CONFIG_FILES([
zfs.release
])
AC_CONFIG_FILES([scripts/objtool-wrapper], [chmod +x scripts/objtool-wrapper])
AC_OUTPUT
+4 -4
View File
@@ -100,8 +100,8 @@ Depends: ${misc:Depends}, ${shlibs:Depends}
# The libcurl4 is loaded through dlopen("libcurl.so.4").
# https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=988521
Recommends: libcurl4
Breaks: libzfs2, libzfs4, libzfs4linux, libzfs6linux
Replaces: libzfs2, libzfs4, libzfs4linux, libzfs6linux
Breaks: libzfs2, libzfs4, libzfs4linux, libzfs6linux, openzfs-libzfs4
Replaces: libzfs2, libzfs4, libzfs4linux, libzfs6linux, openzfs-libzfs4
Conflicts: libzfs6linux
Description: OpenZFS filesystem library for Linux - general support
OpenZFS is a storage platform that encompasses the functionality of
@@ -128,8 +128,8 @@ Package: openzfs-libzpool6
Section: contrib/libs
Architecture: linux-any
Depends: ${misc:Depends}, ${shlibs:Depends}
Breaks: libzpool2, libzpool5, libzpool5linux, libzpool6linux
Replaces: libzpool2, libzpool5, libzpool5linux, libzpool6linux
Breaks: libzpool2, libzpool5, libzpool6linux
Replaces: libzpool2, libzpool5, libzpool6linux
Conflicts: libzpool6linux
Description: OpenZFS pool library for Linux
OpenZFS is a storage platform that encompasses the functionality of
+2
View File
@@ -8,6 +8,7 @@ lib/systemd/system/zfs-import-scan.service
lib/systemd/system/zfs-import.target
lib/systemd/system/zfs-load-key.service
lib/systemd/system/zfs-mount.service
lib/systemd/system/zfs-mount@.service
lib/systemd/system/zfs-scrub-monthly@.timer
lib/systemd/system/zfs-scrub-weekly@.timer
lib/systemd/system/zfs-scrub@.service
@@ -73,6 +74,7 @@ usr/share/man/man8/zfs-recv.8
usr/share/man/man8/zfs-redact.8
usr/share/man/man8/zfs-release.8
usr/share/man/man8/zfs-rename.8
usr/share/man/man8/zfs-rewrite.8
usr/share/man/man8/zfs-rollback.8
usr/share/man/man8/zfs-send.8
usr/share/man/man8/zfs-set.8
+3 -3
View File
@@ -93,7 +93,7 @@ override_dh_auto_install:
@# Install the DKMS source.
@# We only want the files needed to build the modules
install -D -t '$(CURDIR)/debian/tmp/usr/src/$(NAME)-$(DEB_VERSION_UPSTREAM)/scripts' \
'$(CURDIR)/scripts/dkms.postbuild'
'$(CURDIR)/scripts/dkms.postbuild' '$(CURDIR)/scripts/objtool-wrapper.in'
$(foreach file,$(DKMSFILES),mv '$(CURDIR)/$(NAME)-$(DEB_VERSION_UPSTREAM)/$(file)' '$(CURDIR)/debian/tmp/usr/src/$(NAME)-$(DEB_VERSION_UPSTREAM)' || exit 1;)
@# Only ever build Linux modules
@@ -108,8 +108,8 @@ override_dh_auto_install:
@# - zfs.release$
@# * Takes care of spaces and tabs
@# * Remove reference to ZFS_AC_PACKAGE
awk '/^AC_CONFIG_FILES\(\[/,/^\]\)/ {\
if ($$0 !~ /^(AC_CONFIG_FILES\(\[([ \t]+)?$$|\]\)([ \t]+)?$$|([ \t]+)?(include\/(Makefile|sys|os\/(Makefile|linux))|module\/|Makefile([ \t]+)?$$|zfs\.release([ \t]+)?$$))/) \
awk '/^AC_CONFIG_FILES\(\[/,/\]\)/ {\
if ($$0 !~ /^(AC_CONFIG_FILES\(\[([ \t]+)?$$|\]\)([ \t]+)?$$|([ \t]+)?(include\/(Makefile|sys|os\/(Makefile|linux))|module\/|Makefile([ \t]+)?$$|zfs\.release([ \t]+)?$$))|scripts\/objtool-wrapper.*\]\)$$/) \
{next} } {print}' \
'$(CURDIR)/$(NAME)-$(DEB_VERSION_UPSTREAM)/configure.ac' | sed '/ZFS_AC_PACKAGE/d' > '$(CURDIR)/debian/tmp/usr/src/$(NAME)-$(DEB_VERSION_UPSTREAM)/configure.ac'
@# Set "SUBDIRS = module include" for CONFIG_KERNEL and remove SUBDIRS for all other configs.
+2 -1
View File
@@ -979,7 +979,8 @@ mountroot()
touch /run/zfs_unlock_complete
if [ -e /run/zfs_unlock_complete_notify ]; then
read -r < /run/zfs_unlock_complete_notify
# shellcheck disable=SC2034
read -r zfs_unlock_complete_notify < /run/zfs_unlock_complete_notify
fi
# ------------
+1 -1
View File
@@ -8,7 +8,7 @@ This contrib contains community compatibility patches to get Intel QAT working o
These patches are based on the following Intel QAT version:
[1.7.l.4.10.0-00014](https://01.org/sites/default/files/downloads/qat1.7.l.4.10.0-00014.tar.gz)
When using QAT with above kernels versions, the following patches needs to be applied using:
When using QAT with the above kernel versions, the following patches need to be applied using:
patch -p1 < _$PATCH_
_Where $PATCH refers to the path of the patch in question_
-1
View File
@@ -604,5 +604,4 @@ class RaidzExpansionRunning(ZFSError):
errno = ZFS_ERR_RAIDZ_EXPAND_IN_PROGRESS
message = "A raidz device is currently expanding"
# vim: softtabstop=4 tabstop=4 expandtab shiftwidth=4
@@ -4223,7 +4223,7 @@ class _TempPool(object):
self.getRoot().reset()
return
# On the Buildbot builders this may fail with "pool is busy"
# On the CI builders this may fail with "pool is busy"
# Retry 5 times before raising an error
retry = 0
while True:
+4 -1
View File
@@ -8,7 +8,9 @@ usage()
exit 1
}
[ "$#" -eq 1 ] || usage
if ! [ -d "$1" ] ; then
usage
fi
KERNEL_DIR="$1"
if ! [ -e 'zfs_config.h' ]
@@ -31,6 +33,7 @@ cat > "$KERNEL_DIR/fs/zfs/Kconfig" <<EOF
config ZFS
tristate "ZFS filesystem support"
depends on EFI_PARTITION
depends on BLOCK
select ZLIB_INFLATE
select ZLIB_DEFLATE
help
+1
View File
@@ -56,6 +56,7 @@ systemdunit_DATA = \
%D%/systemd/system/zfs-import-scan.service \
%D%/systemd/system/zfs-import.target \
%D%/systemd/system/zfs-mount.service \
%D%/systemd/system/zfs-mount@.service \
%D%/systemd/system/zfs-scrub-monthly@.timer \
%D%/systemd/system/zfs-scrub-weekly@.timer \
%D%/systemd/system/zfs-scrub@.service \
+1 -1
View File
@@ -1,5 +1,5 @@
DESCRIPTION
These script were written with the primary intention of being portable and
These scripts were written with the primary intention of being portable and
usable on as many systems as possible.
This is, in practice, usually not possible. But the intention is there.
+26
View File
@@ -0,0 +1,26 @@
[Unit]
Description=Mount ZFS filesystem %I
Documentation=man:zfs(8)
DefaultDependencies=no
After=systemd-udev-settle.service
After=zfs-import.target
After=zfs-mount.service
After=systemd-remount-fs.service
Before=local-fs.target
ConditionPathIsDirectory=/sys/module/zfs
# This merely tells the service manager
# that unmounting everything undoes the
# effect of this service. No extra logic
# is ran as a result of these settings.
Conflicts=umount.target
Before=umount.target
[Service]
Type=oneshot
RemainAfterExit=yes
EnvironmentFile=-@initconfdir@/zfs
ExecStart=@sbindir@/zfs mount -R %I
[Install]
WantedBy=zfs.target
+5
View File
@@ -56,4 +56,9 @@ struct opensolaris_utsname {
#define task_io_account_read(n)
#define task_io_account_write(n)
/*
* Check if the current thread is a memory reclaim thread.
*/
extern int current_is_reclaim_thread(void);
#endif /* _OPENSOLARIS_SYS_MISC_H_ */
+3
View File
@@ -104,6 +104,9 @@
#define spa_taskq_write_param_set_args(var) \
CTLTYPE_STRING, NULL, 0, spa_taskq_write_param, "A"
#define spa_taskq_free_param_set_args(var) \
CTLTYPE_STRING, NULL, 0, spa_taskq_free_param, "A"
#define fletcher_4_param_set_args(var) \
CTLTYPE_STRING, NULL, 0, fletcher_4_param, "A"
+3 -1
View File
@@ -45,7 +45,9 @@
#ifdef _KERNEL
#define CPU curcpu
#define minclsyspri PRIBIO
#define defclsyspri minclsyspri
#define defclsyspri minclsyspri
/* Write issue taskq priority. */
#define wtqclsyspri ((PVM + PRIBIO) / 2)
#define maxclsyspri PVM
#define max_ncpus (mp_maxid + 1)
#define boot_max_ncpus (mp_maxid + 1)
+4 -73
View File
@@ -290,80 +290,11 @@ extern unsigned char bcd_to_byte[256];
#define offsetof(type, field) __offsetof(type, field)
#endif
/*
* Find highest one bit set.
* Returns bit number + 1 of highest bit that is set, otherwise returns 0.
* High order bit is 31 (or 63 in _LP64 kernel).
*/
static __inline int
highbit(ulong_t i)
{
#if defined(HAVE_INLINE_FLSL)
return (flsl(i));
#else
int h = 1;
#define highbit(x) flsl(x)
#define lowbit(x) ffsl(x)
if (i == 0)
return (0);
#ifdef _LP64
if (i & 0xffffffff00000000ul) {
h += 32; i >>= 32;
}
#endif
if (i & 0xffff0000) {
h += 16; i >>= 16;
}
if (i & 0xff00) {
h += 8; i >>= 8;
}
if (i & 0xf0) {
h += 4; i >>= 4;
}
if (i & 0xc) {
h += 2; i >>= 2;
}
if (i & 0x2) {
h += 1;
}
return (h);
#endif
}
/*
* Find highest one bit set.
* Returns bit number + 1 of highest bit that is set, otherwise returns 0.
*/
static __inline int
highbit64(uint64_t i)
{
#if defined(HAVE_INLINE_FLSLL)
return (flsll(i));
#else
int h = 1;
if (i == 0)
return (0);
if (i & 0xffffffff00000000ULL) {
h += 32; i >>= 32;
}
if (i & 0xffff0000) {
h += 16; i >>= 16;
}
if (i & 0xff00) {
h += 8; i >>= 8;
}
if (i & 0xf0) {
h += 4; i >>= 4;
}
if (i & 0xc) {
h += 2; i >>= 2;
}
if (i & 0x2) {
h += 1;
}
return (h);
#endif
}
#define highbit64(x) flsll(x)
#define lowbit64(x) ffsll(x)
#ifdef __cplusplus
}
+1
View File
@@ -8,6 +8,7 @@ kernel_linux_HEADERS = \
%D%/kernel/linux/mm_compat.h \
%D%/kernel/linux/mod_compat.h \
%D%/kernel/linux/page_compat.h \
%D%/kernel/linux/pagemap_compat.h \
%D%/kernel/linux/simd.h \
%D%/kernel/linux/simd_aarch64.h \
%D%/kernel/linux/simd_arm.h \
@@ -542,24 +542,6 @@ blk_generic_alloc_queue(make_request_fn make_request, int node_id)
}
#endif /* !HAVE_SUBMIT_BIO_IN_BLOCK_DEVICE_OPERATIONS */
/*
* All the io_*() helper functions below can operate on a bio, or a rq, but
* not both. The older submit_bio() codepath will pass a bio, and the
* newer blk-mq codepath will pass a rq.
*/
static inline int
io_data_dir(struct bio *bio, struct request *rq)
{
if (rq != NULL) {
if (op_is_write(req_op(rq))) {
return (WRITE);
} else {
return (READ);
}
}
return (bio_data_dir(bio));
}
static inline int
io_is_flush(struct bio *bio, struct request *rq)
{
+15 -30
View File
@@ -34,6 +34,17 @@
#define d_alias d_u.d_alias
#ifdef HAVE_MM_PAGE_FLAGS_STRUCT
/*
* Starting from Linux 6.18, the 'flags' field in 'struct page' is defined
* to a struct ('memdesc_flags_t' typedef) instead of an unsigned long for
* improved typesafety.
*/
#define page_flags flags.f
#else
#define page_flags flags
#endif
/*
* Starting from Linux 5.13, flush_dcache_page() becomes an inline function
* and under some configurations, may indirectly referencing GPL-only
@@ -44,8 +55,8 @@
#include <linux/simd_powerpc.h>
#define flush_dcache_page(page) do { \
if (!cpu_has_feature(CPU_FTR_COHERENT_ICACHE) && \
test_bit(PG_dcache_clean, &(page)->flags)) \
clear_bit(PG_dcache_clean, &(page)->flags); \
test_bit(PG_dcache_clean, &(page)->page_flags)) \
clear_bit(PG_dcache_clean, &(page)->page_flags);\
} while (0)
#endif
/*
@@ -55,37 +66,11 @@
*/
#if defined __riscv && defined HAVE_FLUSH_DCACHE_PAGE_GPL_ONLY
#define flush_dcache_page(page) do { \
if (test_bit(PG_dcache_clean, &(page)->flags)) \
clear_bit(PG_dcache_clean, &(page)->flags); \
if (test_bit(PG_dcache_clean, &(page)->page_flags)) \
clear_bit(PG_dcache_clean, &(page)->page_flags);\
} while (0)
#endif
/*
* 2.6.30 API change,
* The const keyword was added to the 'struct dentry_operations' in
* the dentry structure. To handle this we define an appropriate
* dentry_operations_t typedef which can be used.
*/
typedef const struct dentry_operations dentry_operations_t;
/*
* 2.6.38 API addition,
* Added d_clear_d_op() helper function which clears some flags and the
* registered dentry->d_op table. This is required because d_set_d_op()
* issues a warning when the dentry operations table is already set.
* For the .zfs control directory to work properly we must be able to
* override the default operations table and register custom .d_automount
* and .d_revalidate callbacks.
*/
static inline void
d_clear_d_op(struct dentry *dentry)
{
dentry->d_op = NULL;
dentry->d_flags &= ~(
DCACHE_OP_HASH | DCACHE_OP_COMPARE |
DCACHE_OP_REVALIDATE | DCACHE_OP_DELETE);
}
/*
* Walk and invalidate all dentry aliases of an inode
* unless it's a mountpoint
@@ -0,0 +1,36 @@
// SPDX-License-Identifier: CDDL-1.0
/*
* CDDL HEADER START
*
* The contents of this file are subject to the terms of the
* Common Development and Distribution License (the "License").
* You may not use this file except in compliance with the License.
*
* You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE
* or https://opensource.org/licenses/CDDL-1.0.
* See the License for the specific language governing permissions
* and limitations under the License.
*
* When distributing Covered Code, include this CDDL HEADER in each
* file and include the License file at usr/src/OPENSOLARIS.LICENSE.
* If applicable, add the following below this CDDL HEADER, with the
* fields enclosed by brackets "[]" replaced with your own identifying
* information: Portions Copyright [yyyy] [name of copyright owner]
*
* CDDL HEADER END
*/
/*
* Copyright (c) 2025, Rob Norris <robn@despairlabs.com>
*/
#ifndef _ZFS_PAGEMAP_COMPAT_H
#define _ZFS_PAGEMAP_COMPAT_H
#include <linux/pagemap.h>
#ifndef HAVE_PAGEMAP_READAHEAD_PAGE
#define readahead_page(ractl) (&(__readahead_folio(ractl)->page))
#endif
#endif
+10 -11
View File
@@ -139,15 +139,6 @@
*/
#if defined(HAVE_KERNEL_FPU_INTERNAL)
/*
* For kernels not exporting *kfpu_{begin,end} we have to use inline assembly
* with the XSAVE{,OPT,S} instructions, so we need the toolchain to support at
* least XSAVE.
*/
#if !defined(HAVE_XSAVE)
#error "Toolchain needs to support the XSAVE assembler instruction"
#endif
#ifndef XFEATURE_MASK_XTILE
/*
* For kernels where this doesn't exist yet, we still don't want to break
@@ -335,9 +326,13 @@ kfpu_begin(void)
return;
}
#endif
#if defined(HAVE_XSAVE)
if (static_cpu_has(X86_FEATURE_XSAVE)) {
kfpu_do_xsave("xsave", state, ~XFEATURE_MASK_XTILE);
} else if (static_cpu_has(X86_FEATURE_FXSR)) {
return;
}
#endif
if (static_cpu_has(X86_FEATURE_FXSR)) {
kfpu_save_fxsr(state);
} else {
kfpu_save_fsave(state);
@@ -390,9 +385,13 @@ kfpu_end(void)
goto out;
}
#endif
#if defined(HAVE_XSAVE)
if (static_cpu_has(X86_FEATURE_XSAVE)) {
kfpu_do_xrstor("xrstor", state, ~XFEATURE_MASK_XTILE);
} else if (static_cpu_has(X86_FEATURE_FXSR)) {
goto out;
}
#endif
if (static_cpu_has(X86_FEATURE_FXSR)) {
kfpu_restore_fxsr(state);
} else {
kfpu_restore_fsave(state);
@@ -23,6 +23,7 @@
/*
* Copyright (C) 2011 Lawrence Livermore National Security, LLC.
* Copyright (C) 2015 Jörg Thalheim.
* Copyright (c) 2025, Rob Norris <robn@despairlabs.com>
*/
#ifndef _ZFS_VFS_H
@@ -262,4 +263,18 @@ zpl_is_32bit_api(void)
#define zpl_generic_fillattr(user_ns, ip, sp) generic_fillattr(ip, sp)
#endif
#ifdef HAVE_INODE_GENERIC_DROP
/* 6.18 API change. These were renamed, alias the old names to the new. */
#define generic_delete_inode(ip) inode_just_drop(ip)
#define generic_drop_inode(ip) inode_generic_drop(ip)
#endif
#ifndef HAVE_INODE_STATE_READ_ONCE
/*
* 6.19 API change. We should no longer access i_state directly. If the new
* helper function doesn't exist, define our own.
*/
#define inode_state_read_once(ip) READ_ONCE(ip->i_state)
#endif
#endif /* _ZFS_VFS_H */
+6
View File
@@ -24,7 +24,13 @@
#define _OS_LINUX_SPL_MISC_H
#include <linux/kobject.h>
#include <linux/swap.h>
extern void spl_signal_kobj_evt(struct block_device *bdev);
/*
* Check if the current thread is a memory reclaim thread.
*/
extern int current_is_reclaim_thread(void);
#endif
+1 -1
View File
@@ -25,6 +25,6 @@
#ifndef _SPL_STAT_H
#define _SPL_STAT_H
#include <linux/stat.h>
#include <sys/stat.h>
#endif /* SPL_STAT_H */
+3 -1
View File
@@ -92,8 +92,10 @@
* Treat shim tasks as SCHED_NORMAL tasks
*/
#define minclsyspri (MAX_PRIO-1)
#define maxclsyspri (MAX_RT_PRIO)
#define defclsyspri (DEFAULT_PRIO)
/* Write issue taskq priority. */
#define wtqclsyspri (MAX_RT_PRIO + 1)
#define maxclsyspri (MAX_RT_PRIO)
#ifndef NICE_TO_PRIO
#define NICE_TO_PRIO(nice) (MAX_RT_PRIO + (nice) + 20)
+1 -6
View File
@@ -59,8 +59,6 @@ DECLARE_EVENT_CLASS(zfs_ace_class,
__field(uint64_t, z_size)
__field(uint64_t, z_pflags)
__field(uint32_t, z_sync_cnt)
__field(uint32_t, z_sync_writes_cnt)
__field(uint32_t, z_async_writes_cnt)
__field(mode_t, z_mode)
__field(boolean_t, z_is_sa)
__field(boolean_t, z_is_ctldir)
@@ -92,8 +90,6 @@ DECLARE_EVENT_CLASS(zfs_ace_class,
__entry->z_size = zn->z_size;
__entry->z_pflags = zn->z_pflags;
__entry->z_sync_cnt = zn->z_sync_cnt;
__entry->z_sync_writes_cnt = zn->z_sync_writes_cnt;
__entry->z_async_writes_cnt = zn->z_async_writes_cnt;
__entry->z_mode = zn->z_mode;
__entry->z_is_sa = zn->z_is_sa;
__entry->z_is_ctldir = zn->z_is_ctldir;
@@ -117,7 +113,7 @@ DECLARE_EVENT_CLASS(zfs_ace_class,
TP_printk("zn { id %llu unlinked %u atime_dirty %u "
"zn_prefetch %u blksz %u seq %u "
"mapcnt %llu size %llu pflags %llu "
"sync_cnt %u sync_writes_cnt %u async_writes_cnt %u "
"sync_cnt %u "
"mode 0x%x is_sa %d is_ctldir %d "
"inode { uid %u gid %u ino %lu nlink %u size %lli "
"blkbits %u bytes %u mode 0x%x generation %x } } "
@@ -126,7 +122,6 @@ DECLARE_EVENT_CLASS(zfs_ace_class,
__entry->z_zn_prefetch, __entry->z_blksz,
__entry->z_seq, __entry->z_mapcnt, __entry->z_size,
__entry->z_pflags, __entry->z_sync_cnt,
__entry->z_sync_writes_cnt, __entry->z_async_writes_cnt,
__entry->z_mode, __entry->z_is_sa, __entry->z_is_ctldir,
__entry->i_uid, __entry->i_gid, __entry->i_ino, __entry->i_nlink,
__entry->i_size, __entry->i_blkbits,
@@ -157,6 +157,7 @@ struct znode;
extern int zfs_sync(struct super_block *, int, cred_t *);
extern int zfs_inode_alloc(struct super_block *, struct inode **ip);
extern void zfs_inode_free(struct inode *);
extern void zfs_inode_destroy(struct inode *);
extern void zfs_mark_inode_dirty(struct inode *);
extern boolean_t zfs_relatime_need_update(const struct inode *);
+1
View File
@@ -55,6 +55,7 @@ extern const struct file_operations zpl_dir_file_operations;
extern void zpl_prune_sb(uint64_t nr_to_scan, void *arg);
extern const struct super_operations zpl_super_operations;
extern const struct dentry_operations zpl_dentry_operations;
extern const struct export_operations zpl_export_operations;
extern struct file_system_type zpl_fs_type;
+1 -1
View File
@@ -954,7 +954,7 @@ typedef struct arc_sums {
wmsum_t arcstat_data_size;
wmsum_t arcstat_metadata_size;
wmsum_t arcstat_dbuf_size;
wmsum_t arcstat_dnode_size;
aggsum_t arcstat_dnode_size;
wmsum_t arcstat_bonus_size;
wmsum_t arcstat_l2_hits;
wmsum_t arcstat_l2_misses;
+1 -1
View File
@@ -65,7 +65,7 @@ _Static_assert(BRT_RANGESIZE / SPA_MINBLOCKSIZE <= UINT16_MAX,
*/
#define BRT_BLOCKSIZE (32 * 1024)
#define BRT_RANGESIZE_TO_NBLOCKS(size) \
(((size) - 1) / BRT_BLOCKSIZE / sizeof (uint16_t) + 1)
(((size) - 1) / (BRT_BLOCKSIZE / sizeof (uint16_t)) + 1)
#define BRT_LITTLE_ENDIAN 0
#define BRT_BIG_ENDIAN 1
+1
View File
@@ -174,6 +174,7 @@ typedef struct dbuf_dirty_record {
arc_buf_t *dr_data;
override_states_t dr_override_state;
uint8_t dr_copies;
uint8_t dr_gang_copies;
boolean_t dr_nopwrite;
boolean_t dr_brtwrite;
boolean_t dr_diowrite;
+2 -5
View File
@@ -286,14 +286,11 @@ typedef struct {
ddt_log_t *ddt_log_active; /* pointers into ddt_log */
ddt_log_t *ddt_log_flushing; /* swapped when flush starts */
hrtime_t ddt_flush_start; /* log flush start this txg */
uint32_t ddt_flush_pass; /* log flush pass this txg */
int32_t ddt_flush_count; /* entries flushed this txg */
int32_t ddt_flush_min; /* min rem entries to flush */
int32_t ddt_log_ingest_rate; /* rolling log ingest rate */
int32_t ddt_log_flush_rate; /* rolling log flush rate */
int32_t ddt_log_flush_time_rate; /* avg time spent flushing */
uint32_t ddt_log_flush_pressure; /* pressure to apply for cap */
uint32_t ddt_log_flush_prev_backlog; /* prev backlog size */
uint64_t ddt_flush_force_txg; /* flush hard before this txg */
+2 -2
View File
@@ -144,9 +144,9 @@ typedef enum dmu_object_byteswap {
#define DMU_OT_IS_DDT(ot) \
((ot) == DMU_OT_DDT_ZAP)
#define DMU_OT_IS_CRITICAL(ot) \
#define DMU_OT_IS_CRITICAL(ot, level) \
(DMU_OT_IS_METADATA(ot) && \
(ot) != DMU_OT_DNODE && \
((ot) != DMU_OT_DNODE || (level) > 0) && \
(ot) != DMU_OT_DIRECTORY_CONTENTS && \
(ot) != DMU_OT_SA)
+15
View File
@@ -740,6 +740,8 @@ typedef struct zpool_load_policy {
#define ZPOOL_CONFIG_METASLAB_SHIFT "metaslab_shift"
#define ZPOOL_CONFIG_ASHIFT "ashift"
#define ZPOOL_CONFIG_ASIZE "asize"
#define ZPOOL_CONFIG_MIN_ALLOC "min_alloc"
#define ZPOOL_CONFIG_MAX_ALLOC "max_alloc"
#define ZPOOL_CONFIG_DTL "DTL"
#define ZPOOL_CONFIG_SCAN_STATS "scan_stats" /* not stored on disk */
#define ZPOOL_CONFIG_REMOVAL_STATS "removal_stats" /* not stored on disk */
@@ -861,6 +863,10 @@ typedef struct zpool_load_policy {
#define ZPOOL_CONFIG_MMP_SEQ "mmp_seq" /* not stored on disk */
#define ZPOOL_CONFIG_MMP_HOSTNAME "mmp_hostname" /* not stored on disk */
#define ZPOOL_CONFIG_MMP_HOSTID "mmp_hostid" /* not stored on disk */
#define ZPOOL_CONFIG_MMP_RESULT "mmp_result" /* not stored on disk */
#define ZPOOL_CONFIG_MMP_TRYIMPORT_NS "mmp_tryimport_ns" /* not stored */
#define ZPOOL_CONFIG_MMP_IMPORT_NS "mmp_import_ns" /* not stored on disk */
#define ZPOOL_CONFIG_MMP_CLAIM_NS "mmp_claim_ns" /* not stored on disk */
#define ZPOOL_CONFIG_ALLOCATION_BIAS "alloc_bias" /* not stored on disk */
#define ZPOOL_CONFIG_EXPANSION_TIME "expansion_time" /* not stored */
#define ZPOOL_CONFIG_REBUILD_STATS "org.openzfs:rebuild_stats"
@@ -1614,6 +1620,15 @@ typedef enum zfs_ioc {
#endif
typedef struct zfs_rewrite_args {
uint64_t off;
uint64_t len;
uint64_t flags;
uint64_t arg;
} zfs_rewrite_args_t;
#define ZFS_IOC_REWRITE _IOW(0x83, 3, zfs_rewrite_args_t)
/*
* ZFS-specific error codes used for returning descriptive errors
* to the userland through zfs ioctls.
+2
View File
@@ -568,6 +568,8 @@ typedef struct metaslab_unflushed_phys {
uint64_t msp_unflushed_txg;
} metaslab_unflushed_phys_t;
char *metaslab_rt_name(metaslab_group_t *, metaslab_t *, const char *);
#ifdef __cplusplus
}
#endif
+5
View File
@@ -33,6 +33,7 @@ extern "C" {
#define MMP_DEFAULT_IMPORT_INTERVALS 20
#define MMP_DEFAULT_FAIL_INTERVALS 10
#define MMP_MIN_FAIL_INTERVALS 2 /* min if != 0 */
#define MMP_IMPORT_VERIFY_ITERS 10
#define MMP_IMPORT_SAFETY_FACTOR 200 /* pct */
#define MMP_INTERVAL_OK(interval) MAX(interval, MMP_MIN_INTERVAL)
#define MMP_FAIL_INTVS_OK(fails) (fails == 0 ? 0 : MAX(fails, \
@@ -53,6 +54,9 @@ typedef struct mmp_thread {
vdev_t *mmp_last_leaf; /* last mmp write sent here */
uint64_t mmp_leaf_last_gen; /* last mmp write sent here */
uint32_t mmp_seq; /* intra-second update counter */
uint64_t mmp_tryimport_ns; /* tryimport activity check time */
uint64_t mmp_import_ns; /* import activity check time */
uint64_t mmp_claim_ns; /* claim activity check time */
} mmp_thread_t;
@@ -62,6 +66,7 @@ extern void mmp_thread_start(struct spa *spa);
extern void mmp_thread_stop(struct spa *spa);
extern void mmp_update_uberblock(struct spa *spa, struct uberblock *ub);
extern void mmp_signal_all_threads(void);
extern int mmp_claim_uberblock(spa_t *spa, vdev_t *vd, uberblock_t *ub);
/* Global tuning */
extern int param_set_multihost_interval(ZFS_MODULE_PARAM_ARGS);
+9
View File
@@ -49,6 +49,9 @@ typedef enum zfs_range_seg_type {
ZFS_RANGE_SEG_NUM_TYPES,
} zfs_range_seg_type_t;
#define ZFS_RT_NAME(rt) (((rt)->rt_name != NULL) ? (rt)->rt_name : "")
#define ZFS_RT_F_DYN_NAME (1ULL << 0) /* if rt_name must be freed */
/*
* Note: the range_tree may not be accessed concurrently; consumers
* must provide external locking if required.
@@ -68,6 +71,9 @@ typedef struct zfs_range_tree {
void *rt_arg;
uint64_t rt_gap; /* allowable inter-segment gap */
uint64_t rt_flags;
const char *rt_name; /* details for debugging */
/*
* The rt_histogram maintains a histogram of ranges. Each bucket,
* rt_histogram[i], contains the number of ranges whose size is:
@@ -281,6 +287,9 @@ zfs_range_tree_t *zfs_range_tree_create_gap(const zfs_range_tree_ops_t *ops,
uint64_t gap);
zfs_range_tree_t *zfs_range_tree_create(const zfs_range_tree_ops_t *ops,
zfs_range_seg_type_t type, void *arg, uint64_t start, uint64_t shift);
zfs_range_tree_t *zfs_range_tree_create_flags(const zfs_range_tree_ops_t *ops,
zfs_range_seg_type_t type, void *arg, uint64_t start, uint64_t shift,
uint64_t flags, const char *name);
void zfs_range_tree_destroy(zfs_range_tree_t *rt);
boolean_t zfs_range_tree_contains(zfs_range_tree_t *rt, uint64_t start,
uint64_t size);
+2
View File
@@ -1044,6 +1044,7 @@ extern void spa_set_rootblkptr(spa_t *spa, const blkptr_t *bp);
extern void spa_altroot(spa_t *, char *, size_t);
extern uint32_t spa_sync_pass(spa_t *spa);
extern char *spa_name(spa_t *spa);
extern char *spa_load_name(spa_t *spa);
extern uint64_t spa_guid(spa_t *spa);
extern uint64_t spa_load_guid(spa_t *spa);
extern uint64_t spa_last_synced_txg(spa_t *spa);
@@ -1055,6 +1056,7 @@ extern pool_state_t spa_state(spa_t *spa);
extern spa_load_state_t spa_load_state(spa_t *spa);
extern uint64_t spa_freeze_txg(spa_t *spa);
extern uint64_t spa_get_worst_case_asize(spa_t *spa, uint64_t lsize);
extern void spa_get_min_alloc_range(spa_t *spa, uint64_t *min, uint64_t *max);
extern uint64_t spa_get_dspace(spa_t *spa);
extern uint64_t spa_get_checkpoint_space(spa_t *spa);
extern uint64_t spa_get_slop_space(spa_t *spa);
+3
View File
@@ -224,6 +224,7 @@ struct spa {
* Fields protected by spa_namespace_lock.
*/
char spa_name[ZFS_MAX_DATASET_NAME_LEN]; /* pool name */
char *spa_load_name; /* unmodified pool name */
char *spa_comment; /* comment */
avl_node_t spa_avl; /* node in spa_namespace_avl */
nvlist_t *spa_config; /* last synced config */
@@ -267,6 +268,7 @@ struct spa {
uint64_t spa_min_ashift; /* of vdevs in normal class */
uint64_t spa_max_ashift; /* of vdevs in normal class */
uint64_t spa_min_alloc; /* of vdevs in normal class */
uint64_t spa_max_alloc; /* of vdevs in normal class */
uint64_t spa_gcd_alloc; /* of vdevs in normal class */
uint64_t spa_config_guid; /* config pool guid */
uint64_t spa_load_guid; /* spa_load initialized guid */
@@ -302,6 +304,7 @@ struct spa {
void *spa_cksum_tmpls[ZIO_CHECKSUM_FUNCTIONS];
uberblock_t spa_ubsync; /* last synced uberblock */
uberblock_t spa_uberblock; /* current uberblock */
boolean_t spa_activity_check; /* activity check required */
boolean_t spa_extreme_rewind; /* rewind past deferred frees */
kmutex_t spa_scrub_lock; /* resilver/scrub lock */
uint64_t spa_scrub_inflight; /* in-flight scrub bytes */
+16 -6
View File
@@ -51,6 +51,12 @@ extern "C" {
#define MMP_SEQ_VALID_BIT 0x02
#define MMP_FAIL_INT_VALID_BIT 0x04
#define MMP_INTERVAL_MASK 0x00000000FFFFFF00
#define MMP_SEQ_MASK 0x0000FFFF00000000
#define MMP_FAIL_INT_MASK 0xFFFF000000000000
#define MMP_SEQ_MAX UINT16_MAX
#define MMP_VALID(ubp) ((ubp)->ub_magic == UBERBLOCK_MAGIC && \
(ubp)->ub_mmp_magic == MMP_MAGIC)
#define MMP_INTERVAL_VALID(ubp) (MMP_VALID(ubp) && ((ubp)->ub_mmp_config & \
@@ -60,21 +66,25 @@ extern "C" {
#define MMP_FAIL_INT_VALID(ubp) (MMP_VALID(ubp) && ((ubp)->ub_mmp_config & \
MMP_FAIL_INT_VALID_BIT))
#define MMP_INTERVAL(ubp) (((ubp)->ub_mmp_config & 0x00000000FFFFFF00) \
#define MMP_INTERVAL(ubp) (((ubp)->ub_mmp_config & MMP_INTERVAL_MASK) \
>> 8)
#define MMP_SEQ(ubp) (((ubp)->ub_mmp_config & 0x0000FFFF00000000) \
#define MMP_SEQ(ubp) (((ubp)->ub_mmp_config & MMP_SEQ_MASK) \
>> 32)
#define MMP_FAIL_INT(ubp) (((ubp)->ub_mmp_config & 0xFFFF000000000000) \
#define MMP_FAIL_INT(ubp) (((ubp)->ub_mmp_config & MMP_FAIL_INT_MASK) \
>> 48)
#define MMP_INTERVAL_SET(write) \
(((uint64_t)(write & 0xFFFFFF) << 8) | MMP_INTERVAL_VALID_BIT)
(((uint64_t)((write) & 0xFFFFFF) << 8) | MMP_INTERVAL_VALID_BIT)
#define MMP_SEQ_SET(seq) \
(((uint64_t)(seq & 0xFFFF) << 32) | MMP_SEQ_VALID_BIT)
(((uint64_t)((seq) & 0xFFFF) << 32) | MMP_SEQ_VALID_BIT)
#define MMP_FAIL_INT_SET(fail) \
(((uint64_t)(fail & 0xFFFF) << 48) | MMP_FAIL_INT_VALID_BIT)
(((uint64_t)((fail) & 0xFFFF) << 48) | MMP_FAIL_INT_VALID_BIT)
#define MMP_SEQ_CLEAR(ubp) \
((ubp)->ub_mmp_config &= ~(MMP_SEQ_MASK | MMP_SEQ_VALID_BIT))
/*
* RAIDZ expansion reflow information.
+3
View File
@@ -173,6 +173,7 @@ extern void vdev_queue_change_io_priority(zio_t *zio, zio_priority_t priority);
extern uint32_t vdev_queue_length(vdev_t *vd);
extern uint64_t vdev_queue_last_offset(vdev_t *vd);
extern uint64_t vdev_queue_class_length(vdev_t *vq, zio_priority_t p);
extern boolean_t vdev_queue_pool_busy(spa_t *spa);
extern void vdev_config_dirty(vdev_t *vd);
extern void vdev_config_clean(vdev_t *vd);
@@ -211,6 +212,8 @@ extern void vdev_label_write(zio_t *zio, vdev_t *vd, int l, abd_t *buf, uint64_t
extern int vdev_label_read_bootenv(vdev_t *, nvlist_t *);
extern int vdev_label_write_bootenv(vdev_t *, nvlist_t *);
extern int vdev_uberblock_sync_list(vdev_t **, int, struct uberblock *, int);
extern int vdev_uberblock_compare(const struct uberblock *,
const struct uberblock *);
extern int vdev_check_boot_reserve(spa_t *, vdev_t *);
typedef enum {
+1
View File
@@ -651,6 +651,7 @@ uint64_t vdev_best_ashift(uint64_t logical, uint64_t a, uint64_t b);
int param_get_raidz_impl(char *buf, zfs_kernel_param_t *kp);
#endif
int param_set_raidz_impl(ZFS_MODULE_PARAM_ARGS);
char *vdev_rt_name(vdev_t *vd, const char *name);
/*
* Vdev ashift optimization tunables
+8 -1
View File
@@ -236,6 +236,11 @@ typedef pthread_t kthread_t;
#define thread_join(t) pthread_join((pthread_t)(t), NULL)
#define newproc(f, a, cid, pri, ctp, pid) (ENOSYS)
/*
* Check if the current thread is a memory reclaim thread.
* Always returns false in userspace (no memory reclaim thread).
*/
#define current_is_reclaim_thread() (0)
/* in libzpool, p0 exists only to have its address taken */
typedef struct proc {
@@ -623,8 +628,10 @@ extern void delay(clock_t ticks);
* Process priorities as defined by setpriority(2) and getpriority(2).
*/
#define minclsyspri 19
#define maxclsyspri -20
#define defclsyspri 0
/* Write issue taskq priority. */
#define wtqclsyspri -19
#define maxclsyspri -20
#define CPU_SEQID ((uintptr_t)pthread_self() & (max_ncpus - 1))
#define CPU_SEQID_UNSTABLE CPU_SEQID
+1
View File
@@ -60,6 +60,7 @@ extern int zfs_dbgmsg_enable;
#define ZFS_DEBUG_METASLAB_ALLOC (1 << 13)
#define ZFS_DEBUG_BRT (1 << 14)
#define ZFS_DEBUG_RAIDZ_RECONSTRUCT (1 << 15)
#define ZFS_DEBUG_DDT (1 << 16)
extern void __set_error(const char *file, const char *func, int line, int err);
extern void __zfs_dbgmsg(char *buf);

Some files were not shown because too many files have changed in this diff Show More