Commit Graph

10297 Commits

Author SHA1 Message Date
Mariusz Zaborski
1e8c96d7d5 Add knob to disable slow io notifications
Introduce a new vdev property `VDEV_PROP_SLOW_IO_REPORTING` that
allows users to disable notifications for slow devices.
This prevents ZED and/or ZFSD from degrading the pool due to slow
I/O.

Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Mariusz Zaborski <oshogbo@FreeBSD.org>
Closes 17477
2025-11-12 13:07:14 -08:00
Alexander Motin
41878d57ea Add BRT support to zpool prefetch command
Implement BRT (Block Reference Table) prefetch functionality similar
to existing DDT prefetch.  This allows preloading BRT metadata into
ARC to improve performance for block cloning operations and frees
of earlier cloned blocks.

Make -t parameter optional.  When omitted, prefetch all supported
metadata types (both DDT and BRT now).

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <alexander.motin@TrueNAS.com>
Closes #17890
2025-11-12 13:07:09 -08:00
Alexander Motin
002bc3da6a BRT: Increase block size from 4KB to 8KB
According to my observations, BRT ZAPs are typically compressible
3:1 for data and 2:1 for indirects.  With ashift=12, typical these
days, it means increasing the block sizes to 8KB we may get most
of possible compression, reducing on-disk and in-ARC BRT footprint
in half by the cost of some compression/decompression overhead,
but without real write inflation, only some dirty data increase.

Increase to 32KB similar to DDT could further increase compression
and storage efficiency, but at the cost of write inflation and
much bigger dirty data increase, which we can not properly control
now.  So lets leave this for a time when BRT log gets implemented.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <alexander.motin@TrueNAS.com>
Closes #17916
2025-11-12 13:07:04 -08:00
Alexander Motin
e895c76194 ZAP: Remove dmu_object_info_from_dnode() call
dmu_object_info_from_dnode() takes two locks and copies plenty of
data that we don't need in zap_lockdir_impl().  Just read dn_type
directly in this hot path.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <alexander.motin@TrueNAS.com>
Closes #17921
2025-11-12 13:07:00 -08:00
Rob Norris
ac0bc4cc00 spa_misc: add an API for spa_namespace_lock
This is useful as debugging support, as it lets namespace lock
operations be traced directly. It will also be useful for future work to
reduce the use of spa_namespace_lock, traditionally a source of
difficult deadlocks.

Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Rob Norris <rob.norris@klarasystems.com>
Closes #17906
2025-11-12 13:06:54 -08:00
Alexander Motin
e305c7d596 BRT: Fix ranges to blocks conversion math
BRT_RANGESIZE_TO_NBLOCKS() takes number of ranges as its argument.
To get number of blocks we should multiply it by the entry size,
not divide by it, as it was due to missing parentheses.

Before #17875 this could cause small memory corruptions for vdevs
bigger than 64TB, but the change made the bug more noticeable.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <alexander.motin@TrueNAS.com>
Closes #17886
Closes #17915
2025-11-12 13:06:48 -08:00
Adi-Goll
e1734111fd Update man page description of zpool rewind
Update description of zpool import --rewind-to-checkpoint in
man/man7/zpoolconcepts.7 to explain that rewinding automatically
discards a checkpoint.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Adi Gollamudi <adigollamudi@gmail.com>
Closes #12646
Closes #17918
2025-11-12 13:06:43 -08:00
Alexander Motin
aaf374bd40 ZIO: Set minimum number of free issue threads to 32
Free issue threads might block waiting for synchronous DDT, BRT or
GANG header reads. So unlike other taskqs using ZTI_SCALE to scale
with number of CPUs, here we also need some amount of threads to
potentially saturate pool reads.  I am not sure we always want the
96 threads we had before ZTI_SCALE introduction at #11966 on small
systems, but lets make it at least 32.

While here, make free taskqs configurable, similar to read and
write ones.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Rob Norris <robn@despairlabs.com>
Signed-off-by: Alexander Motin <alexander.motin@TrueNAS.com>
Closes #17903
2025-11-12 13:06:39 -08:00
rmacklem
583db40030 FreeBSD: Add support for _PC_CASE_INSENSITIVE
FreeBSD now has a pathconf name called _PC_CASE_INSENSITIVE
used to check if a file system performs case insensitive
name lookups.

This patch adds support for this name.

Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
 Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rick Macklem <rmacklem@uoguelph.ca>
Closes #17908
2025-11-12 13:06:36 -08:00
Brian Behlendorf
84dd55510b zstd: disable intrinsics
Disable the aarch64 NEON SIMD intrinsics for kernel builds.  Safely
using them in the kernel context requires saving/restoring the FPU
registers which is not currently done.

Additionally, remove the aarch64 optimized PREFETCH_L1 and PREFETCH_L2
instruction.  Rely on the more portable compiler built ins.

This lets us remove the problematic workaround in the aarch64_compat.h
header which undefines the __aarch64__ macro.

Reviewed-by: Rob Norris <robn@despairlabs.com>
Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #17904
Closes #17852
2025-11-12 13:06:22 -08:00
Adi-Goll
015729a11b Fix typo in vdev_raidz.c
Change the spelling of "begining" on line 4875 to
"beginning".

Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Adi Gollamudi <adigollamudi@gmail.com>
Closes #17905
2025-11-12 13:06:19 -08:00
Toomas Soome
4fd926ab40 libzfs: ignoring unreachable code
We have infinite loop and on certain condition, we exit this loop
and thread with pthread_exit(). But also after this loop,
we have a code to perform pthread_cleanup_pop() and return from the
thread.

The  problem is that modern compilers are able to recognize that we
actually never get to the statements after loop and therefore
it is dead code there.

I think, instead of pthread_exit(), it is better to break out of loop
and let the last statements to work as intended. This is because
we do need to keep pthread_cleanup_pop() anyhow. Of course,
it is matter of taste if we want to use return or pthread_exit as very
last statement in this function.

Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Igor Kozhukhov <igor@dilos.org>
Signed-off-by: Toomas Soome <tsoome@me.com>
Closes #17900
2025-11-12 13:06:15 -08:00
Rob Norris
7b121388fb man: describe zfs-rewrite method and properties
We've heard anecdotes that suggest some
confusion/surprise/disappointment that a changed recordsize is not
applied during rewrite. Until such time as we actually can do that, we
can at least explicitly mention it at something that doesn't work.

Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Reviewed-by: Allan Jude <allan@klarasystems.com>
Signed-off-by: Rob Norris <rob.norris@klarasystems.com>
Closes #17898
2025-11-12 13:06:10 -08:00
Alexander Ziaee
055e908d47 zfs-jail.8: Add introductory sentence, refactor
Add an introductory sentance explaining why the reader may want to use
this command, and establishing the requirement that the jail must be
running. Move other requirements from the description of the subcommands
to follow this for flow and structure. Move the caveat that this is for
FreeBSD down to a cannonical CAVEATS section, and crossreference Linux's
equivelant functionality. Mention that this utility can not be used to
delegate the root directory of the jail to that section also.

Reported by: Jan Brankamp <crest@rlwinm.de>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Allan Jude <allan@klarasystems.com>
Signed-off-by: Alexander Ziaee <ziaee@FreeBSD.org>
Closes #17883
2025-11-12 13:06:06 -08:00
Tony Hutter
a2a34d9212 Linux 6.17 compat: Fix broken projectquota on 6.17
We need to specifically use the FX_XFLAG_* macros in zpl_ioctl_*attr()
codepaths, and the FS_*_FL macros in the zpl_ioctl_*flags() codepaths.
The earlier code just assumes the FS_*_FL macros for both codepaths.
The 6.17 kernel add a bitmask check in copy_fsxattr_from_user() that
exposed this error via failing 'projectquota' ZTS tests.

Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #17884
Closes #17869
2025-11-12 13:06:01 -08:00
Paul Dagnelie
dda711dbb5 Fix gang write late_arrival bug
When a write comes in via dmu_sync_late_arrival, its txg is equal to the
open TXG. If that write gangs, and we have not yet activated the new
gang header feature, and the gang header we pick can store a larger gang
header, we will try to schedule the upgrade for the open TXG + 1. In
debug mode, this causes an assertion to trip. This PR sets the TXG for
activating the feature to be the larger of either the current open TXG
or the syncing TXG + 1.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Paul Dagnelie <paul.dagnelie@klarasystems.com>
Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Closes #17824
2025-11-12 13:05:54 -08:00
Tino Reichardt
be1e5d599b CI: Update FreeBSD versions and ci-type handling
Update FreeBSD versions:
- add FreeBSD 15.0-STABLE
- add FreeBSD 16.0-CURRENT

So we use the latest versions of each line now:
  - Freebsd 14.3 (RELEASE)
  - FreeBSD 15.0 (STABLE)
  - FreeBSD 16.0 (CURRENT)

In commits - you may specify which type of CI should run:
- ZFS-CI-Type: quick
- ZFS-CI-Type: linux
- ZFS-CI-Type: freebsd
- ZFS-CI-Type: full

Reviewed-by: Alexx Saver <lzsaver@users.noreply.github.com>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de>
Closes #17896
2025-11-12 13:05:49 -08:00
Toomas Soome
612e8f1e57 get_key_material_https: label 'kfdok' defined but not used
The label 'kfdok' is only used with O_TMPFILE, we need to use
the same #ifdef around this label.

Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Toomas Soome <tsoome@me.com>
Closes #17894
2025-11-12 13:05:45 -08:00
Robert Evans
5582e8b08e Update dnode_next_offset_level to accept blkid instead of offset
Currently this function uses L0 offsets which:
1. is hard to read since it maps offsets to blkid and back each call
2. necessitates dnode_next_block to handle edge cases at limits
3. makes it hard to tell if the traversal can loop infinitely

Instead, update this and dnode_next_offset to work in (blkid, index).
This way the blkid manipulations are clear, and it's also clear that
the traversal always terminates since blkid goes one direction.

I've also considered updating dnode_next_offset to operate on blkid.
Callers use both patterns, so maybe another PR can split the cases?

While here tidy up dnode_next_offset_level comments.

Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Robert Evans <evansr@google.com>
Closes #17792
2025-11-12 13:05:40 -08:00
jamisiveshkumar
c9835dab1f Fix capitalization typo in README.md
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Sivesh Kumar <siveshjami@gmail.com>
Closes #17889
2025-11-12 13:05:35 -08:00
Alexander Motin
67fc49433f Cleanup ZIO_FLAG_IO_RETRY vs TRYHARD usage
In cases where all issued ZIOs must succeed, and we can't do
anything clever about the errors, we should just explicitly set
ZIO_FLAG_TRYHARD and let OS to do all the reasonable retries.

In other cases, where retries can be different from the original,
for example, some ZIOs are allowed to fail due to redundancy, or
we can disable aggregation on retrial to get at least some of
the data, we can do first pass without TRYHARD, and only if needed
retry with ZIO_FLAG_IO_RETRY (which implies TRYHARD semantics).

Reviewed-by: Rob Norris <robn@despairlabs.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <alexander.motin@TrueNAS.com>
Closes #17877
2025-11-12 13:05:31 -08:00
Alexander Motin
e3acd0a728 Fix caching of DDT log and BRT
Both DDT log and BRT counters we read on pool import and then only
append or overwrite in full blocks.  We don't need them in DMU or
ARC caches.  Fortunately we have DMU_UNCACHEDIO for this now.

Even more we don't need BRT in non-evictable metadata DMU caches,
since it will likely never fit there, while block the cache from
its original users.  Since DMU_OT_IS_METADATA_CACHED() has no way
to differentiate the new metadata types, mark BRT with storage
type of DMU_OT_DDT_ZAP.  As side effect it will also put it on
dedup device, but that should actually be right.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <alexander.motin@TrueNAS.com>
Closes #17875
2025-11-12 13:05:25 -08:00
Alexander Motin
178a8be216 BRT: Round bv_entcount up to BRT_BLOCKSIZE
Since we set bv_mos_brtvdev block size, and since we keep dirty
bitmap at the same granularity, we should keep the allocations
and writes done with.  Otherwise it makes the last block write
short, that will be odd once we implement writing of only dirty
blocks, but also requires read-modify-write on DMU layer.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <alexander.motin@TrueNAS.com>
Closes #17875
2025-11-12 13:05:21 -08:00
Joseph Anthony Pasquale Holsten
29567f13f6 autogen.sh: remove workaround for automake <1.14, needed for EL <=7
Ultimately this is a revert of 779ac93, which according to
@nabijaczleweli is to paper over automake <1.14's lack of
%reldir% support.

As I understand it, EL8 is the lowest current build target.

Reviewed-by: Rob Norris <robn@despairlabs.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Joseph Holsten <joseph@josephholsten.com>
Closes #17878
2025-11-12 13:05:16 -08:00
Brian Behlendorf
e8d2e08345 Retire ZoL patch scripts
Remove the out of date helper scripts originally used to port
Illumos commits to the ZoL repository.  Due to layout changes
made to this repository they're no longer entirely correct.
Remove them to make it clear they're no longer being used or
actively maintained.

Reviewed-by: Rob Norris <robn@despairlabs.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #17880
2025-11-12 13:05:11 -08:00
Alexander Motin
5847626175 Pass flags to more DMU write/hold functions
Over the time many of DMU functions got flags argument to control
prefetch, caching, etc.  Few functions though left without it, even
though closer look shown that many of them do not require prefetch
due to their access pattern.  This patch adds the flags argument to
dmu_write(), dmu_buf_hold_array() and dmu_buf_hold_array_by_bonus(),
passing DMU_READ_NO_PREFETCH where applicable.

I am going to also pass DMU_UNCACHEDIO to some of them later.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Rob Norris <robn@despairlabs.com>
Signed-off-by: Alexander Motin <alexander.motin@TrueNAS.com>
Closes #17872
2025-11-12 13:04:58 -08:00
Quartz
9a9e06e5dd man: Update zpool-event subclass names and document new types
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Quartz <yyhran@163.com>
Closes #17868
2025-11-12 13:04:51 -08:00
Toomas Soome
82d59f7666 ZTS: autotrim_config.ksh is missing pool type
functional/trim tests do create pools of different types to test
trim, autotrim_config.ksh is missing the type from zpool
create command line while we are looping over different pool
types.

Sponsored-by: Edgecast Cloud LLC.
Signed-off-by: Toomas Soome <tsoome@me.com>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #17874
2025-11-12 13:04:47 -08:00
Ryan Libby
672fea2a50 FreeBSD zio_crypt.c: initialize uio variables before access
In zio_crypt_key_wrap and zio_crypt_key_unwrap, the cuio_s variable was
not initialized before the calls to zfs_uio_init, leading to
uninitialized access to cuio_s.uio_offset.  Initialize it to avoid gcc
warnings.

Similar issue as fixed in 2bf152021 ("Fix gcc uninitialized warning in
FreeBSD zio_crypt.c")

Signed-off-by: Ryan Libby <rlibby@FreeBSD.org>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #17863
2025-11-12 13:04:42 -08:00
Rob Norris
ad6eee2b9b mailmap/AUTHORS: update with recent new contributors
We’re not always on the same page, but at least we’re in the same book.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Signed-off-by: Rob Norris <robn@despairlabs.com>
Closes #17860
2025-11-12 13:04:35 -08:00
Rob Norris
f43839e7fd ZTS: fail test run if test runner crashes unexpectedly
zfs-tests.sh executes test-runner.py to do the actual test work. Any
exit code < 4 is interpreted as success, with the actual value
describing the outcome of the tests inside.

If a Python program crashes in some way (eg an uncaught exception), the
process exit code is 1.

Taken together, this means that test-runner.py can crash during setup,
but return a "success" error code to zfs-tests.sh, which will report and
exit 0. This in turn causes the CI runner to believe the test run
completed successfully.

This commit addresses this by making zfs-tests.sh interpret an exit code
of 255 as a failure in the runner itself. Then, in test-runner.py, the
"fail()" function defaults to a 255 return, and the main function gets
wrapped in a generic exception handler, which prints it and calls
fail().

All together, this should mean that any unexpected failure in the test
runner itself will be propagated out of zfs-tests.sh for CI or any other
calling program to deal with.

Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Rob Norris <rob.norris@klarasystems.com>
Closes #17858
2025-11-12 12:57:59 -08:00
Tony Hutter
814f9afba7 Tag 2.4.0-rc3
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
2025-10-21 09:52:08 -07:00
Jean-Sébastien Pédron
6f6e1c90ae FreeBSD: zfs_getpages: Don't zero freshly allocated pages
Initially, `zfs_getpages()` is provided with an array of busy pages by
the vnode pager. It then tries to acquire the range lock, but if there
is a concurrent `zfs_write()` running and fails to acquire that range
lock, it "unbusies" the pages to avoid a deadlock with `zfs_write()`.
After that, it grabs the pages again and retries to acquire the range
lock, and so on.

Once it got the range lock, it filters out valid pages, then copy DMU
data to the remaining invalid pages.

The problem is that freshly allocated zero'd pages it grabbed itself are
marked as valid. Therefore they are skipped by the second part of the
function and DMU data is never copied to these pages. This causes mapped
pages to contain zeros instead of the expected file content.

This was discovered while working on RabbitMQ on FreeBSD. I could
reproduce the problem easily with the following commands:

    git clone https://github.com/rabbitmq/rabbitmq-server.git
    cd rabbitmq-server/deps/rabbit

    gmake distclean-ct RABBITMQ_METADATA_STORE=mnesia \
      ct-amqp_client t=cluster_size_3:leader_transfer_stream_send

The testsuite fails because there is a sendfile(2) that can happen
concurrently to a write(2) on the same file. This leads to sendfile(2)
or read(2) (after the sendfile) sending/returning data with zeros, which
causes a function to crash.

The patch consists of not setting the `VM_ALLOC_ZERO` flag when
`zfs_getpages()` grabs pages again. Then, the last page is zero'd if it
is invalid, in case it would be partially filled with the end of the
file content. Other pages are either valid (and will be skipped) or they
will be entirely overwritten by the file content.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Mark Johnston <markj@FreeBSD.org>
Signed-off-by: Jean-Sébastien Pédron <dumbbell@FreeBSD.org>
Closes #17851
2025-10-21 09:50:43 -07:00
Rob Norris
aeff23939a Linux 6.18: generic_drop_inode() and generic_delete_inode() renamed
Sponsored-by: https://despairlabs.com/sponsor/
Signed-off-by: Rob Norris <robn@despairlabs.com>
2025-10-21 09:50:43 -07:00
Rob Norris
7730109762 sha256_generic: make internal functions a little more private
Linux 6.18 has conflicting prototypes for various sha256_* and sha512_*
functions, which we get through a very long include chain. That's tough
to fix right now; easier is just to rename our internal functions.

Sponsored-by: https://despairlabs.com/sponsor/
Signed-off-by: Rob Norris <robn@despairlabs.com>
2025-10-21 09:50:43 -07:00
Rob Norris
2778832e22 Linux 6.18: namespace type moved to ns_common
The namespace type has moved from the namespace ops struct to the
"common" base namespace struct. Detect this and define a macro that does
the right thing for both versions.

Sponsored-by: https://despairlabs.com/sponsor/
Signed-off-by: Rob Norris <robn@despairlabs.com>
2025-10-21 09:50:43 -07:00
Rob Norris
005c631499 Linux 6.18: replace write_cache_pages()
Linux 6.18 removed write_cache_pages() without a usable replacement.
Here we implement a minimal zpl_write_cache_pages() that find the dirty
pages within the mapping, gets them into the expected state and hands
them off to zfs_putpage(), which handles the rest.

Sponsored-by: https://despairlabs.com/sponsor/
Signed-off-by: Rob Norris <robn@despairlabs.com>
2025-10-21 09:50:43 -07:00
Rob Norris
04d0f83f4e Linux 6.18: block_device_operations->getgeo takes struct gendisk*
Sponsored-by: https://despairlabs.com/sponsor/
Signed-off-by: Rob Norris <robn@despairlabs.com>
2025-10-21 09:50:43 -07:00
Rob Norris
49f078997a Linux 6.18: convert ida_simple_* calls
ida_simple_get() and ida_simple_remove() are removed in 6.18. However,
since 4.19 they have been simple wrappers around ida_alloc() and
ida_free(), so we can just use those directly.

Sponsored-by: https://despairlabs.com/sponsor/
Signed-off-by: Rob Norris <robn@despairlabs.com>
2025-10-21 09:50:43 -07:00
Rob Norris
3fb241157f Linux 6.18: replace nth_page()
Sponsored-by: https://despairlabs.com/sponsor/
Signed-off-by: Rob Norris <robn@despairlabs.com>
2025-10-21 09:50:43 -07:00
Andrew Walker
799bda73e2 Fix return value for setting zvol threading
We must return -1 instead of ENOENT if the special zvol threading
property set function can't locate the dataset (this would typically
happen with an encypted and unmounted zvol) so that the operation
gets inserted properly into the nvlist for operations to set. This
is because we want the property to be set once the zvol is
decrypted again.

Reviewed-by: Allan Jude <allan@klarasystems.com>
Reviewed-by: Rob Norris <robn@despairlabs.com>
Reviewed-by: Ameer Hamza <ahamza@ixsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Andrew Walker <awalker@ixsystems.com>
Closes #17836
2025-10-21 09:50:43 -07:00
Shreshth3
b0106a1b74 zdb: fix bug with -A flag
Fixes #10544.

According to the manpage, zdb -A should
ignore all assertions. But it currently
does not do that. This commit fixes
this bug.

Signed-off-by: Shreshth Srivastava <shreshthsrivastava2@gmail.com>
Reviewed-by: Rob Norris <rob.norris@klarasystems.com>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Closes #17825
2025-10-21 09:50:43 -07:00
Andrew Walker
084f8d0077 Fix ZFS_READONLY implementation on Linux
MS-FSCC 2.6 is the governing document for
DOS attribute behavior. It specifies the following:

For a file, applications can read the file but
cannot write to it or delete it. For a directory,
applications cannot delete it, but applications can
create and delete files from the directory.

Signed-off-by: Andrew Walker <awalker@ixsystems.com>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Reviewed-by: Ameer Hamza <ahamza@ixsystems.com>
Reviewed-by: Allan Jude <allan@klarasystems.com>
Reviewed-by: Rob Norris <rob.norris@klarasystems.com>
Closes #17837
2025-10-21 09:50:43 -07:00
Brian Behlendorf
7987d4deb4 Update device removal documentation
Make a minor update to the 'zpool remove' man page to clarify both
raidz and draid pools do not support removal, and change sector to
ashift which is what we actually care about.

Update the big theory comment in vdev_removal.c to accurately reflect
which types of vdevs can be removed.  Furthermore, I've added some
discussion for the casual reader to briefly explain the top-level
vdev removal restrictions.  This has been a common area of confusion
and it's not intuitive where they come from without understanding
the implementation details.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Closes #17847
2025-10-21 09:50:43 -07:00
Rob Norris
1956417b54 mmap_seek: print error code and text on failure
If lseek() returns an unexpected error, it's useful to know the error
code to help connect it to the trouble spot inside the module.

Since the two seek functions should be basically identical, lift them
into a single generic function.

Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Reviewed-by: Robert Evans <evansr@google.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rob Norris <rob.norris@klarasystems.com>
Closes #17843
2025-10-21 09:50:43 -07:00
Ameer Hamza
3378a324df CI: Fix FreeBSD 15.0 by staying on ALPHA4 due to broken ALPHA5 image
FreeBSD 15.0-ALPHA5 image fails to boot on cloud VMs due to missing
/boot/efi mount point, causing the system to drop to single user mode
where SSH cannot start. Work around this by staying on ALPHA4 and
setting IGNORE_OSVERSION=yes to bypass pkg's kernel version mismatch
prompt during bootstrap. This allows CI to proceed with ALPHA4 until we
have a stable FreeBSD 15.0 image.

Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Closes #17846
2025-10-21 09:50:43 -07:00
Shreshth3
f16fa115d1 arc: fix small typos
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Allan Jude <allan@klarasystems.com>
Reviewed-by: Rob Norris <rob.norris@klarasystems.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Shreshth Srivastava <shreshthsrivastava2@gmail.com>
Closes #17840
2025-10-21 09:50:43 -07:00
Rob Norris
f0c76f8a7b libzpool/cmn_err: remove suppression, add stop option, cleanup
A small uplift of the cmn_err() and panic() calls in userspace:

- remove the suppression on CE_NOTE. We have very few of these calls in
  a standard build, it's convenient for "print debugging".

- make prefixes clear and consistent.

- add LIBZPOOL_PANIC_STOP environment variable to send SIGSTOP to the
  process group on a panic, rather than abort(), so all threads remain
  alive for inspection.

Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com>
Signed-off-by: Rob Norris <rob.norris@klarasystems.com>
Closes #17834
2025-10-21 09:50:43 -07:00
Mark Johnston
c1f55bff8b Fix the type of the raidz_outlier_check_interval_ms parameter
It's an hrtime_t, which is an unsigned long long.  In practice this is
just a U64.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Allan Jude <allan@klarasystems.com>
Reviewed-by: Rob Norris <rob.norris@klarasystems.com>
Signed-off-by: Mark Johnston <markj@FreeBSD.org>
Closes #17833
2025-10-21 09:50:43 -07:00
Alexander Motin
f0bff230f9 Suppress some ashift warnings
Do not warn about vdev ashifts being smaller then physical ashifts
in a pool status if the pool ashift property set and vdev ashift
satisfies it (bigger or equal), since user explicitly requested
this.  The ashift of individual vdevs are still reported.

Do not warn about vdev ashifts in zpool import, since it doesn't
matter much, and we don't even report individual vdevs ashifts
there.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Allan Jude <allan@klarasystems.com>
Reviewed-by: Rob Norris <rob.norris@klarasystems.com>
Signed-off-by: Alexander Motin <alexander.motin@TrueNAS.com>
Closes #17830
2025-10-21 09:50:43 -07:00