Changed zfs_k(un)map_atomic to zfs_k(un)map_local
Signed-off-by: Jason Lee <jasonlee@lanl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Atkinson <batkinson@lanl.gov>
Currently, if a minor is in use when we try to remove it, we'll skip it
and never come back to it again. Since the zvol state is hung off the
minor in the kernel, this can get us into weird situations if something
tries to use it after the removal fails. It's even worse at pool export,
as there's now a vestigial zvol state with no pool under it. It's
weirder again if the pool is subsequently reimported, as the zvol code
(reasonably) assumes the zvol state has been properly setup, when it's
actually left over from the previous import of the pool.
This commit attempts to tackle that by setting a flag on the zvol if its
minor can't be removed, and then checking that flag when a request is
made and rejecting it, thus stopping new work coming in.
The flag also causes a condvar to be signaled when the last client
finishes. For the case where a single minor is being removed (eg
changing volmode), it will wait for this signal before proceeding.
Meanwhile, when removing all minors, a background task is created for
each minor that couldn't be removed on the spot, and those tasks then
wake and clean up.
Since any new tasks are queued on to the pool's spa_zvol_taskq,
spa_export_common() will continue to wait at export until all minors are
removed.
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Signed-off-by: Rob Norris <rob.norris@klarasystems.com>
Closes#14872Closes#16364
When process got SIGSTOP/SIGTSTP, issig() dequeue them and return 0.
But process could still have another signal pending after dequeue. So,
after dequeue, check and return 1, if signal_pending.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Jitendra Patidar <jitendra.patidar@nutanix.com>
Closes#16464
We always call it twice with JUSTLOOKING and then FORREAL.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Pawel Jakub Dawidek <pawel@dawidek.net>
Closes#16225
Even though block cloning is much faster than regular copying,
it is not instantaneous - the file might be large and the recordsize
small. It would be nice to be able to interrupt it with a signal
(e.g., SIGINFO on FreeBSD to see the progress).
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Pawel Jakub Dawidek <pawel@dawidek.net>
Closes#16208
Linux provides SLAB_RECLAIM_ACCOUNT and __GFP_RECLAIMABLE flags to
mark memory allocations that can be freed via shinker calls. It
should allow kernel to tune and group such allocations for lower
memory fragmentation and better reclamation under pressure.
This patch marks as reclaimable most of ARC memory, directly
evictable via ZFS shrinker, plus also dnode/znode/sa memory,
indirectly evictable via kernel's superblock shrinker.
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored by: iXsystems, Inc.
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Allan Jude <allan@klarasystems.com>
This commit replaces current usages of schedule_timeout() with
schedule_timeout_interruptible() in code paths that expect the running
task to sleep for a short period of time. When schedule_timeout() is
called without previously calling set_current_state(), the running
task never sleeps because the task state remains in TASK_RUNNING.
By calling schedule_timeout_interruptible() to set the task state to
TASK_INTERRUPTIBLE before calling schedule_timeout() we achieve the
intended/desired behavior of putting the task to sleep for the
specified timeout.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Daniel Perry <dtperry@amazon.com>
Closes#16150
Freeing an ABD can take sleeping locks to update various stats. We
aren't allowed to sleep on an interrupt handler. So, move the free off
to the io_done callback.
We should never have been freeing things in the interrupt handler, but
we got away with it because we were usually freeing a linear ABD, which
at most is returning two objects to a cache and never sleeping. Scatter
ABDs can be used now, and those have more complex locking.
Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Rob Norris <rob.norris@klarasystems.com>
Closes#16687
It seems out our notion of "properly" aligned IO was incomplete. In
particular, dm-crypt does its own splitting, and assumes that a logical
block will never cross an order-0 page boundary (ie, the physical page
size, not compound size). This effectively means that it needs to be
possible to split a BIO at any page or block size boundary and have it
work correctly.
This updates the alignment check function to enforce these rules (to the
extent possible).
Our response to misaligned data is to make some new allocation that is
properly aligned, and copy the data into it. It turns out that
linearising (via abd_borrow_buf()) is not enough, because we allocate eg
4K blocks from a general purpose slab, and so may receive (or already
have) a 4K block that crosses pages.
So instead, we allocate a new ABD, which is guaranteed to be aligned
properly to block sizes, and then copy everything into it, and back out
on the way back.
Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Rob Norris <rob.norris@klarasystems.com>
Closes#16687#16631#15646#15533#14533
(cherry picked from commit 63bafe60ec741c269d29e26b192a8a5c4f6acf92)
If on the first open device's logical ashift is bigger than set
by pool's ashift property, ignore the last as unusable instead of
creating vdev that will fail most of I/Os due to misalignment.
Reviewed-by: Rob Norris <robn@despairlabs.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ameer Hamza <ahamza@ixsystems.com>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored by: iXsystems, Inc.
Closes#16690
In FreeBSD's `zio_do_crypt_data()`, ensure that two `struct uio`
variables are cleared before copying data out of them. This avoids
accessing garbage data, and fixes gcc `-Wuninitialized` warnings.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Toomas Soome <tsoome@me.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Dimitry Andric <dimitry@andric.com>
Closes#16688
While mounting ZFS root during boot on Linux distributions from initrd,
mount from busybox is effectively used which executes mount system call
directly. This skips the ZFS helper mount.zfs, which checks and enables
the mount options as specified in dataset properties. As a result,
datasets mounted during boot from initrd do not have correct mount
options as specified in ZFS dataset properties.
There has been an attempt to use mount.zfs in zfs initrd script,
responsible for mounting the ZFS root filesystem (PR#13305). This was
later reverted (PR#14908) after discovering that using mount.zfs breaks
mounting of snapshots on root (/) and other child datasets of root have
the same issue (Issue#9461).
This happens because switching from busybox mount to mount.zfs correctly
parses the mount options but also adds 'mntpoint=/root' to the mount
options, which is then prepended to the snapshot mountpoint in
'.zfs/snapshot'. '/root' is the directory on Debian with initramfs-tools
where root filesystem is mounted before pivot_root. When Linux runtime
is reached, trying to access the snapshots on root results in
automounting the snapshot on '/root/.zfs/*', which fails.
This commit attempts to fix the automounting of snapshots on root, while
using mount.zfs in initrd script. Since the mountpoint of dataset is
stored in vfs_mntpoint field, we can check if current mountpoint of
dataset and vfs_mntpoint are same or not. If they are not same, reset
the vfs_mntpoint field with current mountpoint. This fixes the
mountpoints of root dataset and children in respective vfs_mntpoint
fields when we try to access the snapshots of root dataset or its
children. With correct mountpoint for root dataset and children stored
in vfs_mntpoint, all snapshots of root dataset are mounted correctly
and become accessible.
This fix will come into play only if current process, that is trying to
access the snapshots is not in chroot context. The Linux kernel API
that is used to convert struct path into char format (d_path), returns
the complete path for given struct path. It works in chroot environment
as well and returns the correct path from original filesystem root.
However d_path fails to return the complete path if any directory from
original root filesystem is mounted using --bind flag or --rbind flag
in chroot environment. In this case, if we try to access the snapshot
from outside the chroot environment, d_path returns the path correctly,
i.e. it returns the correct path to the directory that is mounted with
--bind flag. However inside the chroot environment, it only returns the
path inside chroot.
For now, there is not a better way in my understanding that gives the
complete path in char format and handles the case where directories from
root filesystem are mounted with --bind or --rbind on another path which
user will later chroot into. So this fix gets enabled if current
process trying to access the snapshot is not in chroot context.
With the snapshots issue fixed for root filesystem, using mount.zfs in
ZFS initrd script, mounts the datasets with correct mount options.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Ameer Hamza <ahamza@ixsystems.com>
Signed-off-by: Umer Saleem <usaleem@ixsystems.com>
Closes#16646
`zvol_rename_minors()` needs to be given the full path not just the
snapshot name. Use code removed in a0bd735ad as a guide
to providing the necessary values.
Add ZTS check for /dev changes after snapshot rename. After
renaming a snapshot with 'snapdev=visible' ensure that the /dev
entries are updated to reflect the rename.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: James Dingwall <james@dingwall.me.uk>
Closes#14223Closes#16600
Since arc_evict() run can take some time, arc_c change during it
may result in undesired shift in ARC states balance. Primarily in
case of arc_c reduction it may cause eviction from MFU data state
despite its being below the target already. Instead we should
evict as originally planned and if needed do another round after.
Reviewed-by: Theera K. <tkittich@hotmail.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored by: iXsystems, Inc.
Closes#16576Closes#16605
Compiling with -O0 (no proper optimizations), strlen() call
in loops for comparing the size, isn't being called/initialized
before the actual loop gets started, which causes n-numbers of
strlen() calls (as long as the string is). Keeping the length
before entering in the loop is a good idea.
On some places, even with -O2, both GCC and Clang can't
recognize this pattern, which seem to happen in an array
of char pointer.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: rilysh <nightquick@proton.me>
Closes#16584
Linux 6.10+ with CONFIG_FORTIFY_SOURCE notices memcpy() accessing past
the end of TString, because it has no indication that there there may be
an additional allocation there.
There's no appropriate upstream change for this (ancient) version of
Lua, so this is the narrowest change I could come up with to add a flex
array field to the end of TString to satisfy the check. It's loosely
based on changes from lua/lua@ca41b43f and lua/lua@9514abc2.
Sponsored-by: https://despairlabs.com/sponsor/
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rob Norris <robn@despairlabs.com>
Closes#16541Closes#16583
Since dsl_crypto_key_open() references the key, 0d23f5e2e4 should
have called dsl_crypto_key_rele() to drop it first instead of
calling dsl_crypto_key_free() directly. The final result should
actually be the same, but without triggering dck_holds assertion.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored by: iXsystems, Inc.
Closes#16567
Couple places in the code depend on 0 returned only if the task was
actually cancelled. Doing otherwise could lead to extra references
being dropped. The race could be small, but I believe CI hit it
from time to time.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored by: iXsystems, Inc.
Closes#16565
This adds the HAVE_KERNEL_NEON and HAVE_KERNEL_FPU_INTERNAL
guards to simd_stat.c defaulted to 0 to make it build again.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Shengqi Chen <harry-chen@outlook.com>
Signed-off-by: Sebastian Wuerl <s.wuerl@mailbox.org>
Closes#16558
Evidently while reworking it on aarch64, I broke it on x86 and
didn't notice.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes#16556
Too many times, people's performance problems have amounted to
"somehow your SIMD support isn't working", and determining that
at runtime is difficult to describe to people.
This adds a /proc/spl/kstat/zfs/simd node, which exposes
metadata about which instructions ZFS thinks it can use,
on AArch64 and x86_64 Linux, to make investigating things
like this much easier.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes#16530
Without updating 'm' we evict from MFU metadata all that we wanted
to evict from all metadata, including already evicted MRU metadata
('m' is the total amount of metadata we had at the beginning,
and 'w' is the total amount of metadata we want to have).
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Theera K. <tkittich@hotmail.com>
Closes#16521Closes#16546
zfs_acl_node_alloc allocates an uninitialized data buffer, but upstack
zfs_acl_chmod only partially initializes it. KMSAN reported that this
memory remained uninitialized at the point when it was read by
lzjb_compress, which suggests a possible kernel memory disclosure bug.
The full KMSAN warning may be found in the PR.
https://github.com/openzfs/zfs/pull/16511
Signed-off-by: Alan Somers <asomers@gmail.com>
Sponsored by: Axcient
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Optionally turn off disk's enclosure slot if an I/O is hung
triggering the deadman.
It's possible for outstanding I/O to a misbehaving SCSI disk to
neither promptly complete or return an error. This can occur due
to retry and recovery actions taken by the SCSI layer, driver, or
disk. When it occurs the pool will be unresponsive even though
there may be sufficient redundancy configured to proceeded without
this single disk.
When a hung I/O is detected by the kmods it will be posted as a
deadman event. By default an I/O is considered to be hung after
5 minutes. This value can be changed with the zfs_deadman_ziotime_ms
module parameter. If ZED_POWER_OFF_ENCLOSURE_SLOT_ON_DEADMAN is set
the disk's enclosure slot will be powered off causing the outstanding
I/O to fail. The ZED will then handle this like a normal disk failure.
By default ZED_POWER_OFF_ENCLOSURE_SLOT_ON_DEADMAN is not set.
As part of this change `zfs_deadman_events_per_second` is added
to control the ratelimitting of deadman events independantly of
delay events. In practice, a single deadman event is sufficient
and more aren't particularly useful.
Alphabetize the zfs_deadman_* entries in zfs.4.
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes#16226
ZIL log record structs (lr_XX_t) are frequently allocated with extra
space after the struct to carry variable-sized "payload" items.
Linux 6.10+ compiled with CONFIG_FORTIFY_SOURCE has been doing runtime
bounds checking on memcpy() calls. Because these types had no indicator
that they might use more space than their simple definition,
__fortify_memcpy_chk will frequently complain about overruns eg:
memcpy: detected field-spanning write (size 7) of single field
"lr + 1" at zfs_log.c:425 (size 0)
memcpy: detected field-spanning write (size 9) of single field
"(char *)(lr + 1)" at zfs_log.c:593 (size 0)
memcpy: detected field-spanning write (size 4) of single field
"(char *)(lr + 1) + snamesize" at zfs_log.c:594 (size 0)
memcpy: detected field-spanning write (size 7) of single field
"lr + 1" at zfs_log.c:425 (size 0)
memcpy: detected field-spanning write (size 9) of single field
"(char *)(lr + 1)" at zfs_log.c:593 (size 0)
memcpy: detected field-spanning write (size 4) of single field
"(char *)(lr + 1) + snamesize" at zfs_log.c:594 (size 0)
memcpy: detected field-spanning write (size 7) of single field
"lr + 1" at zfs_log.c:425 (size 0)
memcpy: detected field-spanning write (size 9) of single field
"(char *)(lr + 1)" at zfs_log.c:593 (size 0)
memcpy: detected field-spanning write (size 4) of single field
"(char *)(lr + 1) + snamesize" at zfs_log.c:594 (size 0)
To fix this, this commit adds flex array fields to all lr_XX_t structs
that require them, and then uses those fields to access that
end-of-struct area rather than more complicated casts and pointer
addition.
Sponsored-by: https://despairlabs.com/sponsor/
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rob Norris <robn@despairlabs.com>
Closes#16501Closes#16539
- Use the macros in few places it was missed.
- Reduce scope of DB_DNODE_ENTER/EXIT() and inline some DB_DNODE()
uses to make it more obvious what exactly is protected there and
make unprotected accesses by mistake more difficult.
- Make use of zrl_owner().
Reviewed-by: Rob Wing <rob.wing@klarasystems.com
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Allan Jude <allan@klarasystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored by: iXsystems, Inc.
Closes#16374
`l2arc_mfuonly` was added to avoid wasting L2 ARC on read-once MRU
data and metadata. However it can be useful to cache as much
metadata as possible while, at the same time, restricting data
cache to MFU buffers only.
This patch allow for such behavior by setting `l2arc_mfuonly` to 2
(or higher). The list of possible values is the following:
0: cache both MRU and MFU for both data and metadata;
1: cache only MFU for both data and metadata;
2: cache both MRU and MFU for metadata, but only MFU for data.
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Gionatan Danti <g.danti@assyoma.it>
Closes#16343Closes#16402
zvol queue limits initialization depends on `zv_volblocksize`, but it is
initialized later, leading to several limits being initialized with
incorrect values, including `max_discard_*` limits. This also causes
`blkdiscard` command to consistently fail, as `blk_ioctl_discard` reads
`bdev_max_discard_sectors()` limits as 0, leading to failure. The fix is
straightforward: initialize `zv->zv_volblocksize` early, before setting
the queue limits. This PR should fix `zvol/zvol_misc/zvol_misc_trim`
failure on recent PRs, as the test case issues `blkdiscard` for a zvol.
Additionally, `zvol_misc_trim` was recently enabled in `6c7d41a`,
which is why the issue wasn't identified earlier.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
Closes#16454
It gets hairier again in Linux 6.11, so I want some actual theory of
operation laid out for next time.
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rob Norris <robn@despairlabs.com>
Sponsored-by: https://despairlabs.com/sponsor/Closes#16400
My merged pull request #15557 fixes compilation of sha2 kernels on arm
v5/6. However, the compiler guards only allows sha256/512_armv7_impl to
be used when __ARM_ARCH > 6. This patch enables these ASM kernels on all
arm architectures. Some compiler guards are adjusted accordingly to
avoid the unnecessary compilation of SIMD (e.g., neon, armv8ce) kernels
on old architectures.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Shengqi Chen <harry-chen@outlook.com>
Closes#15623
This patch uses __ARM_ARCH set by compiler (both
GCC and Clang have this) whenever possible instead
of hardcoding it to 7. This change allows code to
compile on earlier ARM architectures such as armv5te.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Shengqi Chen <harry-chen@outlook.com>
Closes#15557
Rob Noris suggested that we could clean up redundant limits for the case
of non-blk mq scenario.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Rob Norris <robn@despairlabs.com>
Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
Closes#16462
In kernels 6.8 and later, the zvol block device is allocated with
qlimits passed during initialization. However, the zvol driver does not
set `max_hw_discard_sectors`, which is necessary to properly
initialize `max_discard_sectors`. This causes the `zvol_misc_trim` test
to fail on 6.8+ kernels when invoking the `blkdiscard` command. Setting
`max_hw_discard_sectors` in the `HAVE_BLK_ALLOC_DISK_2ARG` case resolve
the issue.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Rob Norris <robn@despairlabs.com>
Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
Closes#16462
If a zvol is renamed, and it has one or more snapshots, and
snapdev=visible is true for the zvol, then the rename causes a kernel
null pointer dereference error. This has the effect (on Linux, anyway)
of killing the z_zvol taskq kthread, with locks still held; which in
turn causes a variety of zvol-related operations afterward to hang
indefinitely (such as udev workers, among other things).
The problem occurs because of an oversight in #15486
(e36ff84c33). As documented in
dataset_kstats_create, some datasets may not actually have kstats
allocated for them; and at least at the present time, this is true for
snapshots. In practical terms, this means that for snapshots,
dk->dk_kstats will be NULL. The dataset_kstats_rename function
introduced in the patch above does not first check whether dk->dk_kstats
is NULL before proceeding, unlike e.g. the nearby
dataset_kstats_update_* functions.
In the very particular circumstance in which a zvol is renamed, AND that
zvol has one or more snapshots, AND that zvol also has snapdev=visible,
zvol_rename_minors_impl will loop over not just the zvol dataset itself,
but each of the zvol's snapshots as well, so that their device nodes
will be renamed as well. This results in dataset_kstats_create being
called for snapshots, where, as we've established, dk->dk_kstats is
NULL.
Fix this by simply adding a NULL check before doing anything in
dataset_kstats_rename.
This still allows the dataset_name kstat value for the zvol to be
updated (as was the intent of the original patch), and merely blocks
attempts by the code to act upon the zvol's non-kstat-having snapshots.
If at some future time, kstats are added for snapshots, then things
should work as intended in that case as well.
Signed-off-by: Justin Gottula <justin@jgottula.com>
Reviewed-by: Rob Norris <robn@despairlabs.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Alan Somers <asomers@gmail.com>
Reviewed-by: Allan Jude <allan@klarasystems.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
zvol_alloc_non_blk_mq()->blk_queue_set_write_cache() needs the disk
queue setup to prevent a NULL pointer deference.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes#16453
The 6.10 kernel broke our rpm-kmod builds. The 6.10 kernel really
wants the source files in the same directory as the object files.
This workaround makes rpm-kmod work again. It also updates
the builtin kernel codepath to work correctly with 6.10.
See kernel commits:
b1992c3772e6 kbuild: use $(src) instead of $(srctree)/$(src) for source
directory
9a0ebe5011f4 kbuild: use $(obj)/ instead of $(src)/ for common pattern
rules
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes#16439Closes#16450
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rob Norris <robn@despairlabs.com>
Sponsored-by: https://despairlabs.com/sponsor/Closes#16400
Since the change to folios it has just been a wrapper anyway. Linux has
removed their wrapper, so we add one.
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rob Norris <robn@despairlabs.com>
Sponsored-by: https://despairlabs.com/sponsor/Closes#16400
These fields are very old, so no detection necessary; we just move them
into the limit setup functions.
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rob Norris <robn@despairlabs.com>
Sponsored-by: https://despairlabs.com/sponsor/Closes#16400
Apply them with with the rest of the settings.
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rob Norris <robn@despairlabs.com>
Sponsored-by: https://despairlabs.com/sponsor/Closes#16400
Detect it, and use a macro to make sure we always match the prototype.
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rob Norris <robn@despairlabs.com>
Sponsored-by: https://despairlabs.com/sponsor/Closes#16400