Commit Graph

9146 Commits

Author SHA1 Message Date
Shengqi Chen
92c384cbac CI: run only sanity check on limited OSes for nonbehavioral changes
The commit uses heuristics to determine whether a PR is behavioral:

It runs "quick" CI (i.e., only use sanity.run on fewer OSes)
if (explicitly requested by user):
- the *last* commit message contains a line 'ZFS-CI-Type: quick',
or if (by heuristics):
- the files changed are not in the list of specified directory, and
- all commit messages does not contain 'ZFS-CI-Type: full'.

It runs "full" CI otherwise.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de>
Signed-off-by: Shengqi Chen <harry-chen@outlook.com>
Closes #16564
2024-11-04 11:08:44 -08:00
Brian Behlendorf
040be2504a CI: cancel workflows when PRs are updated (#16562)
For checkstyle, zloop, zfs-qemu, and codeql workflows cancel
in-progress jobs when the PR is updated.

Relevant GitHub Actions documentation:

  The following concurrency group cancels in-progress jobs or run
  on pull_request events only; if github.head_ref is undefined, the
  concurrency group will fallback to the run ID, which is guaranteed
  to be both unique and defined for the run.

https://docs.github.com/en/actions/writing-workflows/workflow-syntax-for-github-actions#example-using-a-fallback-value

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #16562
2024-11-04 11:08:38 -08:00
Brian Behlendorf
1b27b76f6f ZTS: CI Documentation Updates
Update the CONTRIBUTING.md documentation to refer to the GitHub Actions
workflows which have replaced the buildbot infrastructure.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #16561
2024-11-04 11:08:30 -08:00
Brian Behlendorf
af0a2bef40 ZTS: CodeQL Action v3 update
Switch from v2 to v3 CodeQL Actions.  The v2 actions will no longer
be supported as of Dec '24 so we need to move to v3.  According to
the release notes they should be functionally equivalent.

    Note that the only difference between v2 and v3 of the CodeQL
    Action is the node version they support, ... For example 3.22.11
    was the first v3 release and is functionally identical to 2.22.11.

https://github.com/github/codeql-action/blob/main/CHANGELOG.md

Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #16560
2024-11-04 11:08:24 -08:00
Brian Behlendorf
ce3ad59c27 ZTS: Add additional exceptions
The following tests have been observed to occasionally fail when
running under the CI.  Updated our exceptions list to track them.

Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #16553
2024-11-04 11:07:49 -08:00
Brian Behlendorf
46b0d5c486 ZTS: Retire "tmpfile_reason" exception
All supported Linux kernels, 4.18 and newer, provide O_TMPFILE.

Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #16553
2024-11-04 11:07:43 -08:00
Brian Behlendorf
b69c7f3847 ZTS: Retire "ci_reason" exceptions
There is no longer be a need for the ci_reason exception with
the update CI GitHub Actions infrastruture.  Retire it.

Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #16553
2024-11-04 11:07:37 -08:00
Tino Reichardt
484f92b8fa ZTS: Fix Summary Page
The qemu-9-summary-page.sh script reads the file env.txt in the
first lines. When the module didn't build, this file was not copied
into the tarfile - causing the scipt to abort.

Fix: copy needed files into the tarfile in case of module build
failures. The fix ignores also empty tarfiles in future.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de>
Closes #16555
2024-11-04 11:07:29 -08:00
Umer Saleem
e2bded28ea ZTS: Fix skipping over comment lines in zpool_create.shlib
In zpool_create.shlib, check_feature_set iterates over all features
mentioned in provided compatibility file to check if only those
features are enabled on the pool.

This commit fixes skipping over comment lines correctly. Otherwise,
the test case fails as comment lines are also treated as feature names
by check_feature_set function.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Umer Saleem <usaleem@ixsystems.com>
Closes #15909
2024-11-04 11:07:19 -08:00
Tino Reichardt
ca5652c420 ZTS: Remove functional tests via matrix
This commit changes the workflow of the github actions.

- Ubuntu 20.04, 22.04, 24.04 will be tested via QEMU now
- remove unused scripts of this commit: b7bc334d1
- re-add the zloop standalone testings via zloop.yml

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de>
Closes #16549
2024-11-04 11:07:04 -08:00
Tino Reichardt
53879c4ecc ZTS: Fix Test Summary page generation
Fix that error: "cat /tmp/failed.txt: No such file or directory"

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de>
Closes #16549
2024-11-04 11:06:56 -08:00
Tino Reichardt
1cbb695dcc ZTS: use openssl for md5digest and sha256digest
On larger files this should improve the speed.

Sample values of my system:

[mcmilk@xz]$ time dd if=/dev/zero bs=128k count=1k | sha256sum
254bcc3fc4f27172636df4bf32de9f107f620d559b20d760197e452b97453917  -
real    0m1,050s
user    0m0,985s
sys     0m0,153s

[mcmilk@xz]$ time dd if=/dev/zero bs=128k count=1k | openssl sha256 -r
254bcc3fc4f27172636df4bf32de9f107f620d559b20d760197e452b97453917 *stdin
real    0m0,254s
user    0m0,206s
sys     0m0,160s

I think cli_root/zdb/zdb_backup.ksh runs also an FreeBSD and I needed to
include the sysutils/coreutils package for the FreeBSD tests within the
QEMU patchset.

This could be reverted, when this pull request gets upstream

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de>
Closes #16543
2024-11-04 11:06:47 -08:00
Tino Reichardt
66db223778 ZTS: Use QEMU for tests on Linux and FreeBSD
This commit adds functional tests for these systems:
- AlmaLinux 8, AlmaLinux 9, ArchLinux
- CentOS Stream 9, Fedora 39, Fedora 40
- Debian 11, Debian 12
- FreeBSD 13, FreeBSD 14, FreeBSD 15
- Ubuntu 20.04, Ubuntu 22.04, Ubuntu 24.04

- enabled by default:
 - AlmaLinux 8, AlmaLinux 9
 - Debian 11, Debian 12
 - Fedora 39, Fedora 40
 - FreeBSD 13, FreeBSD 14

Workflow for each operating system:
- install qemu on the github runner
- download current cloud image of operating system
- start and init that image via cloud-init
- install dependencies and poweroff system
- start system and build openzfs and then poweroff again
- clone build system and start 2 instances of it
- run functional testings and complete in around 3h
- when tests are done, do some logfile preparing
- show detailed results for each system
- in the end, generate the job summary

Real-world benefits from this PR:

1. The github runner scripts are in the zfs repo itself. That means
   you can just open a PR against zfs, like "Add Fedora 41 tester", and
   see the results directly in the PR. ZFS admins no longer need
   manually to login to the buildbot server to update the buildbot config
   with new version of Fedora/Almalinux.

2. Github runners allow you to run the entire test suite against your
   private branch before submitting a formal PR to openzfs. Just open a
   PR against your private zfs repo, and the exact same
   Fedora/Alma/FreeBSD runners will fire up and run ZTS. This can be
   useful if you want to iterate on a ZTS change before submitting a
   formal PR.

3. buildbot is incredibly cumbersome. Our buildbot config files alone
   are ~1500 lines (not including any build/setup scripts)!
   It's a huge pain to setup.

4. We're running the super ancient buildbot 0.8.12. It's so ancient
   it requires python2. We actually have to build python2 from source
   for almalinux9 just to get it to run. Ugrading to a more modern
   buildbot is a huge undertaking, and the UI on the newer versions is
   worse.

5. Buildbot uses EC2 instances. EC2 is a pain because:
   * It costs money
   * They throttle IOPS and CPU usage, leading to mysterious,
   * hard-to-diagnose, failures and timeouts in ZTS.
   * EC2 is high maintenance. We have to setup security groups, SSH
   * keys, networking, users, etc, in AWS and it's a pain. We also
   * have to periodically go in an kill zombie EC2 instances that
   * buildbot is unable to kill off.

6. Buildbot doesn't always handle failures well. One of the things we
   saw in the past was the FreeBSD builders would often die, and each
   builder death would take up a "slot" in buildbot. So we would
   periodically have to restart buildbot via a cron job to get the slots
   back.

7. This PR divides up the ZTS test list into two parts, launches two
   VMs, and on each VM runs half the test suite. The test results are
   then merged and shown in the sumary page. So we're basically
   parallelizing ZTS on the same github runner. This leads to lower
   overall ZTS runtimes (2.5-3 hours vs 4+ hours on buildbot), and one
   unified set of results per runner, which is nice.

8. Since the tests are running on a VM, we have much more control over
   what happens. We can capture the serial console output even if the
   test completely brings down the VM. In the future, we could also
   restart the test on the VM where it left off, so that if a single test
   panics the VM, we can just restart it and run the remaining ZTS tests
   (this functionaly is not yet implemented though, just an idea).

9. Using the runners, users can manually kill or restart a test run
   via the github IU. That really isn't possible with buildbot unless
   you're an admin.

10. Anecdotally, the tests seem to be more stable and constant under
    the QEMU runners.

Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #16537
2024-11-04 11:06:02 -08:00
Tino Reichardt
36bac59fbc ZTS: increase timeout of mmap_sync_001_pos
On load the test needs sometimes a bit more time then just one second.
Doubling the time will help on the QEMU based testings.

Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de>
Closes #16537
2024-11-04 11:05:55 -08:00
Tino Reichardt
e29e7a0502 ZTS: fix zpool_status_008_pos test on qemu vm's
The test needs some adjusting within the timings.

Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Co-authored-by: Tino Reichardt <milky-zfs@mcmilk.de>
Closes #16537
2024-11-04 11:05:42 -08:00
Rob Norris
d6ec55963a zts-report: don't crash on non-UTF-8 chars in the log (#16497)
The report generator expects the log to be clean and tidy UTF-8. That
can be a problem if you use some of the verbose/debug test runner
options, which sends all sorts of weird output from arbitrary programs
to the log.

This just makes Python a little more relaxed about such things. It
shouldn't matter in practice, as those lines didn't match the test
result regex anyway, and are discarded immediately.


Sponsored-by: https://despairlabs.com/sponsor/

Signed-off-by: Rob Norris <robn@despairlabs.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
2024-11-04 11:05:33 -08:00
Ameer Hamza
66bb2a8389 Github workflow: fix typo in zloop artifact
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
Closes #16432
2024-11-04 11:03:30 -08:00
Mateusz Piotrowski
00dd5ea9e6 tests: user_property_001_pos: Remove unnecessary evals
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de>
Reviewed-by: Allan Jude <allan@klarasystems.com>
Signed-off-by: Mateusz Piotrowski <0mp@FreeBSD.org>
Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Closes #16248
2024-11-04 11:02:28 -08:00
Mateusz Piotrowski
b4502ac342 tests: user_property: Clarify comments
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de>
Reviewed-by: Allan Jude <allan@klarasystems.com>
Signed-off-by: Mateusz Piotrowski <0mp@FreeBSD.org>
Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Closes #16248
2024-11-04 11:02:19 -08:00
Tino Reichardt
55468ccc3e ZTS: small fix for SEEK_DATA/SEEK_HOLE tests (#16413)
Some libc's like uClibc lag the proper definition of SEEK_DATA
and SEEK_HOLE. Since we have only two files in ZTS which use
these definitons, let's define them by hand:

```
#ifndef SEEK_DATA
#define SEEK_DATA 3
#endif
#ifndef SEEK_HOLE
#define SEEK_HOLE 4
#endif
```

There should be no failures, because:
- FreeBSD has support for SEEK_DATA/SEEK_HOLE since FreeBSD 8
- Linux has it since Linux 3.1
- the libc will submit the parameters unchanged to the kernel

Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
2024-11-04 10:57:05 -08:00
Brian Behlendorf
b0cfb480ca zed: Add deadman-slot_off.sh zedlet
Optionally turn off disk's enclosure slot if an I/O is hung
triggering the deadman.

It's possible for outstanding I/O to a misbehaving SCSI disk to
neither promptly complete or return an error.  This can occur due
to retry and recovery actions taken by the SCSI layer, driver, or
disk.  When it occurs the pool will be unresponsive even though
there may be sufficient redundancy configured to proceeded without
this single disk.

When a hung I/O is detected by the kmods it will be posted as a
deadman event.  By default an I/O is considered to be hung after
5 minutes.  This value can be changed with the zfs_deadman_ziotime_ms
module parameter.  If ZED_POWER_OFF_ENCLOSURE_SLOT_ON_DEADMAN is set
the disk's enclosure slot will be powered off causing the outstanding
I/O to fail.  The ZED will then handle this like a normal disk failure.
By default ZED_POWER_OFF_ENCLOSURE_SLOT_ON_DEADMAN is not set.

As part of this change `zfs_deadman_events_per_second` is added
to control the ratelimitting of deadman events independantly of
delay events.  In practice, a single deadman event is sufficient
and more aren't particularly useful.

Alphabetize the zfs_deadman_* entries in zfs.4.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #16226
2024-11-04 10:49:53 -08:00
Rob Norris
8ca4319f60 ZTS: remove skips for zvol_misc tests
Last commit should fix the underlying problem, so these should be
passing reliably again.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Signed-off-by: Rob Norris <rob.norris@klarasystems.com>
Closes #16364
2024-11-04 10:37:57 -08:00
Brian Behlendorf
d66412884a Linux 6.11 compat: META
Update the META file to reflect compatibility with the 6.11 kernel.

Reviewed-by: Rob Norris <robn@despairlabs.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #16586
2024-11-04 10:34:48 -08:00
Rob Norris
3fb6647561 META: set Linux minimum version to 4.18
This sets RHEL8's base kernel[1] as the floor, and includes the oldest
kernel.org LTS (4.19).

1. https://access.redhat.com/articles/3078#RHEL8

Sponsored-by: https://despairlabs.com/sponsor/
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de>
Signed-off-by: Rob Norris <robn@despairlabs.com>
Closes #16479
2024-11-04 10:34:48 -08:00
Rob Norris
30ea442961 zfs_log: add flex array fields to log record structs
ZIL log record structs (lr_XX_t) are frequently allocated with extra
space after the struct to carry variable-sized "payload" items.

Linux 6.10+ compiled with CONFIG_FORTIFY_SOURCE has been doing runtime
bounds checking on memcpy() calls. Because these types had no indicator
that they might use more space than their simple definition,
__fortify_memcpy_chk will frequently complain about overruns eg:

    memcpy: detected field-spanning write (size 7) of single field
        "lr + 1" at zfs_log.c:425 (size 0)
    memcpy: detected field-spanning write (size 9) of single field
        "(char *)(lr + 1)" at zfs_log.c:593 (size 0)
    memcpy: detected field-spanning write (size 4) of single field
        "(char *)(lr + 1) + snamesize" at zfs_log.c:594 (size 0)
    memcpy: detected field-spanning write (size 7) of single field
        "lr + 1" at zfs_log.c:425 (size 0)
    memcpy: detected field-spanning write (size 9) of single field
        "(char *)(lr + 1)" at zfs_log.c:593 (size 0)
    memcpy: detected field-spanning write (size 4) of single field
        "(char *)(lr + 1) + snamesize" at zfs_log.c:594 (size 0)
    memcpy: detected field-spanning write (size 7) of single field
        "lr + 1" at zfs_log.c:425 (size 0)
    memcpy: detected field-spanning write (size 9) of single field
        "(char *)(lr + 1)" at zfs_log.c:593 (size 0)
    memcpy: detected field-spanning write (size 4) of single field
        "(char *)(lr + 1) + snamesize" at zfs_log.c:594 (size 0)

To fix this, this commit adds flex array fields to all lr_XX_t structs
that require them, and then uses those fields to access that
end-of-struct area rather than more complicated casts and pointer
addition.

Sponsored-by: https://despairlabs.com/sponsor/
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rob Norris <robn@despairlabs.com>
Closes #16501
Closes #16539
2024-11-04 10:34:48 -08:00
Alexander Motin
56608daa8d Cleanup DB_DNODE() macros usage
- Use the macros in few places it was missed.
 - Reduce scope of DB_DNODE_ENTER/EXIT() and inline some DB_DNODE()
uses to make it more obvious what exactly is protected there and
make unprotected accesses by mistake more difficult.
 - Make use of zrl_owner().

Reviewed-by: Rob Wing <rob.wing@klarasystems.com
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Allan Jude <allan@klarasystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by:	Alexander Motin <mav@FreeBSD.org>
Sponsored by:	iXsystems, Inc.
Closes #16374
2024-11-04 10:34:48 -08:00
Tony Hutter
baa5031456 Tag zfs-2.2.6
META file and changelog updated.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
2024-08-27 14:53:03 -07:00
shodanshok
cd42e992b5 Enable L2 cache of all (MRU+MFU) metadata but MFU data only
`l2arc_mfuonly` was added to avoid wasting L2 ARC on read-once MRU
data and metadata. However it can be useful to cache as much
metadata as possible while, at the same time, restricting data
cache to MFU buffers only.

This patch allow for such behavior by setting `l2arc_mfuonly` to 2
(or higher). The list of possible values is the following:
0: cache both MRU and MFU for both data and metadata;
1: cache only MFU for both data and metadata;
2: cache both MRU and MFU for metadata, but only MFU for data.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Gionatan Danti <g.danti@assyoma.it>
Closes #16343 
Closes #16402
2024-08-27 14:53:03 -07:00
Ameer Hamza
c60df6a801 linux/zvol_os: fix zvol queue limits initialization
zvol queue limits initialization depends on `zv_volblocksize`, but it is
initialized later, leading to several limits being initialized with
incorrect values, including `max_discard_*` limits. This also causes
`blkdiscard` command to consistently fail, as `blk_ioctl_discard` reads
`bdev_max_discard_sectors()` limits as 0, leading to failure. The fix is
straightforward: initialize `zv->zv_volblocksize` early, before setting
the queue limits. This PR should fix `zvol/zvol_misc/zvol_misc_trim`
failure on recent PRs, as the test case issues `blkdiscard` for a zvol.
Additionally, `zvol_misc_trim` was recently enabled in `6c7d41a`,
which is why the issue wasn't identified earlier.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
Closes #16454
2024-08-26 15:10:16 -07:00
Rob Norris
d8fa32a79d linux/zvol_os: tidy and document queue limit/config setup
It gets hairier again in Linux 6.11, so I want some actual theory of
operation laid out for next time.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rob Norris <robn@despairlabs.com>
Sponsored-by: https://despairlabs.com/sponsor/
Closes #16400
2024-08-26 15:10:16 -07:00
Tino Reichardt
88a5ee0706 ZTS: fix zfs_copies_006_pos test on Ubuntu 20.04 (#16409)
This test was failing before:
- FAIL cli_root/zfs_copies/zfs_copies_006_pos (expected PASS)

Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
2024-08-26 15:10:16 -07:00
Tino Reichardt
0465fbecd7 ZTS: fix history_007_pos test on Ubuntu 24.04 (#16410)
The timezone "US/Mountain" isn't supported on newer linux versions.
Using the correct timezone "America/Denver" like it's done in FreeBSD
will fix this. Older Linux distros should behave also okay with this.

Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
2024-08-26 15:10:16 -07:00
Shengqi Chen
a99a37991e contrib: link zpool to zfs in bash-completion (#16376)
Currently user won't have completion of `zpool` command until they
trigger completion of `zfs` first. This patch adds a link to `zfs`,
thus user can use both to initialize the completion.

Fixes: #16320

Signed-off-by: Shengqi Chen <harry-chen@outlook.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Rob Norris <rob.norris@klarasystems.com>
Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de>
2024-08-26 15:10:16 -07:00
Shengqi Chen
2ca1515374 module/icp/asm-arm/sha2: enable non-SIMD asm kernels on armv5/6
My merged pull request #15557 fixes compilation of sha2 kernels on arm
v5/6. However, the compiler guards only allows sha256/512_armv7_impl to
be used when __ARM_ARCH > 6. This patch enables these ASM kernels on all
arm architectures. Some compiler guards are adjusted accordingly to
avoid the unnecessary compilation of SIMD (e.g., neon, armv8ce) kernels
on old architectures.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Shengqi Chen <harry-chen@outlook.com>
Closes #15623
2024-08-26 15:10:16 -07:00
Shengqi Chen
bce36e21ca module/icp/asm-arm/sha2: auto detect __ARM_ARCH
This patch uses __ARM_ARCH set by compiler (both
GCC and Clang have this) whenever possible instead
of hardcoding it to 7. This change allows code to
compile on earlier ARM architectures such as armv5te.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Shengqi Chen <harry-chen@outlook.com>
Closes #15557
2024-08-26 15:10:16 -07:00
Tony Hutter
86492e3c96 Linux 6.10 compat: META
Update the META file to reflect compatibility with the 6.10 kernel.

Reviewed-by: Rob Norris <robn@despairlabs.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #16466
2024-08-22 15:43:46 -07:00
Ameer Hamza
07f0465742 linux/zvol_os.c: cleanup limits for non-blk mq case
Rob Noris suggested that we could clean up redundant limits for the case
of non-blk mq scenario.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Rob Norris <robn@despairlabs.com>
Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
Closes #16462
2024-08-22 15:43:34 -07:00
Ameer Hamza
0f9457d1dd linux/zvol_os.c: Fix max_discard_sectors limit for 6.8+ kernel
In kernels 6.8 and later, the zvol block device is allocated with
qlimits passed during initialization. However, the zvol driver does not
set `max_hw_discard_sectors`, which is necessary to properly
initialize `max_discard_sectors`. This causes the `zvol_misc_trim` test
to fail on 6.8+ kernels when invoking the `blkdiscard` command. Setting
`max_hw_discard_sectors` in the `HAVE_BLK_ALLOC_DISK_2ARG` case resolve
the issue.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Rob Norris <robn@despairlabs.com>
Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
Closes #16462
2024-08-22 15:43:34 -07:00
Justin Gottula
859f906a4b Fix null ptr deref when renaming a zvol with snaps and snapdev=visible (#16316)
If a zvol is renamed, and it has one or more snapshots, and
snapdev=visible is true for the zvol, then the rename causes a kernel
null pointer dereference error. This has the effect (on Linux, anyway)
of killing the z_zvol taskq kthread, with locks still held; which in
turn causes a variety of zvol-related operations afterward to hang
indefinitely (such as udev workers, among other things).

The problem occurs because of an oversight in #15486
(e36ff84c33). As documented in
dataset_kstats_create, some datasets may not actually have kstats
allocated for them; and at least at the present time, this is true for
snapshots. In practical terms, this means that for snapshots,
dk->dk_kstats will be NULL. The dataset_kstats_rename function
introduced in the patch above does not first check whether dk->dk_kstats
is NULL before proceeding, unlike e.g. the nearby
dataset_kstats_update_* functions.

In the very particular circumstance in which a zvol is renamed, AND that
zvol has one or more snapshots, AND that zvol also has snapdev=visible,
zvol_rename_minors_impl will loop over not just the zvol dataset itself,
but each of the zvol's snapshots as well, so that their device nodes
will be renamed as well. This results in dataset_kstats_create being
called for snapshots, where, as we've established, dk->dk_kstats is
NULL.

Fix this by simply adding a NULL check before doing anything in
dataset_kstats_rename.

This still allows the dataset_name kstat value for the zvol to be
updated (as was the intent of the original patch), and merely blocks
attempts by the code to act upon the zvol's non-kstat-having snapshots.
If at some future time, kstats are added for snapshots, then things
should work as intended in that case as well.

Signed-off-by: Justin Gottula <justin@jgottula.com>
Reviewed-by: Rob Norris <robn@despairlabs.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Alan Somers <asomers@gmail.com>
Reviewed-by: Allan Jude <allan@klarasystems.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
2024-08-22 15:42:49 -07:00
Tony Hutter
84a9861536 Linux 6.10 compat: Fix zvol NULL pointer deference
zvol_alloc_non_blk_mq()->blk_queue_set_write_cache() needs the disk
queue setup to prevent a NULL pointer deference.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #16453
2024-08-22 15:42:49 -07:00
Tony Hutter
9d64d1bfad Linux 6.10 compat: fix rpm-kmod and builtin
The 6.10 kernel broke our rpm-kmod builds.  The 6.10 kernel really
wants the source files in the same directory as the object files.
This workaround makes rpm-kmod work again.  It also updates
the builtin kernel codepath to work correctly with 6.10.

See kernel commits:

b1992c3772e6 kbuild: use $(src) instead of $(srctree)/$(src) for source
                     directory
9a0ebe5011f4 kbuild: use $(obj)/ instead of $(src)/ for common pattern
                     rules

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #16439
Closes #16450
2024-08-22 15:42:49 -07:00
Tony Hutter
ce22dc2589 ZTS: Use /dev/urandom instead of /dev/random
Use /dev/urandom so we never have to wait on entropy.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #16442
2024-08-22 15:42:14 -07:00
Rob Norris
8479a45abe Linux 6.11: avoid passing "end" sentinel to register_sysctl()
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rob Norris <robn@despairlabs.com>
Sponsored-by: https://despairlabs.com/sponsor/
Closes #16400
2024-08-22 15:42:14 -07:00
Rob Norris
8156099cf2 Linux 6.11: add compat macro for page_mapping()
Since the change to folios it has just been a wrapper anyway. Linux has
removed their wrapper, so we add one.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rob Norris <robn@despairlabs.com>
Sponsored-by: https://despairlabs.com/sponsor/
Closes #16400
2024-08-22 15:42:14 -07:00
Rob Norris
11ad6124c3 Linux 6.11: add more queue_limit fields with removed setters
These fields are very old, so no detection necessary; we just move them
into the limit setup functions.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rob Norris <robn@despairlabs.com>
Sponsored-by: https://despairlabs.com/sponsor/
Closes #16400
2024-08-22 15:42:14 -07:00
Rob Norris
11de432c8b Linux 6.11: IO stats is now a queue feature flag
Apply them with with the rest of the settings.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rob Norris <robn@despairlabs.com>
Sponsored-by: https://despairlabs.com/sponsor/
Closes #16400
2024-08-22 15:42:14 -07:00
Rob Norris
464747ffd3 Linux 6.11: first arg to proc_handler is now const
Detect it, and use a macro to make sure we always match the prototype.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rob Norris <robn@despairlabs.com>
Sponsored-by: https://despairlabs.com/sponsor/
Closes #16400
2024-08-22 15:42:14 -07:00
Rob Norris
92a8af0f8b Linux 6.11: get backing_dev_info through queue gendisk
It's no longer available directly on the request queue, but its easy to
get from the attached disk.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rob Norris <robn@despairlabs.com>
Sponsored-by: https://despairlabs.com/sponsor/
Closes #16400
2024-08-22 15:42:14 -07:00
Rob Norris
4fa84563b8 Linux 6.11: enable queue flush through queue limits
In 6.11 struct queue_limits gains a 'features' field, where, among other
things, flush and write-cache are enabled. Detect it and use it.

Along the way, the blk_queue_set_write_cache() compat wrapper gets a
little cleanup. Since both flags are alway set together, its now a
single bool. Also the very very ancient version that sets q->flush_flags
directly couldn't actually turn it off, so I've fixed that. Not that we
use it, but still.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rob Norris <robn@despairlabs.com>
Sponsored-by: https://despairlabs.com/sponsor/
Closes #16400
2024-08-22 15:42:14 -07:00
Mark Johnston
6961d4fb57 ZTS: Add a test to verify that copy_file_range obeys RLIMIT_FSIZE
Signed-off-by: Mark Johnston <markj@FreeBSD.org>

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Allan Jude <allan@klarasystems.com>
Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
2024-08-22 15:31:56 -07:00