Commit Graph

9331 Commits

Author SHA1 Message Date
JKDingwall
f53f3c3e6a Fix generation of kernel uevents for snapshot rename on linux
`zvol_rename_minors()` needs to be given the full path not just the
snapshot name.  Use code removed in a0bd735ad as a guide
to providing the necessary values.

Add ZTS check for /dev changes after snapshot rename.  After
renaming a snapshot with 'snapdev=visible' ensure that the /dev
entries are updated to reflect the rename.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: James Dingwall <james@dingwall.me.uk>
Closes #14223 
Closes #16600
2024-11-05 15:43:53 -08:00
Umer Saleem
d29f257b03 Update path for zed in zfs-zed.service for native debian packages
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Umer Saleem <usaleem@ixsystems.com>
Closes#15638
2024-11-05 15:43:53 -08:00
Umer Saleem
e63023738e Disable parallel build for native-deb* targets
Running native-deb* targets in parallel via make is not supported.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Umer Saleem <usaleem@ixsystems.com>
Closes#14736
2024-11-05 15:43:52 -08:00
Umer Saleem
18e355670d Fix missing packaging files from release tarballs
Properly distribute files for native Debian packages. This fixes the
issue with broken release tarballs.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Umer Saleem <usaleem@ixsystems.com>
Closes#15404
Closes#15586
2024-11-05 15:43:52 -08:00
Alexander Motin
acc8a31863 ARC: Cache arc_c value during arc_evict()
Since arc_evict() run can take some time, arc_c change during it
may result in undesired shift in ARC states balance. Primarily in
case of arc_c reduction it may cause eviction from MFU data state
despite its being below the target already.  Instead we should
evict as originally planned and if needed do another round after.

Reviewed-by: Theera K. <tkittich@hotmail.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by:	Alexander Motin <mav@FreeBSD.org>
Sponsored by:	iXsystems, Inc.
Closes #16576
Closes #16605
2024-11-05 15:43:52 -08:00
rilysh
e8f4592a19 Avoid computing strlen() inside loops
Compiling with -O0 (no proper optimizations), strlen() call
in loops for comparing the size, isn't being called/initialized
before the actual loop gets started, which causes n-numbers of
strlen() calls (as long as the string is). Keeping the length
before entering in the loop is a good idea.

On some places, even with -O2, both GCC and Clang can't
recognize this pattern, which seem to happen in an array
of char pointer.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: rilysh <nightquick@proton.me>
Closes #16584
2024-11-05 15:43:52 -08:00
Rob Norris
4adc97ae15 lua: add flex array field to TString type
Linux 6.10+ with CONFIG_FORTIFY_SOURCE notices memcpy() accessing past
the end of TString, because it has no indication that there there may be
an additional allocation there.

There's no appropriate upstream change for this (ancient) version of
Lua, so this is the narrowest change I could come up with to add a flex
array field to the end of TString to satisfy the check. It's loosely
based on changes from lua/lua@ca41b43f and lua/lua@9514abc2.

Sponsored-by: https://despairlabs.com/sponsor/
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rob Norris <robn@despairlabs.com>
Closes #16541
Closes #16583
2024-11-05 15:43:52 -08:00
George Melikov
2bd540d273 man: update recordsize max size info
Reflect f2330bd156
change in our man pages and add some context.

Wording is primarily copy-pasted from code comments.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by:  Tino Reichardt <milky-zfs@mcmilk.de>
Signed-off-by: George Melikov <mail@gmelikov.ru>
Closes #16581
2024-11-05 15:43:52 -08:00
Alexander Motin
48482bb2f4 Properly release key in spa_keystore_dsl_key_hold_dd()
Since dsl_crypto_key_open() references the key, 0d23f5e2e4 should
have called dsl_crypto_key_rele() to drop it first instead of
calling dsl_crypto_key_free() directly.  The final result should
actually be the same, but without triggering dck_holds assertion.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by:	Alexander Motin <mav@FreeBSD.org>
Sponsored by:	iXsystems, Inc.
Closes #16567
2024-11-05 15:43:52 -08:00
Alexander Motin
21c40e6d9e FreeBSD: Sync taskq_cancel_id() returns with Linux
Couple places in the code depend on 0 returned only if the task was
actually cancelled.  Doing otherwise could lead to extra references
being dropped.  The race could be small, but I believe CI hit it
from time to time.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by:	Alexander Motin <mav@FreeBSD.org>
Sponsored by:	iXsystems, Inc.
Closes #16565
2024-11-05 15:43:52 -08:00
w0xel
b9658f9a67 Add missing guard defines for simd_stat
This adds the HAVE_KERNEL_NEON and HAVE_KERNEL_FPU_INTERNAL
guards to simd_stat.c defaulted to 0 to make it build again.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Shengqi Chen <harry-chen@outlook.com>
Signed-off-by: Sebastian Wuerl <s.wuerl@mailbox.org>
Closes #16558
2024-11-05 15:43:52 -08:00
Rich Ercolani
948704cb37 Fix /proc/spl/kstat/simd on x86
Evidently while reworking it on aarch64, I broke it on x86 and
didn't notice.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #16556
2024-11-05 15:43:52 -08:00
Rich Ercolani
9616275021 Add SIMD metadata in /proc on Linux
Too many times, people's performance problems have amounted to
"somehow your SIMD support isn't working", and determining that
at runtime is difficult to describe to people.

This adds a /proc/spl/kstat/zfs/simd node, which exposes
metadata about which instructions ZFS thinks it can use,
on AArch64 and x86_64 Linux, to make investigating things
like this much easier.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #16530
2024-11-05 15:43:52 -08:00
Theera K.
0987892160 Evicting too many bytes from MFU metadata
Without updating 'm' we evict from MFU metadata all that we wanted
to evict from all metadata, including already evicted MRU metadata
('m' is the total amount of metadata we had at the beginning,
and 'w' is the total amount of metadata we want to have). 

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Theera K. <tkittich@hotmail.com>
Closes #16521
Closes #16546
2024-11-05 15:43:52 -08:00
Alexander Motin
54278533a4 Reduce and handle EAGAIN errors on AIO label reads
At least FreeBSD has a limit of 256 simultaneous AIO requests per
process. Attempt to issue more results in EAGAIN errors. Since we
issue 4 requests per disk/partition from 2xCPUs threads, it is
quite easy to reach that limit on large systems, that results in
random pool import failures.  It annoyed me for quite a while on
a system with 64 CPUs and 70+ partitioned disks.

This patch from one side limits the number of threads to avoid the
error, while from another should softly fall back to sync reads in
case of error.  It takes into account _SC_AIO_MAX as a system-wide
AIO limit and _SC_AIO_LISTIO_MAX as a closest value to per-process
limit.  The last not exactly right, but it is the best I found.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored by: iXsystems, Inc.
Closes #16551
2024-11-05 15:43:52 -08:00
Umer Saleem
94a571b3d9 Add compatibility file for GRUB versions up to v2.06
GRUB is not able to detect ZFS pool if snaphsot of top level boot
pool is created. This issue is observed with GRUB versions up to
v2.06 if extensible_dataset feature is enabled on ZFS boot pool.

compatibility=grub2-2.06 would enable all read-only compatible
zpool features except extensible_dataset and other features that
depend on it.

The existing grub2 compatibility file is now renamed to grub2-2.12 to
reflect the appropriate grub2 version. grub2-2.12 lists all read-only
features that can be enabled on boot pool for grub2 with version 2.12
onwards.

A new symlink grub2 is created that currently points to the grub2-2.12
compatibility file.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Umer Saleem <usaleem@ixsystems.com>
Closes #13873
Closes #15261
Closes #15909
2024-11-05 15:43:52 -08:00
rmacklem
2dc8529d9a Fix handling of DNS names with '-' in them for sharenfs
An old FreeBSD bugzilla report PR#168158 notes that DNS
names with '-'s in them cannot be used for the sharenfs
property.  This patch fixes the parsing of these DNS names.
The only negative affect this patch might have is that,
if a user has incorrectly separated options with a '-'
the sharenfs setting will no longer work once this patch
is applied.

Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Rick Macklem <rmacklem@uoguelph.ca>
Closes #16529
2024-11-05 15:43:52 -08:00
Rob Norris
c25d5140b0 sa_impl: fix SA header bitfield docs
Off by one, confused me a while!

Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de>
Signed-off-by: Rob Norris <rob.norris@klarasystems.com>
Closes #16500
2024-11-05 15:43:52 -08:00
Alan Somers
bc0d89bfc1 Fix an uninitialized data access (#16511)
zfs_acl_node_alloc allocates an uninitialized data buffer, but upstack
zfs_acl_chmod only partially initializes it.  KMSAN reported that this
memory remained uninitialized at the point when it was read by
lzjb_compress, which suggests a possible kernel memory disclosure bug.

The full KMSAN warning may be found in the PR.
https://github.com/openzfs/zfs/pull/16511

Signed-off-by:	Alan Somers <asomers@gmail.com>
Sponsored by:	Axcient
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
2024-11-05 15:43:52 -08:00
Rob Norris
25ec9a9034 zdb: fix BRT dump (#16335)
BRT refcounts are stored as eight uint8_ts rather than a single
uint64_t. This means that za_first_integer is only the first byte, so
max 256. This fixes it by doing a lookup for the whole value.

Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.

Signed-off-by: Rob Norris <rob.norris@klarasystems.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
2024-11-05 15:43:52 -08:00
Tony Hutter
bb946ff232 ZTS: Add Fedora 41, remove Fedora 39
Fedora 41 was released 10/29/24, and Fedora 39 will be EOL on 11/12/24.
Update Fedora runners in the test suite.  Some minor tweaks also needed
to support ksh 1.0.10.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #16700
2024-11-05 11:09:22 -08:00
Tony Hutter
f2633144e9 ZTS: Add LUKS sanity test
Add a LUKS sanity test to trigger: #16631

Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de>
Reviewed-by: Rob Norris <robn@despairlabs.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #16681
2024-11-05 11:09:22 -08:00
Tino Reichardt
bdd1f72d9d Fix dependency install on Debian 11 (#16683)
Adding cryptsetup breaks some dialog things on Debian 11.
Apply some workaround for it.

Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
2024-11-05 11:09:22 -08:00
Brian Behlendorf
51996dcaee ZTS: Add additional exceptions
The following tests have been observed to occasionally fail when
running under the CI.  Updated our exceptions list to track them.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #16670
2024-11-05 11:09:22 -08:00
Tino Reichardt
aa32ee6838 ZTS: Make use of optimal CPU pinning
With CPU pinning, we should get some speedup because of better
cpu cache re-use.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de>
Closes #16641
2024-11-05 11:09:22 -08:00
Tino Reichardt
34537bd9c7 ZTS: Optimize Kernel Same-page Merging (KSM)
Kernel same-page Merging (KSM) allows KVM guests to share identical
memory pages. These shared pages are usually common libraries or other
identical, high-use data.

The current configuration was a bit to lazy - so KSM didn't work very
well. With the new configuration I could run 3 Linux VMs in parralel.

FreeBSD can't benefit from it. But FreeBSD is not so memory hungry in
general, so there is no need for it ;)

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de>
Closes #16641
2024-11-05 11:09:22 -08:00
Brian Behlendorf
09f7432f83 CI: Stick with ubuntu-22.04 for CodeQL analysis
The ubuntu-latest alias now refers to ubuntu-24.04 instead of
ubuntu-22.04 which causes CodeQL's autobuild to fail with:

    cpp/autobuilder: deptrace not supported in ubuntu 24.04

Until deptrace is supported by ubuntu-24.04 hosted runners request
ubuntu-22.04 which is supported.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Closes #16639
2024-11-05 11:09:22 -08:00
Tino Reichardt
8457bab268 ZTS: Fix summary page creation again - second try
In PR #16599 I used 'return' like in C - which is wrong :/
This fix generates the summary as needed.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de>
Closes #16611
2024-11-04 11:11:45 -08:00
Tino Reichardt
f52b725e64 ZTS: Remove FreeBSD 13.4-STABLE
Current CI is failing on FreeBSD 13.4-STABLE, because samba4 can't be
installed there. Lets remove it for now.

Update also the FreeBSD version definitions a bit.

The naming is like this now:

FreeBSD variants:
- freebsd13-3r, freebsd13-4r, freebsd14-0r, freebsd14-1r (RELEASE)
- freebsd13-4s, freebsd14-1s (STABLE)
- freebsd15-0c (CURRENT)

RHL based distros:
- almalinux8, almalinux9, centos-stream9, fedora39, fedora40

Debian based:
- debian11, debian12, ubuntu20, ubuntu22, ubuntu24

Misc Linux distros:
- archlinux, tumbleweed

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de>
Closes #16610
2024-11-04 11:11:38 -08:00
Tino Reichardt
903db66793 ZTS: Fix summary page creation
There are cases, where some needed files for the summary page aren't
created. Currently the whole Summary Page creation will fail then.
Sample run: https://github.com/openzfs/zfs/actions/runs/11148248072/job/30999748588

Fix this, by properly checking for existence of the needed files.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Rob Norris <robn@despairlabs.com>
Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de>
Closes #16599
2024-11-04 11:11:29 -08:00
Tino Reichardt
e55e9468f7 ZTS: Replace MD5 and SHA256 wit XXH128
For data integrity checks as done in ZTS, the verification for
unintended data corruption with xxhash128 should be a lot faster
and perfectly usable.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de>
Closes #16577
2024-11-04 11:10:24 -08:00
Brian Behlendorf
a44a15d318 ZTS: Fix zpool_import_hostid_changed_unclean_export
Update the test case to freeze the pool then export it to better
simulate a hard failure.  This is preferable to copying the vdev
while the pool's imported since with a copy we're not guaranteed
the on-disk state will be consistent.  That can in turn result
in a pool import failure and a spurious test failure.

Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #16578
2024-11-04 11:09:21 -08:00
Brian Behlendorf
54244d263a ZTS: Update deadman_sync threshold
Lower the minimum number of expected deadman events from 4 to 3.  All
that is strictly required is a single event to consider the test a
pass.  However, since I've never seen a count of less than 3 reported
by the CI that should be sufficient.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #16575
2024-11-04 11:09:16 -08:00
Brian Behlendorf
ad739614cc CI: Add logs to zloop workflow
On failure attempt to include the most relevant portions of the
ztest logs in the CI output.  This full logs are still available
for download but often a backtrace and the last output is enough.

Install libunwind to improve the odds of a useful backtrace.

Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #16573
2024-11-04 11:09:09 -08:00
Brian Behlendorf
f407add033 ZTS: Fix zpool_import_hostid_changed_cachefile_unclean_export
Update the test case to freeze the pool then export it to better
simulate a hard failure.  This is preferable to copying the vdev
while the pool's imported since with a copy we're not guaranteed
the on-disk state will be consistent.  That can in turn result
in a pool import failure and a spurious test failure.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #16570
2024-11-04 11:08:57 -08:00
Shengqi Chen
92c384cbac CI: run only sanity check on limited OSes for nonbehavioral changes
The commit uses heuristics to determine whether a PR is behavioral:

It runs "quick" CI (i.e., only use sanity.run on fewer OSes)
if (explicitly requested by user):
- the *last* commit message contains a line 'ZFS-CI-Type: quick',
or if (by heuristics):
- the files changed are not in the list of specified directory, and
- all commit messages does not contain 'ZFS-CI-Type: full'.

It runs "full" CI otherwise.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de>
Signed-off-by: Shengqi Chen <harry-chen@outlook.com>
Closes #16564
2024-11-04 11:08:44 -08:00
Brian Behlendorf
040be2504a CI: cancel workflows when PRs are updated (#16562)
For checkstyle, zloop, zfs-qemu, and codeql workflows cancel
in-progress jobs when the PR is updated.

Relevant GitHub Actions documentation:

  The following concurrency group cancels in-progress jobs or run
  on pull_request events only; if github.head_ref is undefined, the
  concurrency group will fallback to the run ID, which is guaranteed
  to be both unique and defined for the run.

https://docs.github.com/en/actions/writing-workflows/workflow-syntax-for-github-actions#example-using-a-fallback-value

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #16562
2024-11-04 11:08:38 -08:00
Brian Behlendorf
1b27b76f6f ZTS: CI Documentation Updates
Update the CONTRIBUTING.md documentation to refer to the GitHub Actions
workflows which have replaced the buildbot infrastructure.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #16561
2024-11-04 11:08:30 -08:00
Brian Behlendorf
af0a2bef40 ZTS: CodeQL Action v3 update
Switch from v2 to v3 CodeQL Actions.  The v2 actions will no longer
be supported as of Dec '24 so we need to move to v3.  According to
the release notes they should be functionally equivalent.

    Note that the only difference between v2 and v3 of the CodeQL
    Action is the node version they support, ... For example 3.22.11
    was the first v3 release and is functionally identical to 2.22.11.

https://github.com/github/codeql-action/blob/main/CHANGELOG.md

Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #16560
2024-11-04 11:08:24 -08:00
Brian Behlendorf
ce3ad59c27 ZTS: Add additional exceptions
The following tests have been observed to occasionally fail when
running under the CI.  Updated our exceptions list to track them.

Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #16553
2024-11-04 11:07:49 -08:00
Brian Behlendorf
46b0d5c486 ZTS: Retire "tmpfile_reason" exception
All supported Linux kernels, 4.18 and newer, provide O_TMPFILE.

Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #16553
2024-11-04 11:07:43 -08:00
Brian Behlendorf
b69c7f3847 ZTS: Retire "ci_reason" exceptions
There is no longer be a need for the ci_reason exception with
the update CI GitHub Actions infrastruture.  Retire it.

Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #16553
2024-11-04 11:07:37 -08:00
Tino Reichardt
484f92b8fa ZTS: Fix Summary Page
The qemu-9-summary-page.sh script reads the file env.txt in the
first lines. When the module didn't build, this file was not copied
into the tarfile - causing the scipt to abort.

Fix: copy needed files into the tarfile in case of module build
failures. The fix ignores also empty tarfiles in future.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de>
Closes #16555
2024-11-04 11:07:29 -08:00
Umer Saleem
e2bded28ea ZTS: Fix skipping over comment lines in zpool_create.shlib
In zpool_create.shlib, check_feature_set iterates over all features
mentioned in provided compatibility file to check if only those
features are enabled on the pool.

This commit fixes skipping over comment lines correctly. Otherwise,
the test case fails as comment lines are also treated as feature names
by check_feature_set function.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Umer Saleem <usaleem@ixsystems.com>
Closes #15909
2024-11-04 11:07:19 -08:00
Tino Reichardt
ca5652c420 ZTS: Remove functional tests via matrix
This commit changes the workflow of the github actions.

- Ubuntu 20.04, 22.04, 24.04 will be tested via QEMU now
- remove unused scripts of this commit: b7bc334d1
- re-add the zloop standalone testings via zloop.yml

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de>
Closes #16549
2024-11-04 11:07:04 -08:00
Tino Reichardt
53879c4ecc ZTS: Fix Test Summary page generation
Fix that error: "cat /tmp/failed.txt: No such file or directory"

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de>
Closes #16549
2024-11-04 11:06:56 -08:00
Tino Reichardt
1cbb695dcc ZTS: use openssl for md5digest and sha256digest
On larger files this should improve the speed.

Sample values of my system:

[mcmilk@xz]$ time dd if=/dev/zero bs=128k count=1k | sha256sum
254bcc3fc4f27172636df4bf32de9f107f620d559b20d760197e452b97453917  -
real    0m1,050s
user    0m0,985s
sys     0m0,153s

[mcmilk@xz]$ time dd if=/dev/zero bs=128k count=1k | openssl sha256 -r
254bcc3fc4f27172636df4bf32de9f107f620d559b20d760197e452b97453917 *stdin
real    0m0,254s
user    0m0,206s
sys     0m0,160s

I think cli_root/zdb/zdb_backup.ksh runs also an FreeBSD and I needed to
include the sysutils/coreutils package for the FreeBSD tests within the
QEMU patchset.

This could be reverted, when this pull request gets upstream

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de>
Closes #16543
2024-11-04 11:06:47 -08:00
Tino Reichardt
66db223778 ZTS: Use QEMU for tests on Linux and FreeBSD
This commit adds functional tests for these systems:
- AlmaLinux 8, AlmaLinux 9, ArchLinux
- CentOS Stream 9, Fedora 39, Fedora 40
- Debian 11, Debian 12
- FreeBSD 13, FreeBSD 14, FreeBSD 15
- Ubuntu 20.04, Ubuntu 22.04, Ubuntu 24.04

- enabled by default:
 - AlmaLinux 8, AlmaLinux 9
 - Debian 11, Debian 12
 - Fedora 39, Fedora 40
 - FreeBSD 13, FreeBSD 14

Workflow for each operating system:
- install qemu on the github runner
- download current cloud image of operating system
- start and init that image via cloud-init
- install dependencies and poweroff system
- start system and build openzfs and then poweroff again
- clone build system and start 2 instances of it
- run functional testings and complete in around 3h
- when tests are done, do some logfile preparing
- show detailed results for each system
- in the end, generate the job summary

Real-world benefits from this PR:

1. The github runner scripts are in the zfs repo itself. That means
   you can just open a PR against zfs, like "Add Fedora 41 tester", and
   see the results directly in the PR. ZFS admins no longer need
   manually to login to the buildbot server to update the buildbot config
   with new version of Fedora/Almalinux.

2. Github runners allow you to run the entire test suite against your
   private branch before submitting a formal PR to openzfs. Just open a
   PR against your private zfs repo, and the exact same
   Fedora/Alma/FreeBSD runners will fire up and run ZTS. This can be
   useful if you want to iterate on a ZTS change before submitting a
   formal PR.

3. buildbot is incredibly cumbersome. Our buildbot config files alone
   are ~1500 lines (not including any build/setup scripts)!
   It's a huge pain to setup.

4. We're running the super ancient buildbot 0.8.12. It's so ancient
   it requires python2. We actually have to build python2 from source
   for almalinux9 just to get it to run. Ugrading to a more modern
   buildbot is a huge undertaking, and the UI on the newer versions is
   worse.

5. Buildbot uses EC2 instances. EC2 is a pain because:
   * It costs money
   * They throttle IOPS and CPU usage, leading to mysterious,
   * hard-to-diagnose, failures and timeouts in ZTS.
   * EC2 is high maintenance. We have to setup security groups, SSH
   * keys, networking, users, etc, in AWS and it's a pain. We also
   * have to periodically go in an kill zombie EC2 instances that
   * buildbot is unable to kill off.

6. Buildbot doesn't always handle failures well. One of the things we
   saw in the past was the FreeBSD builders would often die, and each
   builder death would take up a "slot" in buildbot. So we would
   periodically have to restart buildbot via a cron job to get the slots
   back.

7. This PR divides up the ZTS test list into two parts, launches two
   VMs, and on each VM runs half the test suite. The test results are
   then merged and shown in the sumary page. So we're basically
   parallelizing ZTS on the same github runner. This leads to lower
   overall ZTS runtimes (2.5-3 hours vs 4+ hours on buildbot), and one
   unified set of results per runner, which is nice.

8. Since the tests are running on a VM, we have much more control over
   what happens. We can capture the serial console output even if the
   test completely brings down the VM. In the future, we could also
   restart the test on the VM where it left off, so that if a single test
   panics the VM, we can just restart it and run the remaining ZTS tests
   (this functionaly is not yet implemented though, just an idea).

9. Using the runners, users can manually kill or restart a test run
   via the github IU. That really isn't possible with buildbot unless
   you're an admin.

10. Anecdotally, the tests seem to be more stable and constant under
    the QEMU runners.

Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #16537
2024-11-04 11:06:02 -08:00
Tino Reichardt
36bac59fbc ZTS: increase timeout of mmap_sync_001_pos
On load the test needs sometimes a bit more time then just one second.
Doubling the time will help on the QEMU based testings.

Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de>
Closes #16537
2024-11-04 11:05:55 -08:00
Tino Reichardt
e29e7a0502 ZTS: fix zpool_status_008_pos test on qemu vm's
The test needs some adjusting within the timings.

Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Co-authored-by: Tino Reichardt <milky-zfs@mcmilk.de>
Closes #16537
2024-11-04 11:05:42 -08:00