Compare commits

..

243 Commits

Author SHA1 Message Date
Tony Hutter ef83e07db5 Tag zfs-2.1.3
META file and changelog updated.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
2022-03-09 07:10:55 -08:00
Brian Behlendorf 145af480d3 Fix ENOSPC when unlinking multiple files from full pool
When unlinking multiple files from a pool at 100% capacity, it was
possible for ENOSPC to be returned after the first unlink.  e.g.

    rm -f /mnt/fs/test1.0.0 /mnt/fs/test1.1.0 /mnt/fs/test1.2.0
    rm: cannot remove '/mnt/fs/test1.1.0': No space left on device
    rm: cannot remove '/mnt/fs/test1.2.0': No space left on device

After waiting for the pending deferred frees from the first unlink to
be processed the remaining files can then be unlinked.  This is caused
by the quota limit in dsl_dir_tempreserve_impl() being temporarily
decreased to the allocatable pool capacity less any deferred free
space.

This is resolved using the existing mechanism of returning ERESTART
when over quota as long as we know enough space will shortly be
available after processing the pending deferred frees.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Ryan Moeller <freqlabs@FreeBSD.org>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13172
2022-03-08 11:46:03 -08:00
Brian Behlendorf b3b6491ce9 ZTS: deadman_sync fix
In the CI environment it's possible for events to be slightly
delayed resulting in 4, instead of 5, events appearing in the
log file.  This isn't a problem and should be considered a
success to avoid false positive test results.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #12625
2022-03-07 15:17:49 -08:00
Mark Johnston b3427b18b1 zfs: Fix a deadlock between page busy and the teardown lock
When rolling back a dataset, ZFS has to purge file data resident in the
system page cache.  To do this, it loops over all vnodes for the
mountpoint and calls vn_pages_remove() to purge pages associated with
the vnode's VM object.  Each page is thus exclusively busied while the
dataset's teardown write lock is held.

When handling a page fault on a mapped ZFS file, FreeBSD's page fault
handler busies newly allocated pages and then uses VOP_GETPAGES to fill
them.  The ZFS getpages VOP acquires the teardown read lock with vnode
pages already busied.  This represents a lock order reversal which can
lead to deadlock.

To break the deadlock, observe that zfs_rezget() need only purge those
pages marked valid, and that pages busied by the page fault handler are,
by definition, invalid.  Furthermore, ZFS pages always transition from
invalid to valid with the teardown lock held, and ZFS never creates
partially valid pages.  Thus, zfs_rezget() can use the new
vn_pages_remove_valid() to skip over pages busied by the fault handler.

PR:		258208
Tested by:	pho
Reviewed by:	avg, sef, kib
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D32931

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Ryan Moeller <freqlabs@FreeBSD.org>
Closes #12828
2022-03-04 15:37:41 -08:00
Alexander Motin 0e2bb1a3ee Really zero the zero page
While switching abd_zero_buf allocation KPI I've missed the fact
that kmem_zalloc() zeroed the allocation, while kmem_cache_alloc()
does not.  Add explicit bzero() after it.

I don't think it should have caused real problems, but leaking one
memory page content all over the pool is not good.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Brian Atkinson <batkinson@lanl.gov>
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Closes #12569
2022-03-04 15:37:33 -08:00
Brian Behlendorf 037434e4fc ZTS: Fix import_devices_missing.ksh
Related to commit 90b77a036.  Retry the `zpool export` if the pool
is "busy" indicating there is a process accessing the mount point.
This can happen after an import, allowing it to be retried will
avoid spurious test failures.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13169
2022-03-02 11:27:05 -08:00
Brian Behlendorf 190516f0c5 ZTS: Retry in import_rewind_config_changed.ksh
As explained by the disclaimer in the test case,

    "This test can fail since nothing guarantees that old
    MOS blocks aren't overwritten."

This behavior is expected and correct, but results in a
flaky test case which is problematic for the CI.  The best
we can do to resolve this is to retry the sub-test which
failed when the MOS blocks have clearly been overwritten.

When testing failures were rare enough that a single retry
should normally be sufficient.  However, we allow up to
five for good measure.

Reviewed by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13119
2022-03-02 11:25:35 -08:00
Brian Behlendorf e2fddf07bd ZTS: Modify receive-o-x_props_override.ksh exception
As previously noted in #12272 the receive-o-x_props_override.ksh test
reliably fails on FreeBSD.  Since we don't expect this test to pass
move the exception from the "maybe" to "known" section.  This way we
don't retry the FAILED test when it is not expected to pass.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Ryan Moeller <freqlabs@FreeBSD.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13167
2022-03-01 13:16:43 -08:00
Brian Behlendorf 4cb88d7fdc ZTS: Move largest_pool_001_pos.ksh to Linux runfile
On FreeBSD pools are not allowed to be created using vdevs which are
backed by ZFS volumes.  This configuration is not recommended for any
supported platform, nevertheless the largest_pool_001_pos.ksh test
case makes use of it as a convenience.  This causes the test case to
fail reliably on FreeBSD.  The layout is still tolerated on Linux
so only perform this test on Linux.

Reviewed-by: Igor Kozhukhov <igor@dilos.org>
Reviewed by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Ryan Moeller <freqlabs@FreeBSD.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13166
2022-03-01 13:16:43 -08:00
Paul Dagnelie ddcdccbcc4 Fix erroneous zstreamdump warning
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Amanakis <gamanakis@gmail.com>
Signed-off-by: Paul Dagnelie <pcd@delphix.com>
Closes #13154
2022-03-01 09:45:48 -08:00
наб f2eaa97840 Fix FreeBSD reporting on reruns
Turns out, when your test-suite fails on FreeBSD the rerun logic
would fail as follows:

Results Summary
PASS	 1358
FAIL	   7
SKIP	  47

Running Time:	04:00:02
Percent passed:	96.2%
Log directory:	/var/tmp/test_results/20220225T092538
mktemp: illegal option -- p
usage: mktemp [-d] [-q] [-t prefix] [-u] template ...
       mktemp [-d] [-q] [-u] -t prefix
mktemp: illegal option -- p
usage: mktemp [-d] [-q] [-t prefix] [-u] template ...
       mktemp [-d] [-q] [-u] -t prefix
/usr/local/share/zfs/zfs-tests.sh: cannot create :
                                   No such file or directory
...

This change resolves a flaw from the original commit, 2320e6eb4
("Add zfs-test  facility to automatically rerun failing tests")

Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13156
2022-02-15 21:52:49 -08:00
Paul Dagnelie 7bd292e59b Fix cpu hotplug atomic sleep issue
We move the spinlock unlock before the thread creation. This should be
safe because the thread creation code doesn't actually manipulate any
taskq data structures; that's done by the thread once it's created.

We also remove the assertion that the maxthreads is the current threads
plus one; that assertion could fail if multiple hotplug events come in
quick succession, and the first new taskq thread hasn't had a chance to
start processing yet.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
eviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Signed-off-by: Paul Dagnelie <pcd@delphix.com>
Closes #12714
2022-02-15 21:52:45 -08:00
Damian Szuberski 5c80a25653 Fix directory detection in dkms.mkconf
Fix `zfs-dkms` installation on Debian-derived distributions by
aligning the directory detection logic to #13096.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: szubersk <szuberskidamian@gmail.com>
Closes #11449
Closes #13141
2022-02-24 11:33:02 -08:00
Attila Fülöp 1d70698174 Linux 5.11 compat: x86 SIMD: fix kernel_fpu_{begin,end}() detection
Linux 5.11 changed kernel_fpu_begin() to an inlined function and
moved the functionality to kernel_fpu_begin_mask(). This breaks the
existing detection mechanism since it checks if kernel_fpu_begin is
an exported kernel symbol, which isn't the case for an inlined
function.

To avoid assumptions about internal implementation, replace
ZFS_LINUX_TEST_RESULT_SYMBOL in favor of  ZFS_LINUX_TEST_RESULT
which already makes sure kernel_fpu_{begin,end}() is usable by us.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Attila Fülöp <attila@fueloep.org>
Closes #13147
2022-02-24 11:33:02 -08:00
Damian Szuberski b55ed8df92 Fix Linux kernel directories detection
Most modern Linux distributions have separate locations for bare
source and prebuilt ("build") files. Additionally, there are `source`
and `build` symlinks in `/lib/modules/$(KERNEL_VERSION)` pointing to
them. The order of directory search is now:
- `configure` command line values if both `--with-linux` and
  `--with-linux-obj` were defined
- If only `--with-linux` was defined, `--with-linux-obj` is assumed
  to have the same value as `--with-linux`
- If neither `--with-linux` nor `--with-linux-obj` were defined
  autodetection is used:
  - `/lib/modules/$(uname -r)/{source,build}` respectively, if exist
  - The first directory in `/lib/modules` with the highest version
    number according to `sort -V` which contains `source` and `build`
    symlinks/directories
  - The first directory matching `/usr/src/kernels/*` and
    `/usr/src/linux-*` with the highest version number according to
    `sort -V`. Here the source and prebuilt directories are assumed
    to be the same.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: szubersk <szuberskidamian@gmail.com>
Closes #9935
Closes #13096
2022-02-23 16:47:44 -08:00
George Amanakis bcddb18bae Enable encrypted raw sending to pools with greater ashift
Raw sending from pool1/encrypted with ashift=9 to pool2/encrypted with
ashift=12 results to failure when mounting pool2/encrypted (Input/Output
error). Notably, the opposite, raw sending from a greater ashift to a
lower one does not fail.

This happens because zio_compress_write() falsely checks only
ZIO_FLAG_RAW_COMPRESS and not ZIO_FLAG_RAW_ENCRYPT which is also set in
encrypted raw send streams. In this case it rounds up the psize and if
not equal to the zio->io_size it modifies the block by zeroing out
the extra bytes. Because this happens in a SA attr. registration object
(type=46), the decryption fails upon mounting the filesystem, and zpool
status falsely reports an error.

Fix this by checking both ZIO_FLAG_RAW_COMPRESS and ZIO_FLAG_RAW_ENCRYPT
before deciding whether to zero-pad a block.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Amanakis <gamanakis@gmail.com>
Closes #13067 
Closes #13074
2022-02-23 16:47:37 -08:00
George Amanakis 6c6153e5b8 Avoid dirtying the final TXGs when exporting a pool
There are two codepaths than can dirty final TXGs:

1) If calling spa_export_common()->spa_unload()->
   spa_unload_log_sm_flush_all() after the spa_final_txg is set, then
   spa_sync()->spa_flush_metaslabs() may end up dirtying the final
   TXGs. Then we have the following panic:
   Call Trace:
    <TASK>
    dump_stack_lvl+0x46/0x62
    spl_panic+0xea/0x102 [spl]
    dbuf_dirty+0xcd6/0x11b0 [zfs]
    zap_lockdir_impl+0x321/0x590 [zfs]
    zap_lockdir+0xed/0x150 [zfs]
    zap_update+0x69/0x250 [zfs]
    feature_sync+0x5f/0x190 [zfs]
    space_map_alloc+0x83/0xc0 [zfs]
    spa_generate_syncing_log_sm+0x10b/0x2f0 [zfs]
    spa_flush_metaslabs+0xb2/0x350 [zfs]
    spa_sync_iterate_to_convergence+0x15a/0x320 [zfs]
    spa_sync+0x2e0/0x840 [zfs]
    txg_sync_thread+0x2b1/0x3f0 [zfs]
    thread_generic_wrapper+0x62/0xa0 [spl]
    kthread+0x127/0x150
    ret_from_fork+0x22/0x30
    </TASK>

2) Calling vdev_*_stop_all() for a second time in spa_unload() after
   spa_export_common() unnecessarily delays the final TXGs beyond what
   spa_final_txg is set at.

Fix this by performing the check and call for
spa_unload_log_sm_flush_all() before the spa_final_txg is set in
spa_export_common(). Also check if the spa_final_txg has already been
set in spa_unload() and skip those calls in this case.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Amanakis <gamanakis@gmail.com>
External-issue: https://www.illumos.org/issues/9081
Closes #13048 
Closes #13098
2022-02-23 16:47:33 -08:00
наб 336c6d5f54 zfs-receive.8: properly unlight = in option setting
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Amanakis <gamanakis@gmail.com>
Reviewed-by: Ryan Moeller <freqlabs@FreeBSD.org>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13101
2022-02-16 17:58:56 -08:00
наб 4b3fbf3c16 zfs-receive.8: fix Op Fl x Ar encryption in running text
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Amanakis <gamanakis@gmail.com>
Reviewed-by: Ryan Moeller <freqlabs@FreeBSD.org>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13101
2022-02-16 17:58:56 -08:00
Tomohiro Kusumi 02309af096 Remove unneeded "extern inline" function declarations
All of these externs are already #included as static inline
functions via corresponding headers.

Reviewed-by: Igor Kozhukhov <igor@dilos.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tomohiro Kusumi <kusumi.tomohiro@gmail.com>
Closes #13073
2022-02-16 17:58:56 -08:00
наб 94a4b7ec3d module: zfs: fix unused, remove argsused
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12844
2022-02-16 17:58:56 -08:00
Brian Behlendorf ccbe9efd6b ZTS: Fix checkpoint_ro_rewind.ksh
Related to commit 90b77a036.  Retry the `zpool export` if the pool is
"busy" indicating there is a process accessing the mount point.  This
can happen after an import and allowing it to be retried will avoid
spurious test failures.

Reviewed by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13092
2022-02-16 17:58:56 -08:00
Brian Behlendorf 882bc4ad61 ZTS: Fix zpool_expand_001_pos
The dRAID section of the zpool_expand_001_pos test would reliably fail
because the calculated expansion size assumed the dRAID top-level vdev
was created with a distributed spare.  Create the vdev as expected to
resolve the test failure.

This test case flaw was accidentally caused by changing the default
number of dRAID distributed spares from one to zero while dRAID was
being developed.

Additionally, remove zpool_expand_005_pos from the list of possible
faulty tests.  It appears to be passing consistently in my testing.

Reviewed by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13091
2022-02-16 17:58:56 -08:00
Brian Behlendorf f4c2b21823 Fix gcc warning in kfpu_begin()
Observed when building on CentOS 8 Stream.  Remove the `out`
label at the end of the function and instead return.

  linux/simd_x86.h: In function 'kfpu_begin':
  linux/simd_x86.h:337:1: error: label at end of compound statement

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Attila Fülöp <attila@fueloep.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13089
2022-02-16 17:58:56 -08:00
наб d24bdf4ee4 zpool-import.8: WARNING should be emphasised
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13082
2022-02-16 17:58:56 -08:00
наб 11bd8cd002 zpool-import.8: newpool is Ar, not Sy
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13082
2022-02-16 17:58:56 -08:00
наб a38e7bc922 zpoolprops.7: document leaked
It's noted very scarcely in the code as it stands, indeed the only
actual comment on this is

  /*
   * We have finished background destroying, but there is still
   * some space left in the dp_free_dir. Transfer this leaked
   * space to the dp_leak_dir.
   */

Introduced in fbeddd60b7 ("Illumos 4390 -
I/O errors can corrupt space map when deleting fs/vol"),
which explains, alongside the references, that this can only happen
with a corrupted pool

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13081
2022-02-16 17:58:56 -08:00
Zhu Chuang d4e8dcf07e Correct a typo in zfs-receive.8
Should be  `-o keyformat=passphrase` instead of `-o -keyformat=passphrase`

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chuang Zhu <chuang@melty.land>
Closes #13072
2022-02-16 17:58:56 -08:00
Brian Behlendorf f03cf651ec ZTS: Fix zvol_misc_volmode test
Changing volmode may need to remove minors, which could be open, so
call udev_wait() before we "zfs set volmode=<value>".  This ensures
no udev process has the zvol open (i.e. blkid) and the kernel
zvol_remove_minor_impl() function won't skip removing the in use
device.

Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13075
2022-02-16 17:58:56 -08:00
drowfx bc99c809d5 Add dataset_kstats_update.. to mmap read/write paths
This allows reads/writes caused by accesses to mmap files to be
accounted correctly in the per-dataset kstats for both Linux and
FreeBSD.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <freqlabs@FreeBSD.org>
Signed-off-by: Matthias Blankertz <matthias@blankertz.org>
Closes #12994 
Closes #13044
2022-02-16 17:58:56 -08:00
Attila Fülöp 5c19af07d4 Receive checks should allow unencrypted child datasets
dmu_recv_begin_check() unconditionally sets the DS_HOLD_FLAG_DECRYPT
flag before calling dsl_dataset_hold_flags(). If the key on the
receiving side isn't loaded or the send stream contains embedded
blocks, the receive check fails for a stream which is perfectly
valid and could be received without any problem. This seems like
a remnant of the initial design, where unencrypted datasets below
encrypted ones weren't allowed.

Add a condition to set `DS_HOLD_FLAG_DECRYPT` only for encrypted
datasets, modify an existing test to detect this regression and add
a test for raw replication streams.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Amanakis <gamanakis@gmail.com>
Co-authored-by: George Amanakis <gamanakis@gmail.com>
Signed-off-by: Attila Fülöp <attila@fueloep.org>
Closes #13033 
Closes #13076
2022-02-16 17:58:55 -08:00
Damian Szuberski 2681f8a5b8 Propagate KERNEL_* to *.spec
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Authored-by: Damian Szuberski <szuberskidamian@gmail.com>
Signed-off-by: Peter Levine <plevine457@gmail.com>
Closes #13046
2022-02-16 17:58:55 -08:00
Peter Levine c7fcf00917 Add support for $KERNEL_{CC,LD,LLVM} variables
Currently, $(CC), $(LD), and $(LLVM) variables aren't passed to kbuild
while building modules.  This causes modules to build with the default
GNU GCC toolchain and prevents experimenting with other toolchains such
as CLANG/LLVM.  It can also lead to build failure if the CFLAGS/LDFLAGS
passed are incompatible with gcc/ld.

Pass $KERNEL_CC, $KERNEL_LD, and $KERNEL_LLVM as $(CC), $(LD), and
$(LLVM), respectively, to kbuild for each that is defined in the
environment.  This should take care of the majority of alternative
toolchain use cases.

Reviewed-by: Damian Szuberski <szuberskidamian@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Peter Levine <plevine457@gmail.com>
Closes #13046
2022-02-16 17:58:55 -08:00
наб 52aae04c6a module: Makefile: simplify clean and install jobs
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12979
2022-02-16 17:58:55 -08:00
наб 77ae804f9e module: Makefile: flatten subdir loop, use $PWD instead of pwd
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Issue #12899
2022-02-16 17:58:55 -08:00
Attila Fülöp 3b52ccd7d7 Linux 5.16 compat: don't use XSTATE_XSAVE to save FPU state
Linux 5.16 moved XSTATE_XSAVE and XSTATE_XRESTORE out of our reach,
so add our own XSAVE{,OPT,S} code and use it for Linux 5.16.

Please note that this differs from previous behavior in that it
won't handle exceptions created by XSAVE an XRSTOR. This is sensible
for three reasons.

 - Exceptions during XSAVE and XRSTOR can only occur if the feature
   is not supported or enabled or the memory operand isn't aligned
   on a 64 byte boundary. If this happens something else went
   terribly wrong, and it may be better to stop execution.

 - Previously we just printed a warning and didn't handle the fault,
   this is arguable for the above reason.

 - All other *SAVE instruction also don't handle exceptions, so this
   at least aligns behavior.

Finally add a test to catch such a regression in the future.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Attila Fülöp <attila@fueloep.org>
Closes #13042
Closes #13059
2022-02-16 17:58:55 -08:00
Damian Szuberski bb271d67e8 mount.zfs -o zfsutil leverages zfs_mount_at()
Using `zfs_mount_at()` gives opportunity to properly propagate
mountopts from what's stored in a pool to the `mount(2)` syscall
invocation. It fixes cases when mount options are set to incorrect
values and rectification is impossible (e. g. Linux initrd boot
sequence in #7947).
Moved debug information printing after all variables are
initialized - printed text reflects what is passed to `mount(2)`.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: szubersk <szuberskidamian@gmail.com>
Issue #7947 
Closes #13021
2022-02-16 17:58:55 -08:00
Christian Schwarz a61915e086 dsl_dir_tempreserve_impl: remove unused deferred variable
The following commit moved the users of `deferred` into function
dsl_pool_unreserved_space:

    commit d2734cce68
    Author: Serapheim Dimitropoulos <serapheim.dimitro@delphix.com>
    Date:   Fri Dec 16 14:11:29 2016 -0800

        OpenZFS 9166 - zfs storage pool checkpoint

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <freqlabs@FreeBSD.org>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Christian Schwarz <christian.schwarz@nutanix.com>
Closes #13056
2022-02-16 17:58:55 -08:00
наб 765be36006 libfetch: unquote @LIBFETCH_SONAME@ subst
@LIBFETCH_SONAME@ is no longer quoted. The C define still is.

Ref: 153f7c9f72
Ref: https://github.com/openzfs/zfs/pull/12835#discussion_r776833743
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Damian Szuberski <szuberskidamian@gmail.com>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12922
2022-02-16 17:58:55 -08:00
наб 0cb2d8a60b contrib/initrd hooks: properly quote @LIBFETCH_SONAME@
Bullseye shellcheck picks these up as SC2140, and it's right!
@LIBFETCH_SONAME@ is already quoted, so dracut had
  "$d/"libcurl.so.4""
and i-t had
  ""libcurl.so.4""

Partially reverts 34eef3e9a7 (#12760),
which broke this

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12835
2022-02-16 17:58:55 -08:00
наб 745a7f78da Remove basename(1). Clean up/shorten some coreutils pipelines
Basenames that remain, in cmd/zed/zed.d/statechange-led.sh:
	dev=$(basename "$(echo "$therest" | awk '{print $(NF-1)}')")
	vdev=$(basename "$ZEVENT_VDEV_PATH")
I don't wanna interfere with #11988

scripts/zfs-tests.sh:
	SINGLETESTFILE=$(basename "$SINGLETEST")
tests/zfs-tests/tests/functional/cli_user/zfs_list/zfs_list.kshlib:
	ACTUAL=$(basename $dataset)
	ACTUAL=$(basename $dataset)
tests/zfs-tests/tests/functional/cli_user/zpool_iostat/
	zpool_iostat_-c_homedir.ksh:
	typeset USER_SCRIPT=$(basename "$USER_SCRIPT_FULL")
tests/zfs-tests/tests/functional/cli_user/zpool_iostat/
	zpool_iostat_-c_searchpath.ksh:
	typeset CMD_1=$(basename "$SCRIPT_1")
	typeset CMD_2=$(basename "$SCRIPT_2")
tests/zfs-tests/tests/functional/cli_user/zpool_status/
	zpool_status_-c_homedir.ksh:
	typeset USER_SCRIPT=$(basename "$USER_SCRIPT_FULL")
tests/zfs-tests/tests/functional/cli_user/zpool_status/
	zpool_status_-c_searchpath.ksh
	typeset CMD_1=$(basename "$SCRIPT_1")
	typeset CMD_2=$(basename "$SCRIPT_2")
tests/zfs-tests/tests/functional/migration/migration.cfg:
	export BNAME=`basename $TESTFILE`
tests/zfs-tests/tests/perf/perf.shlib:
	typeset logbase="$(get_perf_output_dir)/$(basename \
tests/zfs-tests/tests/perf/perf.shlib:
	typeset logbase="$(get_perf_output_dir)/$(basename \

These are potentially Of Directories, where basename is actually
useful

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12652
2022-02-16 17:58:55 -08:00
Jorgen Lundman d6b7903032 autoconf: allow Release to contain hyphen
To avoid clashing with tags and releases, we'll use "zfs-macOS".

Meta:          1
Name:          zfs-macOS

Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Jorgen Lundman <lundman@lundman.net>
Closes #12437
2022-02-16 17:58:55 -08:00
Brian Behlendorf cd0e238049 ZTS: Update enospc_002_pos test case
The on-disk cost of creating a snapshot or bookmark is sufficiently low
that it is difficult to make it reliably fail even when the pool is
"full".  In order to avoid false positives remove these two checks from
the test case.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #13060
2022-02-16 17:58:55 -08:00
Pawel Jakub Dawidek 3e27b589cf Fix clearing set-uid and set-gid bits on a file when replying a write
POSIX requires that set-uid and set-gid bits to be removed when an
unprivileged user writes to a file and ZFS does that during normal
operation.

The problem arrises when the write is stored in the ZIL and replayed.
During replay we have no access to original credentials of the process
doing the write, so zfs_write() will be performed with the root
credentials. When root is doing the write set-uid and set-gid bits
are not removed from the file.

To correct that, log a separate TX_SETATTR entry that removed those bits
on first write to such file.

Idea from:	Christian Schwarz

Add test for ZIL replay of setuid/setgid clearing.

Improve various edge cases when clearing setid bits:
- The setid bits can be readded during a single write, so make sure to check
  for them on every chunk write.
- Log TX_SETATTR record at most once per transaction group (if the setid bits
  are keep coming back).
- Move zfs_log_setattr() outside of zp->z_acl_lock.

Reviewed-by: Dan McDonald <danmcd@joyent.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Co-authored-by: Christian Schwarz <me@cschwarz.com>
Signed-off-by: Pawel Jakub Dawidek <pawel@dawidek.net>
Closes #13027
2022-02-16 17:58:55 -08:00
Akash B 9221ff1888 Add enumerated vdev names to 'zpool iostat -v' and 'zpool list -v'
This commit adds enumerated names to disambiguate between the
different vdevs. Previously only 'zpool status' showed enumerated
vdev names, now 'zpool list -v' and 'zpool iostat -v' also shows
the enumerated vdev names.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Dipak Ghosh <dipak.ghosh@hpe.com>
Signed-off-by: Akash B <akash-b@hpe.com>
Closes #12510
Closes #13031
2022-02-16 17:58:55 -08:00
George Amanakis 72a82f312f Report dnodes with faulty bonuslen
In files created/modified before 4254acb there may be a corruption of
xattrs which is not reported during scrub and normal send/receive. It
manifests only as an error when raw sending/receiving. This happens
because currently only the raw receive path checks for discrepancies
between the dnode bonus length and the spill pointer flag.

In case we encounter a dnode whose bonus length is greater than the
predicted one, we should report an error. Modify in this regard
dnode_sync() with an assertion at the end, dump_dnode() to error out,
dsl_scan_recurse() to report errors during a scrub, and zstream to
report a warning when dumping. Also added a test to verify spill blocks
are sent correctly in a raw send.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Amanakis <gamanakis@gmail.com>
Closes #12720 
Closes #13014
2022-02-16 17:58:55 -08:00
ColMelvin 5753e7a7c5 RPM: Add missing BuildRequires for PAM component
When the optional PAM binaries are included in a build, ./configure will
look for security/pam_modules.h and - if it doesn't find it - recommend
the user install `libpam0g-dev`.  On Red Hat systems, `pam-devel` is the
package that supplies this requirement; `libpam0g-dev` does not exist.

By encoding this requirement in the spec file, we give packagers more
appropriate (and timely) recommendations for completing the build.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chris Lindee <chris.lindee+github@gmail.com>
Closes #13001
2022-02-16 17:58:55 -08:00
Brian Behlendorf 7f4f461bcf Clarify failmode=wait documentation
Nowhere in the description of the failmode property does it
clearly state how to bring a suspended pool back online.
Add a few words to property description and the zpool-clear(8)
man page.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #12907
Closes #9395
2022-02-16 17:58:55 -08:00
Ryan Hirasaki f601ee1e43 README: Update OpenZFS website url
This change is to first replace the OpenZFS website in the README to
point to openzfs.org as this is what open-zfs.org redirects to.
Along with replacing the URL, the protocol is also upgraded
from http to https.

These changes should prevent web browsers such as Firefox from
complaining about visiting a http site, if the proper security
settings are enabled, when it will still end up on a https page
after the redirect.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Hirasaki <ryanhirasaki@gmail.com>
Closes #12939
2022-02-16 17:58:55 -08:00
chrisrd 5987838a3f man: speling
Fix spelling.

Reviewed-by: Rich Ercolani <rincebrain@gmail.com>
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chris Dunlop <chris@onthe.net.au>
Closes #12911
2022-02-16 17:58:55 -08:00
Brian Behlendorf 8285e1b09d ZTS: Fix enospc_002_pos.ksh again
This is a follow up commit for e03a41a60 which aimed to resolve
this same test failure.  The core "problem" here is that it takes
very little space to perform a clone/snapshot/bookmark, which
means if we want these commands to reliably fail the pool must
truely have exhausted all free space.

This commit increases the number of fill iterations to try and
consume every block which we can.  This still can't guarantee
the clone/snapshot/bookmark will fail, but it significantly
improves the odds.  The exception was kept since it's still
not a sure thing.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Igor Kozhukhov <igor@dilos.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #12903
2022-02-16 17:58:55 -08:00
Brian Behlendorf c454e46336 ZTS: Fix rollback_003_pos.ksh
Under Linux when rolling back a mounted filesystem negative dentries
may not be dropped from the cache.  This can result in an ENOENT
being incorrectly returned on first access.  Issuing a `df` before
the unmount results in the negative dentries being invalidated and
side steps the issue.

This is solely a workaround for the test case on Linux and not
correct behavior.  The core issue of invalidating negative dentries
needs to be handled with a kernel side change.  This is being
tracked as issue #6143.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #12898 
Issue #6143
2022-02-16 17:58:55 -08:00
Brian Behlendorf 306cccca27 Update zts-report.py with additional tests
The following test cases may still occasionally fail and are being
added to the "maybe" list for Linux until they can be updated to be
entirely reliable.

  cli_root/zfs_rename/zfs_rename_002_pos.ksh
  cli_root/zpool_reopen/zpool_reopen_003_pos.ksh
  refreserv/refreserv_raidz

These 6 tests consistently fail only on Fedora 31+, the failures
are related to the kernel rescanning the partition table on loopback
devices which is no longer reliable unless partprobe is used.  In
order to enable the Fedora bot by default they are also being added
to the list until the tests can be updated.  Any significant regression
in functionality covered by these tests will still be detected by the
FreeBSD builders.

  alloc_class/alloc_class_009_pos
  alloc_class/alloc_class_010_pos
  cli_root/zpool_expand/zpool_expand_001_pos
  cli_root/zpool_expand/zpool_expand_005_pos
  rsend/rsend_007_pos
  rsend/rsend_010_pos
  rsend/rsend_011_pos
  snapshot/rollback_003_pos

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #10489
2022-02-16 17:58:55 -08:00
Rich Ercolani 4730c3f249 Exclude zvol_misc_volmode for now
It keeps failing, on changes which aren't related at all.

So until someone runs down why, I'd like it to stop being the
sole reason for CI failures.

Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #12733
2022-02-16 17:58:55 -08:00
Brian Behlendorf 4fea6a6737 ZTS: Add known exceptions
Add the following test failures to the exception list for FreeBSD
to ensure we notice new unexpected failures.

   pool_checkpoint/checkpoint_big_rewind
   pool_checkpoint/checkpoint_indirect

And the following for Linux.

   zvol/zvol_misc/zvol_misc_snapdev

Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #12621
Issue #12622
Issue #12623
Closes #12624
2022-02-16 17:58:55 -08:00
Ryan Moeller fc3230a781 ZTS: Minimize udev_wait in zvol_misc tests
The zvol_misc tests, in particular zvol_misc_volmode, make use of a
common udev_wait function to wait for zvol devices in /dev to quiesce
on Linux.  On other platforms this function currently only sleeps for
one second before returning.  This is insufficient, and
zvol_misc_volmode has been flaky on FreeBSD as a result.

Replace udev_wait with block_device_wait, passing through the optional
device parameter where possible.  Rearrange a few checks to strengthen
the verifications we are making and avoid unnecessarily sleeping.  We
must keep udev_wait in a couple places to pass in Github CI workflows.
Remove zvol_misc_volmode from the maybe failing tests on FreeBSD in
zts-report.py.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #12583
2022-02-16 17:58:55 -08:00
Ka Ho Ng ed064ed596 ZTS: Enable punch-hole tests on FreeBSD
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Ka Ho Ng <khng@FreeBSD.org>
Sponsored-by: The FreeBSD Foundation
Closes #12458
2022-02-16 17:58:55 -08:00
Brian Behlendorf 74bba85423 ZTS: Fix refreserv_raidz.ksh
The rerefreserv_raidz test was failing on Linux because the sync being
issued doesn't guarantee a pool sync.  Switch to using the sync_pool
function and remove the ZTS exception for Linux.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #12897
2022-02-16 17:58:55 -08:00
Georgy Yakovlev f22ebf8fa6 zfs-test/mmap_seek: fix build on musl
The build on musl needs linux/fs.h for SEEK_DATA and friends,
and sys/sysmacros.h for P2ROUNDUP.  Add the needed headers.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Georgy Yakovlev <gyakovlev@gentoo.org>
Closes #12891
2022-02-16 17:58:55 -08:00
Brian Behlendorf 1fb5566a25 ZTS: speed up rsend tests
With some minor tweaks several of rsend tests can be sped up
considerably without significantly reducing test coverage.

* send-c_verify_ratio:  ~120s -> ~60s
* send_realloc_*_files: ~330s -> ~65s

For the send_realloc* tests this also has the advantage of removing
(most of) the linux/freebsd conditional logic.  Note that for this
test more passes, and thus more incremental send/recvs, are preferable
to a larger number of files.

Total run time of the rsend test group was reduced from roughly 20 to
11 minutes in an environment similar to what's used by the CI.

Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #12876
2022-02-16 17:58:55 -08:00
Brian Behlendorf be01ee8629 ZTS: rsend_007_pos failures
The rsend_007_pos test reliably fails on Linux in the cleanup
function.  This is caused by an unmount error when attempting to
recursively destroy the newly received datasets.  Invoking `df`
prior to the `zfs destroy` interestingly avoids the unmont error.

Why this should matter is unclear and should be investigated.
However, this minor tweak may allow us to remove the ZTS rsend
exceptions.  The subsequent rsend_010_pos and rsend_011_pos
failures were a result of this initial failure.  The other
"maybe" failures I was unable to reproduce and have not been
recently observed in the master branch.

Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #5665
Closes #6086
Closes #6087
Closes #6446
Closes #12876
2022-02-16 17:58:55 -08:00
наб efbed102f0 zfs-share.8: document -l flag
Description stolen from zfs-mount.8

Reviewed-by: Don Brady <don.brady@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12067
2022-02-16 17:58:55 -08:00
наб 19a4bf445f contrib/initrd: systemd-ask-password --no-tty before argument
In systemd 249 (sid), sd-a-p processes its arguments in getopt + mode,
so "systemd-ask-password zupa --no-tty" prompts for "zupa --no-tty",
not "zupa" not on the tty, as expected (bullseye, 247).

Ref: https://github.com/systemd/systemd/commit/4b1c842d95bfd6ab352ade1a4655f9e512f35185
Ref: https://github.com/systemd/systemd/pull/19806
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12870
2022-02-16 17:58:55 -08:00
наб f9baf968b8 dracut: 90zfs: zfs-load-key: wait for key to appear for up to 10 seconds
Also reduce password retries to 3 to match i-t

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12065
Closes #12108
2022-02-16 17:58:55 -08:00
наб 9cbc2ed20f libzfs: add keylocation=https://, backed by fetch(3) or libcurl
Add support for http and https to the keylocation properly to
allow encryption keys to be fetched from the specified URL.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Issue #9543
Closes #9947
Closes #11956
2022-02-16 17:58:37 -08:00
наб 9b185de6fa ZTS: cli_root/zfs_load-key: add separate key files
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Issue: #11956
Closes #11976
2022-02-15 16:20:12 -08:00
D. Ebdrup 4d4f0d1a05 zfsprops.7: Add note about comma-separation
This change primarily seeks to make implicit documentation explicit, as
it is not outright stated that options should be comma-separated, nor is
there a reason given for it.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Daniel Ebdrup Jensen <debdrup@FreeBSD.org>
Closes #12579
2022-02-15 16:20:12 -08:00
Rich Ercolani 687de107b7 Add explicit timeout to test step
If we die from timeout of the whole GH action run, we don't run the
collect step afterward, which can make it hard to investigate the
timeout.

If we timeout first in the test action, though, it qualifies as
failure, and collects appropriately.

(330 minutes seems like an acceptable tradeoff between the 6h
timeout by default on the action and the 4h and change "functional"
usually takes.)

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #12999
2022-02-15 16:20:12 -08:00
Rich Ercolani 2e3b3e3a2e Workaround Debian's fake System.map behavior
Debian ships fake System.map files by default, leading to the
invocation of depmod with them to flood you with errors about
missing symbols.

Let's notice and not do that.

Reviewed-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #12862
2022-02-10 11:18:38 -08:00
José Luis Salvador Rufo a35125e3d5 Proper support for DESTDIR and INSTALL_MOD_PATH
The environment variables DESTDIR and INSTALL_MOD_PATH must
be mutually exclusive.

https://www.gnu.org/prep/standards/html_node/DESTDIR.html
https://www.kernel.org/doc/Documentation/kbuild/modules.txt

This issue was discussed in this Buildroot thread:
https://lists.buildroot.org/pipermail/buildroot/2021-August/621350.html

I saw this behavior in other different projects, as:

- Yocto Project:
  https://www.yoctoproject.org/pipermail/meta-freescale/2013-August/004307.html

- Google IA Coral:
  https://coral.googlesource.com/linux-imx-debian/+/refs/heads/master/debian/rules

For the above reasons, INSTALL_MOD_PATH will be set as DESTDIR
by default.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: José Luis Salvador Rufo <salvador.joseluis@gmail.com>
Signed-off-by: Romain Naour <romain.naour@gmail.com>
Closes #12577
2022-02-10 11:18:29 -08:00
Brian Behlendorf fe8b0a33d4 ZTS: alloc_class.ksh must wait for the process to exit
The alloc_class_* tests may fail on Linux with an EBUSY error if
`zfs destroy` is run before the `dd` process has had a chance to
terminate.  Wait on the pid after the `kill -9` to make sure.

When testing I didn't observe any failures for the alloc_class
tests.  Remove them from the exceptions list, the CI was used to
verify the tests pass on all platforms.

Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Rich Ercolani <rincebrain@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #12873
2022-02-10 11:05:07 -08:00
Rich Ercolani d4794c8204 ZTS: Avoid piping send directly to /dev/null
Unfortunately, #11445 means while we fail gracefully now, we still
fail, unless people want to implement a complex workaround just to
support /dev/null.

So let's just use the cheap workaround in a test for now.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #12872
2022-02-10 11:04:57 -08:00
Tony Hutter 29e05d5345 ZTS: Fix zpool_reopen_[1-5] on Fedora 35
The zpool_reopen_[1-5] tests are failing Fedora 35 with:

zpool_reopen_001_pos.ksh[64]: log_must[67]: log_pos[270]:
wait_for_resilver_end[98]: wait_for_action: line 71: func: is read only

Renaming 'func' -> 'funct' fixes the issue.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #12871
2022-02-10 11:04:46 -08:00
Georgy Yakovlev f471a0a0a7 systemd: add weekly and monthly scrub timers
Timers can be enabled as follows:

systemctl enable zfs-scrub-weekly@rpool.timer --now
systemctl enable zfs-scrub-monthly@datapool.timer --now

Each timer will pull in zfs-scrub@${poolname}.service, which is not
schedule-specific.

Added PERIODIC SCRUB section to zpool-scrub.8.

Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Georgy Yakovlev <gyakovlev@gentoo.org>
Closes #12193
2022-02-10 11:04:35 -08:00
ogelpre d76917b2ec Add init script to load keys
Add new init scripts which allow automatic loading of keys if
keylocation property is set to a URI.

Reviewed-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Benedikt Neuffer <ogelpre@itfriend.de>
Closes #11659
Closes #11662
2022-02-10 11:04:26 -08:00
Francesco Mazzoli 487bb77623 Notify on UNAVAIL statechange
`UNAVAIL` is maybe not quite as concerning as `DEGRADED`, but still an
event of notice, in my opinion. For example it is triggered when a
drive goes missing.

Reviewed-by: Don Brady <don.brady@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Francesco Mazzoli <f@mazzo.li>
Closes #12629
Closes #12630
2022-02-10 11:04:16 -08:00
Jorgen Lundman f31b45176c Upstream: Add snapshot and zvol events
For kernel to send snapshot mount/unmount events to zed.

For kernel to send symlink creates/removes on zvol plumbing.
(/dev/run/dsk/zvol/$pool/$zvol -> /dev/diskX)

If zed misses the ENODEV, all errors after are EINVAL. Treat any error
as kernel module failure.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Jorgen Lundman <lundman@lundman.net>
Closes #12416
2022-02-10 11:04:06 -08:00
Scott Colby 4613504809 zed: Add Pushover notifier
Add zed_notify_pushover to zed-functions.sh, along with the necessary
configuration variables in zed.rc.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Signed-off-by: Scott Colby <scott@scolby.com>
Closes #12012
2022-02-10 11:03:38 -08:00
Phil Kauffman 44bb2fcf38 zed-functions.sh: escape newline to produce valid json
This was discovered when using Discords Slack compatible webhook.

Slack webhooks works without the escape, however Discord rightly refuses
the POST as it contains invalid JSON.

https://discord.com/developers/docs/resources/webhook#execute-slackcompatible-webhook

Valid (while escaping the newline:
```
+ msg_json='{"text": "*ZFS scrub_finish error for test on quartz*\nZFS has detected a data error:\n\n   eid: 124\n class: scrub_finish\n  host: quartz\n  time: \n error: \n objid: :\n  pool: test\n"}'
```

Invalid (no escape):
```
+ msg_json='{"text": "*ZFS scrub_finish error for test on quartz*
ZFS has detected a data error:\n\n   eid: 124\n class: scrub_finish\n  host: quartz\n  time: \n error: \n objid: :\n  pool: test\n"}'
```
The new line gets rendered and not sent inside the JSON as intended.

```
++ curl -X POST https://discord.com/api/webhooks/{webhook.id}/{webhook.token}/slack --header 'Content-Type: application/json' --data-binary '{"text": "*ZFS scrub_finish error for test on quartz*
ZFS has detected a data error:\n\n   eid: 124\n class: scrub_finish\n  host: quartz\n  time: \n error: \n objid: :\n  pool: test\n"}'
+ msg_out='{"message": "Cannot send an empty message", "code": 50006}'
```

Test method:
`root@quartz:/etc/zfs/zed.d# export ZED_ZEDLET_DIR=/etc/zfs/zed.d; export ZEVENT_EID=124; export ZEVENT_SUBCLASS=scrub_finish; export ZEVENT_POOL=test; export ZED_NOTIFY_DATA=1; bash -x ./data-notify.sh`

Reviewed-by: Damian Szuberski <szuberskidamian@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Philip Kauffman <philip@kauffman.me>
Closes #13049
2022-02-07 14:05:41 -08:00
shodanshok e56dffe4b5 zed: send notification email by default
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Gionatan Danti <g.danti@assyoma.it>
Closes #12806
2022-02-07 14:05:14 -08:00
George Amanakis e257bd481b Introduce a flag to skip comparing the local mac when raw sending
Raw receiving a snapshot back to the originating dataset is currently
impossible because of user accounting being present in the originating
dataset.

One solution would be resetting user accounting when raw receiving on
the receiving dataset. However, to recalculate it we would have to dirty
all dnodes, which may not be preferable on big datasets.

Instead, we rely on the os_phys flag
OBJSET_FLAG_USERACCOUNTING_COMPLETE to indicate that user accounting is
incomplete when raw receiving. Thus, on the next mount of the receiving
dataset the local mac protecting user accounting is zeroed out.
The flag is then cleared when user accounting of the raw received
snapshot is calculated.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Amanakis <gamanakis@gmail.com>
Closes #12981 
Closes #10523
Closes #11221
Closes #11294
Closes #12594
Issue #11300
2022-02-04 16:14:56 -08:00
Finix1979 1009e60992 Linux <4.8 compat: submit_bio() rw arg
When using the two argument version of submit_bio() in kernel's prior
to 4.8 the first argument should be specified.  It's used by block
dump to report the bio direction.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Finix Yan <yancw@info2soft.com>
Closes #13006
2022-02-04 08:33:52 -08:00
наб 4f6599416a Linux 5.17 compat: PDE_DATA() renamed to pde_data()
Upstream commit 359745d78351c6f5442435f81549f0207ece28aa
("proc: remove PDE_DATA() completely")

Link: https://lore.kernel.org/all/20211124081956.87711-2-songmuchun@bytedance.com/T/#u

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #13004
Closes #12989
2022-02-04 08:33:52 -08:00
наб f42c126029 Linux 5.17 compat: dequeue_signal() takes a 4th argument
Linux 5.17's dequeue_signal() takes an additional enum pid_type *
output argument

Upstream commit 5768d8906bc23d512b1a736c1e198aa833a6daa4
("signal: Requeue signals in the appropriate queue")

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12989
2022-02-04 08:33:52 -08:00
наб 2ce06d93a8 Linux 5.17 compat: detect complete_and_exit() rename
Linux 5.17 sees a rename from complete_and_exit()
to kthread complete_and_exit()

Upstream commit cead18552660702a4a46f58e65188fe5f36e9dfe
("exit: Rename complete_and_exit to kthread_complete_and_exit")

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12989
2022-02-04 08:33:52 -08:00
Rich Ercolani 8ef01afbfc Add support for FALLOC_FL_ZERO_RANGE
For us, I think it's always just FALLOC_FL_PUNCH_HOLE with a fake
mustache on.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Coleman Kane <ckane@colemankane.org>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #12975
2022-02-04 08:33:52 -08:00
Rich Ercolani 70b7b1975d Linux 5.16 compat: Added mapping for iov_iter_fault_in_readable
Linux decided to rename this for some reason. At some point, we
should probably invert this mapping, but for now...

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Coleman Kane <ckane@colemankane.org>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #12975
2022-02-04 08:33:52 -08:00
Rich Ercolani c31c1146b6 Linux 5.16 compat: Added add_disk check for return
add_disk went from void to must-check int return.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Coleman Kane <ckane@colemankane.org>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #12975
2022-02-04 08:33:52 -08:00
Rich Ercolani b3e0853951 Linux 5.16 compat: Check slab.h for kvmalloc
As it says on the tin - the folio work moved a bunch out of mm.h.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Coleman Kane <ckane@colemankane.org>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #12975
2022-02-04 08:33:52 -08:00
Mark Johnston 0da15f9194 Fix handling of errors from dmu_write_uio_dbuf() on FreeBSD
FreeBSD's implementation of zfs_uio_fault_move() returns EFAULT when a
page fault occurs while copying data in or out of user buffers.  The VFS
treats such errors specially and will retry the I/O operation (which may
have made some partial progress).

When the FreeBSD and Linux implementations of zfs_write() were merged,
the handling of errors from dmu_write_uio_dbuf() changed such that
EFAULT is not handled as a partial write.  For example, when appending
to a file, the z_size field of the znode is not updated after a partial
write resulting in EFAULT.

Restore the old handling of errors from dmu_write_uio_dbuf() to fix
this.  This should have no impact on Linux, which has special handling
for EFAULT already.

Reviewed-by: Andriy Gapon <avg@FreeBSD.org>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Mark Johnston <markj@FreeBSD.org>
Closes #12964
2022-02-03 15:30:52 -08:00
Mark Johnston 5303fc4c95 Avoid memory allocations in the ARC eviction thread
When the eviction thread goes to shrink an ARC state, it allocates a set
of marker buffers used to hold its place in the state's sublists.

This can be problematic in low memory conditions, since
1) the allocation can be substantial, as we allocate NCPU markers;
2) on at least FreeBSD, page reclamation can block in
   arc_wait_for_eviction()

In particular, in stress tests it's possible to hit a deadlock on
FreeBSD when the number of free pages is very low, wherein the system is
waiting for the page daemon to reclaim memory, the page daemon is
waiting for the ARC eviction thread to finish, and the ARC eviction
thread is blocked waiting for more memory.

Try to reduce the likelihood of such deadlocks by pre-allocating markers
for the eviction thread at ARC initialization time.  When evicting
buffers from an ARC state, check to see if the current thread is the ARC
eviction thread, and use the pre-allocated markers for that purpose
rather than dynamically allocating them.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: George Amanakis <gamanakis@gmail.com>
Signed-off-by: Mark Johnston <markj@FreeBSD.org>
Closes #12985
2022-02-03 15:30:52 -08:00
Ryan Moeller 4aceda0497 libzfs_sendrecv: Fix leaked holds nvlist
There is no need to allocate a holds nvlist.  lzc_get_holds does that
for us.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <freqlabs@FreeBSD.org>
Closes #12967
2022-02-03 15:28:01 -08:00
Ryan Moeller ddb5a7a182 libzfs_sendrecv: Avoid extra avl_find
avl_add does avl_find internally, then avl_insert.  We're already doing
the avl_find, so using avl_insert directly avoids repeating the search.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <freqlabs@FreeBSD.org>
Closes #12967
2022-02-03 15:28:01 -08:00
Ryan Moeller af1630c883 FreeBSD: Fix zvol_cdev_open locking
First open locking changes were correctly applied to zvol_geom_open but
incorrectly applied to zvol_cdev_open, causing spa_namespace_lock to be
held indefinitely.

Make the first open locking in zvol_cdev_open match zvol_geom_open.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #13016
2022-02-03 15:28:01 -08:00
Ryan Moeller 1828b68a0b FreeBSD: Fix zvol_*_open() locking
These are the changes for FreeBSD corresponding to the changes made for
Linux in #12863, see that PR for details.

Changes from #12863 are applied for zvol_geom_open and zvol_cdev_open
on FreeBSD.  This also adds a check for the zvol dying which we had
in zvol_geom_open but was missing in zvol_cdev_open.  The check causes
the open to fail early with ENXIO when we are in the middle of changing
volmode.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #12934
2022-02-03 15:28:01 -08:00
Ryan Moeller f4def7ec6c FreeBSD: Fix leaked strings in libspl mnttab
The FreeBSD implementations of various libspl functions for getting
mounted device information were found to leak several strings which
were being allocated in statfs2mnttab but never freed.

The Solaris getmntany(3C) and related interfaces are expected to return
strings residing in static buffers that need to be copied rather than
freed by the caller.

Use static thread-local storage to stash the mnttab structure strings
from FreeBSD's statfs info rather than strings allocated on the heap by
strdup(3).

While here, remove some stray commented out lines.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Rich Ercolani <rincebrain@gmail.com>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #12961
2022-02-03 15:28:01 -08:00
наб c9c9d634aa linux: libzfs: mount: fix uninitialised flags
They're later |=d with constants, but never reset

Caught by valgrind while investigating
https://github.com/openzfs/zfs/pull/12928#issuecomment-1007496550

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12954
2022-02-03 15:28:01 -08:00
наб 36a91d6cef FreeBSD: vfsops: use setgen for error case
Fix from https://github.com/openzfs/zfs/pull/12844#discussion_r774179413

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12905
2022-02-03 15:28:01 -08:00
chrisrd 1259dc6e6a zfs_prune: reset sc.nr_to_scan
sc.nr_to_scan is an input to super_cache_clean (via
shrinker->scan_objects), used to set the number of objects to scan
in the various caches. However super_cache_scan also modifies
sc.nr_to_scan, so when used in a loop we need to reset
sc.nr_to_scan back to our desired nr_to_scan for the next
iteration.

Issue discovered and solution suggested by
Tenzin Lhakhang @tlhakhan.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chris Dunlop <chris@onthe.net.au>
Issue #12433
Closes #12908
2022-02-03 15:28:01 -08:00
Brian Behlendorf 6575defc52 Verify dRAID empty sectors
Verify that all empty sectors are zero filled before using them to
calculate parity.  Failure to do so can result in incorrect parity
columns being generated and written to disk if the contents of an
empty sector are non-zero.  This was possible because the checksum
only protects the data portions of the buffer, not the empty sector
padding.

This issue has been addressed by updating raidz_parity_verify() to
check that all dRAID empty sectors are zero filled.  Any sectors
which are non-zero will be fixed, repair IO issued, and a checksum
error logged.  They can then be safely used to verify the parity.

This specific type of damage is unlikely to occur since it requires
a disk to have silently returned bad data, for an empty sector, while
performing a scrub.  However, if a pool were to have been damaged
in this way, scrubbing the pool with this change applied will repair
both the empty sector and parity columns as long as the data checksum
is valid.  Checksum errors will be reported in the `zpool status`
output for any repairs which are made.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Mark Maybee <mark.maybee@delphix.com>
Reviewed-by: Brian Atkinson <batkinson@lanl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #12857
2022-02-03 15:28:01 -08:00
наб 5d8c081193 FreeBSD: fix unpropagated error
When performing I/O on FreeBSD using a file based vdev ensure all
errors encountered when reading/writing are propagated through the
zio pipeline.  

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12904
2022-02-03 15:28:01 -08:00
Martin Matuška 14bf91a043 FreeBSD: fix world build after 143476ce8
Do not redefine the fallthrough macro when building with libcpp.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Martin Matuska <mm@FreeBSD.org>
Closes #12880
2022-02-03 15:28:01 -08:00
Philipp Riederer 1833de8103 Fix error propagation from lzc_send_redacted
Any error from lzc_send_redacted is overwritten by the error of
send_conclusion_record; skip writing the conclusion record if there
was an earlier error.

Reviewed-by: Paul Dagnelie <pcd@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Philipp Riederer <philipp@riederer.email>
Closes #12766
2022-02-03 15:28:01 -08:00
наб a1a52a356b freebsd/libshare: nfs: don't send SIGHUP to all processes
pidfile_open() sets *pidptr to -1 if the process currently holding
the lock is between pidfile_open() and pidfile_write(),
the subsequent kill(mountdpid) would potentially SIGHUP all
non-system processes except init: just sleep for half a millisecond
and try again in that case

Reviewed-by: Don Brady <don.brady@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12067
2022-02-03 15:28:01 -08:00
Brian Behlendorf 9ec630ff2c Fix zvol_open() lock inversion
When restructuring the zvol_open() logic for the Linux 5.13 kernel
a lock inversion was accidentally introduced.  In the updated code
the spa_namespace_lock is now taken before the zv_suspend_lock
allowing the following scenario to occur:

    down_read <=== waiting for zv_suspend_lock
    zvol_open <=== holds spa_namespace_lock
    __blkdev_get
    blkdev_get_by_dev
    blkdev_open
    ...

     mutex_lock <== waiting for spa_namespace_lock
     spa_open_common
     spa_open
     dsl_pool_hold
     dmu_objset_hold_flags
     dmu_objset_hold
     dsl_prop_get
     dsl_prop_get_integer
     zvol_create_minor
     dmu_recv_end
     zfs_ioc_recv_impl <=== holds zv_suspend_lock via zvol_suspend()
     zfs_ioc_recv
     ...

This commit resolves the issue by moving the acquisition of the
spa_namespace_lock back to after the zv_suspend_lock which restores
the original ordering.

Additionally, as part of this change the error exit paths were
simplified where possible.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Rich Ercolani <rincebrain@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #12863
2022-02-03 15:28:01 -08:00
Alan Somers 4b2bac5fe9 FreeBSD: Update argument types for VOP_READDIR
A recent commit to FreeBSD changed the type of
vop_readdir_args.a_cookies to a uint64_t**.  There is no functional
impact to ZFS because ZFS only uses 32-bit cookies, which will be
zero-extended to 64-bits by the existing code.

https://github.com/freebsd/freebsd-src/commit/b214fcceacad6b842545150664bd2695c1c2b34f

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Alan Somers <asomers@gmail.com>
Closes #12874
2022-02-03 15:28:01 -08:00
Alexander Motin 786abf5321 Reduce number of arc_prune threads
On FreeBSD vnode reclamation is single-threaded, protected by single
global lock.  Linux seems to be able to use a thread per mount point,
but at this time it creates more harm than good.

Reduce number of threads to 1, adding tunable in case somebody wants
to try more.

Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Chris Dunlop <chris@onthe.net.au>
Reviewed-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Closes #12896
Issue #9966
2022-02-03 15:28:01 -08:00
Ryan Moeller 913ae45218 FreeBSD: Provide correct file generation number
va_seq was actually a thin veil over va_gen, so z_gen is a more
appropriate value than z_seq to populate the field with.

Drop the unnecessary compat obfuscation and provide the correct
file generation number.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Ryan Moeller <freqlabs@freebsd.org>
Closes #12851
2022-02-03 15:28:01 -08:00
Tony Hutter af88d47f1e Tag zfs-2.1.2
META file and changelog updated.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
2021-12-13 15:00:39 -08:00
Till Maas 24221589dd zfs-dkms rpm: Fix scriptlets dependencies
To ensure that the necessary packages are available during the %post and
%preun scriptlets, require them properly.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Till Maas <opensource@till.name>
Closes #12822
Closes #12832
2021-12-13 13:23:48 -08:00
Ryan Moeller def73c0735 FreeBSD: Add vop_standard_writecount_nomsync
https://cgit.freebsd.org/src/commit?id=3ffcfa599e29686cf2b3c1a6087408c37acaed78

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Ryan Moeller <freqlabs@FreeBSD.org>
Closes #12828
2021-12-13 13:23:07 -08:00
Ryan Moeller effe984148 FreeBSD: Catch up with more VFS changes
Unused thread argument was removed from NDINIT*

https://cgit.freebsd.org/src/commit?id=7e1d3eefd410ca0fbae5a217422821244c3eeee4

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Ryan Moeller <freqlabs@FreeBSD.org>
Closes #12828
2021-12-13 13:23:01 -08:00
Mark Johnston 19337332cc Fix several bugs in the FreeBSD rename VOP implementation
- To avoid a use-after-free, zfsvfs->z_log needs to be loaded after the
  teardown lock is acquired with ZFS_ENTER().
- Avoid leaking vnode locks in zfs_rename_relock() and zfs_rename_()
  when the ZFS_ENTER() macros forces an early return.

Refactor the rename implementation so that ZFS_ENTER() can be used
safely.  As a bonus, this lets us use the ZFS_VERIFY_ZP() macro instead
of open-coding its implementation.

Reported-by: Peter Holm <pho@FreeBSD.org>
Tested-by: Peter Holm <pho@FreeBSD.org>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Signed-off-by: Mark Johnston <markj@FreeBSD.org>
Sponsored-by: The FreeBSD Foundation
Closes #12717
2021-12-13 13:22:54 -08:00
Pawel Jakub Dawidek b96737b83e Remove (now unused) td argument from zfs_lookup()
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Signed-off-by: Pawel Jakub Dawidek <pawel@dawidek.net>
Closes #12748
2021-12-13 13:22:47 -08:00
Mark Johnston 4b7bfcf8a0 Exit the teardown section later in rename on FreeBSD
We have to hold the teardown lock while dereferencing zfsvfs->z_os and,
I believe, when committing to the ZIL.

Note that jumping to the "out" label, "error" is always non-zero.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Signed-off-by: Mark Johnston <markj@FreeBSD.org>
Closes #12704
2021-12-13 13:22:41 -08:00
Mark Johnston 07165ce540 Fix potential use-after-frees in FreeBSD getpages and setattr VOPs
The objset object is reallocated during certain dataset operations, such
as rollbacks, so the objset pointer must be loaded after acquiring the
teardown lock.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Signed-off-by: Mark Johnston <markj@FreeBSD.org>
Closes #12704
2021-12-13 13:22:34 -08:00
Brian Behlendorf 6ed7d77b44 ZTS: import_rewind_device_replaced reliably fails
The import_rewind_device_replaced.ksh test was never entirely reliable
because it depends on MOS data not being overwritten.  The MOS data is
not protected by the snapshot so occasional failures were always
expected.  However, this test is now failing reliably on all platforms
indicating something has changed in the code since the test was marked
"maybe".  Convert the test to a "known" failure until the root cause
is identified and resolved.

Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #12821
2021-12-08 13:28:09 -08:00
Damian Szuberski 64e88992b6 Update checkstyle workflow env to ubuntu-20.04
- `checkstyle` workflow uses ubuntu-20.04 environment
- improved `mancheck.sh` readability

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Signed-off-by: szubersk <szuberskidamian@gmail.com>
Closes #12713
2021-12-08 13:27:56 -08:00
Brian Behlendorf ad15fb430a Linux 5.15 compat: META (#12824)
The final 5.15 kernel is available and has been tested.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
2021-12-07 17:05:04 -08:00
Paul Dagnelie 57f6a050e6 ZFS send/recv with ashift 9->12 leads to data corruption
Improve the ability of zfs send to determine if a block is compressed
or not by using information contained in the blkptr.

Reviewed-by: Rich Ercolani <rincebrain@gmail.com>
Reviewed-by: Matthew Ahrens <matthew.ahrens@delphix.com>
Signed-off-by: Paul Dagnelie <pcd@delphix.com>
Closes #12770
2021-12-07 17:04:34 -08:00
Coleman Kane b3b293c9fc Linux 5.16: Resolve ZSTD_isError symbol collision in Linux kernel
Newer zstd code introduced in the main kernel tree now creates a symbol
collision with ZSTD_isError in our ZSTD code. This change relabels our
implementation with a ZFS-specific symbol name, and undoes some
macro-based micro-optimizations that conflict with the attempt to rename
our internal-use version.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Coleman Kane <ckane@colemankane.org>
Closes #12819
2021-12-07 13:14:24 -08:00
Coleman Kane bef7c02c81 Linux 5.16: The blk-cgroup.h header is where struct blkcg_gq is defined
The definition of struct blkcg_gq was moved into blk-cgroup.h, which is
a header that's been in Linux since 2015. This is used by
vdev_blkg_tryget() in module/os/linux/zfs/vdev_disk.c. Since the kernel
for CentOS 7 and similar-generation releases doesn't have this header,
its inclusion is guarded by a configure test.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Coleman Kane <ckane@colemankane.org>
Closes #12819
2021-12-07 13:14:23 -08:00
Coleman Kane ea61e07413 Linux 5.16: bio_set_dev is no longer a helper macro
This change adds a confiugre check to determine if bio_set_dev is a
helper macro or not. If not, then the attempt to override its internal
call to bio_associate_blkg(), with a macro definition to our own
version, is no longer possible, as the compiler won't use it when
compiling the new inline function replacement implemented in the header.
This change also creates a new vdev_bio_set_dev() function that performs
the same work, and also performs the work implemented in
vdev_bio_associate_blkg(), as it is the only thing calling that function
in our code. Our custom vdev_bio_associate_blkg() is now only compiled
if the bio_set_dev() is a macro in the Linux headers.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Coleman Kane <ckane@colemankane.org>
Closes #12819
2021-12-07 13:14:23 -08:00
Coleman Kane 9519fe1ff8 Linux 5.16: type member of iov_iter renamed iter_type
The iov_iter->type member was renamed iov_iter->iter_type. However,
while looking into this, realized that in 2018 a iov_iter_type(*iov)
accessor function was introduced. So if that is present, use it,
otherwise fall back to trying the existing behavior of directly
accessing type from iov_iter.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Coleman Kane <ckane@colemankane.org>
Closes #12819
2021-12-07 13:14:23 -08:00
Coleman Kane 0c40ff56f2 Linux 5.16: block_device_operations->submit_bio now returns void
The return type for the submit_bio member of struct
block_device_operations was changed to no longer return a value.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Coleman Kane <ckane@colemankane.org>
Closes #12819
2021-12-07 13:14:23 -08:00
Coleman Kane 806c3777e7 Linux 5.16 compat: asm/fpu/xcr.h is new location for xgetbv/xsetbv
Linux 5.16 moved these functions into this new header in commit
1b4fb8545f2b00f2844c4b7619d64d98440a477c. This change adds code to look
for the presence of this header, and include it so that the code using
xgetbv & xsetbv will compile again.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Coleman Kane <ckane@colemankane.org>
Closes #12800
2021-12-07 13:14:23 -08:00
наб ac9b1aa1bf tests/file_check: remove unused variable
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12187
2021-12-06 13:52:34 -08:00
John Wren Kennedy e20186f5d5 Strip colons from all test result filenames
The upload artifact functionality in github can't handle colons in
filenames. The current code handles this for files under the most
recent set of results. With the ability to rerun failed tests, now
there can be multiple sets of results, and they all need to be
processed in the same way.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Signed-off-by: John Kennedy <john.kennedy@delphix.com>
Closes #12815
2021-12-06 12:23:02 -08:00
Brian Behlendorf 16da688f25 Linux 5.13 compat: retry zvol_open() when contended
Due to a possible lock inversion the zvol open call path on Linux
needs to be able to retry in the case where the spa_namespace_lock
cannot be acquired.

For Linux 5.12 an older kernel this was accomplished by returning
-ERESTARTSYS from zvol_open() to request that blkdev_get() drop
the bdev->bd_mutex lock, reaquire it, then call the open callback
again.  However, as of the 5.13 kernel this behavior was removed.

Therefore, for 5.12 and older kernels we preserved the existing
retry logic, but for 5.13 and newer kernels we retry internally in
zvol_open().  This should always succeed except in the case where
a pool's vdev are layed on zvols, in which case it may fail.  To
handle this case vdev_disk_open() has been updated to retry when
opening a device when -ERESTARTSYS is returned.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #12301
Closes #12759
2021-12-06 12:22:57 -08:00
John Wren Kennedy e9ee57f682 Temporarily remove tests from sanity runfile
With the addition of functionality to rerun failing tests, some
tests that fail only sometimes still fail often enough to degrade
the reliability of the sanity runs. Remove them from the runfile
until they reliably pass.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Signed-off-by: John Kennedy <john.kennedy@delphix.com>
Closes #12814
2021-12-06 12:22:51 -08:00
Paul Dagnelie d346361515 Add zfs-test facility to automatically rerun failing tests
This was a project proposed as part of the Quality theme for the
hackthon for the 2021 OpenZFS Developer Summit. The idea is to improve
the usability of the automated tests that get run when a PR is created
by having failing tests automatically rerun in order to make flaky
tests less impactful.

Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Signed-off-by: Paul Dagnelie <pcd@delphix.com>
Closes #12740
2021-12-06 12:22:43 -08:00
Coleman Kane 12d27e7134 Linux 5.16: wait_on_page_bit() no longer available to modules
Instead, linux/pagemap.h offers a number of folio-specific functions to
be called instead. In this case, module/os/linux/zfs/zfs_vnops_os.c
wants to call wait_on_page_bit(pp, PG_writeback). This gets replaced
with folio_wait_bit(folio_page(pp), PG_writeback). This change modifies
the code to conditionally compile that if configure identifies th
presence of the folio_wait_bit() function.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Coleman Kane <ckane@colemankane.org>
Closes #12800
2021-12-06 12:22:38 -08:00
Jorgen Lundman a1a29bf8fc Iterate encrypted clones at zvol_create_minor
Userland figures out which encryption-root keys are required to load,
and issues ZFS_IOC_LOAD_KEY.
The tail section of spa_keystore_load_wkey() will call
zvol_create_minors() on the encryption-root object.

Any clones of the encrypted zvol will not be plumbed. This commits
adds additional logic to detect if zvol has clones, and is encrypted,
then adds these to the list of zvols to call zvol_create_minors() on.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Jorgen Lundman <lundman@lundman.net>
Closes #12471
2021-12-06 12:22:32 -08:00
Brian Behlendorf ea0dda5999 Exclude zfs_copies_003_pos on Linux
This test case may fail on 5.13 and newer Linux kernels if the
/dev/zvol/ device is not created by udev.

Reviewed-by: Rich Ercolani <rincebrain@gmail.com>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #12301
Closes  #12738
2021-11-12 16:20:15 -08:00
Brian Behlendorf d7e640cf95 Restore dirty dnode detection logic
In addition to flushing memory mapped regions when checking holes,
commit de198f2d95 modified the dirty dnode detection logic to check
the dn->dn_dirty_records instead of the dn->dn_dirty_link.  Relying
on the dirty record has not be reliable, switch back to the previous
method.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #11900 
Closes #12745
2021-11-05 09:45:04 -07:00
Brian Behlendorf 664d487a5d Fix lseek(SEEK_DATA/SEEK_HOLE) mmap consistency
When using lseek(2) to report data/holes memory mapped regions of
the file were ignored.  This could result in incorrect results.
To handle this zfs_holey_common() was updated to asynchronously
writeback any dirty mmap(2) regions prior to reporting holes.

Additionally, while not strictly required, the dn_struct_rwlock is
now held over the dirty check to prevent the dnode structure from
changing.  This ensures that a clean dnode can't be dirtied before
the data/hole is located.  The range lock is now also taken to
ensure the call cannot race with zfs_write().

Furthermore, the code was refactored to provide a dnode_is_dirty()
helper function which checks the dnode for any dirty records to
determine its dirtiness.

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Rich Ercolani <rincebrain@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #11900
Closes #12724
2021-11-05 08:08:55 -07:00
Dimitri John Ledkov 5bf81fea2f Upgrade to libabigail 2.0.0
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Dimitri John Ledkov <dimitri.ledkov@canonical.com>
Closes #12722
Closes #12739
2021-11-05 07:59:40 -07:00
Tony Hutter 1fca958615 zed: Control NVMe fault LEDs
The ZED code currently can only turn on the fault LED for
a faulted disk in a JBOD enclosure.  This extends support
for faulted NVMe disks as well.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #12648
Closes #12695
2021-11-05 07:51:21 -07:00
Brian Behlendorf 22b0891dbb Linux 5.16 compat: submit_bio()
The submit_bio() prototype has changed again.  The version is 5.16
still only expects a single argument but the return type has changed
to void.  Since we never used the returned value before update the
configure check to detect both single arg versions.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Alexander Lobakin <alobakin@pm.me>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #12725
2021-11-05 07:51:21 -07:00
Brian Behlendorf 0e537a0195 Linux 5.16 compat: linux/elevator.h
Commit https://github.com/torvalds/linux/commit/2e9bc346 moved
the elevator.h header under the block/ directory as part of some
refactoring.  This turns out not to be a problem since there's
no longer anything we need from the header.  This has been the
case for some time, this change removes the elevator.h include
and replaces it with a major.h include.

Reviewed-by: Alexander Lobakin <alobakin@pm.me>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #12725
2021-11-05 07:51:21 -07:00
Tony Hutter db9e1c907a vdev_id: Fix PHY sorting
One of our developers noticed a bug in vdev_id where we were incorrectly
sorting PHYs using alphabetical sorting (which usually works) instead
of natural sorting (-v).  For example:

	[port-0:0]# ls -d phy*
	phy-0:10  phy-0:11  phy-0:8  phy-0:9

	[port-0:0]# ls -vd phy*
	phy-0:8  phy-0:9  phy-0:10  phy-0:11

This fixes the issue.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #12699
2021-11-02 16:31:17 -07:00
Tony Hutter d11b03ed81 vdev_id: Fix enclosure_symlinks feature
The vdev_id.conf "enclosure_symlinks" option persistently creates
and maps /dev/by-enclosure symlinks to dynamic /dev/sg* devices.

This patch fixes two issues:

1. The enclosure_symlinks feature was accidentally broken in:

   vdev_id: Support daisy-chained JBODs in multipath mode

2. Even when working, the feature numbered the enclosure
   sequentially rather than by HBA port number.  That meant that
   if a port was down or didn't appear in sysfs, then the
   enclosure_sumlinks numbers would be numbered wrong.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #12660
2021-11-02 16:31:11 -07:00
Tony Hutter 586b5d366e Rescan enclosure sysfs path on import
When you create a pool, zfs writes vd->vdev_enc_sysfs_path with the
enclosure sysfs path to the fault LEDs, like:

    vdev_enc_sysfs_path = /sys/class/enclosure/0:0:1:0/SLOT8

However, this enclosure path doesn't get updated on successive imports
even if enclosure path to the disk changes.  This patch fixes the issue.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #11950
Closes #12095
2021-11-02 16:31:05 -07:00
Ryan Moeller 27d9c6ae2b FreeBSD: Catch up with recent VFS changes
cn_thread is always curthread.

https://cgit.freebsd.org/src/commit?id=b4a58fbf640409a1e507d9f7b411c83a3f83a2f3
https://cgit.freebsd.org/src/commit?id=2b68eb8e1dbbdaf6a0df1c83b26f5403ca52d4c3

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Alan Somers <asomers@gmail.com>
Signed-off-by: Ryan Moeller <freqlabs@FreeBSD.org>
Closes #12668
2021-11-02 13:48:54 -07:00
Martin Matuška b7ecb4ff0d FreeBSD: fix compilation of FreeBSD world after 29274c9f6
prng32_bounded() is available to kernel only on FreeBSD 13+.

Call inline random_get_pseudo_bytes() with correct pointer type.
To be consistent, apply to Linux as well.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Martin Matuska <mm@FreeBSD.org>
Closes #12282
2021-11-02 13:35:47 -07:00
Brian Behlendorf af9aa4a216 ZTS: Standardize use of destroy_dataset in cleanup
When cleaning up a test case standardize on using the convention:

    datasetexists $ds && destroy_dataset $ds <flags>

By using 'destroy_dataset' instead of 'log_must zfs destroy' we ensure
that the destroy is retried in the event that a ZFS volume is busy.
This helps ensures ensure tests are fully cleaned up and prevents false
positive test failures on Linux.

Note that all of the tests which used 'zfs destroy' in cleanup have
been updated even if they don't use volumes.  This was done to
clearly establish the expected convention.

Reviewed-by: Rich Ercolani <rincebrain@gmail.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #12663
2021-11-02 09:51:32 -07:00
Rich Ercolani 55ab3773d7 Workaround cloud-init hotplug issue
cloud-init added a hook which triggers on every device add/rm
event, which results in holding open devices for a while after
they're created/destroyed.

So let's shove an exclusion rule for that into the GH workflows
until it gets fixed.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #12644
Closes #12669
2021-11-02 09:51:32 -07:00
Brian Behlendorf 143476ce8d Use fallthrough macro
As of the Linux 5.9 kernel a fallthrough macro has been added which
should be used to anotate all intentional fallthrough paths.  Once
all of the kernel code paths have been updated to use fallthrough
the -Wimplicit-fallthrough option will because the default.  To
avoid warnings in the OpenZFS code base when this happens apply
the fallthrough macro.

Additional reading: https://lwn.net/Articles/794944/

Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #12441
2021-11-02 09:50:30 -07:00
Kevin Bowling d8a97a7be2 Detect HAVE_LARGE_STACKS at compile time (#12584)
Move HAVE_LARGE_STACKS definitions to header and set when appropriate.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Allan Jude <allan@klarasystems.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Kevin Bowling <kbowling@FreeBSD.org>
Closes #12350
2021-11-01 14:56:18 -07:00
Rich Ercolani 8cd9f20a34 Correct a flaw in the Python 3 version checking (#12636)
It turns out the ax_python_devel.m4 version check assumes that
("3.X+1.0" >= "3.X.0") is True in Python, which is not when X+1
is 10 or above and X is not. (Also presumably X+1=100 and ...)

So let's remake the check to behave consistently, using the
"packaging" or (if absent) the "distlib" modules.

(Also, update the Github workflows to use the new packages.)

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes: #12073
2021-11-01 14:54:47 -07:00
Brian Behlendorf 71c6098526 Tag 2.1.1
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
2021-09-15 13:37:50 -07:00
Brian Behlendorf a3da79d582 Linux 5.14 compat: META
Increase the Linux-Maximum version in the META file to 5.14.
All of the required compatibility patches have been merged
and the 5.14 kernel has been officially released.

Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #12565
2021-09-15 13:24:34 -07:00
Arun KV bb80b4649a Fixed data integrity issue when underlying disk returns error
Errors in zil_lwb_write_done() are not propagated to
zil_lwb_flush_vdevs_done() which can result in zil_commit_impl()
not returning an error to applications even when zfs was not able
to write data to the disk.

Remove the ZIO_FLAG_DONT_PROPAGATE flag from zio_rewrite() to
allow errors to propagate and consolidate the error handling for
flush and write errors to a single location (rather than having
error handling split between the "write done" and "flush done"
handlers).

Reviewed-by: George Wilson <gwilson@delphix.com>
Reviewed-by: Prakash Surya <prakash.surya@delphix.com>
Signed-off-by: Arun KV <arun.kv@datacore.com>
Closes #12391
Closes #12443
2021-09-14 15:45:30 -07:00
Brian Behlendorf 7816a6b85b ZTS: Waiting for zvols to be available
This is a follow up patch for PR #12515 which addresses some
additional ZTS tests which are unreliable are should explicitly
wait for the required zvols to be available.

Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: @Theo13111
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #12553
2021-09-14 15:45:11 -07:00
Brian Behlendorf 9183321501 Verify embedded blkptr's in arc_read()
The block pointer verification check in arc_read() should also
cover embedded block pointers.  While highly unlikely, accessing
a damaged block pointer can result in panic.  To further harden
the code extend the existing check to include embedded block
pointers and add a comment explaining the rational for this
sanity check.  Lastly, correct a flaw in zfs_blkptr_verify()
so the error count is checked even when checking a untrusted
config to verify the non-pool-specific portions of a block
pointer.

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #12535
2021-09-14 15:43:18 -07:00
Brian Behlendorf 32512acbc0 Linux 5.15 compat: get_acl()
Kernel commits

332f606b32b6 ovl: enable RCU'd ->get_acl()
0cad6246621b vfs: add rcu argument to ->get_acl() callback

Added compatibility code to detect the new ->get_acl() interface
and correctly handle the case where the new rcu argument is set.

Reviewed-by: Coleman Kane <ckane@colemankane.org>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #12548
2021-09-14 15:42:59 -07:00
Allan Jude cea0752f8d Allow sending corrupt snapshots even if metadata is corrupted
When zfs_send_corrupt_data is set, use the TRAVERSE_HARD flag,
so traverse_visitbp() will not fail with ECKSUM if a blockpointer
cannot be read, but rather will continue and send the objects it can.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Signed-off-by: Allan Jude <allan@klarasystems.com>
Sponsored-By: Klara Inc.
Sponsored-By: WHC Online Solutions Inc.
Closes #12541
2021-09-14 15:42:49 -07:00
Rich Ercolani 7d70f1e099 arc: Drop an incorrect assert
Unfortunately, there was an overzealous assertion that was (in pretty
specific circumstances) false, causing failure.  This assertion was
added in error, so we're removing it.

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: George Wilson <gwilson@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #9897
Closes #12020
Closes #12246
2021-09-14 15:42:33 -07:00
Paul Dagnelie fd92825445 Compressed receive with different ashift can result in incorrect PSIZE on disk
We round up the psize to the nearest multiple of the asize or to the
lsize, whichever is smaller. Once that's done, we allocate a new
buffer of the appropriate size, zero the tail, and copy the data
into it. This adds a small performance cost to these kinds of writes,
but fixes the bookkeeping problems.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Co-authored-by: Matthew Ahrens <matthew.ahrens@delphix.com>
Signed-off-by: Paul Dagnelie <pcd@delphix.com>
Closes #12522
Closes #8462
2021-09-14 15:42:17 -07:00
Alexander 7bf68e9806 Linux 5.15 compat: standalone <linux/stdarg.h>
Kernel commits

39f75da7bcc8 ("isystem: trim/fixup stdarg.h and other headers")
c0891ac15f04 ("isystem: ship and use stdarg.h")
564f963eabd1 ("isystem: delete global -isystem compile option")

(for now can be found in linux-next.git tree, will land into the
 Linus' tree during the ongoing 5.15 cycle with one of akpm merges)

removed the -isystem flag and disallowed the inclusion of any
compiler header files. They also introduced a minimal
<linux/stdarg.h> as a replacement for <stdarg.h>.
include/os/linux/spl/sys/cmn_err.h in the ZFS source tree includes
<stdarg.h> unconditionally. Introduce a test for <linux/stdarg.h>
and include it instead of the compiler's one to prevent module
build breakage.

Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Lobakin <alobakin@pm.me>
Closes #12531
2021-09-14 15:42:01 -07:00
Brian Behlendorf ad8dc99ed2 Linux 5.15 compat: block device readahead
The 5.15 kernel moved the backing_dev_info structure out of
the request queue structure which causes a build failure.

Rather than look in the new location for the BDI we instead
detect this upstream refactoring by the existance of either
the blk_queue_update_readahead() or disk_update_readahead()
functions.  In either case, there's no longer any reason to
manually set the ra_pages value since it will be overridden
with a reasonable default (2x the block size) when
blk_queue_io_opt() is called.

Therefore, we update the compatibility wrapper to do nothing
for 5.9 and newer kernels.  While it's tempting to do the
same for older kernels we want to keep the compatibility
code to preserve the existing behavior.  Removing it would
effectively increase the default readahead to 128k.

Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #12532
2021-09-14 15:41:42 -07:00
Don Brady 6ca1f30708 Detect iSCSI in the zpool cmd vdev media script
Reviewed-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Signed-off-by: Don Brady <don.brady@delphix.com>
Closes #12206
2021-09-14 15:40:52 -07:00
George Melikov e16e05c9cf CI: don't install abigail-tools
We use docker image instead.

Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Melikov <mail@gmelikov.ru>
Closes #12529
2021-09-14 15:40:36 -07:00
George Melikov 5331e2d216 Update ABI files via new libabigail version
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Melikov <mail@gmelikov.ru>
Closes #12529
2021-09-14 15:40:12 -07:00
George Melikov d6dae00982 Libabigail: make .abi files more consistent
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Melikov <mail@gmelikov.ru>
Closes #12529
2021-09-14 15:38:55 -07:00
George Melikov 993d4b28af CI: use fresh libabigail via docker image
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Melikov <mail@gmelikov.ru>
Closes #12529
2021-09-14 15:13:33 -07:00
George Melikov 004e7d3f9a Check for libabigail version
We need to use 1.8.0+ version, older versions
may segfault and give inconsistent results.

Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Melikov <mail@gmelikov.ru>
Closes #12529
2021-09-14 15:13:11 -07:00
Ryan Moeller aef8a72afe ZTS: Remove exceptions for flaky zhack on FreeBSD
Issue #11854 has been resolved, so we can remove the exceptions for it.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #12527
2021-09-14 15:12:14 -07:00
Ryan Moeller 81611683c8 FreeBSD: Don't remove SA xattr if not SA znode
We attempt to remove an existing SA xattr when setting a dir xattr, but
this only makes sense if the znode has been upgraded to the SA format.
Otherwise, we will hit an assert in zfs_sa_get_xattr.

Make sure this is an SA znode before attempting to remove the SA xattr.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #12514
2021-09-14 15:11:56 -07:00
Rich Ercolani 72a989cf60 Fix cross-endian interoperability of zstd
It turns out that layouts of union bitfields are a pain, and the
current code results in an inconsistent layout between BE and LE
systems, leading to zstd-active datasets on one erroring out on
the other.

Switch everyone over to the LE layout, and add compatibility code
to read both.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #12008
Closes #12022
2021-09-14 15:05:55 -07:00
Brian Behlendorf 6bb6410570 ZTS: Waiting for zvols to be available
The ZTS block_device_wait helper function should use -e when waiting
for a file to appear since it will be either a block special device
or a symlink.  This didn't cause any failures but when a device path
was specified the function would wait longer than needed.

Additionally update the most flakey test cases to pass the file path
to block_device_wait to try and improve the test reliability.  The
udev behavior on Fedora in particular can result in frequent false
positives.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #12515
2021-09-14 14:37:50 -07:00
Ryan Moeller 6c3c7dc846 Correct checking bdev_check_media_change message
We're not looking for bdev_disk_changed.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #12492
2021-09-14 14:36:42 -07:00
Tony Hutter 2904ec57f0 Make 'zpool labelclear -f' work on offlined disks
This patch allows you to clear the label on offlined disks in an active
pool with `-f`.  Previously, labelclear wouldn't let you do that.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Closes #12511
2021-09-14 14:36:37 -07:00
Anton Gubarkov bc371b2806 vdev_id: Return an error if config file is not found
Signed-off-by: Anton Gubarkov <anton.gubarkov@gmail.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
2021-09-14 14:36:32 -07:00
Sam Hathaway e78d06f89b zpool-remove.8: describe top-level vdev sector size limitation
Document that top-level vdevs cannot be removed unless all top-level
vdevs have the same sector size.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Sam Hathaway <sam@sam-hathaway.com>
Closes #11339
Closes #12472
2021-09-14 14:32:16 -07:00
Mark Johnston 2016d7fb9c Initialize parity blocks before RAID-Z reconstruction benchmarking
benchmark_raidz() allocates a row to benchmark parity calculation and
reconstruction.  In the latter case, the parity blocks are left
uninitialized, leading to reports from KMSAN.

Initialize parity blocks to 0xAA as we do for the data earlier in the
function.  This does not affect the selected RAID-Z implementation on
any of several systems tested.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Mark Johnston <markj@FreeBSD.org>
Closes #12473
2021-09-14 14:32:16 -07:00
Ryan Moeller 584b7a214e ZTS: Add tests for creation time
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Reviewed-by: Allan Jude <allan@klarasystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #12432
2021-09-14 14:32:16 -07:00
Richard Yao 1655ce5619 Linux 4.11 compat: statx support
Linux 4.11 added a new statx system call that allows us to expose crtime
as btime. We do this by caching crtime in the znode to match how atime,
ctime and mtime are cached in the inode.

statx also introduced a new way of reporting whether the immutable,
append and nodump bits have been set. It adds support for reporting
compression and encryption, but the semantics on other filesystems is
not just to report compression/encryption, but to allow it to be turned
on/off at the file level. We do not support that.

We could implement semantics where we refuse to allow user modification
of the bit, but we would need to do a dnode_hold() in zfs_znode_alloc()
to find out encryption/compression information. That would introduce
locking that will have a minor (although unmeasured) performance cost.
It also would be inferior to zdb, which reports far more detailed
information. We therefore omit reporting of encryption/compression
through statx in favor of recommending that users interested in such
information use zdb.

Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Reviewed-by: Allan Jude <allan@klarasystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Richard Yao <ryao@gentoo.org>
Closes #8507
2021-09-14 14:31:50 -07:00
Gordon Bergling 5de6e4ec94 zfs.4: Fix typo s/compatiblity/compatibility/
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Signed-off-by: Gordon Bergling <gbergling@googlemail.com>
Closes #12464
2021-09-14 14:31:50 -07:00
Alexander Motin a4862125b8 Remove b_pabd/b_rabd allocation from arc_hdr_alloc()
When a header is allocated for full overwrite it is a waste of time
to allocate b_pabd/b_rabd for it, since arc_write() will free them
without ever being touched.  If it is a read or a partial overwrite
then arc_read() and arc_hdr_decrypt() allocate them explicitly.

Reduced memory allocation in user threads also reduces ARC eviction
throttling there, proportionally increasing it in ZIO threads, that
is not good.  To minimize or even avoid it introduce ARC allocation
reserve, allowing certain arc_get_data_abd() callers to allocate a
bit longer in situations where user threads will already throttle.

Reviewed-by: George Wilson <gwilson@delphix.com>
Reviewed-by: Mark Maybee <mark.maybee@delphix.com>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Closes #12398
2021-09-14 14:31:50 -07:00
Alexander Motin 61773f41b8 Optimize arc_l2c_only lists assertions
It is very expensive and not informative to call multilist_is_empty()
for each arc_change_state() on debug builds to check for impossible.
Instead implement special index function for arc_l2c_only->arcs_list,
multilists, panicking on any attempt to use it.

Reviewed-by: Mark Maybee <mark.maybee@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Closes #12421
2021-09-14 14:31:22 -07:00
Alexander Motin 40e02f49e9 Fix/improve dbuf hits accounting
Instead of clearing stats inside arc_buf_alloc_impl() do it inside
arc_hdr_alloc() and arc_release().  It fixes statistics being wiped
every time a new dbuf is filled from the ARC.

Remove b_l1hdr.b_l2_hits. L2ARC hits are accounted at b_l2hdr.b_hits.
Since the hits are accounted under hash lock, replace atomics with
simple increments.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Wilson <george.wilson@delphix.com>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Closes #12422
2021-09-14 14:31:22 -07:00
Alexander Motin c600f0687f Avoid vq_lock drop in vdev_queue_aggregate()
vq_lock is already too congested for two more operations per I/O.
Instead of dropping and reacquiring it inside vdev_queue_aggregate()
delegate the zio_vdev_io_bypass() and zio_execute() calls for parent
I/Os to callers, that drop the lock any way to execute the new I/O.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Mark Maybee <mark.maybee@delphix.com>
Reviewed-by: Brian Atkinson <batkinson@lanl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Closes #12297
2021-09-14 14:31:22 -07:00
Alexander Motin 5afc35b698 Use more atomics in refcounts
Use atomic_load_64() for zfs_refcount_count() to prevent torn reads
on 32-bit platforms.  On 64-bit ones it should not change anything.

When built with ZFS_DEBUG but running without tracking enabled use
atomics instead of mutexes same as for builds without ZFS_DEBUG.
Since rc_tracked can't change live we can check it without lock.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Closes #12420
2021-09-14 14:31:01 -07:00
Ryan Moeller c6c0d30016 ZTS: Avoid unset $tmpdir in redacted_panic
The redacted_send tests make use of a $tmpdir variable, except in
redacted_send/redacted_panic the variable is never defined.

Use $TEST_BASE_DIR instead.

Clean up the stream file after the test.

Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #12455
2021-09-14 14:31:01 -07:00
Allan Jude 24e51e3749 Restore FreeBSD sysctl processing for arc.min and arc.max
Before OpenZFS 2.0, trying to set the FreeBSD sysctl vfs.zfs.arc_max
to a disallowed value would return an error.
Since the switch, it instead only generates WARN_IF_TUNING_IGNORED

Keep the ability to set the sysctl's specifically to 0, even though
that is less than the minimum, because some tests depend on this.

Also lost, was the ability to set vfs.zfs.arc_max to a value less
than the default vfs.zfs.arc_min at boot time. Restore this as well.

Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Signed-off-by: Allan Jude <allan@klarasystems.com>
Closes #12161
2021-09-14 14:31:01 -07:00
Ryan Moeller 744f3009fc zfs: add missed dependency of zfs module on zlib
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Martin Matuska <mm@FreeBSD.org>
Co-authored-by: Konstantin Belousov <kib@FreeBSD.org>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
External-issue: https://reviews.freebsd.org/D31207
Closes #12442
2021-09-14 14:30:39 -07:00
Ryan Moeller cacc48702b Add zfs.sh -r flag to reload modules
zfs.sh already can load and unload, so why not both?

This is convenient when developing changes to the module and you want
to rapidly make some changes, rebuild the module, reload the module,
and test the changes.

Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #12450
2021-09-14 14:30:39 -07:00
Ryan Moeller cc55271681 Fix usage of find in tests/Makefile.am
The path is not optional on FreeBSD.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #12453
2021-09-14 14:30:39 -07:00
Tony Nguyen 477edd642c Run arc_evict thread at higher priority
Run arc_evict thread at higher priority, nice=0, to give it more CPU
time which can improve performance for workload with high ARC evict
activities.

On mixed read/write and sequential read workloads, I've seen between
10-40% better performance.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Tony Nguyen <tony.nguyen@delphix.com>
Closes #12397
2021-09-14 14:30:13 -07:00
Rich Ercolani 23184b172a Make get_key_material_file fail more verbosely
It turns out, there are a lot of possible reasons for fopen to fail.
Let's share which reason we failed for today.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #12410
2021-09-14 14:30:13 -07:00
Brian Behlendorf 32a971e749 Enable /proc/diskstats for zvols
The /proc/diskstats accounting needs to be explicitly enabled
for block devices which do not use multi-queue.

Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #12440
Closes #12066
2021-09-14 14:30:13 -07:00
George Melikov c07ed69577 Man zpool-scrub.8: describe sequential scrub
Describe sequential scrub and add examples of scrub status.

Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Signed-off-by: George Melikov <mail@gmelikov.ru>
Closes #12429
2021-09-14 14:29:46 -07:00
hedongzhang ddb732e2c8 Modify checksum obtain method of QAT
CpaDcGeneratefooter function that obtain the checksum code
does not support the CPA_DC_STATELESS mode. So we get the
adler32 chencksum of the end of the zlib from dc_results.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Chengfei Zhu <chengfeix.zhu@intel.com>
Signed-off-by: hedong.zhang <h_d_zhang@163.com>
Closes #12343
2021-09-14 14:29:46 -07:00
Mark Johnston 451d6da988 Allow disabling of unmapped I/O on FreeBSD
We have a tunable which permits one to disable the use of unmapped I/O
for the buffer cache.  Respect it in ZFS as well.  This is useful for
KMSAN, which cannot easily maintain shadow state for unmapped pages.

No functional change intended, as unmapped I/O is permitted by default
and there's no real reason to disable it in practice except for
debugging.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Mark Johnston <markj@FreeBSD.org>
Closes #12446
2021-09-14 14:29:46 -07:00
Alexander Motin e298ac5d04 Add comment on metaslab_class_throttle_reserve() locking
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Issue #12314
Closes #12419
2021-09-14 13:09:40 -07:00
John Wren Kennedy 9429910781 Assorted fixes for the performance tests
- Bail out early if we're running the perf tests and forget to
  specify disks.
- Allow perf tests to run with any number of disks.
- Remove weekly vs. nightly settings
- Move variables with common values to perf.shlib
- Use zinject to clear the ARC over export/import
- Fix dbuf cache size calculation

When the meaning of `dbuf_cache_max_bytes` changed, the performance
test that covers the dbuf cache started to fail. The test would try to
write files for the test using the max possible size of the cache,
inevitably filling the pool and failing. This change uses
`dbuf_cache_shift` to correctly calculate the dbuf cache size.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: John Kennedy <john.kennedy@delphix.com>
Closes #12408
2021-09-14 13:09:24 -07:00
Matthew Ahrens 8a969f3e2d Read past end of argv array in zpool_do_import()
`zpool_do_import()` passes `argv[0]`, (optionally) `argv[1]`, and
`pool_specified` to `import_pools()`.  If `pool_specified==FALSE`, the
`argv[]` arguments are not used.  However, these values may be off the
end of the `argv[]` array, so loading them could dereference unmapped
memory.  This error is reported by the asan build:

```
=================================================================
==6003==ERROR: AddressSanitizer: heap-buffer-overflow
READ of size 8 at 0x6030000004a8 thread T0
    #0 0x562a078b50eb in zpool_do_import zpool_main.c:3796
    #1 0x562a078858c5 in main zpool_main.c:10709
    #2 0x7f5115231bf6 in __libc_start_main
    #3 0x562a07885eb9 in _start

0x6030000004a8 is located 0 bytes to the right of 24-byte region
allocated by thread T0 here:
    #0 0x7f5116ac6b40 in __interceptor_malloc
    #1 0x562a07885770 in main zpool_main.c:10699
    #2 0x7f5115231bf6 in __libc_start_main
```

This commit passes NULL for these arguments if they are off the end
of the `argv[]` array.

Reviewed-by: George Wilson <gwilson@delphix.com>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Allan Jude <allan@klarasystems.com>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
Closes #12339
2021-09-14 13:08:53 -07:00
Václav Skála 898b1e173c Add missing properties to zfs allow manpage
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Václav Skála <skala@vshosting.cz>
Closes #12402
2021-09-14 13:08:19 -07:00
George Amanakis 406534f807 Fixes in persistent L2ARC
In l2arc_add_vdev() first decide whether the device is eligible for
L2ARC rebuild or whole device trim and then add it to the list of cache
devices. Otherwise l2arc_feed_thread() might already start writing on
the device invalidating previous content as l2ad_hand = l2ad_start.
However l2arc_rebuild_vdev() needs the device present in the cache
device list to figure out its l2arc_dev_t. Fix this by moving most of
l2arc_rebuild_vdev() in a new function l2arc_rebuild_dev() which does
not need to search in the cache device list.

In contrast to l2arc_add_vdev() we do not have to worry about
l2arc_feed_thread() invalidating previous content when onlining a
cache device. The device parameters (l2ad*) are not cleared when
offlining the device and writing new buffers will not invalidate
all previous content. In worst case only buffers that have not had
their log block written to the device will be lost.

Retire persist_l2arc_00{4,5,8} tests since they cover code already
covered by the remaining ones. Test persist_l2arc_006 is renamed to
persist_l2arc_004 and persist_l2arc_007 is renamed to persist_l2arc_005.

Fix a typo in persist_l2arc_004, and remove an assertion that is not
always true from l2arc_arcstats_pos. Also update an assertion in
persist_l2arc_005 and explain why in a comment.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Amanakis <gamanakis@gmail.com>
Closes #12365
2021-09-14 13:07:44 -07:00
Mark Johnston ac573e3105 Initialize dn_next_type[] in the dnode constructor
It seems nothing ensures that this array is zeroed when a dnode is
freshly allocated, so in principle it retains the values from the
previous allocation.  In practice it seems to be the case that the
fields should end up zeroed, but we can zero the field anyway for
consistency.

This was found using KMSAN.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Mark Johnston <markj@FreeBSD.org>
Closes #12383
2021-09-14 13:07:44 -07:00
Mark Johnston 99df200ffc Zero pad bytes following TX_WRITE log data
When logging a TX_WRITE record in the case where file data has to be
copied from the DMU, we pad the log record size to a multiple of 8
bytes.  In this case, any padding bytes should be zeroed, otherwise the
contents of uninitialized memory are written to the ZIL.

This was found using KMSAN.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Mark Johnston <markj@FreeBSD.org>
Closes #12383
2021-09-14 12:42:21 -07:00
Mark Johnston bd910fdeb0 Zero pad bytes when allocating a ZIL record
When allocating a record, we round up the allocation size to a multiple
of 8.  In this case, any padding bytes should be zeroed, otherwise the
contents of uninitialized memory are written to the ZIL.

This was found using KMSAN.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Mark Johnston <markj@FreeBSD.org>
Closes #12383
2021-09-14 12:42:21 -07:00
Mark Johnston 9cc9821014 Initialize all fields in zfs_log_xvattr()
When logging TX_SETATTR, we could otherwise fail to initialize part of
the corresponding ZIL record depending on which fields are present in
the xvattr.  Initialize the creation time and the AV scan timestamp to
zero so that uninitialized bytes are not written to the ZIL.

This was found using KMSAN.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Mark Johnston <markj@FreeBSD.org>
Closes #12383
2021-09-14 12:42:21 -07:00
Mark Johnston fceda40c1e Initialize "autoreplace" in spa_ld_get_props()
spa_prop_find() may fail to find the specified property, in which case
it suppresses ENOENT from zap_lookup().  In this case, the return value
is left uninitialized, so spa_autoreplace was being initialized using an
uninitialized stack variable.

This was found using KMSAN.  It appears to be a regression from commit
9eb7b46ed0, which removed the initialization of "autoreplace" from the
definition.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Mark Johnston <markj@FreeBSD.org>
Closes #12383
2021-09-14 12:41:10 -07:00
Coleman Kane 4434baab11 Linux 5.14 compat: explicity assign set_page_dirty
Kernel 5.14 introduced a change where set_page_dirty of
struct address_space_operations is no longer implicitly set to
__set_page_dirty_buffers(), which ended up resulting in a NULL
pointer deref in the kernel when it is attempted to be called.
This change sets .set_page_dirty in the structure to
__set_page_dirty_nobuffers(), which was introduced with the
related patch set. The breaking change was introduce in commit
0af573780b0b13fceb7fabd49dc1b073cee9a507 to torvalds/linux.git.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Coleman Kane <ckane@colemankane.org>
Closes #12427
2021-09-14 12:41:10 -07:00
Rich Ercolani 6385f4e70e Fix unfortunate NULL in spa_update_dspace
After 1325434b, we can in certain circumstances end up calling
spa_update_dspace with vd->vdev_mg NULL, which ends poorly during
vdev removal.

So let's not do that further space adjustment when we can't.

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #12380
Closes #12428
2021-09-14 12:41:10 -07:00
Brian Behlendorf 2f073cc9c6 Linux 5.14 compat: blk_alloc_disk()
In Linux 5.14, blk_alloc_queue is no longer exported, and its usage
has been superseded by blk_alloc_disk, which returns a gendisk struct
from which we can still retrieve the struct request_queue* that is
needed in the one place where it is used. This also replaces the call
to alloc_disk(minors), and minors is now set via struct member
assignment.

Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-by: Coleman Kane <ckane@colemankane.org>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #12362
Closes #12409
2021-09-14 12:40:45 -07:00
Ryan Moeller 729eb48666 zloop: Add a max iterations option, use default run/pass times
It is useful to have control over the number of iterations of zloop so
we can easily produce "x core dumps found *in y iterations*" metrics.

Using random values for run/pass times doesn't improve coverage in a
meaningful way.

Randomizing run time could be seen as a compromise between running a
greater variety of shorter tests versus a smaller variety of longer
tests within a fixed time span.  However, it is not desirable when
running a fixed number of iterations.

Pass time already incorporates randomness within ztest.

Either parameter can be passed to ztest explicitly if the defaults are
not satisfactory.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #12411
2021-09-14 12:40:45 -07:00
Alexander Motin 93e11e257b FreeBSD: Ignore make_dev_s() errors
Since errors returned by zvol_create_minor_impl() are ignored by the
common code, it is more convenient to ignore make_dev_s() errors there.
It allows, for example, to get device created for the zvol after later
rename instead of having it further stuck in half-created state.
zvol_rename_minor() already ignores those errors.

While there, switch from MAXPHYS to maxphys in FreeBSD 13+.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Closes #12375
2021-09-14 12:40:45 -07:00
Jorgen Lundman eaa10257ca Remove old orig_fd variable from zfs send
Possibly required in the past, but is currently fills no purpose.
Ordinarily such tiny cleanup is not generally worth it, however
on the macOS port, in a future commit, we do unspeakable things to the
"fd" for send/recv, and it would be easier to only have to deal with
one "fd" instead of two.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Signed-off-by: Jorgen Lundman <lundman@lundman.net>
Closes #12404
2021-09-14 12:40:16 -07:00
Alexander Motin 32c0b6468c Optimize allocation throttling
Remove mc_lock use from metaslab_class_throttle_*().  The math there
is based on refcounts and so atomic, so the only race possible there
is between zfs_refcount_count() and zfs_refcount_add().  But in most
cases metaslab_class_throttle_reserve() is called with the allocator
lock held, which covers the race.  In cases where the lock is not
held, GANG_ALLOCATION() or METASLAB_MUST_RESERVE are set, and so we
do not use zfs_refcount_count().  And even if we assume some other
non-existing scenario, the worst that may happen from this race is
few more I/Os get to allocation earlier, that is not a problem.

Move locks and data of different allocators into different cache
lines to avoid false sharing.  Group spa_alloc_* arrays together
into single array of aligned struct spa_alloc spa_allocs.  Align
struct metaslab_class_allocator.

Reviewed-by: Paul Dagnelie <pcd@delphix.com>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Don Brady <don.brady@delphix.com>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Closes #12314
2021-09-14 12:40:15 -07:00
George Melikov 7c61e1ef9d CI: generate ABI files if changed
So commit author can just download them as
artifacts and commit.

Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Signed-off-by: George Melikov <mail@gmelikov.ru>
Closes #12379
2021-09-14 12:40:15 -07:00
Alexander Motin 6a49948c73 Minor ARC optimizations
Remove unneeded global, practically constant, state pointer variables
(arc_anon, arc_mru, etc.), replacing them with macros of real state
variables addresses (&ARC_anon, &ARC_mru, etc.).

Change ARC_EVICT_ALL from -1ULL to UINT64_MAX, not requiring special
handling in inner loop of ARC reclamation.  Respectively change bytes
argument of arc_evict_state() from int64_t to uint64_t.

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Mark Maybee <mark.maybee@delphix.com>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Closes #12348
2021-09-14 12:39:48 -07:00
Jorgen Lundman 4dfb698aac dmu_redact.c does not call bqueue_destroy
Ensure all calls to bqueue_init() has a corresponding call to bqueue_destroy()

Reviewed-by: Paul Dagnelie <pcd@delphix.com>
Co-authored-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Jorgen Lundman <lundman@lundman.net>
Closes #12118
2021-09-14 12:39:48 -07:00
Alexander 4affa09f3e A few fixes of callback typecasting (for the upcoming ClangCFI)
* zio: avoid callback typecasting
* zil: avoid zil_itxg_clean() callback typecasting
* zpl: decouple zpl_readpage() into two separate callbacks
* nvpair: explicitly declare callbacks for xdr_array()
* linux/zfs_nvops: don't use external iput() as a callback
* zcp_synctask: don't use fnvlist_free() as a callback
* zvol: don't use ops->zv_free() as a callback for taskq_dispatch()

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Mark Maybee <mark.maybee@delphix.com>
Signed-off-by: Alexander Lobakin <alobakin@pm.me>
Closes #12260
2021-09-14 12:39:48 -07:00
Ryan Moeller 0ca9558561 Remove unused fields from zvol_task_t
We don't use or need the pool name or value source in the zvol tasks.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #12361
2021-09-14 12:39:17 -07:00
Alexander Motin c2c4d05700 FreeBSD: Switch from MAXPHYS to maxphys on FreeBSD 13+
Reviewed-by: Allan Jude <allan@klarasystems.com>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Closes #12378
2021-09-14 12:39:17 -07:00
George Melikov f8c2e91db5 zpool_influxdb: fix -Werror=stringop-truncation
Use strlcpy instead of problematic strncpy

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Richard Elling <Richard.Elling@RichardElling.com>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: George Melikov <mail@gmelikov.ru>
Closes #12344
2021-09-14 12:39:17 -07:00
Rich Ercolani 056c273939 Correct zfs-send(8) on readonly sends
zfs-send(8) claimed in the flags list you could use -pR when sending
a readonly filesystem or volume. You cannot.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #12336
2021-09-14 12:38:51 -07:00
Alexander Motin ba76bb30a6 Introduce dsl_dir_diduse_transfer_space()
Most of dsl_dir_diduse_space() and dsl_dir_transfer_space() CPU time
is a dd_lock overhead and time spent in dmu_buf_will_dirty(). Calling
them one after another is a waste of time and even more contention.
Doing that twice for each rewritten block within dbuf_write_done()
via dsl_dataset_block_kill() and dsl_dataset_block_born() created one
of the biggest CPU overheads in case of small blocks rewrite.

dsl_dir_diduse_transfer_space() combines functionality of these two
functions for cases where it is needed, but without double overhead,
practically for the cost of dsl_dir_diduse_space() or even cheaper.

While there, optimize dsl_dir_phys() calls in dsl_dir_diduse_space()
and dsl_dir_transfer_space().  It seems Clang detects some aliasing
there, repeating dd->dd_dbuf->db_data dereference multiple times,
increasing dd_lock scope and contention.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Author: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Closes #12300
2021-09-14 12:38:51 -07:00
наб 968dc13572 config/libatomic: require -latomic iff atomic.c doesn't link w/o it
In absence of LTO, and dynamic libatomic, la.so ends up in the needs
section of every toolchain executable; some consider this an issue.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12345
Closes #12359
2021-09-14 12:38:51 -07:00
Rich Ercolani 960a5a557b Tinker with slop space accounting with dedup
* Tinker with slop space accounting with dedup

Do not include the deduplicated space usage in the slop space
reservation, it leads to surprising outcomes.

* Update spa_dedup_dspace sometimes

Sometimes, we get into spa_get_slop_space() with
spa_dedup_dspace=~0ULL, AKA "unset", while spa_dspace is correctly set.

So call the code to update it before we use it if we hit that case.

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Mark Maybee <mark.maybee@delphix.com>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #12271
2021-09-14 12:38:05 -07:00
Alexander Motin 45305a067f Fix ARC ghost states eviction accounting
arc_evict_hdr() returns number of evicted bytes in scope of specific
state.  For ghost states it does not mean the amount of really freed
memory, but the logical buffer size.  It is correct for the eviction
process, but not for waking up threads waiting for ARC size reduction,
as added in "Revise ARC shrinker algorithm" commit, causing premature
wakeups while ARC is still overflowed, allowing even bigger overflow,
plus processing overhead when next allocation will also get blocked,
probably also for too short time.

To fix that make arc_evict_hdr() also return the amount of really
freed memory, which for the ghost states is only the header, and use
it to update arc_evict_count instead.  Originally I was thinking to
not return it at all, since arc_get_data_impl() does not account for
the headers, but decided that some slow allocation progress is better
than long waits, reaching on my tests up to 100ms.

To reduce negative latency effects of long time periods when reclaim
thread can free little real memory, start reclamation process earlier,
before we actually reached the overflow threshold, when we have to
throttle new allocations.  We can also do it without taking global
arc_evict_lock, reducing the contention.

Reviewed-by: George Wilson <gwilson@delphix.com>
Reviewed-by: Allan Jude <allan@klarasystems.com>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Closes #12279
2021-09-14 12:38:05 -07:00
Brian Behlendorf a5e68f0478 Update bug report template
- Remove the "SPL Version" line, the repositories have been merged
  since the 0.8 release and we no longer need to ask about this.

- Simply ask for the kernel version / patch level and add a hint
  about how to get this information on Linux and FreeBSD.

- Remove "Status: Triage Needed" from the template, in practice
  we really haven't been using this label so let's step setting it.

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes: #12340
2021-09-14 12:38:05 -07:00
George Wilson 8415c3c170 file reference counts can get corrupted
Callers of zfs_file_get and zfs_file_put can corrupt the reference
counts for the file structure resulting in a panic or a soft lockup.
When zfs send/recv runs, it will add a reference count to the
open file, and begin to send or recv the stream. If the file descriptor
is closed, then when dmu_recv_stream() or dmu_send() return we will
call zfs_file_put to remove the reference we placed on the file
structure. Unfortunately, because zfs_file_put() uses the file
descriptor to lookup the file structure, it may end up finding that
the file descriptor table no longer contains the file struct, thus
leaking the file structure. Or it might end up finding a file
descriptor for a different file and blindly updating its reference
counts. Other failure modes probably exists.

This change reworks the zfs_file_[get|put] interface to not rely
on the file descriptor but instead pass the zfs_file_t pointer around.

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Mark Maybee <mark.maybee@delphix.com>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Co-authored-by: Allan Jude <allan@klarasystems.com>
Signed-off-by: George Wilson <gwilson@delphix.com>
External-issue: DLPX-76119
Closes #12299
2021-09-14 12:37:38 -07:00
Jorgen Lundman 04ebe29188 dprintf_dnode: strcpy -> strlcpy
Missed a couple of strcpy() in earlier commit, this is only used with
--enable-debug.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Signed-off-by: Jorgen Lundman <lundman@lundman.net>
Closes #12311
2021-09-14 12:37:38 -07:00
Jorgen Lundman a0b4da2297 Replace strchrnul() with strrchr()
Could have gone either way with this one, either adding it to
macOS/Windows SPL, or returning it to "classic" usage with strrchr().
Since the new special way isn't really used, and only used once,
we have this commit.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Jorgen Lundman <lundman@lundman.net>
Closes  #12312
2021-09-14 12:37:38 -07:00
Alexander Motin c84670950a FreeBSD: Use unmapped I/O for scattered/gang ABD buffers
Many FreeBSD disk drivers support "unmapped" I/O mode, when data
buffer represented not with a virtually contiguous KVA-mapped address
range, but with a list of physical memory pages.  Originally it was
designed to do I/O from buffers without KVA mapping (unmapped).  But
moving virtual addresses out of equation allows us to operate even
non-contiguous data buffers with one condition: all buffer discon-
tinuities must be aligned to memory page borders.

Doing I/O to capable GEOM device this patch traverses through non-
linear ABD buffers, validating the chunks borders.  If the condition
is met, it supplies GEOM with the list of original physical memory
pages instead of copying the data into temporary contiguous buffer.
On capable hardware on pools with ashift=12 and default ABD chunk of
4KB it should handle all the I/O without additional memory copying.

Reviewed-by: Brian Atkinson <batkinson@lanl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Closes #12320
2021-09-14 12:37:02 -07:00
Alexander Motin 49bb454120 FreeBSD: Hardcode abd_chunk_size to PAGE_SIZE
It makes no sense to set it below PAGE_SIZE, since it increases all
overheads and makes returning memory to OS problematic.  It makes no
sense to set it above PAGE_SIZE, since such allocations and especially
frees are too expensive and cause KVA fragmentation to benefit from
fewer chunks.  After that it makes no sense to keep more complicated
math here.

What may have sense though is just a tunable border between linear and
scatter ABDs, previously also controlled by this tunable.  Retain that
functionality by taking abd_scatter_min_size tunable from Linux, just
with different default value.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Brian Atkinson <batkinson@lanl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Closes #12328
2021-09-14 12:36:44 -07:00
Alexander Motin 41b33dce44 Move gethrtime() calls out of vdev queue lock
This dramatically reduces the lock contention on systems with slower
(non-TSC) timecounters.  With TSC the difference is minimal, but since
this lock is pretty congested, any improvement counts.  Plus I don't
see any reason to do it under the lock other than the latency of the
lock itself, which this change actually reduces.

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Closes #12281
2021-09-14 12:35:53 -07:00
Justin Gottula dab147d65a Use substantially more robust program exit status logic in zvol_id
Currently, there are several places in zvol_id where the program logic
returns particular errno values, or even particular ioctl return values,
as the program exit status, rather than a straightforward system of
explicit zero on success and explicit nonzero value(s) on failure.

This is problematic for multiple reasons. One particularly interesting
problem that can arise, is that if any of these values happens to have
all 8 least significant bits unset (i.e., it is a positive or negative
multiple of 256), then although the C program sees a nonzero int value
(presumed to be a failure exit status), the actual exit status as seen
by the system is only the bottom 8 bits of that integer: zero.

This can happen in practice, and I have encountered it myself. In a
particularly weird situation, the zvol_open code in the zfs kernel
module was behaving in such a manner that it caused the open() syscall
to fail and for errno to be set to a kernel-private value (ERESTARTSYS,
which happens to be defined as 512). It turns out that 512 is evenly
divisible by 256; or, in other words, its least significant 8 bits are
all-zero. So even though zvol_id believed it was returning a nonzero
(failure) exit status of 512, the system modulo'd that value by 256,
resulting in the actual exit status visible by other programs being 0!
This actually-zero (non-failure) exit status caused problems: udev
believed that the program was operating successfully, when in fact it
was attempting to indicate failure via a nonzero exit status integer.
Combined with another problem, this led to the creation of nonsense
symlinks for zvol dev nodes by udev.

Let's get rid of all this problematic logic, and simply return
EXIT_SUCCESS (0) is everything went fine, and EXIT_FAILURE (1) if
anything went wrong.

Additionally, let's clarify some of the variable names (error is similar
to errno, etc) and clean up the overall program flow a bit.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com>
Signed-off-by: Justin Gottula <justin@jgottula.com>
Closes #12302
2021-09-14 12:23:38 -07:00
Justin Gottula 7138fe7205 Print zvol_id error messages to stderr rather than stdout
The zvol_id program is invoked by udev, via a PROGRAM key in the
60-zvol.rules.in rule file, to determine the "pretty" /dev/zvol/*
symlink paths paths that should be generated for each opaquely named
/dev/zd* dev node.

The udev rule uses the PROGRAM key, followed by a SYMLINK+= assignment
containing the %c substitution, to collect the program's stdout and then
"paste" it directly into the name of the symlink(s) to be created.

Unfortunately, as currently written, zvol_id outputs both its intended
output (a single string representing the symlink path that should be
created to refer to the name of the dataset whose /dev/zd* path is
given) AND its error messages (if any) to stdout.

When processing PROGRAM keys (and others, such as IMPORT{program}), udev
uses only the data written to stdout for functional purposes. Any data
written to stderr is used solely for the purposes of logging (if udev's
log_level is set to debug).

The unintended consequence of this is as follows: if zvol_id encounters
an error condition; and then udev fails to halt processing of the
current rule (either because zvol_id didn't return a nonzero exit
status, or because the PROGRAM key in the rule wasn't written properly
to result in a "non-match" condition that would stop the current rule on
a nonzero exit); then udev will create a space-delimited list of symlink
names derived directly from the words of the error message string!

I've observed this exact behavior on my own system, in a situation where
the open() syscall on /dev/zd* dev nodes was failing sporadically (for
reasons that aren't especially relevant here). Because the open() call
failed, zvol_id printed "Unable to open device file: /dev/zd736\n" to
stdout and then exited.

The udev rule finished with SYMLINK+="zvol/%c %c". Assuming a volume
name like pool/foo/bar, this would ordinarily expand to
   SYMLINK+="zvol/pool/foo/bar pool/foo/bar"
and would cause symlinks to be created like this:
   /dev/zvol/pool/foo/bar -> /dev/zd736
   /dev/pool/foo/bar      -> /dev/zd736

But because of the combination of error messages being printed to
stdout, and the udev syntax freely accepting a space-delimited sequence
of names in this context, the error message string
   "Unable to open device file: /dev/zd736\n"
in reality expanded to
   SYMLINK+="zvol/Unable to open device file: /dev/zd736"
which caused the following symlinks to actually be created:
   /dev/zvol/Unable -> /dev/zd736
   /dev/to          -> /dev/zd736
   /dev/open        -> /dev/zd736
   /dev/device      -> /dev/zd736
   /dev/file:       -> /dev/zd736
   /dev//dev/zd736  -> /dev/zd736

(And, because multiple zvols had open() syscall errors, multiple zvols
attempted to claim several of those symlink names, resulting in numerous
udev errors and timeouts and general chaos.)

This commit rectifies all this silliness by simply printing error
messages to stderr, as Dennis Ritchie originally intended.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com>
Signed-off-by: Justin Gottula <justin@jgottula.com>
Closes #12302
2021-09-14 12:23:38 -07:00
Justin Gottula fd2e4d143d Udev rules: use match (==) rather than assign (=) for PROGRAM
Assignment syntax (=) can be used for the PROGRAM key. But the PROGRAM
key is really a match key, not an assign key. The internal logic used by
udev to decide whether a PROGRAM key "matched" or not (which determines
whether the remainder of the rule is evaluated) depends on whether the
operator was OP_MATCH (==) or OP_NOMATCH (!=). [1]

The man page claims that '"=", ":=", and "+=" have the same effect as
"=="' for PROGRAM keys. And, after a brief perusal, the udev source code
does seem to confirm that operators other than OP_MATCH (==) or
OP_NOMATCH (!=) are implicitly converted to OP_MATCH (==). [2] But it's
not entirely clear that this is definitely the case: anecdotal testing
seems to indicate that when OP_ASSIGN (=) is used, the program's exit
status is disregarded and the remainder of the rule is processed
regardless of whether it was, in fact, a successful exit.

The bottom line here is that, if zvol_id hits some snag and returns a
nonzero exit status, then we almost certainly do NOT want to continue on
with the rule and use whatever the stdout contents may have been to
mindlessly create /dev/zvol/* symlinks. Therefore, let's be extra-sure
and use the match (==) operator explicitly, to eliminate any possibility
that udev might do the wrong thing, and ensure that a nonzero exit
status will definitely short-circuit the rest of the rule, bypassing the
SYMLINK+= assignments.

[1]
udev,
 file src/udev/udev-rules.c,
  func udev_rule_apply_token_to_event,
   switch case TK_M_PROGRAM if r != 0 (nonzero exit status):
      return token->op == OP_NOMATCH;
   switch case TK_M_PROGRAM if r == 0 (zero exit status):
      return token->op == OP_MATCH;
   func retval 0 => key is considered to have matched
   func retval 1 => key is considered to have NOT matched

[2]
udev,
 file src/udev/udev-rules.c,
  func parse_token,
   at func start:
      bool is_match = IN_SET(op, OP_MATCH, OP_NOMATCH);
   in else-if case streq(key, "PROGRAM"):
      if (!is_match) op = OP_MATCH;

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com>
Signed-off-by: Justin Gottula <justin@jgottula.com>
Closes #12302
2021-09-14 12:23:10 -07:00
Justin Gottula 0cb122941e Udev rules: replace deprecated $tempnode with $devnode
The $tempnode substitution is so old that it's not even mentioned in the
man page anymore. It is still technically supported by udev, but with
plenty of "deprecated" comments surrounding it.

The preferred modern equivalent of $tempnode is $devnode (or
alternatively, %N).

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com>
Signed-off-by: Justin Gottula <justin@jgottula.com>
Closes #12302
2021-09-14 12:23:10 -07:00
Justin Gottula c20ba9bd7a Udev rules: use non-ancient comma syntax
This file is old as dirt. It's entirely possible that commas were
optional in udev back at that time. But they're definitely supposed to
be there nowadays.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com>
Signed-off-by: Justin Gottula <justin@jgottula.com>
Closes #12302
2021-09-14 12:23:10 -07:00
Alexander Motin 15177c1aac Compact dbuf/buf hashes and lock arrays
With default dbuf cache size of 1/32 of ARC, it makes no sense to have
hash table of the same size (or even bigger on Linux).  Reduce it to
1/8 of ARC's one, still leaving some slack, assuming higher I/O rate
via dbuf cache than via ARC.

Remove padding from ARC hash locks array.  The idea behind padding
is to avoid false sharing between locks.  It would have sense if
there would be a limited number of very busy locks.  But since we
have no limit on the number, using the same memory for more locks we
can achieve even lower lock contention with the same false sharing,
or we can use less memory for the same contention level.

Reduce number of hash locks from 8192 to 2048.  The number is still
big enough to not cause contention, but reduced memory size improves
cache hit rate for mutex_tryenter() in ARC eviction thread, saving
about 1% of the thread time.

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Closes #12289
2021-09-14 12:22:46 -07:00
Jorgen Lundman 035219ee10 Fix abd leak, kmem_free correct size of abd_t
Fix a leak of abd_t that manifested mostly when using
raidzN with at least as many columns as N (e.g. a
four-disk raidz2 but not a three-disk raidz2).
Sufficiently heavy raidz use would eventually run a system
out of memory.

Additionally:

* Switch abd_cache arena to FIRSTFIT, which empirically
improves perofrmance.

* Make abd_chunk_cache more performant and debuggable.

* Allocate the abd_zero_buf from abd_chunk_cache rather
than the heap.

* Don't try to reap non-existent qcaches in abd_cache arena.

* KM_PUSHPAGE->KM_SLEEP when allocating chunks from their
own arena

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Jorgen Lundman <lundman@lundman.net>
Co-authored-by: Sean Doran <smd@use.net>
Closes #12295
2021-09-14 12:22:28 -07:00
Jorgen Lundman 2334bc4efa Upstream: dmu_zfetch_stream_fini leaks refcount
dmu_zfetch_stream_fini() is missing calls to destroy the refcounts,
leaking them and the mutex inside.

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Jorgen Lundman <lundman@lundman.net>
Closes #12294
2021-09-14 12:21:55 -07:00
Ryan Moeller d6c2b89032 ZED: Match added disk by pool/vdev GUID if found (#12217)
This enables ZED to auto-online vdevs that are not wholedisk managed by
ZFS.

Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Don Brady <don.brady@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
2021-09-14 12:10:44 -07:00
Alexander Motin f3969ea78b Optimize small random numbers generation
In all places except two spa_get_random() is used for small values,
and the consumers do not require well seeded high quality values.
Switch those two exceptions directly to random_get_pseudo_bytes()
and optimize spa_get_random(), renaming it to random_in_range(),
since it is not related to SPA or ZFS in general.

On FreeBSD directly map random_in_range() to new prng32_bounded() KPI
added in FreeBSD 13.  On Linux and in user-space just reduce the type
used to uint32_t to avoid more expensive 64bit division.

Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Closes #12183
2021-09-14 12:10:17 -07:00
Ryan Moeller 6fe6192796 FreeBSD: Implement xattr=sa
FreeBSD historically has not cared about the xattr property; it was
always treated as xattr=on.  With xattr=on, xattrs are stored as files
in a hidden xattr directory.  With xattr=sa, xattrs are stored as
system attributes and get cached in nvlists during xattr operations.
This makes SA xattrs simpler and more efficient to manipulate.  FreeBSD
needs to implement the SA xattr operations for feature parity with
Linux and to ensure that SA xattrs are accessible when migrated or
replicated from Linux.

Following the example set by Linux, refactor our existing extattr vnops
to split off the parts handling dir style xattrs, and add the
corresponding SA handling parts.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #11997
2021-09-14 12:09:35 -07:00
Ryan Moeller 1826068523 FreeBSD: Clean up ASSERT/VERIFY use in module
Convert use of ASSERT() to ASSERT0(), ASSERT3U(), ASSERT3S(),
ASSERT3P(), and likewise for VERIFY().  In some cases it ended up
making more sense to change the code, such as VERIFY on nvlist
operations that I have converted to use fnvlist instead.  In one
place I changed an internal struct member from int to boolean_t to
match its use.  Some asserts that combined multiple checks with &&
in a single assert have been split to separate asserts, to make it
apparent which check fails.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes #11971
2021-09-14 12:02:23 -07:00
660 changed files with 24743 additions and 15724 deletions
+9 -7
View File
@@ -2,7 +2,7 @@
name: Bug report
about: Create a report to help us improve OpenZFS
title: ''
labels: 'Type: Defect, Status: Triage Needed'
labels: 'Type: Defect'
assignees: ''
---
@@ -25,14 +25,16 @@ Type | Version/Name
--- | ---
Distribution Name |
Distribution Version |
Linux Kernel |
Kernel Version |
Architecture |
ZFS Version |
SPL Version |
OpenZFS Version |
<!--
Commands to find ZFS/SPL versions:
modinfo zfs | grep -iw version
modinfo spl | grep -iw version
Command to find OpenZFS version:
zfs version
Commands to find kernel version:
uname -r # Linux
freebsd-version -r # FreeBSD
-->
### Describe the problem you're observing
+17 -3
View File
@@ -6,7 +6,7 @@ on:
jobs:
checkstyle:
runs-on: ubuntu-18.04
runs-on: ubuntu-20.04
steps:
- uses: actions/checkout@v2
with:
@@ -18,7 +18,7 @@ jobs:
sudo apt-get install --yes -qq zlib1g-dev uuid-dev libattr1-dev libblkid-dev libselinux-dev libudev-dev libssl-dev python-dev python-setuptools python-cffi python3 python3-dev python3-setuptools python3-cffi
# packages for tests
sudo apt-get install --yes -qq parted lsscsi ksh attr acl nfs-kernel-server fio
sudo apt-get install --yes -qq mandoc cppcheck pax-utils devscripts abigail-tools
sudo apt-get install --yes -qq mandoc cppcheck pax-utils devscripts
sudo -E pip --quiet install flake8
- name: Prepare
run: |
@@ -32,5 +32,19 @@ jobs:
run: |
make lint
- name: CheckABI
id: CheckABI
run: |
make checkabi
sudo docker run -v $(pwd):/source ghcr.io/openzfs/libabigail make checkabi
- name: StoreABI
if: failure() && steps.CheckABI.outcome == 'failure'
run: |
sudo docker run -v $(pwd):/source ghcr.io/openzfs/libabigail make storeabi
- name: Prepare artifacts
if: failure() && steps.CheckABI.outcome == 'failure'
run: |
find -name *.abi | tar -cf abi_files.tar -T -
- uses: actions/upload-artifact@v2
if: failure() && steps.CheckABI.outcome == 'failure'
with:
name: New ABI files (use only if you're sure about interface changes)
path: abi_files.tar
+16 -3
View File
@@ -26,7 +26,8 @@ jobs:
xfslibs-dev libattr1-dev libacl1-dev libudev-dev libdevmapper-dev \
libssl-dev libffi-dev libaio-dev libelf-dev libmount-dev \
libpam0g-dev pamtester python-dev python-setuptools python-cffi \
python3 python3-dev python3-setuptools python3-cffi
python3 python3-dev python3-setuptools python3-cffi python3-packaging \
libcurl4-openssl-dev
- name: Autogen.sh
run: |
sh autogen.sh
@@ -44,6 +45,17 @@ jobs:
sudo sed -i.bak 's/updates/extra updates/' /etc/depmod.d/ubuntu.conf
sudo depmod
sudo modprobe zfs
# Workaround for cloud-init bug
# see https://github.com/openzfs/zfs/issues/12644
FILE=/lib/udev/rules.d/10-cloud-init-hook-hotplug.rules
if [ -r "${FILE}" ]; then
HASH=$(md5sum "${FILE}" | awk '{ print $1 }')
if [ "${HASH}" = "121ff0ef1936cd2ef65aec0458a35772" ]; then
# Just shove a zd* exclusion right above the hotplug hook...
sudo sed -i -e s/'LABEL="cloudinit_hook"'/'KERNEL=="zd*", GOTO="cloudinit_end"\n&'/ "${FILE}"
sudo udevadm control --reload-rules
fi
fi
# Workaround to provide additional free space for testing.
# https://github.com/actions/virtual-environments/issues/2840
sudo rm -rf /usr/share/dotnet
@@ -52,7 +64,8 @@ jobs:
sudo rm -rf "$AGENT_TOOLSDIRECTORY"
- name: Tests
run: |
/usr/share/zfs/zfs-tests.sh -v -s 3G
/usr/share/zfs/zfs-tests.sh -vR -s 3G
timeout-minutes: 330
- name: Prepare artifacts
if: failure()
run: |
@@ -61,7 +74,7 @@ jobs:
sudo cp /var/log/syslog $RESULTS_PATH/
sudo chmod +r $RESULTS_PATH/*
# Replace ':' in dir names, actions/upload-artifact doesn't support it
for f in $(find $RESULTS_PATH -name '*:*'); do mv "$f" "${f//:/__}"; done
for f in $(find /var/tmp/test_results -name '*:*'); do mv "$f" "${f//:/__}"; done
- uses: actions/upload-artifact@v2
if: failure()
with:
+16 -3
View File
@@ -22,7 +22,8 @@ jobs:
xfslibs-dev libattr1-dev libacl1-dev libudev-dev libdevmapper-dev \
libssl-dev libffi-dev libaio-dev libelf-dev libmount-dev \
libpam0g-dev pamtester python-dev python-setuptools python-cffi \
python3 python3-dev python3-setuptools python3-cffi
python3 python3-dev python3-setuptools python3-cffi python3-packaging \
libcurl4-openssl-dev
- name: Autogen.sh
run: |
sh autogen.sh
@@ -40,6 +41,17 @@ jobs:
sudo sed -i.bak 's/updates/extra updates/' /etc/depmod.d/ubuntu.conf
sudo depmod
sudo modprobe zfs
# Workaround for cloud-init bug
# see https://github.com/openzfs/zfs/issues/12644
FILE=/lib/udev/rules.d/10-cloud-init-hook-hotplug.rules
if [ -r "${FILE}" ]; then
HASH=$(md5sum "${FILE}" | awk '{ print $1 }')
if [ "${HASH}" = "121ff0ef1936cd2ef65aec0458a35772" ]; then
# Just shove a zd* exclusion right above the hotplug hook...
sudo sed -i -e s/'LABEL="cloudinit_hook"'/'KERNEL=="zd*", GOTO="cloudinit_end"\n&'/ "${FILE}"
sudo udevadm control --reload-rules
fi
fi
# Workaround to provide additional free space for testing.
# https://github.com/actions/virtual-environments/issues/2840
sudo rm -rf /usr/share/dotnet
@@ -48,7 +60,8 @@ jobs:
sudo rm -rf "$AGENT_TOOLSDIRECTORY"
- name: Tests
run: |
/usr/share/zfs/zfs-tests.sh -v -s 3G -r sanity
/usr/share/zfs/zfs-tests.sh -vR -s 3G -r sanity
timeout-minutes: 330
- name: Prepare artifacts
if: failure()
run: |
@@ -57,7 +70,7 @@ jobs:
sudo cp /var/log/syslog $RESULTS_PATH/
sudo chmod +r $RESULTS_PATH/*
# Replace ':' in dir names, actions/upload-artifact doesn't support it
for f in $(find $RESULTS_PATH -name '*:*'); do mv "$f" "${f//:/__}"; done
for f in $(find /var/tmp/test_results -name '*:*'); do mv "$f" "${f//:/__}"; done
- uses: actions/upload-artifact@v2
if: failure()
with:
+3 -3
View File
@@ -22,8 +22,8 @@ jobs:
xfslibs-dev libattr1-dev libacl1-dev libudev-dev libdevmapper-dev \
libssl-dev libffi-dev libaio-dev libelf-dev libmount-dev \
libpam0g-dev \
python-dev python-setuptools python-cffi \
python3 python3-dev python3-setuptools python3-cffi
python-dev python-setuptools python-cffi python-packaging \
python3 python3-dev python3-setuptools python3-cffi python3-packaging
- name: Autogen.sh
run: |
sh autogen.sh
@@ -45,7 +45,7 @@ jobs:
run: |
sudo mkdir -p $TEST_DIR
# run for 20 minutes to have a total runner time of 30 minutes
sudo /usr/share/zfs/zloop.sh -t 1200 -l -m1
sudo /usr/share/zfs/zloop.sh -t 1200 -l -m1 -- -T 120 -P 60
- name: Prepare artifacts
if: failure()
run: |
+2 -2
View File
@@ -1,10 +1,10 @@
Meta: 1
Name: zfs
Branch: 1.0
Version: 2.1.0
Version: 2.1.3
Release: 1
Release-Tags: relext
License: CDDL
Author: OpenZFS
Linux-Maximum: 5.13
Linux-Maximum: 5.16
Linux-Minimum: 3.10
+13 -2
View File
@@ -129,10 +129,21 @@ SHELLCHECKDIRS = cmd contrib etc scripts tests
SHELLCHECKSCRIPTS = autogen.sh
PHONY += checkabi storeabi
checkabi: lib
checklibabiversion:
libabiversion=`abidw -v | $(SED) 's/[^0-9]//g'`; \
if test $$libabiversion -lt "200"; then \
/bin/echo -e "\n" \
"*** Please use libabigail 2.0.0 version or newer;\n" \
"*** otherwise results are not consistent!\n" \
"(or see https://github.com/openzfs/libabigail-docker )\n"; \
exit 1; \
fi;
checkabi: checklibabiversion lib
$(MAKE) -C lib checkabi
storeabi: lib
storeabi: checklibabiversion lib
$(MAKE) -C lib storeabi
PHONY += mancheck
+1 -1
View File
@@ -12,7 +12,7 @@ This repository contains the code for running OpenZFS on Linux and FreeBSD.
* [Documentation](https://openzfs.github.io/openzfs-docs/) - for using and developing this repo
* [ZoL Site](https://zfsonlinux.org) - Linux release info & links
* [Mailing lists](https://openzfs.github.io/openzfs-docs/Project%20and%20Community/Mailing%20Lists.html)
* [OpenZFS site](http://open-zfs.org/) - for conference videos and info on other platforms (illumos, OSX, Windows, etc)
* [OpenZFS site](https://openzfs.org/) - for conference videos and info on other platforms (illumos, OSX, Windows, etc)
# Installation
+35 -14
View File
@@ -246,13 +246,6 @@ main(int argc, char **argv)
}
}
if (verbose)
(void) fprintf(stdout, gettext("mount.zfs:\n"
" dataset: \"%s\"\n mountpoint: \"%s\"\n"
" mountflags: 0x%lx\n zfsflags: 0x%lx\n"
" mountopts: \"%s\"\n mtabopts: \"%s\"\n"),
dataset, mntpoint, mntflags, zfsflags, mntopts, mtabopt);
if (mntflags & MS_REMOUNT) {
nomtab = 1;
remount = 1;
@@ -275,7 +268,10 @@ main(int argc, char **argv)
return (MOUNT_USAGE);
}
zfs_adjust_mount_options(zhp, mntpoint, mntopts, mtabopt);
if (!zfsutil || sloppy ||
libzfs_envvar_is_set("ZFS_MOUNT_HELPER")) {
zfs_adjust_mount_options(zhp, mntpoint, mntopts, mtabopt);
}
/* treat all snapshots as legacy mount points */
if (zfs_get_type(zhp) == ZFS_TYPE_SNAPSHOT)
@@ -293,12 +289,11 @@ main(int argc, char **argv)
if (zfs_version == 0) {
fprintf(stderr, gettext("unable to fetch "
"ZFS version for filesystem '%s'\n"), dataset);
zfs_close(zhp);
libzfs_fini(g_zfs);
return (MOUNT_SYSERR);
}
zfs_close(zhp);
libzfs_fini(g_zfs);
/*
* Legacy mount points may only be mounted using 'mount', never using
* 'zfs mount'. However, since 'zfs mount' actually invokes 'mount'
@@ -316,6 +311,8 @@ main(int argc, char **argv)
"Use 'zfs set mountpoint=%s' or 'mount -t zfs %s %s'.\n"
"See zfs(8) for more information.\n"),
dataset, mntpoint, dataset, mntpoint);
zfs_close(zhp);
libzfs_fini(g_zfs);
return (MOUNT_USAGE);
}
@@ -326,14 +323,38 @@ main(int argc, char **argv)
"Use 'zfs set mountpoint=%s' or 'zfs mount %s'.\n"
"See zfs(8) for more information.\n"),
dataset, "legacy", dataset);
zfs_close(zhp);
libzfs_fini(g_zfs);
return (MOUNT_USAGE);
}
if (verbose)
(void) fprintf(stdout, gettext("mount.zfs:\n"
" dataset: \"%s\"\n mountpoint: \"%s\"\n"
" mountflags: 0x%lx\n zfsflags: 0x%lx\n"
" mountopts: \"%s\"\n mtabopts: \"%s\"\n"),
dataset, mntpoint, mntflags, zfsflags, mntopts, mtabopt);
if (!fake) {
error = mount(dataset, mntpoint, MNTTYPE_ZFS,
mntflags, mntopts);
if (zfsutil && !sloppy &&
!libzfs_envvar_is_set("ZFS_MOUNT_HELPER")) {
error = zfs_mount_at(zhp, mntopts, mntflags, mntpoint);
if (error) {
(void) fprintf(stderr, "zfs_mount_at() failed: "
"%s", libzfs_error_description(g_zfs));
zfs_close(zhp);
libzfs_fini(g_zfs);
return (MOUNT_SYSERR);
}
} else {
error = mount(dataset, mntpoint, MNTTYPE_ZFS,
mntflags, mntopts);
}
}
zfs_close(zhp);
libzfs_fini(g_zfs);
if (error) {
switch (errno) {
case ENOENT:
@@ -368,7 +389,7 @@ main(int argc, char **argv)
"mount the filesystem again.\n"), dataset);
return (MOUNT_SYSERR);
}
/* fallthru */
fallthrough;
#endif
default:
(void) fprintf(stderr, gettext("filesystem "
+13 -9
View File
@@ -140,7 +140,8 @@ Usage: vdev_id [-h]
-p number of phy's per switch port [default=$PHYS_PER_PORT]
-h show this summary
EOF
exit 0
exit 1
# exit with error to avoid processing usage message by a udev rule
}
map_slot() {
@@ -374,7 +375,7 @@ sas_handler() {
i=$((i + 1))
done
PHY=$(ls -d "$port_dir"/phy* 2>/dev/null | head -1 | awk -F: '{print $NF}')
PHY=$(ls -vd "$port_dir"/phy* 2>/dev/null | head -1 | awk -F: '{print $NF}')
if [ -z "$PHY" ] ; then
PHY=0
fi
@@ -595,7 +596,9 @@ enclosure_handler () {
# DEVPATH=/sys/devices/pci0000:00/0000:00:03.0/0000:05:00.0/host0/subsystem/devices/0:0:0:0/scsi_generic/sg0
# Get the enclosure ID ("0:0:0:0")
ENC=$(basename $(readlink -m "/sys/$DEVPATH/../.."))
ENC="${DEVPATH%/*}"
ENC="${ENC%/*}"
ENC="${ENC##*/}"
if [ ! -d "/sys/class/enclosure/$ENC" ] ; then
# Not an enclosure, bail out
return
@@ -615,14 +618,15 @@ enclosure_handler () {
# The PCI directory is two directories up from the port directory
# /sys/devices/pci0000:00/0000:00:03.0/0000:05:00.0
PCI_ID_LONG=$(basename $(readlink -m "/sys/$PORT_DIR/../.."))
PCI_ID_LONG="$(readlink -m "/sys/$PORT_DIR/../..")"
PCI_ID_LONG="${PCI_ID_LONG##*/}"
# Strip down the PCI address from 0000:05:00.0 to 05:00.0
PCI_ID=$(echo "$PCI_ID_LONG" | sed -r 's/^[0-9]+://g')
PCI_ID="${PCI_ID_LONG#[0-9]*:}"
# Name our device according to vdev_id.conf (like "L0" or "U1").
NAME=$(awk '/channel/{if ($1 == "channel" && $2 == "$PCI_ID" && \
$3 == "$PORT_ID") {print ${4}int(count[$4])}; count[$4]++}' $CONFIG)
NAME=$(awk "/channel/{if (\$1 == \"channel\" && \$2 == \"$PCI_ID\" && \
\$3 == \"$PORT_ID\") {print \$4\$3}}" $CONFIG)
echo "${NAME}"
}
@@ -673,7 +677,7 @@ alias_handler () {
link=$(echo "$link" | sed 's/p[0-9][0-9]*$//')
fi
# Check both the fully qualified and the base name of link.
for l in $link $(basename "$link") ; do
for l in $link ${link##*/} ; do
if [ ! -z "$l" ]; then
alias=$(awk -v var="$l" '($1 == "alias") && \
($3 == var) \
@@ -728,7 +732,7 @@ done
if [ ! -r "$CONFIG" ] ; then
echo "Error: Config file \"$CONFIG\" not found"
exit 0
exit 1
fi
if [ -z "$DEV" ] && [ -z "$ENCLOSURE_MODE" ] ; then
+6 -4
View File
@@ -2218,7 +2218,8 @@ snprintf_zstd_header(spa_t *spa, char *blkbuf, size_t buflen,
(void) snprintf(blkbuf + strlen(blkbuf),
buflen - strlen(blkbuf),
" ZSTD:size=%u:version=%u:level=%u:EMBEDDED",
zstd_hdr.c_len, zstd_hdr.version, zstd_hdr.level);
zstd_hdr.c_len, zfs_get_hdrversion(&zstd_hdr),
zfs_get_hdrlevel(&zstd_hdr));
return;
}
@@ -2242,7 +2243,8 @@ snprintf_zstd_header(spa_t *spa, char *blkbuf, size_t buflen,
(void) snprintf(blkbuf + strlen(blkbuf),
buflen - strlen(blkbuf),
" ZSTD:size=%u:version=%u:level=%u:NORMAL",
zstd_hdr.c_len, zstd_hdr.version, zstd_hdr.level);
zstd_hdr.c_len, zfs_get_hdrversion(&zstd_hdr),
zfs_get_hdrlevel(&zstd_hdr));
abd_return_buf_copy(pabd, buf, BP_GET_LSIZE(bp));
}
@@ -4094,7 +4096,7 @@ cksum_record_compare(const void *x1, const void *x2)
const cksum_record_t *l = (cksum_record_t *)x1;
const cksum_record_t *r = (cksum_record_t *)x2;
int arraysize = ARRAY_SIZE(l->cksum.zc_word);
int difference;
int difference = 0;
for (int i = 0; i < arraysize; i++) {
difference = TREE_CMP(l->cksum.zc_word[i], r->cksum.zc_word[i]);
@@ -4571,7 +4573,7 @@ dump_path_impl(objset_t *os, uint64_t obj, char *name, uint64_t *retobj)
case DMU_OT_DIRECTORY_CONTENTS:
if (s != NULL && *(s + 1) != '\0')
return (dump_path_impl(os, child_obj, s + 1, retobj));
/*FALLTHROUGH*/
fallthrough;
case DMU_OT_PLAIN_FILE_CONTENTS:
if (retobj != NULL) {
*retobj = child_obj;
+34 -6
View File
@@ -640,6 +640,27 @@ devid_iter(const char *devid, zfs_process_func_t func, boolean_t is_slice)
return (data.dd_found);
}
/*
* Given a device guid, find any vdevs with a matching guid.
*/
static boolean_t
guid_iter(uint64_t pool_guid, uint64_t vdev_guid, const char *devid,
zfs_process_func_t func, boolean_t is_slice)
{
dev_data_t data = { 0 };
data.dd_func = func;
data.dd_found = B_FALSE;
data.dd_pool_guid = pool_guid;
data.dd_vdev_guid = vdev_guid;
data.dd_islabeled = is_slice;
data.dd_new_devid = devid;
(void) zpool_iter(g_zfshdl, zfs_iter_pool, &data);
return (data.dd_found);
}
/*
* Handle a EC_DEV_ADD.ESC_DISK event.
*
@@ -663,15 +684,18 @@ static int
zfs_deliver_add(nvlist_t *nvl, boolean_t is_lofi)
{
char *devpath = NULL, *devid;
uint64_t pool_guid = 0, vdev_guid = 0;
boolean_t is_slice;
/*
* Expecting a devid string and an optional physical location
* Expecting a devid string and an optional physical location and guid
*/
if (nvlist_lookup_string(nvl, DEV_IDENTIFIER, &devid) != 0)
return (-1);
(void) nvlist_lookup_string(nvl, DEV_PHYS_PATH, &devpath);
(void) nvlist_lookup_uint64(nvl, ZFS_EV_POOL_GUID, &pool_guid);
(void) nvlist_lookup_uint64(nvl, ZFS_EV_VDEV_GUID, &vdev_guid);
is_slice = (nvlist_lookup_boolean(nvl, DEV_IS_PART) == 0);
@@ -682,12 +706,16 @@ zfs_deliver_add(nvlist_t *nvl, boolean_t is_lofi)
* Iterate over all vdevs looking for a match in the following order:
* 1. ZPOOL_CONFIG_DEVID (identifies the unique disk)
* 2. ZPOOL_CONFIG_PHYS_PATH (identifies disk physical location).
*
* For disks, we only want to pay attention to vdevs marked as whole
* disks or are a multipath device.
* 3. ZPOOL_CONFIG_GUID (identifies unique vdev).
*/
if (!devid_iter(devid, zfs_process_add, is_slice) && devpath != NULL)
(void) devphys_iter(devpath, devid, zfs_process_add, is_slice);
if (devid_iter(devid, zfs_process_add, is_slice))
return (0);
if (devpath != NULL && devphys_iter(devpath, devid, zfs_process_add,
is_slice))
return (0);
if (vdev_guid != 0)
(void) guid_iter(pool_guid, vdev_guid, devid, zfs_process_add,
is_slice);
return (0);
}
+1
View File
@@ -40,6 +40,7 @@
#include <sys/fm/fs/zfs.h>
#include <libzfs.h>
#include <string.h>
#include <libgen.h>
#include "zfs_agents.h"
#include "fmd_api.h"
+1 -1
View File
@@ -291,7 +291,7 @@ idle:
rv = zed_event_service(&zcp);
/* ENODEV: When kernel module is unloaded (osx) */
if (rv == ENODEV)
if (rv != 0)
break;
}
+1 -1
View File
@@ -21,7 +21,7 @@ if [ "${ZED_SYSLOG_DISPLAY_GUIDS}" = "1" ]; then
[ -n "${ZEVENT_VDEV_GUID}" ] && msg="${msg} vdev_guid=${ZEVENT_VDEV_GUID}"
else
[ -n "${ZEVENT_POOL}" ] && msg="${msg} pool='${ZEVENT_POOL}'"
[ -n "${ZEVENT_VDEV_PATH}" ] && msg="${msg} vdev=$(basename "${ZEVENT_VDEV_PATH}")"
[ -n "${ZEVENT_VDEV_PATH}" ] && msg="${msg} vdev=${ZEVENT_VDEV_PATH##*/}"
fi
# log pool state if state is anything other than 'ACTIVE'
+1 -1
View File
@@ -23,7 +23,7 @@
# Rate-limit the notification based in part on the filename.
#
rate_limit_tag="${ZEVENT_POOL};${ZEVENT_SUBCLASS};$(basename -- "$0")"
rate_limit_tag="${ZEVENT_POOL};${ZEVENT_SUBCLASS};${0##*/}"
rate_limit_interval="${ZED_NOTIFY_INTERVAL_SECS}"
zed_rate_limit "${rate_limit_tag}" "${rate_limit_interval}" || exit 3
+68 -5
View File
@@ -29,7 +29,8 @@
[ -f "${ZED_ZEDLET_DIR}/zed.rc" ] && . "${ZED_ZEDLET_DIR}/zed.rc"
. "${ZED_ZEDLET_DIR}/zed-functions.sh"
if [ ! -d /sys/class/enclosure ] ; then
if [ ! -d /sys/class/enclosure ] && [ ! -d /sys/bus/pci/slots ] ; then
# No JBOD enclosure or NVMe slots
exit 1
fi
@@ -92,6 +93,29 @@ check_and_set_led()
done
}
# Fault LEDs for JBODs and NVMe drives are handled a little differently.
#
# On JBODs the fault LED is called 'fault' and on a path like this:
#
# /sys/class/enclosure/0:0:1:0/SLOT 10/fault
#
# On NVMe it's called 'attention' and on a path like this:
#
# /sys/bus/pci/slot/0/attention
#
# This function returns the full path to the fault LED file for a given
# enclosure/slot directory.
#
path_to_led()
{
dir=$1
if [ -f "$dir/fault" ] ; then
echo "$dir/fault"
elif [ -f "$dir/attention" ] ; then
echo "$dir/attention"
fi
}
state_to_val()
{
state="$1"
@@ -105,6 +129,38 @@ state_to_val()
esac
}
#
# Given a nvme name like 'nvme0n1', pass back its slot directory
# like "/sys/bus/pci/slots/0"
#
nvme_dev_to_slot()
{
dev="$1"
# Get the address "0000:01:00.0"
address=$(cat "/sys/class/block/$dev/device/address")
# For each /sys/bus/pci/slots subdir that is an actual number
# (rather than weird directories like "1-3/").
# shellcheck disable=SC2010
for i in $(ls /sys/bus/pci/slots/ | grep -E "^[0-9]+$") ; do
this_address=$(cat "/sys/bus/pci/slots/$i/address")
# The format of address is a little different between
# /sys/class/block/$dev/device/address and
# /sys/bus/pci/slots/
#
# address= "0000:01:00.0"
# this_address = "0000:01:00"
#
if echo "$address" | grep -Eq ^"$this_address" ; then
echo "/sys/bus/pci/slots/$i"
break
fi
done
}
# process_pool (pool)
#
# Iterate through a pool and set the vdevs' enclosure slot LEDs to
@@ -134,6 +190,11 @@ process_pool()
# Get dev name (like 'sda')
dev=$(basename "$(echo "$therest" | awk '{print $(NF-1)}')")
vdev_enc_sysfs_path=$(realpath "/sys/class/block/$dev/device/enclosure_device"*)
if [ ! -d "$vdev_enc_sysfs_path" ] ; then
# This is not a JBOD disk, but it could be a PCI NVMe drive
vdev_enc_sysfs_path=$(nvme_dev_to_slot "$dev")
fi
current_val=$(echo "$therest" | awk '{print $NF}')
if [ "$current_val" != "0" ] ; then
@@ -145,9 +206,10 @@ process_pool()
continue
fi
if [ ! -e "$vdev_enc_sysfs_path/fault" ] ; then
led_path=$(path_to_led "$vdev_enc_sysfs_path")
if [ ! -e "$led_path" ] ; then
rc=3
zed_log_msg "vdev $vdev '$file/fault' doesn't exist"
zed_log_msg "vdev $vdev '$led_path' doesn't exist"
continue
fi
@@ -158,7 +220,7 @@ process_pool()
continue
fi
if ! check_and_set_led "$vdev_enc_sysfs_path/fault" "$val"; then
if ! check_and_set_led "$led_path" "$val"; then
rc=3
fi
done
@@ -169,7 +231,8 @@ if [ -n "$ZEVENT_VDEV_ENC_SYSFS_PATH" ] && [ -n "$ZEVENT_VDEV_STATE_STR" ] ; the
# Got a statechange for an individual vdev
val=$(state_to_val "$ZEVENT_VDEV_STATE_STR")
vdev=$(basename "$ZEVENT_VDEV_PATH")
check_and_set_led "$ZEVENT_VDEV_ENC_SYSFS_PATH/fault" "$val"
ledpath=$(path_to_led "$ZEVENT_VDEV_ENC_SYSFS_PATH")
check_and_set_led "$ledpath" "$val"
else
# Process the entire pool
poolname=$(zed_guid_to_pool "$ZEVENT_POOL_GUID")
+3 -2
View File
@@ -15,7 +15,7 @@
# Send notification in response to a fault induced statechange
#
# ZEVENT_SUBCLASS: 'statechange'
# ZEVENT_VDEV_STATE_STR: 'DEGRADED', 'FAULTED' or 'REMOVED'
# ZEVENT_VDEV_STATE_STR: 'DEGRADED', 'FAULTED', 'REMOVED', or 'UNAVAIL'
#
# Exit codes:
# 0: notification sent
@@ -31,7 +31,8 @@
if [ "${ZEVENT_VDEV_STATE_STR}" != "FAULTED" ] \
&& [ "${ZEVENT_VDEV_STATE_STR}" != "DEGRADED" ] \
&& [ "${ZEVENT_VDEV_STATE_STR}" != "REMOVED" ]; then
&& [ "${ZEVENT_VDEV_STATE_STR}" != "REMOVED" ] \
&& [ "${ZEVENT_VDEV_STATE_STR}" != "UNAVAIL" ]; then
exit 3
fi
+86 -4
View File
@@ -77,7 +77,7 @@ zed_log_msg()
zed_log_err()
{
logger -p "${ZED_SYSLOG_PRIORITY}" -t "${ZED_SYSLOG_TAG}" -- "error:" \
"$(basename -- "$0"):""${ZEVENT_EID:+" eid=${ZEVENT_EID}:"}" "$@"
"${0##*/}:""${ZEVENT_EID:+" eid=${ZEVENT_EID}:"}" "$@"
}
@@ -202,6 +202,10 @@ zed_notify()
[ "${rv}" -eq 0 ] && num_success=$((num_success + 1))
[ "${rv}" -eq 1 ] && num_failure=$((num_failure + 1))
zed_notify_pushover "${subject}" "${pathname}"; rv=$?
[ "${rv}" -eq 0 ] && num_success=$((num_success + 1))
[ "${rv}" -eq 1 ] && num_failure=$((num_failure + 1))
[ "${num_success}" -gt 0 ] && return 0
[ "${num_failure}" -gt 0 ] && return 1
return 2
@@ -254,7 +258,7 @@ zed_notify_email()
[ -n "${subject}" ] || return 1
if [ ! -r "${pathname}" ]; then
zed_log_err \
"$(basename "${ZED_EMAIL_PROG}") cannot read \"${pathname}\""
"${ZED_EMAIL_PROG##*/} cannot read \"${pathname}\""
return 1
fi
@@ -266,7 +270,7 @@ zed_notify_email()
eval ${ZED_EMAIL_PROG} ${ZED_EMAIL_OPTS} < "${pathname}" >/dev/null 2>&1
rv=$?
if [ "${rv}" -ne 0 ]; then
zed_log_err "$(basename "${ZED_EMAIL_PROG}") exit=${rv}"
zed_log_err "${ZED_EMAIL_PROG##*/} exit=${rv}"
return 1
fi
return 0
@@ -413,7 +417,7 @@ zed_notify_slack_webhook()
# Construct the JSON message for posting.
#
msg_json="$(printf '{"text": "*%s*\n%s"}' "${subject}" "${msg_body}" )"
msg_json="$(printf '{"text": "*%s*\\n%s"}' "${subject}" "${msg_body}" )"
# Send the POST request and check for errors.
#
@@ -433,6 +437,84 @@ zed_notify_slack_webhook()
return 0
}
# zed_notify_pushover (subject, pathname)
#
# Send a notification via Pushover <https://pushover.net/>.
# The access token (ZED_PUSHOVER_TOKEN) identifies this client to the
# Pushover server. The user token (ZED_PUSHOVER_USER) defines the user or
# group to which the notification will be sent.
#
# Requires curl and sed executables to be installed in the standard PATH.
#
# References
# https://pushover.net/api
#
# Arguments
# subject: notification subject
# pathname: pathname containing the notification message (OPTIONAL)
#
# Globals
# ZED_PUSHOVER_TOKEN
# ZED_PUSHOVER_USER
#
# Return
# 0: notification sent
# 1: notification failed
# 2: not configured
#
zed_notify_pushover()
{
local subject="$1"
local pathname="${2:-"/dev/null"}"
local msg_body
local msg_out
local msg_err
local url="https://api.pushover.net/1/messages.json"
[ -n "${ZED_PUSHOVER_TOKEN}" ] && [ -n "${ZED_PUSHOVER_USER}" ] || return 2
if [ ! -r "${pathname}" ]; then
zed_log_err "pushover cannot read \"${pathname}\""
return 1
fi
zed_check_cmd "curl" "sed" || return 1
# Read the message body in.
#
msg_body="$(cat "${pathname}")"
if [ -z "${msg_body}" ]
then
msg_body=$subject
subject=""
fi
# Send the POST request and check for errors.
#
msg_out="$( \
curl \
--form-string "token=${ZED_PUSHOVER_TOKEN}" \
--form-string "user=${ZED_PUSHOVER_USER}" \
--form-string "message=${msg_body}" \
--form-string "title=${subject}" \
"${url}" \
2>/dev/null \
)"; rv=$?
if [ "${rv}" -ne 0 ]; then
zed_log_err "curl exit=${rv}"
return 1
fi
msg_err="$(echo "${msg_out}" \
| sed -n -e 's/.*"errors" *:.*\[\(.*\)\].*/\1/p')"
if [ -n "${msg_err}" ]; then
zed_log_err "pushover \"${msg_err}"\"
return 1
fi
return 0
}
# zed_rate_limit (tag, [interval])
#
# Check whether an event of a given type [tag] has already occurred within the
+21 -4
View File
@@ -13,9 +13,9 @@
# Email address of the zpool administrator for receipt of notifications;
# multiple addresses can be specified if they are delimited by whitespace.
# Email will only be sent if ZED_EMAIL_ADDR is defined.
# Disabled by default; uncomment to enable.
# Enabled by default; comment to disable.
#
#ZED_EMAIL_ADDR="root"
ZED_EMAIL_ADDR="root"
##
# Name or path of executable responsible for sending notifications via email;
@@ -82,6 +82,23 @@
#
#ZED_SLACK_WEBHOOK_URL=""
##
# Pushover token.
# This defines the application from which the notification will be sent.
# <https://pushover.net/api#registration>
# Disabled by default; uncomment to enable.
# ZED_PUSHOVER_USER, below, must also be configured.
#
#ZED_PUSHOVER_TOKEN=""
##
# Pushover user key.
# This defines which user or group will receive Pushover notifications.
# <https://pushover.net/api#identifiers>
# Disabled by default; uncomment to enable.
# ZED_PUSHOVER_TOKEN, above, must also be configured.
#ZED_PUSHOVER_USER=""
##
# Default directory for zed state files.
#
@@ -89,8 +106,8 @@
##
# Turn on/off enclosure LEDs when drives get DEGRADED/FAULTED. This works for
# device mapper and multipath devices as well. Your enclosure must be
# supported by the Linux SES driver for this to work.
# device mapper and multipath devices as well. This works with JBOD enclosures
# and NVMe PCI drives (assuming they're supported by Linux in sysfs).
#
ZED_USE_ENCLOSURE_LEDS=1
+1
View File
@@ -22,6 +22,7 @@
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <sys/uio.h>
#include <unistd.h>
+2
View File
@@ -72,6 +72,8 @@ zed_udev_event(const char *class, const char *subclass, nvlist_t *nvl)
zed_log_msg(LOG_INFO, "\t%s: %s", DEV_PATH, strval);
if (nvlist_lookup_string(nvl, DEV_IDENTIFIER, &strval) == 0)
zed_log_msg(LOG_INFO, "\t%s: %s", DEV_IDENTIFIER, strval);
if (nvlist_lookup_boolean(nvl, DEV_IS_PART) == B_TRUE)
zed_log_msg(LOG_INFO, "\t%s: B_TRUE", DEV_IS_PART);
if (nvlist_lookup_string(nvl, DEV_PHYS_PATH, &strval) == 0)
zed_log_msg(LOG_INFO, "\t%s: %s", DEV_PHYS_PATH, strval);
if (nvlist_lookup_uint64(nvl, DEV_SIZE, &numval) == 0)
+2
View File
@@ -26,6 +26,8 @@
#include <time.h>
#include <unistd.h>
#include <pthread.h>
#include <signal.h>
#include "zed_exec.h"
#include "zed_log.h"
#include "zed_strings.h"
+2 -1
View File
@@ -317,7 +317,7 @@ get_usage(zfs_help_t idx)
case HELP_SEND:
return (gettext("\tsend [-DnPpRvLecwhb] [-[i|I] snapshot] "
"<snapshot>\n"
"\tsend [-nvPLecw] [-i snapshot|bookmark] "
"\tsend [-DnvPLecw] [-i snapshot|bookmark] "
"<filesystem|volume|snapshot>\n"
"\tsend [-DnPpvLec] [-i bookmark|snapshot] "
"--redact <bookmark> <snapshot>\n"
@@ -7475,6 +7475,7 @@ unshare_unmount(int op, int argc, char **argv)
if (zfs_prop_get_int(zhp, ZFS_PROP_CANMOUNT) ==
ZFS_CANMOUNT_NOAUTO)
continue;
break;
default:
break;
}
+2 -1
View File
@@ -26,7 +26,8 @@ zpool_LDADD = \
$(abs_top_builddir)/lib/libzfs/libzfs.la \
$(abs_top_builddir)/lib/libzfs_core/libzfs_core.la \
$(abs_top_builddir)/lib/libnvpair/libnvpair.la \
$(abs_top_builddir)/lib/libuutil/libuutil.la
$(abs_top_builddir)/lib/libuutil/libuutil.la \
$(abs_top_builddir)/lib/libzutil/libzutil.la
zpool_LDADD += $(LTLIBINTL)
+4 -6
View File
@@ -16,14 +16,12 @@ if [ -L "$dev" ] ; then
dev=$(readlink "$dev")
fi
dev=$(basename "$dev")
dev="${dev##*/}"
val=""
if [ -d "/sys/class/block/$dev/slaves" ] ; then
# ls -C: output in columns, no newlines
val=$(ls -C "/sys/class/block/$dev/slaves")
# ls -C will print two spaces between files; change to one space.
val=$(echo "$val" | sed -r 's/[[:blank:]]+/ /g')
# ls -C: output in columns, no newlines, two spaces (change to one)
# shellcheck disable=SC2012
val=$(ls -C "/sys/class/block/$dev/slaves" | tr -s '[:space:]' ' ')
fi
echo "dm-deps=$val"
+3 -3
View File
@@ -9,7 +9,7 @@ iostat: Show iostat values since boot (summary page).
iostat-1s: Do a single 1-second iostat sample and show values.
iostat-10s: Do a single 10-second iostat sample and show values."
script=$(basename "$0")
script="${0##*/}"
if [ "$1" = "-h" ] ; then
echo "$helpstr" | grep "$script:" | tr -s '\t' | cut -f 2-
exit
@@ -42,7 +42,7 @@ else
${brief:+"-y"} \
${interval:+"$interval"} \
${interval:+"1"} \
"$VDEV_UPATH" | awk NF | tail -n 2)
"$VDEV_UPATH" | grep -v '^$' | tail -n 2)
fi
@@ -61,7 +61,7 @@ fi
cols=$(echo "$out" | head -n 1)
# Get the values and tab separate them to make them cut-able.
vals=$(echo "$out" | tail -n 1 | sed -r 's/[[:blank:]]+/\t/g')
vals=$(echo "$out" | tail -n 1 | tr -s '[:space:]' '\t')
i=0
for col in $cols ; do
+1 -1
View File
@@ -48,7 +48,7 @@ size: Show the disk capacity.
vendor: Show the disk vendor.
lsblk: Show the disk size, vendor, and model number."
script=$(basename "$0")
script="${0##*/}"
if [ "$1" = "-h" ] ; then
echo "$helpstr" | grep "$script:" | tr -s '\t' | cut -f 2-
+12 -8
View File
@@ -4,19 +4,23 @@
#
if [ "$1" = "-h" ] ; then
echo "Show whether a vdev is a file, hdd, or ssd."
echo "Show whether a vdev is a file, hdd, ssd, or iscsi."
exit
fi
if [ -b "$VDEV_UPATH" ]; then
device=$(basename "$VDEV_UPATH")
val=$(cat "/sys/block/$device/queue/rotational" 2>/dev/null)
if [ "$val" = "0" ]; then
MEDIA="ssd"
fi
device="${VDEV_UPATH##*/}"
read -r val 2>/dev/null < "/sys/block/$device/queue/rotational"
case "$val" in
0) MEDIA="ssd" ;;
1) MEDIA="hdd" ;;
esac
if [ "$val" = "1" ]; then
MEDIA="hdd"
vpd_pg83="/sys/block/$device/device/vpd_pg83"
if [ -f "$vpd_pg83" ]; then
if grep -q --binary "iqn." "$vpd_pg83"; then
MEDIA="iscsi"
fi
fi
else
if [ -f "$VDEV_UPATH" ]; then
+8 -2
View File
@@ -11,7 +11,7 @@ fault_led: Show value of the disk enclosure slot fault LED.
locate_led: Show value of the disk enclosure slot locate LED.
ses: Show disk's enc, enc device, slot, and fault/locate LED values."
script=$(basename "$0")
script="${0##*/}"
if [ "$1" = "-h" ] ; then
echo "$helpstr" | grep "$script:" | tr -s '\t' | cut -f 2-
exit
@@ -41,7 +41,13 @@ for i in $scripts ; do
val=$(ls "$VDEV_ENC_SYSFS_PATH/../device/scsi_generic" 2>/dev/null)
;;
fault_led)
val=$(cat "$VDEV_ENC_SYSFS_PATH/fault" 2>/dev/null)
# JBODs fault LED is called 'fault', NVMe fault LED is called
# 'attention'.
if [ -f "$VDEV_ENC_SYSFS_PATH/fault" ] ; then
val=$(cat "$VDEV_ENC_SYSFS_PATH/fault" 2>/dev/null)
elif [ -f "$VDEV_ENC_SYSFS_PATH/attention" ] ; then
val=$(cat "$VDEV_ENC_SYSFS_PATH/attention" 2>/dev/null)
fi
;;
locate_led)
val=$(cat "$VDEV_ENC_SYSFS_PATH/locate" 2>/dev/null)
+3 -47
View File
@@ -264,51 +264,6 @@ for_each_pool(int argc, char **argv, boolean_t unavail,
return (ret);
}
static int
for_each_vdev_cb(zpool_handle_t *zhp, nvlist_t *nv, pool_vdev_iter_f func,
void *data)
{
nvlist_t **child;
uint_t c, children;
int ret = 0;
int i;
char *type;
const char *list[] = {
ZPOOL_CONFIG_SPARES,
ZPOOL_CONFIG_L2CACHE,
ZPOOL_CONFIG_CHILDREN
};
for (i = 0; i < ARRAY_SIZE(list); i++) {
if (nvlist_lookup_nvlist_array(nv, list[i], &child,
&children) == 0) {
for (c = 0; c < children; c++) {
uint64_t ishole = 0;
(void) nvlist_lookup_uint64(child[c],
ZPOOL_CONFIG_IS_HOLE, &ishole);
if (ishole)
continue;
ret |= for_each_vdev_cb(zhp, child[c], func,
data);
}
}
}
if (nvlist_lookup_string(nv, ZPOOL_CONFIG_TYPE, &type) != 0)
return (ret);
/* Don't run our function on root vdevs */
if (strcmp(type, VDEV_TYPE_ROOT) != 0) {
ret |= func(zhp, nv, data);
}
return (ret);
}
/*
* This is the equivalent of for_each_pool() for vdevs. It iterates thorough
* all vdevs in the pool, ignoring root vdevs and holes, calling func() on
@@ -327,7 +282,7 @@ for_each_vdev(zpool_handle_t *zhp, pool_vdev_iter_f func, void *data)
verify(nvlist_lookup_nvlist(config, ZPOOL_CONFIG_VDEV_TREE,
&nvroot) == 0);
}
return (for_each_vdev_cb(zhp, nvroot, func, data));
return (for_each_vdev_cb((void *) zhp, nvroot, func, data));
}
/*
@@ -603,7 +558,7 @@ vdev_run_cmd_thread(void *cb_cmd_data)
/* For each vdev in the pool run a command */
static int
for_each_vdev_run_cb(zpool_handle_t *zhp, nvlist_t *nv, void *cb_vcdl)
for_each_vdev_run_cb(void *zhp_data, nvlist_t *nv, void *cb_vcdl)
{
vdev_cmd_data_list_t *vcdl = cb_vcdl;
vdev_cmd_data_t *data;
@@ -611,6 +566,7 @@ for_each_vdev_run_cb(zpool_handle_t *zhp, nvlist_t *nv, void *cb_vcdl)
char *vname = NULL;
char *vdev_enc_sysfs_path = NULL;
int i, match = 0;
zpool_handle_t *zhp = zhp_data;
if (nvlist_lookup_string(nv, ZPOOL_CONFIG_PATH, &path) != 0)
return (1);
+49 -12
View File
@@ -1215,6 +1215,26 @@ zpool_do_remove(int argc, char **argv)
return (ret);
}
/*
* Return 1 if a vdev is active (being used in a pool)
* Return 0 if a vdev is inactive (offlined or faulted, or not in active pool)
*
* This is useful for checking if a disk in an active pool is offlined or
* faulted.
*/
static int
vdev_is_active(char *vdev_path)
{
int fd;
fd = open(vdev_path, O_EXCL);
if (fd < 0) {
return (1); /* cant open O_EXCL - disk is active */
}
close(fd);
return (0); /* disk is inactive in the pool */
}
/*
* zpool labelclear [-f] <vdev>
*
@@ -1324,9 +1344,23 @@ zpool_do_labelclear(int argc, char **argv)
case POOL_STATE_ACTIVE:
case POOL_STATE_SPARE:
case POOL_STATE_L2CACHE:
/*
* We allow the user to call 'zpool offline -f'
* on an offlined disk in an active pool. We can check if
* the disk is online by calling vdev_is_active().
*/
if (force && !vdev_is_active(vdev))
break;
(void) fprintf(stderr, gettext(
"%s is a member (%s) of pool \"%s\"\n"),
"%s is a member (%s) of pool \"%s\""),
vdev, zpool_pool_state_to_name(state), name);
if (force) {
(void) fprintf(stderr, gettext(
". Offline the disk first to clear its label."));
}
printf("\n");
ret = 1;
goto errout;
@@ -3764,9 +3798,10 @@ zpool_do_import(int argc, char **argv)
return (1);
}
err = import_pools(pools, props, mntopts, flags, argv[0],
argc == 1 ? NULL : argv[1], do_destroyed, pool_specified,
do_all, &idata);
err = import_pools(pools, props, mntopts, flags,
argc >= 1 ? argv[0] : NULL,
argc >= 2 ? argv[1] : NULL,
do_destroyed, pool_specified, do_all, &idata);
/*
* If we're using the cachefile and we failed to import, then
@@ -3786,9 +3821,10 @@ zpool_do_import(int argc, char **argv)
nvlist_free(pools);
pools = zpool_search_import(g_zfs, &idata, &libzfs_config_ops);
err = import_pools(pools, props, mntopts, flags, argv[0],
argc == 1 ? NULL : argv[1], do_destroyed, pool_specified,
do_all, &idata);
err = import_pools(pools, props, mntopts, flags,
argc >= 1 ? argv[0] : NULL,
argc >= 2 ? argv[1] : NULL,
do_destroyed, pool_specified, do_all, &idata);
}
error:
@@ -4789,7 +4825,7 @@ children:
continue;
vname = zpool_vdev_name(g_zfs, zhp, newchild[c],
cb->cb_name_flags);
cb->cb_name_flags | VDEV_NAME_TYPE_ID);
ret += print_vdev_stats(zhp, vname, oldnv ? oldchild[c] : NULL,
newchild[c], cb, depth + 2);
free(vname);
@@ -4832,7 +4868,7 @@ children:
}
vname = zpool_vdev_name(g_zfs, zhp, newchild[c],
cb->cb_name_flags);
cb->cb_name_flags | VDEV_NAME_TYPE_ID);
ret += print_vdev_stats(zhp, vname, oldnv ?
oldchild[c] : NULL, newchild[c], cb, depth + 2);
free(vname);
@@ -5129,11 +5165,12 @@ get_stat_flags(zpool_list_t *list)
* Return 1 if cb_data->cb_vdev_names[0] is this vdev's name, 0 otherwise.
*/
static int
is_vdev_cb(zpool_handle_t *zhp, nvlist_t *nv, void *cb_data)
is_vdev_cb(void *zhp_data, nvlist_t *nv, void *cb_data)
{
iostat_cbdata_t *cb = cb_data;
char *name = NULL;
int ret = 0;
zpool_handle_t *zhp = zhp_data;
name = zpool_vdev_name(g_zfs, zhp, nv, cb->cb_name_flags);
@@ -6145,7 +6182,7 @@ print_list_stats(zpool_handle_t *zhp, const char *name, nvlist_t *nv,
continue;
vname = zpool_vdev_name(g_zfs, zhp, child[c],
cb->cb_name_flags);
cb->cb_name_flags | VDEV_NAME_TYPE_ID);
print_list_stats(zhp, vname, child[c], cb, depth + 2, B_FALSE);
free(vname);
}
@@ -6179,7 +6216,7 @@ print_list_stats(zpool_handle_t *zhp, const char *name, nvlist_t *nv,
printed = B_TRUE;
}
vname = zpool_vdev_name(g_zfs, zhp, child[c],
cb->cb_name_flags);
cb->cb_name_flags | VDEV_NAME_TYPE_ID);
print_list_stats(zhp, vname, child[c], cb, depth + 2,
B_FALSE);
free(vname);
+1 -1
View File
@@ -27,6 +27,7 @@
#include <libnvpair.h>
#include <libzfs.h>
#include <libzutil.h>
#ifdef __cplusplus
extern "C" {
@@ -67,7 +68,6 @@ int for_each_pool(int, char **, boolean_t unavail, zprop_list_t **,
boolean_t, zpool_iter_f, void *);
/* Vdev list functions */
typedef int (*pool_vdev_iter_f)(zpool_handle_t *, nvlist_t *, void *);
int for_each_vdev(zpool_handle_t *zhp, pool_vdev_iter_f func, void *data);
typedef struct zpool_list zpool_list_t;
+2 -2
View File
@@ -117,6 +117,7 @@ escape_string(char *s)
case '=':
case '\\':
*d++ = '\\';
fallthrough;
default:
*d = *c;
}
@@ -683,9 +684,8 @@ print_recursive_stats(stat_printer_f func, nvlist_t *nvroot,
if (descend && nvlist_lookup_nvlist_array(nvroot, ZPOOL_CONFIG_CHILDREN,
&child, &children) == 0) {
(void) strncpy(vdev_name, get_vdev_name(nvroot, parent_name),
(void) strlcpy(vdev_name, get_vdev_name(nvroot, parent_name),
sizeof (vdev_name));
vdev_name[sizeof (vdev_name) - 1] = '\0';
for (c = 0; c < children; c++) {
print_recursive_stats(func, child[c], pool_name,
+13
View File
@@ -297,6 +297,7 @@ zstream_do_dump(int argc, char *argv[])
fletcher_4_init();
while (read_hdr(drr, &zc)) {
uint64_t featureflags = 0;
/*
* If this is the first DMU record being processed, check for
@@ -362,6 +363,9 @@ zstream_do_dump(int argc, char *argv[])
BSWAP_64(drrb->drr_fromguid);
}
featureflags =
DMU_GET_FEATUREFLAGS(drrb->drr_versioninfo);
(void) printf("BEGIN record\n");
(void) printf("\thdrtype = %lld\n",
DMU_GET_STREAM_HDRTYPE(drrb->drr_versioninfo));
@@ -461,6 +465,15 @@ zstream_do_dump(int argc, char *argv[])
BSWAP_64(drro->drr_maxblkid);
}
if (featureflags & DMU_BACKUP_FEATURE_RAW &&
drro->drr_bonuslen > drro->drr_raw_bonuslen) {
(void) fprintf(stderr,
"Warning: Object %llu has bonuslen = "
"%u > raw_bonuslen = %u\n\n",
(u_longlong_t)drro->drr_object,
drro->drr_bonuslen, drro->drr_raw_bonuslen);
}
payload_size = DRR_OBJECT_PAYLOAD_SIZE(drro);
if (verbose) {
+31 -27
View File
@@ -38,40 +38,39 @@
static int
ioctl_get_msg(char *var, int fd)
{
int error = 0;
int ret;
char msg[ZFS_MAX_DATASET_NAME_LEN];
error = ioctl(fd, BLKZNAME, msg);
if (error < 0) {
return (error);
ret = ioctl(fd, BLKZNAME, msg);
if (ret < 0) {
return (ret);
}
snprintf(var, ZFS_MAX_DATASET_NAME_LEN, "%s", msg);
return (error);
return (ret);
}
int
main(int argc, char **argv)
{
int fd, error = 0;
int fd = -1, ret = 0, status = EXIT_FAILURE;
char zvol_name[ZFS_MAX_DATASET_NAME_LEN];
char *zvol_name_part = NULL;
char *dev_name;
struct stat64 statbuf;
int dev_minor, dev_part;
int i;
int rc;
if (argc < 2) {
printf("Usage: %s /dev/zvol_device_node\n", argv[0]);
return (EINVAL);
fprintf(stderr, "Usage: %s /dev/zvol_device_node\n", argv[0]);
goto fail;
}
dev_name = argv[1];
error = stat64(dev_name, &statbuf);
if (error != 0) {
printf("Unable to access device file: %s\n", dev_name);
return (errno);
ret = stat64(dev_name, &statbuf);
if (ret != 0) {
fprintf(stderr, "Unable to access device file: %s\n", dev_name);
goto fail;
}
dev_minor = minor(statbuf.st_rdev);
@@ -79,23 +78,23 @@ main(int argc, char **argv)
fd = open(dev_name, O_RDONLY);
if (fd < 0) {
printf("Unable to open device file: %s\n", dev_name);
return (errno);
fprintf(stderr, "Unable to open device file: %s\n", dev_name);
goto fail;
}
error = ioctl_get_msg(zvol_name, fd);
if (error < 0) {
printf("ioctl_get_msg failed:%s\n", strerror(errno));
return (errno);
ret = ioctl_get_msg(zvol_name, fd);
if (ret < 0) {
fprintf(stderr, "ioctl_get_msg failed: %s\n", strerror(errno));
goto fail;
}
if (dev_part > 0)
rc = asprintf(&zvol_name_part, "%s-part%d", zvol_name,
ret = asprintf(&zvol_name_part, "%s-part%d", zvol_name,
dev_part);
else
rc = asprintf(&zvol_name_part, "%s", zvol_name);
ret = asprintf(&zvol_name_part, "%s", zvol_name);
if (rc == -1 || zvol_name_part == NULL)
goto error;
if (ret == -1 || zvol_name_part == NULL)
goto fail;
for (i = 0; i < strlen(zvol_name_part); i++) {
if (isblank(zvol_name_part[i]))
@@ -103,8 +102,13 @@ main(int argc, char **argv)
}
printf("%s\n", zvol_name_part);
free(zvol_name_part);
error:
close(fd);
return (error);
status = EXIT_SUCCESS;
fail:
if (zvol_name_part)
free(zvol_name_part);
if (fd >= 0)
close(fd);
return (status);
}
+5 -1
View File
@@ -25,5 +25,9 @@ checkabi:
storeabi:
cd .libs ; \
for lib in $(lib_LTLIBRARIES) ; do \
abidw $${lib%.la}.so > ../$${lib%.la}.abi ; \
abidw --no-show-locs \
--no-corpus-path \
--no-comp-dir-path \
--type-id-style hash \
$${lib%.la}.so > ../$${lib%.la}.abi ; \
done
+1 -1
View File
@@ -26,6 +26,7 @@ AM_LIBTOOLFLAGS = --silent
AM_CFLAGS = -std=gnu99 -Wall -Wstrict-prototypes -Wmissing-prototypes
AM_CFLAGS += -fno-strict-aliasing
AM_CFLAGS += $(NO_OMIT_FRAME_POINTER)
AM_CFLAGS += $(IMPLICIT_FALLTHROUGH)
AM_CFLAGS += $(DEBUG_CFLAGS)
AM_CFLAGS += $(ASAN_CFLAGS)
AM_CFLAGS += $(CODE_COVERAGE_CFLAGS) $(NO_FORMAT_ZERO_LENGTH)
@@ -39,7 +40,6 @@ AM_CPPFLAGS = -D_GNU_SOURCE
AM_CPPFLAGS += -D_REENTRANT
AM_CPPFLAGS += -D_FILE_OFFSET_BITS=64
AM_CPPFLAGS += -D_LARGEFILE64_SOURCE
AM_CPPFLAGS += -DHAVE_LARGE_STACKS=1
AM_CPPFLAGS += -DLIBEXECDIR=\"$(libexecdir)\"
AM_CPPFLAGS += -DRUNSTATEDIR=\"$(runstatedir)\"
AM_CPPFLAGS += -DSBINDIR=\"$(sbindir)\"
+3 -1
View File
@@ -15,7 +15,9 @@ subst_sed_cmd = \
-e 's|@PYTHON[@]|$(PYTHON)|g' \
-e 's|@PYTHON_SHEBANG[@]|$(PYTHON_SHEBANG)|g' \
-e 's|@DEFAULT_INIT_NFS_SERVER[@]|$(DEFAULT_INIT_NFS_SERVER)|g' \
-e 's|@DEFAULT_INIT_SHELL[@]|$(DEFAULT_INIT_SHELL)|g'
-e 's|@DEFAULT_INIT_SHELL[@]|$(DEFAULT_INIT_SHELL)|g' \
-e 's|@LIBFETCH_DYNAMIC[@]|$(LIBFETCH_DYNAMIC)|g' \
-e 's|@LIBFETCH_SONAME[@]|$(LIBFETCH_SONAME)|g'
SUBSTFILES =
CLEANFILES = $(SUBSTFILES)
+23
View File
@@ -161,6 +161,29 @@ AC_DEFUN([ZFS_AC_CONFIG_ALWAYS_CC_NO_UNUSED_BUT_SET_VARIABLE], [
AC_SUBST([NO_UNUSED_BUT_SET_VARIABLE])
])
dnl #
dnl # Check if gcc supports -Wimplicit-fallthrough option.
dnl #
AC_DEFUN([ZFS_AC_CONFIG_ALWAYS_CC_IMPLICIT_FALLTHROUGH], [
AC_MSG_CHECKING([whether $CC supports -Wimplicit-fallthrough])
saved_flags="$CFLAGS"
CFLAGS="$CFLAGS -Werror -Wimplicit-fallthrough"
AC_COMPILE_IFELSE([AC_LANG_PROGRAM([], [])], [
IMPLICIT_FALLTHROUGH=-Wimplicit-fallthrough
AC_DEFINE([HAVE_IMPLICIT_FALLTHROUGH], 1,
[Define if compiler supports -Wimplicit-fallthrough])
AC_MSG_RESULT([yes])
], [
IMPLICIT_FALLTHROUGH=
AC_MSG_RESULT([no])
])
CFLAGS="$saved_flags"
AC_SUBST([IMPLICIT_FALLTHROUGH])
])
dnl #
dnl # Check if gcc supports -fno-omit-frame-pointer option.
dnl #
+1 -1
View File
@@ -28,7 +28,7 @@ AC_DEFUN([ZFS_AC_CONFIG_ALWAYS_PYTHON], [
dnl #
AM_PATH_PYTHON([], [], [:])
AS_IF([test -z "$PYTHON_VERSION"], [
PYTHON_VERSION=$(basename $PYTHON | tr -cd 0-9.)
PYTHON_VERSION=$(echo ${PYTHON##*/} | tr -cd 0-9.)
])
PYTHON_MINOR=${PYTHON_VERSION#*\.}
+16 -1
View File
@@ -6,7 +6,7 @@ dnl # https://www.gnu.org/software/autoconf-archive/ax_python_module.html
dnl # Required by ZFS_AC_CONFIG_ALWAYS_PYZFS.
dnl #
AC_DEFUN([ZFS_AC_PYTHON_MODULE], [
PYTHON_NAME=$(basename $PYTHON)
PYTHON_NAME=${PYTHON##*/}
AC_MSG_CHECKING([for $PYTHON_NAME module: $1])
AS_IF([$PYTHON -c "import $1" 2>/dev/null], [
AC_MSG_RESULT(yes)
@@ -46,6 +46,21 @@ AC_DEFUN([ZFS_AC_CONFIG_ALWAYS_PYZFS], [
])
AC_SUBST(DEFINE_PYZFS)
dnl #
dnl # Python "packaging" (or, failing that, "distlib") module is required to build and install pyzfs
dnl #
AS_IF([test "x$enable_pyzfs" = xcheck -o "x$enable_pyzfs" = xyes], [
ZFS_AC_PYTHON_MODULE([packaging], [], [
ZFS_AC_PYTHON_MODULE([distlib], [], [
AS_IF([test "x$enable_pyzfs" = xyes], [
AC_MSG_ERROR("Python $PYTHON_VERSION packaging and distlib modules are not installed")
], [test "x$enable_pyzfs" != xno], [
enable_pyzfs=no
])
])
])
])
dnl #
dnl # Require python-devel libraries
dnl #
+27 -6
View File
@@ -97,9 +97,18 @@ AC_DEFUN([AX_PYTHON_DEVEL],[
# Check for a version of Python >= 2.1.0
#
AC_MSG_CHECKING([for a version of Python >= '2.1.0'])
ac_supports_python_ver=`$PYTHON -c "import sys; \
ver = sys.version.split ()[[0]]; \
print (ver >= '2.1.0')"`
ac_supports_python_ver=`cat<<EOD | $PYTHON -
from __future__ import print_function;
import sys;
try:
from packaging import version;
except ImportError:
from distlib import version;
ver = sys.version.split ()[[0]];
(tst_cmp, tst_ver) = ">= '2.1.0'".split ();
tst_ver = tst_ver.strip ("'");
eval ("print (version.LegacyVersion (ver)"+ tst_cmp +"version.LegacyVersion (tst_ver))")
EOD`
if test "$ac_supports_python_ver" != "True"; then
if test -z "$PYTHON_NOVERSIONCHECK"; then
AC_MSG_RESULT([no])
@@ -126,9 +135,21 @@ to something else than an empty string.
#
if test -n "$1"; then
AC_MSG_CHECKING([for a version of Python $1])
ac_supports_python_ver=`$PYTHON -c "import sys; \
ver = sys.version.split ()[[0]]; \
print (ver $1)"`
# Why the strip ()? Because if we don't, version.parse
# will, for example, report 3.10.0 >= '3.11.0'
ac_supports_python_ver=`cat<<EOD | $PYTHON -
from __future__ import print_function;
import sys;
try:
from packaging import version;
except ImportError:
from distlib import version;
ver = sys.version.split ()[[0]];
(tst_cmp, tst_ver) = "$1".split ();
tst_ver = tst_ver.strip ("'");
eval ("print (version.LegacyVersion (ver)"+ tst_cmp +"version.LegacyVersion (tst_ver))")
EOD`
if test "$ac_supports_python_ver" = "True"; then
AC_MSG_RESULT([yes])
else
+22 -1
View File
@@ -162,6 +162,9 @@ dnl #
dnl # 3.1 API change,
dnl # Check if inode_operations contains the function get_acl
dnl #
dnl # 5.15 API change,
dnl # Added the bool rcu argument to get_acl for rcu path walk.
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_INODE_OPERATIONS_GET_ACL], [
ZFS_LINUX_TEST_SRC([inode_operations_get_acl], [
#include <linux/fs.h>
@@ -174,14 +177,32 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_INODE_OPERATIONS_GET_ACL], [
.get_acl = get_acl_fn,
};
],[])
ZFS_LINUX_TEST_SRC([inode_operations_get_acl_rcu], [
#include <linux/fs.h>
struct posix_acl *get_acl_fn(struct inode *inode, int type,
bool rcu) { return NULL; }
static const struct inode_operations
iops __attribute__ ((unused)) = {
.get_acl = get_acl_fn,
};
],[])
])
AC_DEFUN([ZFS_AC_KERNEL_INODE_OPERATIONS_GET_ACL], [
AC_MSG_CHECKING([whether iops->get_acl() exists])
ZFS_LINUX_TEST_RESULT([inode_operations_get_acl], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_GET_ACL, 1, [iops->get_acl() exists])
],[
ZFS_LINUX_TEST_ERROR([iops->get_acl()])
ZFS_LINUX_TEST_RESULT([inode_operations_get_acl_rcu], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_GET_ACL_RCU, 1, [iops->get_acl() takes rcu])
],[
ZFS_LINUX_TEST_ERROR([iops->get_acl()])
])
])
])
+26
View File
@@ -0,0 +1,26 @@
dnl #
dnl # 5.16 API change
dnl # add_disk grew a must-check return code
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_ADD_DISK], [
ZFS_LINUX_TEST_SRC([add_disk_ret], [
#include <linux/genhd.h>
], [
struct gendisk *disk = NULL;
int err = add_disk(disk);
err = err;
])
])
AC_DEFUN([ZFS_AC_KERNEL_ADD_DISK], [
AC_MSG_CHECKING([whether add_disk() returns int])
ZFS_LINUX_TEST_RESULT([add_disk_ret],
[
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_ADD_DISK_RET, 1,
[add_disk() returns int])
], [
AC_MSG_RESULT(no)
])
])
+85 -2
View File
@@ -191,6 +191,24 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_BIO_SET_DEV], [
], [], [ZFS_META_LICENSE])
])
dnl #
dnl # Linux 5.16 API
dnl #
dnl # bio_set_dev is no longer a helper macro and is now an inline function,
dnl # meaning that the function it calls internally can no longer be overridden
dnl # by our code
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_BIO_SET_DEV_MACRO], [
ZFS_LINUX_TEST_SRC([bio_set_dev_macro], [
#include <linux/bio.h>
#include <linux/fs.h>
],[
#ifndef bio_set_dev
#error Not a macro
#endif
], [], [ZFS_META_LICENSE])
])
AC_DEFUN([ZFS_AC_KERNEL_BIO_SET_DEV], [
AC_MSG_CHECKING([whether bio_set_dev() is available])
ZFS_LINUX_TEST_RESULT([bio_set_dev], [
@@ -205,6 +223,15 @@ AC_DEFUN([ZFS_AC_KERNEL_BIO_SET_DEV], [
AC_DEFINE(HAVE_BIO_SET_DEV_GPL_ONLY, 1,
[bio_set_dev() GPL-only])
])
AC_MSG_CHECKING([whether bio_set_dev() is a macro])
ZFS_LINUX_TEST_RESULT([bio_set_dev_macro], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_BIO_SET_DEV_MACRO, 1,
[bio_set_dev() is a macro])
],[
AC_MSG_RESULT(no)
])
],[
AC_MSG_RESULT(no)
])
@@ -294,9 +321,8 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_BIO_SUBMIT_BIO], [
ZFS_LINUX_TEST_SRC([submit_bio], [
#include <linux/bio.h>
],[
blk_qc_t blk_qc;
struct bio *bio = NULL;
blk_qc = submit_bio(bio);
(void) submit_bio(bio);
])
])
@@ -396,6 +422,58 @@ AC_DEFUN([ZFS_AC_KERNEL_BIO_BDEV_DISK], [
])
])
dnl #
dnl # Linux 5.16 API
dnl #
dnl # The Linux 5.16 API for submit_bio changed the return type to be
dnl # void instead of int
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_BDEV_SUBMIT_BIO_RETURNS_VOID], [
ZFS_LINUX_TEST_SRC([bio_bdev_submit_bio_void], [
#include <linux/blkdev.h>
],[
struct block_device_operations *bdev = NULL;
__attribute__((unused)) void(*f)(struct bio *) = bdev->submit_bio;
])
])
AC_DEFUN([ZFS_AC_KERNEL_BDEV_SUBMIT_BIO_RETURNS_VOID], [
AC_MSG_CHECKING(
[whether block_device_operations->submit_bio() returns void])
ZFS_LINUX_TEST_RESULT([bio_bdev_submit_bio_void], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_BDEV_SUBMIT_BIO_RETURNS_VOID, 1,
[block_device_operations->submit_bio() returns void])
],[
AC_MSG_RESULT(no)
])
])
dnl #
dnl # Linux 5.16 API
dnl #
dnl # The Linux 5.16 API moved struct blkcg_gq into linux/blk-cgroup.h, which
dnl # has been around since 2015. This test looks for the presence of that
dnl # header, so that it can be conditionally included where it exists, but
dnl # still be backward compatible with kernels that pre-date its introduction.
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_BLK_CGROUP_HEADER], [
ZFS_LINUX_TEST_SRC([blk_cgroup_header], [
#include <linux/blk-cgroup.h>
], [])
])
AC_DEFUN([ZFS_AC_KERNEL_BLK_CGROUP_HEADER], [
AC_MSG_CHECKING([for existence of linux/blk-cgroup.h])
ZFS_LINUX_TEST_RESULT([blk_cgroup_header],[
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_LINUX_BLK_CGROUP_HEADER, 1,
[linux/blk-cgroup.h exists])
],[
AC_MSG_RESULT(no)
])
])
AC_DEFUN([ZFS_AC_KERNEL_SRC_BIO], [
ZFS_AC_KERNEL_SRC_REQ
ZFS_AC_KERNEL_SRC_BIO_OPS
@@ -407,6 +485,9 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_BIO], [
ZFS_AC_KERNEL_SRC_BIO_CURRENT_BIO_LIST
ZFS_AC_KERNEL_SRC_BLKG_TRYGET
ZFS_AC_KERNEL_SRC_BIO_BDEV_DISK
ZFS_AC_KERNEL_SRC_BDEV_SUBMIT_BIO_RETURNS_VOID
ZFS_AC_KERNEL_SRC_BIO_SET_DEV_MACRO
ZFS_AC_KERNEL_SRC_BLK_CGROUP_HEADER
])
AC_DEFUN([ZFS_AC_KERNEL_BIO], [
@@ -429,4 +510,6 @@ AC_DEFUN([ZFS_AC_KERNEL_BIO], [
ZFS_AC_KERNEL_BIO_CURRENT_BIO_LIST
ZFS_AC_KERNEL_BLKG_TRYGET
ZFS_AC_KERNEL_BIO_BDEV_DISK
ZFS_AC_KERNEL_BDEV_SUBMIT_BIO_RETURNS_VOID
ZFS_AC_KERNEL_BLK_CGROUP_HEADER
])
+40
View File
@@ -47,6 +47,44 @@ AC_DEFUN([ZFS_AC_KERNEL_BLK_QUEUE_BDI], [
])
])
dnl #
dnl # 5.9: added blk_queue_update_readahead(),
dnl # 5.15: renamed to disk_update_readahead()
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_BLK_QUEUE_UPDATE_READAHEAD], [
ZFS_LINUX_TEST_SRC([blk_queue_update_readahead], [
#include <linux/blkdev.h>
],[
struct request_queue q;
blk_queue_update_readahead(&q);
])
ZFS_LINUX_TEST_SRC([disk_update_readahead], [
#include <linux/blkdev.h>
],[
struct gendisk disk;
disk_update_readahead(&disk);
])
])
AC_DEFUN([ZFS_AC_KERNEL_BLK_QUEUE_UPDATE_READAHEAD], [
AC_MSG_CHECKING([whether blk_queue_update_readahead() exists])
ZFS_LINUX_TEST_RESULT([blk_queue_update_readahead], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_BLK_QUEUE_UPDATE_READAHEAD, 1,
[blk_queue_update_readahead() exists])
],[
AC_MSG_CHECKING([whether disk_update_readahead() exists])
ZFS_LINUX_TEST_RESULT([disk_update_readahead], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_DISK_UPDATE_READAHEAD, 1,
[disk_update_readahead() exists])
],[
AC_MSG_RESULT(no)
])
])
])
dnl #
dnl # 2.6.32 API,
dnl # blk_queue_discard()
@@ -280,6 +318,7 @@ AC_DEFUN([ZFS_AC_KERNEL_BLK_QUEUE_MAX_SEGMENTS], [
AC_DEFUN([ZFS_AC_KERNEL_SRC_BLK_QUEUE], [
ZFS_AC_KERNEL_SRC_BLK_QUEUE_PLUG
ZFS_AC_KERNEL_SRC_BLK_QUEUE_BDI
ZFS_AC_KERNEL_SRC_BLK_QUEUE_UPDATE_READAHEAD
ZFS_AC_KERNEL_SRC_BLK_QUEUE_DISCARD
ZFS_AC_KERNEL_SRC_BLK_QUEUE_SECURE_ERASE
ZFS_AC_KERNEL_SRC_BLK_QUEUE_FLAG_SET
@@ -292,6 +331,7 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_BLK_QUEUE], [
AC_DEFUN([ZFS_AC_KERNEL_BLK_QUEUE], [
ZFS_AC_KERNEL_BLK_QUEUE_PLUG
ZFS_AC_KERNEL_BLK_QUEUE_BDI
ZFS_AC_KERNEL_BLK_QUEUE_UPDATE_READAHEAD
ZFS_AC_KERNEL_BLK_QUEUE_DISCARD
ZFS_AC_KERNEL_BLK_QUEUE_SECURE_ERASE
ZFS_AC_KERNEL_BLK_QUEUE_FLAG_SET
+23 -1
View File
@@ -120,7 +120,7 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_BLKDEV_BDEV_CHECK_MEDIA_CHANGE], [
])
AC_DEFUN([ZFS_AC_KERNEL_BLKDEV_BDEV_CHECK_MEDIA_CHANGE], [
AC_MSG_CHECKING([whether bdev_disk_changed() exists])
AC_MSG_CHECKING([whether bdev_check_media_change() exists])
ZFS_LINUX_TEST_RESULT([bdev_check_media_change], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_BDEV_CHECK_MEDIA_CHANGE, 1,
@@ -294,6 +294,27 @@ AC_DEFUN([ZFS_AC_KERNEL_BLKDEV_BDEV_WHOLE], [
])
])
dnl #
dnl # 5.13 API change
dnl # blkdev_get_by_path() no longer handles ERESTARTSYS
dnl #
dnl # Unfortunately we're forced to rely solely on the kernel version
dnl # number in order to determine the expected behavior. This was an
dnl # internal change to blkdev_get_by_dev(), see commit a8ed1a0607.
dnl #
AC_DEFUN([ZFS_AC_KERNEL_BLKDEV_GET_ERESTARTSYS], [
AC_MSG_CHECKING([whether blkdev_get_by_path() handles ERESTARTSYS])
AS_VERSION_COMPARE([$LINUX_VERSION], [5.13.0], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_BLKDEV_GET_ERESTARTSYS, 1,
[blkdev_get_by_path() handles ERESTARTSYS])
],[
AC_MSG_RESULT(no)
],[
AC_MSG_RESULT(no)
])
])
AC_DEFUN([ZFS_AC_KERNEL_SRC_BLKDEV], [
ZFS_AC_KERNEL_SRC_BLKDEV_GET_BY_PATH
ZFS_AC_KERNEL_SRC_BLKDEV_PUT
@@ -318,4 +339,5 @@ AC_DEFUN([ZFS_AC_KERNEL_BLKDEV], [
ZFS_AC_KERNEL_BLKDEV_CHECK_DISK_CHANGE
ZFS_AC_KERNEL_BLKDEV_BDEV_CHECK_MEDIA_CHANGE
ZFS_AC_KERNEL_BLKDEV_BDEV_WHOLE
ZFS_AC_KERNEL_BLKDEV_GET_ERESTARTSYS
])
-31
View File
@@ -19,7 +19,6 @@ AC_DEFUN([ZFS_AC_KERNEL_CONFIG_DEFINED], [
])
])
ZFS_AC_KERNEL_SRC_CONFIG_THREAD_SIZE
ZFS_AC_KERNEL_SRC_CONFIG_DEBUG_LOCK_ALLOC
ZFS_AC_KERNEL_SRC_CONFIG_TRIM_UNUSED_KSYMS
ZFS_AC_KERNEL_SRC_CONFIG_ZLIB_INFLATE
@@ -29,42 +28,12 @@ AC_DEFUN([ZFS_AC_KERNEL_CONFIG_DEFINED], [
ZFS_LINUX_TEST_COMPILE_ALL([config])
AC_MSG_RESULT([done])
ZFS_AC_KERNEL_CONFIG_THREAD_SIZE
ZFS_AC_KERNEL_CONFIG_DEBUG_LOCK_ALLOC
ZFS_AC_KERNEL_CONFIG_TRIM_UNUSED_KSYMS
ZFS_AC_KERNEL_CONFIG_ZLIB_INFLATE
ZFS_AC_KERNEL_CONFIG_ZLIB_DEFLATE
])
dnl #
dnl # Check configured THREAD_SIZE
dnl #
dnl # The stack size will vary by architecture, but as of Linux 3.15 on x86_64
dnl # the default thread stack size was increased to 16K from 8K. Therefore,
dnl # on newer kernels and some architectures stack usage optimizations can be
dnl # conditionally applied to improve performance without negatively impacting
dnl # stability.
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_CONFIG_THREAD_SIZE], [
ZFS_LINUX_TEST_SRC([config_thread_size], [
#include <linux/module.h>
],[
#if (THREAD_SIZE < 16384)
#error "THREAD_SIZE is less than 16K"
#endif
])
])
AC_DEFUN([ZFS_AC_KERNEL_CONFIG_THREAD_SIZE], [
AC_MSG_CHECKING([whether kernel was built with 16K or larger stacks])
ZFS_LINUX_TEST_RESULT([config_thread_size], [
AC_MSG_RESULT([yes])
AC_DEFINE(HAVE_LARGE_STACKS, 1, [kernel has large stacks])
],[
AC_MSG_RESULT([no])
])
])
dnl #
dnl # Check CONFIG_DEBUG_LOCK_ALLOC
dnl #
+17
View File
@@ -3,6 +3,10 @@ dnl # Linux 2.6.38 - 3.x API
dnl # The fallocate callback was moved from the inode_operations
dnl # structure to the file_operations structure.
dnl #
dnl #
dnl # Linux 3.15+
dnl # fallocate learned a new flag, FALLOC_FL_ZERO_RANGE
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_FALLOCATE], [
ZFS_LINUX_TEST_SRC([file_fallocate], [
#include <linux/fs.h>
@@ -15,12 +19,25 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_FALLOCATE], [
.fallocate = test_fallocate,
};
], [])
ZFS_LINUX_TEST_SRC([falloc_fl_zero_range], [
#include <linux/falloc.h>
],[
int flags __attribute__ ((unused));
flags = FALLOC_FL_ZERO_RANGE;
])
])
AC_DEFUN([ZFS_AC_KERNEL_FALLOCATE], [
AC_MSG_CHECKING([whether fops->fallocate() exists])
ZFS_LINUX_TEST_RESULT([file_fallocate], [
AC_MSG_RESULT(yes)
AC_MSG_CHECKING([whether FALLOC_FL_ZERO_RANGE exists])
ZFS_LINUX_TEST_RESULT([falloc_fl_zero_range], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_FALLOC_FL_ZERO_RANGE, 1, [FALLOC_FL_ZERO_RANGE is defined])
],[
AC_MSG_RESULT(no)
])
],[
ZFS_LINUX_TEST_ERROR([file_fallocate])
])
+59 -3
View File
@@ -1,7 +1,16 @@
dnl #
dnl #
dnl # Handle differences in kernel FPU code.
dnl #
dnl # Kernel
dnl # 5.16: XCR code put into asm/fpu/xcr.h
dnl # HAVE_KERNEL_FPU_XCR_HEADER
dnl #
dnl # XSTATE_XSAVE and XSTATE_XRESTORE aren't accessible any more
dnl # HAVE_KERNEL_FPU_XSAVE_INTERNAL
dnl #
dnl # 5.11: kernel_fpu_begin() is an inlined function now, so don't check
dnl # for it inside the kernel symbols.
dnl #
dnl # 5.0: Wrappers have been introduced to save/restore the FPU state.
dnl # This change was made to the 4.19.38 and 4.14.120 LTS kernels.
dnl # HAVE_KERNEL_FPU_INTERNAL
@@ -25,6 +34,18 @@ AC_DEFUN([ZFS_AC_KERNEL_FPU_HEADER], [
AC_DEFINE(HAVE_KERNEL_FPU_API_HEADER, 1,
[kernel has asm/fpu/api.h])
AC_MSG_RESULT(asm/fpu/api.h)
AC_MSG_CHECKING([whether fpu/xcr header is available])
ZFS_LINUX_TRY_COMPILE([
#include <linux/module.h>
#include <asm/fpu/xcr.h>
],[
],[
AC_DEFINE(HAVE_KERNEL_FPU_XCR_HEADER, 1,
[kernel has asm/fpu/xcr.h])
AC_MSG_RESULT(asm/fpu/xcr.h)
],[
AC_MSG_RESULT(no asm/fpu/xcr.h)
])
],[
AC_MSG_RESULT(i387.h & xcr.h)
])
@@ -92,6 +113,36 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_FPU], [
struct fxregs_state *fxr __attribute__ ((unused)) = &st->fxsave;
struct xregs_state *xr __attribute__ ((unused)) = &st->xsave;
])
ZFS_LINUX_TEST_SRC([fpu_xsave_internal], [
#include <linux/sched.h>
#if defined(__x86_64) || defined(__x86_64__) || \
defined(__i386) || defined(__i386__)
#if !defined(__x86)
#define __x86
#endif
#endif
#if !defined(__x86)
#error Unsupported architecture
#endif
#include <linux/types.h>
#ifdef HAVE_KERNEL_FPU_API_HEADER
#include <asm/fpu/api.h>
#include <asm/fpu/internal.h>
#else
#include <asm/i387.h>
#include <asm/xcr.h>
#endif
],[
struct fpu *fpu = &current->thread.fpu;
union fpregs_state *st = &fpu->fpstate->regs;
struct fregs_state *fr __attribute__ ((unused)) = &st->fsave;
struct fxregs_state *fxr __attribute__ ((unused)) = &st->fxsave;
struct xregs_state *xr __attribute__ ((unused)) = &st->xsave;
])
])
AC_DEFUN([ZFS_AC_KERNEL_FPU], [
@@ -99,8 +150,7 @@ AC_DEFUN([ZFS_AC_KERNEL_FPU], [
dnl # Legacy kernel
dnl #
AC_MSG_CHECKING([whether kernel fpu is available])
ZFS_LINUX_TEST_RESULT_SYMBOL([kernel_fpu_license],
[kernel_fpu_begin], [arch/x86/kernel/fpu/core.c], [
ZFS_LINUX_TEST_RESULT([kernel_fpu_license], [
AC_MSG_RESULT(kernel_fpu_*)
AC_DEFINE(HAVE_KERNEL_FPU, 1,
[kernel has kernel_fpu_* functions])
@@ -124,7 +174,13 @@ AC_DEFUN([ZFS_AC_KERNEL_FPU], [
AC_DEFINE(HAVE_KERNEL_FPU_INTERNAL, 1,
[kernel fpu internal])
],[
ZFS_LINUX_TEST_RESULT([fpu_xsave_internal], [
AC_MSG_RESULT(internal with internal XSAVE)
AC_DEFINE(HAVE_KERNEL_FPU_XSAVE_INTERNAL, 1,
[kernel fpu and XSAVE internal])
],[
AC_MSG_RESULT(unavailable)
])
])
])
])
+1
View File
@@ -64,6 +64,7 @@ dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_KVMALLOC], [
ZFS_LINUX_TEST_SRC([kvmalloc], [
#include <linux/mm.h>
#include <linux/slab.h>
],[
void *p __attribute__ ((unused));
+68
View File
@@ -0,0 +1,68 @@
AC_DEFUN([ZFS_AC_KERNEL_KTHREAD_COMPLETE_AND_EXIT], [
dnl #
dnl # 5.17 API,
dnl # cead18552660702a4a46f58e65188fe5f36e9dfe ("exit: Rename complete_and_exit to kthread_complete_and_exit")
dnl #
dnl # Also moves the definition from include/linux/kernel.h to include/linux/kthread.h
dnl #
AC_MSG_CHECKING([whether kthread_complete_and_exit() is available])
ZFS_LINUX_TEST_RESULT([kthread_complete_and_exit], [
AC_MSG_RESULT(yes)
AC_DEFINE(SPL_KTHREAD_COMPLETE_AND_EXIT, kthread_complete_and_exit, [kthread_complete_and_exit() available])
], [
AC_MSG_RESULT(no)
AC_DEFINE(SPL_KTHREAD_COMPLETE_AND_EXIT, complete_and_exit, [using complete_and_exit() instead])
])
])
AC_DEFUN([ZFS_AC_KERNEL_KTHREAD_DEQUEUE_SIGNAL_4ARG], [
dnl #
dnl # 5.17 API: enum pid_type * as new 4th dequeue_signal() argument,
dnl # 5768d8906bc23d512b1a736c1e198aa833a6daa4 ("signal: Requeue signals in the appropriate queue")
dnl #
dnl # int dequeue_signal(struct task_struct *task, sigset_t *mask, kernel_siginfo_t *info);
dnl # int dequeue_signal(struct task_struct *task, sigset_t *mask, kernel_siginfo_t *info, enum pid_type *type);
dnl #
AC_MSG_CHECKING([whether dequeue_signal() takes 4 arguments])
ZFS_LINUX_TEST_RESULT([kthread_dequeue_signal], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_DEQUEUE_SIGNAL_4ARG, 1, [dequeue_signal() takes 4 arguments])
], [
AC_MSG_RESULT(no)
])
])
AC_DEFUN([ZFS_AC_KERNEL_SRC_KTHREAD_COMPLETE_AND_EXIT], [
ZFS_LINUX_TEST_SRC([kthread_complete_and_exit], [
#include <linux/kthread.h>
], [
struct completion *completion = NULL;
long code = 0;
kthread_complete_and_exit(completion, code);
])
])
AC_DEFUN([ZFS_AC_KERNEL_SRC_KTHREAD_DEQUEUE_SIGNAL_4ARG], [
ZFS_LINUX_TEST_SRC([kthread_dequeue_signal], [
#include <linux/sched/signal.h>
], [
struct task_struct *task = NULL;
sigset_t *mask = NULL;
kernel_siginfo_t *info = NULL;
enum pid_type *type = NULL;
int error __attribute__ ((unused));
error = dequeue_signal(task, mask, info, type);
])
])
AC_DEFUN([ZFS_AC_KERNEL_KTHREAD], [
ZFS_AC_KERNEL_KTHREAD_COMPLETE_AND_EXIT
ZFS_AC_KERNEL_KTHREAD_DEQUEUE_SIGNAL_4ARG
])
AC_DEFUN([ZFS_AC_KERNEL_SRC_KTHREAD], [
ZFS_AC_KERNEL_SRC_KTHREAD_COMPLETE_AND_EXIT
ZFS_AC_KERNEL_SRC_KTHREAD_DEQUEUE_SIGNAL_4ARG
])
+20
View File
@@ -42,6 +42,13 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_MAKE_REQUEST_FN], [
struct block_device_operations o;
o.submit_bio = NULL;
])
ZFS_LINUX_TEST_SRC([blk_alloc_disk], [
#include <linux/blkdev.h>
],[
struct gendisk *disk __attribute__ ((unused));
disk = blk_alloc_disk(NUMA_NO_NODE);
])
])
AC_DEFUN([ZFS_AC_KERNEL_MAKE_REQUEST_FN], [
@@ -56,6 +63,19 @@ AC_DEFUN([ZFS_AC_KERNEL_MAKE_REQUEST_FN], [
AC_DEFINE(HAVE_SUBMIT_BIO_IN_BLOCK_DEVICE_OPERATIONS, 1,
[submit_bio is member of struct block_device_operations])
dnl #
dnl # Linux 5.14 API Change:
dnl # blk_alloc_queue() + alloc_disk() combo replaced by
dnl # a single call to blk_alloc_disk().
dnl #
AC_MSG_CHECKING([whether blk_alloc_disk() exists])
ZFS_LINUX_TEST_RESULT([blk_alloc_disk], [
AC_MSG_RESULT(yes)
AC_DEFINE([HAVE_BLK_ALLOC_DISK], 1, [blk_alloc_disk() exists])
], [
AC_MSG_RESULT(no)
])
],[
AC_MSG_RESULT(no)
+26
View File
@@ -0,0 +1,26 @@
dnl #
dnl # Linux 5.16 no longer allows directly calling wait_on_page_bit, and
dnl # instead requires you to call folio-specific functions. In this case,
dnl # wait_on_page_bit(pg, PG_writeback) becomes
dnl # folio_wait_bit(pg, PG_writeback)
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_PAGEMAP_FOLIO_WAIT_BIT], [
ZFS_LINUX_TEST_SRC([pagemap_has_folio_wait_bit], [
#include <linux/pagemap.h>
],[
static struct folio *f = NULL;
folio_wait_bit(f, PG_writeback);
])
])
AC_DEFUN([ZFS_AC_KERNEL_PAGEMAP_FOLIO_WAIT_BIT], [
AC_MSG_CHECKING([folio_wait_bit() exists])
ZFS_LINUX_TEST_RESULT([pagemap_has_folio_wait_bit], [
AC_MSG_RESULT([yes])
AC_DEFINE(HAVE_PAGEMAP_FOLIO_WAIT_BIT, 1,
[folio_wait_bit() exists])
],[
AC_MSG_RESULT([no])
])
])
+9 -7
View File
@@ -1,20 +1,22 @@
dnl #
dnl # 3.10 API change,
dnl # PDE is replaced by PDE_DATA
dnl # 5.17 API: PDE_DATA() renamed to pde_data(),
dnl # 359745d78351c6f5442435f81549f0207ece28aa ("proc: remove PDE_DATA() completely")
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_PDE_DATA], [
ZFS_LINUX_TEST_SRC([pde_data], [
#include <linux/proc_fs.h>
], [
PDE_DATA(NULL);
pde_data(NULL);
])
])
AC_DEFUN([ZFS_AC_KERNEL_PDE_DATA], [
AC_MSG_CHECKING([whether PDE_DATA() is available])
ZFS_LINUX_TEST_RESULT_SYMBOL([pde_data], [PDE_DATA], [], [
AC_MSG_CHECKING([whether pde_data() is lowercase])
ZFS_LINUX_TEST_RESULT([pde_data], [
AC_MSG_RESULT(yes)
],[
ZFS_LINUX_TEST_ERROR([PDE_DATA])
AC_DEFINE(SPL_PDE_DATA, pde_data, [pde_data() is pde_data()])
], [
AC_MSG_RESULT(no)
AC_DEFINE(SPL_PDE_DATA, PDE_DATA, [pde_data() is PDE_DATA()])
])
])
+32
View File
@@ -0,0 +1,32 @@
dnl #
dnl # Linux 5.15 gets rid of -isystem and external <stdarg.h> inclusion
dnl # and ships its own <linux/stdarg.h>. Check if this header file does
dnl # exist and provide all necessary definitions for variable argument
dnl # functions. Adjust the inclusion of <stdarg.h> according to the
dnl # results.
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_STANDALONE_LINUX_STDARG], [
ZFS_LINUX_TEST_SRC([has_standalone_linux_stdarg], [
#include <linux/stdarg.h>
#if !defined(va_start) || !defined(va_end) || \
!defined(va_arg) || !defined(va_copy)
#error "<linux/stdarg.h> is invalid"
#endif
],[])
])
AC_DEFUN([ZFS_AC_KERNEL_STANDALONE_LINUX_STDARG], [
dnl #
dnl # Linux 5.15 ships its own stdarg.h and doesn't allow to
dnl # include compiler headers.
dnl #
AC_MSG_CHECKING([whether standalone <linux/stdarg.h> exists])
ZFS_LINUX_TEST_RESULT([has_standalone_linux_stdarg], [
AC_MSG_RESULT([yes])
AC_DEFINE(HAVE_STANDALONE_LINUX_STDARG, 1,
[standalone <linux/stdarg.h> exists])
],[
AC_MSG_RESULT([no])
])
])
+42 -2
View File
@@ -41,6 +41,17 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_VFS_IOV_ITER], [
error = iov_iter_fault_in_readable(&iter, size);
])
ZFS_LINUX_TEST_SRC([fault_in_iov_iter_readable], [
#include <linux/fs.h>
#include <linux/uio.h>
],[
struct iov_iter iter = { 0 };
size_t size = 512;
int error __attribute__ ((unused));
error = fault_in_iov_iter_readable(&iter, size);
])
ZFS_LINUX_TEST_SRC([iov_iter_count], [
#include <linux/fs.h>
#include <linux/uio.h>
@@ -74,6 +85,14 @@ AC_DEFUN([ZFS_AC_KERNEL_SRC_VFS_IOV_ITER], [
bytes = copy_from_iter((void *)&buf, size, &iter);
])
ZFS_LINUX_TEST_SRC([iov_iter_type], [
#include <linux/fs.h>
#include <linux/uio.h>
],[
struct iov_iter iter = { 0 };
__attribute__((unused)) enum iter_type i = iov_iter_type(&iter);
])
])
AC_DEFUN([ZFS_AC_KERNEL_VFS_IOV_ITER], [
@@ -115,8 +134,15 @@ AC_DEFUN([ZFS_AC_KERNEL_VFS_IOV_ITER], [
AC_DEFINE(HAVE_IOV_ITER_FAULT_IN_READABLE, 1,
[iov_iter_fault_in_readable() is available])
],[
AC_MSG_RESULT(no)
enable_vfs_iov_iter="no"
AC_MSG_CHECKING([whether fault_in_iov_iter_readable() is available])
ZFS_LINUX_TEST_RESULT([fault_in_iov_iter_readable], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_FAULT_IN_IOV_ITER_READABLE, 1,
[fault_in_iov_iter_readable() is available])
],[
AC_MSG_RESULT(no)
enable_vfs_iov_iter="no"
])
])
AC_MSG_CHECKING([whether iov_iter_count() is available])
@@ -149,6 +175,20 @@ AC_DEFUN([ZFS_AC_KERNEL_VFS_IOV_ITER], [
enable_vfs_iov_iter="no"
])
dnl #
dnl # This checks for iov_iter_type() in linux/uio.h. It is not
dnl # required, however, and the module will compiled without it
dnl # using direct access of the member attribute
dnl #
AC_MSG_CHECKING([whether iov_iter_type() is available])
ZFS_LINUX_TEST_RESULT([iov_iter_type], [
AC_MSG_RESULT(yes)
AC_DEFINE(HAVE_IOV_ITER_TYPE, 1,
[iov_iter_type() is available])
],[
AC_MSG_RESULT(no)
])
dnl #
dnl # As of the 4.9 kernel support is provided for iovecs, kvecs,
dnl # bvecs and pipes in the iov_iter structure. As long as the
+34
View File
@@ -0,0 +1,34 @@
dnl #
dnl # Linux 5.14 adds a change to require set_page_dirty to be manually
dnl # wired up in struct address_space_operations. Determine if this needs
dnl # to be done. This patch set also introduced __set_page_dirty_nobuffers
dnl # declaration in linux/pagemap.h, so these tests look for the presence
dnl # of that function to tell the compiler to assign set_page_dirty in
dnl # module/os/linux/zfs/zpl_file.c
dnl #
AC_DEFUN([ZFS_AC_KERNEL_SRC_VFS_SET_PAGE_DIRTY_NOBUFFERS], [
ZFS_LINUX_TEST_SRC([vfs_has_set_page_dirty_nobuffers], [
#include <linux/pagemap.h>
#include <linux/fs.h>
static const struct address_space_operations
aops __attribute__ ((unused)) = {
.set_page_dirty = __set_page_dirty_nobuffers,
};
],[])
])
AC_DEFUN([ZFS_AC_KERNEL_VFS_SET_PAGE_DIRTY_NOBUFFERS], [
dnl #
dnl # Linux 5.14 change requires set_page_dirty() to be assigned
dnl # in address_space_operations()
dnl #
AC_MSG_CHECKING([__set_page_dirty_nobuffers exists])
ZFS_LINUX_TEST_RESULT([vfs_has_set_page_dirty_nobuffers], [
AC_MSG_RESULT([yes])
AC_DEFINE(HAVE_VFS_SET_PAGE_DIRTY_NOBUFFERS, 1,
[__set_page_dirty_nobuffers exists])
],[
AC_MSG_RESULT([no])
])
])
+92 -31
View File
@@ -132,6 +132,11 @@ AC_DEFUN([ZFS_AC_KERNEL_TEST_SRC], [
ZFS_AC_KERNEL_SRC_SIGNAL_STOP
ZFS_AC_KERNEL_SRC_SIGINFO
ZFS_AC_KERNEL_SRC_SET_SPECIAL_STATE
ZFS_AC_KERNEL_SRC_VFS_SET_PAGE_DIRTY_NOBUFFERS
ZFS_AC_KERNEL_SRC_STANDALONE_LINUX_STDARG
ZFS_AC_KERNEL_SRC_PAGEMAP_FOLIO_WAIT_BIT
ZFS_AC_KERNEL_SRC_ADD_DISK
ZFS_AC_KERNEL_SRC_KTHREAD
AC_MSG_CHECKING([for available kernel interfaces])
ZFS_LINUX_TEST_COMPILE_ALL([kabi])
@@ -237,6 +242,11 @@ AC_DEFUN([ZFS_AC_KERNEL_TEST_RESULT], [
ZFS_AC_KERNEL_SIGNAL_STOP
ZFS_AC_KERNEL_SIGINFO
ZFS_AC_KERNEL_SET_SPECIAL_STATE
ZFS_AC_KERNEL_VFS_SET_PAGE_DIRTY_NOBUFFERS
ZFS_AC_KERNEL_STANDALONE_LINUX_STDARG
ZFS_AC_KERNEL_PAGEMAP_FOLIO_WAIT_BIT
ZFS_AC_KERNEL_ADD_DISK
ZFS_AC_KERNEL_KTHREAD
])
dnl #
@@ -270,6 +280,35 @@ AC_DEFUN([ZFS_AC_MODULE_SYMVERS], [
dnl #
dnl # Detect the kernel to be built against
dnl #
dnl # Most modern Linux distributions have separate locations for bare
dnl # source (source) and prebuilt (build) files. Additionally, there are
dnl # `source` and `build` symlinks in `/lib/modules/$(KERNEL_VERSION)`
dnl # pointing to them. The directory search order is now:
dnl #
dnl # - `configure` command line values if both `--with-linux` and
dnl # `--with-linux-obj` were defined
dnl #
dnl # - If only `--with-linux` was defined, `--with-linux-obj` is assumed
dnl # to have the same value as `--with-linux`
dnl #
dnl # - If neither `--with-linux` nor `--with-linux-obj` were defined
dnl # autodetection is used:
dnl #
dnl # - `/lib/modules/$(uname -r)/{source,build}` respectively, if exist.
dnl #
dnl # - If only `/lib/modules/$(uname -r)/build` exists, it is assumed
dnl # to be both source and build directory.
dnl #
dnl # - The first directory in `/lib/modules` with the highest version
dnl # number according to `sort -V` which contains both `source` and
dnl # `build` symlinks/directories. If module directory contains only
dnl # `build` component, it is assumed to be both source and build
dnl # directory.
dnl #
dnl # - Last resort: the first directory matching `/usr/src/kernels/*`
dnl # and `/usr/src/linux-*` with the highest version number according
dnl # to `sort -V` is assumed to be both source and build directory.
dnl #
AC_DEFUN([ZFS_AC_KERNEL], [
AC_ARG_WITH([linux],
AS_HELP_STRING([--with-linux=PATH],
@@ -281,25 +320,52 @@ AC_DEFUN([ZFS_AC_KERNEL], [
[Path to kernel build objects]),
[kernelbuild="$withval"])
AC_MSG_CHECKING([kernel source directory])
AS_IF([test -z "$kernelsrc"], [
AS_IF([test -e "/lib/modules/$(uname -r)/source"], [
headersdir="/lib/modules/$(uname -r)/source"
sourcelink=$(readlink -f "$headersdir")
AC_MSG_CHECKING([kernel source and build directories])
AS_IF([test -n "$kernelsrc" && test -z "$kernelbuild"], [
kernelbuild="$kernelsrc"
], [test -z "$kernelsrc"], [
AS_IF([test -e "/lib/modules/$(uname -r)/source" && \
test -e "/lib/modules/$(uname -r)/build"], [
src="/lib/modules/$(uname -r)/source"
build="/lib/modules/$(uname -r)/build"
], [test -e "/lib/modules/$(uname -r)/build"], [
headersdir="/lib/modules/$(uname -r)/build"
sourcelink=$(readlink -f "$headersdir")
build="/lib/modules/$(uname -r)/build"
src="$build"
], [
sourcelink=$(ls -1d /usr/src/kernels/* \
/usr/src/linux-* \
2>/dev/null | grep -v obj | tail -1)
src=
for d in $(ls -1d /lib/modules/* 2>/dev/null | sort -Vr); do
if test -e "$d/source" && test -e "$d/build"; then
src="$d/source"
build="$d/build"
break
fi
if test -e "$d/build"; then
src="$d/build"
build="$d/build"
break
fi
done
# the least reliable method
if test -z "$src"; then
src=$(ls -1d /usr/src/kernels/* /usr/src/linux-* \
2>/dev/null | grep -v obj | sort -Vr | head -1)
build="$src"
fi
])
AS_IF([test -n "$sourcelink" && test -e ${sourcelink}], [
kernelsrc=`readlink -f ${sourcelink}`
AS_IF([test -n "$src" && test -e "$src"], [
kernelsrc=$(readlink -e "$src")
], [
kernelsrc="[Not found]"
])
AS_IF([test -n "$build" && test -e "$build"], [
kernelbuild=$(readlink -e "$build")
], [
kernelbuild="[Not found]"
])
], [
AS_IF([test "$kernelsrc" = "NONE"], [
kernsrcver=NONE
@@ -307,30 +373,19 @@ AC_DEFUN([ZFS_AC_KERNEL], [
withlinux=yes
])
AC_MSG_RESULT([done])
AC_MSG_CHECKING([kernel source directory])
AC_MSG_RESULT([$kernelsrc])
AS_IF([test ! -d "$kernelsrc"], [
AC_MSG_CHECKING([kernel build directory])
AC_MSG_RESULT([$kernelbuild])
AS_IF([test ! -d "$kernelsrc" || test ! -d "$kernelbuild"], [
AC_MSG_ERROR([
*** Please make sure the kernel devel package for your distribution
*** is installed and then try again. If that fails, you can specify the
*** location of the kernel source with the '--with-linux=PATH' option.])
*** location of the kernel source and build with the '--with-linux=PATH' and
*** '--with-linux-obj=PATH' options respectively.])
])
AC_MSG_CHECKING([kernel build directory])
AS_IF([test -z "$kernelbuild"], [
AS_IF([test x$withlinux != xyes -a -e "/lib/modules/$(uname -r)/build"], [
kernelbuild=`readlink -f /lib/modules/$(uname -r)/build`
], [test -d ${kernelsrc}-obj/${target_cpu}/${target_cpu}], [
kernelbuild=${kernelsrc}-obj/${target_cpu}/${target_cpu}
], [test -d ${kernelsrc}-obj/${target_cpu}/default], [
kernelbuild=${kernelsrc}-obj/${target_cpu}/default
], [test -d `dirname ${kernelsrc}`/build-${target_cpu}], [
kernelbuild=`dirname ${kernelsrc}`/build-${target_cpu}
], [
kernelbuild=${kernelsrc}
])
])
AC_MSG_RESULT([$kernelbuild])
AC_MSG_CHECKING([kernel source version])
utsrelease1=$kernelbuild/include/linux/version.h
utsrelease2=$kernelbuild/include/linux/utsrelease.h
@@ -591,9 +646,15 @@ dnl #
dnl # Used internally by ZFS_LINUX_TEST_{COMPILE,MODPOST}
dnl #
AC_DEFUN([ZFS_LINUX_COMPILE], [
AC_ARG_VAR([KERNEL_CC], [C compiler for
building kernel modules])
AC_ARG_VAR([KERNEL_LD], [Linker for
building kernel modules])
AC_ARG_VAR([KERNEL_LLVM], [Binary option to
build kernel modules with LLVM/CLANG toolchain])
AC_TRY_COMMAND([
KBUILD_MODPOST_NOFINAL="$5" KBUILD_MODPOST_WARN="$6"
make modules -k -j$TEST_JOBS -C $LINUX_OBJ $ARCH_UM
make modules -k -j$TEST_JOBS ${KERNEL_CC:+CC=$KERNEL_CC} ${KERNEL_LD:+LD=$KERNEL_LD} ${KERNEL_LLVM:+LLVM=$KERNEL_LLVM} -C $LINUX_OBJ $ARCH_UM
M=$PWD/$1 >$1/build.log 2>&1])
AS_IF([AC_TRY_COMMAND([$2])], [$3], [$4])
])
+66
View File
@@ -24,6 +24,9 @@ AC_DEFUN([ZFS_AC_CONFIG_ALWAYS_TOOLCHAIN_SIMD], [
ZFS_AC_CONFIG_TOOLCHAIN_CAN_BUILD_AES
ZFS_AC_CONFIG_TOOLCHAIN_CAN_BUILD_PCLMULQDQ
ZFS_AC_CONFIG_TOOLCHAIN_CAN_BUILD_MOVBE
ZFS_AC_CONFIG_TOOLCHAIN_CAN_BUILD_XSAVE
ZFS_AC_CONFIG_TOOLCHAIN_CAN_BUILD_XSAVEOPT
ZFS_AC_CONFIG_TOOLCHAIN_CAN_BUILD_XSAVES
;;
esac
])
@@ -422,3 +425,66 @@ AC_DEFUN([ZFS_AC_CONFIG_TOOLCHAIN_CAN_BUILD_MOVBE], [
AC_MSG_RESULT([no])
])
])
dnl #
dnl # ZFS_AC_CONFIG_TOOLCHAIN_CAN_BUILD_XSAVE
dnl #
AC_DEFUN([ZFS_AC_CONFIG_TOOLCHAIN_CAN_BUILD_XSAVE], [
AC_MSG_CHECKING([whether host toolchain supports XSAVE])
AC_LINK_IFELSE([AC_LANG_SOURCE([
[
void main()
{
char b[4096] __attribute__ ((aligned (64)));
__asm__ __volatile__("xsave %[b]\n" : : [b] "m" (*b) : "memory");
}
]])], [
AC_MSG_RESULT([yes])
AC_DEFINE([HAVE_XSAVE], 1, [Define if host toolchain supports XSAVE])
], [
AC_MSG_RESULT([no])
])
])
dnl #
dnl # ZFS_AC_CONFIG_TOOLCHAIN_CAN_BUILD_XSAVEOPT
dnl #
AC_DEFUN([ZFS_AC_CONFIG_TOOLCHAIN_CAN_BUILD_XSAVEOPT], [
AC_MSG_CHECKING([whether host toolchain supports XSAVEOPT])
AC_LINK_IFELSE([AC_LANG_SOURCE([
[
void main()
{
char b[4096] __attribute__ ((aligned (64)));
__asm__ __volatile__("xsaveopt %[b]\n" : : [b] "m" (*b) : "memory");
}
]])], [
AC_MSG_RESULT([yes])
AC_DEFINE([HAVE_XSAVEOPT], 1, [Define if host toolchain supports XSAVEOPT])
], [
AC_MSG_RESULT([no])
])
])
dnl #
dnl # ZFS_AC_CONFIG_TOOLCHAIN_CAN_BUILD_XSAVES
dnl #
AC_DEFUN([ZFS_AC_CONFIG_TOOLCHAIN_CAN_BUILD_XSAVES], [
AC_MSG_CHECKING([whether host toolchain supports XSAVES])
AC_LINK_IFELSE([AC_LANG_SOURCE([
[
void main()
{
char b[4096] __attribute__ ((aligned (64)));
__asm__ __volatile__("xsaves %[b]\n" : : [b] "m" (*b) : "memory");
}
]])], [
AC_MSG_RESULT([yes])
AC_DEFINE([HAVE_XSAVES], 1, [Define if host toolchain supports XSAVES])
], [
AC_MSG_RESULT([no])
])
])
+15 -21
View File
@@ -1,34 +1,28 @@
dnl #
dnl # If -latomic exists, it's needed for __atomic intrinsics.
dnl #
dnl # Some systems (like FreeBSD 13) don't have a libatomic at all because
dnl # their toolchain doesn't ship it they obviously don't need it.
dnl #
dnl # Others (like sufficiently ancient CentOS) have one,
dnl # but terminally broken or unlinkable (e.g. it's a dangling symlink,
dnl # or a linker script that points to a nonexistent file)
dnl # most arches affected by this don't actually need -latomic (and if they do,
dnl # then they should have libatomic that actually exists and links,
dnl # so don't fall into this category).
dnl #
dnl # Technically, we could check if the platform *actually* needs -latomic,
dnl # or if it has native support for all the intrinsics we use,
dnl # but it /really/ doesn't matter, and C11 recommends to always link it.
dnl # If -latomic exists and atomic.c doesn't link without it,
dnl # it's needed for __atomic intrinsics.
dnl #
AC_DEFUN([ZFS_AC_CONFIG_USER_LIBATOMIC], [
AC_MSG_CHECKING([whether -latomic is present])
AC_MSG_CHECKING([whether -latomic is required])
saved_libs="$LIBS"
LIBS="$LIBS -latomic"
LIBATOMIC_LIBS=""
AC_LINK_IFELSE([AC_LANG_PROGRAM([], [])], [
LIBATOMIC_LIBS="-latomic"
AC_MSG_RESULT([yes])
], [
LIBATOMIC_LIBS=""
AC_MSG_RESULT([no])
LIBS="$saved_libs"
saved_cflags="$CFLAGS"
CFLAGS="$CFLAGS -isystem lib/libspl/include"
AC_LINK_IFELSE([AC_LANG_PROGRAM([#include "lib/libspl/atomic.c"], [])], [], [LIBATOMIC_LIBS="-latomic"])
CFLAGS="$saved_cflags"
])
if test -n "$LIBATOMIC_LIBS"; then
AC_MSG_RESULT([yes])
else
AC_MSG_RESULT([no])
fi
LIBS="$saved_libs"
AC_SUBST([LIBATOMIC_LIBS])
])
+71
View File
@@ -0,0 +1,71 @@
dnl #
dnl # Check for a libfetch - either fetch(3) or libcurl.
dnl #
dnl # There are two configuration dimensions:
dnl # * fetch(3) vs libcurl
dnl # * static vs dynamic
dnl #
dnl # fetch(3) is only dynamic.
dnl # We use sover 6, which first appeared in FreeBSD 8.0-RELEASE.
dnl #
dnl # libcurl development packages include curl-config(1) we want:
dnl # * HTTPS support
dnl # * version at least 7.16 (October 2006), for sover 4
dnl # * to decide if it's static or not
dnl #
AC_DEFUN([ZFS_AC_CONFIG_USER_LIBFETCH], [
AC_MSG_CHECKING([for libfetch])
LIBFETCH_LIBS=
LIBFETCH_IS_FETCH=0
LIBFETCH_IS_LIBCURL=0
LIBFETCH_DYNAMIC=0
LIBFETCH_SONAME=
have_libfetch=
saved_libs="$LIBS"
LIBS="$LIBS -lfetch"
AC_LINK_IFELSE([AC_LANG_PROGRAM([[
#include <sys/param.h>
#include <stdio.h>
#include <fetch.h>
]], [fetchGetURL("", "");])], [
have_libfetch=1
LIBFETCH_IS_FETCH=1
LIBFETCH_DYNAMIC=1
LIBFETCH_SONAME="libfetch.so.6"
LIBFETCH_LIBS="-ldl"
AC_MSG_RESULT([fetch(3)])
], [])
LIBS="$saved_libs"
if test -z "$have_libfetch"; then
if curl-config --protocols 2>/dev/null | grep -q HTTPS &&
test "$(printf "%u" "0x$(curl-config --vernum)")" -ge "$(printf "%u" "0x071000")"; then
have_libfetch=1
LIBFETCH_IS_LIBCURL=1
if test "$(curl-config --built-shared)" = "yes"; then
LIBFETCH_DYNAMIC=1
LIBFETCH_SONAME="libcurl.so.4"
LIBFETCH_LIBS="-ldl"
AC_MSG_RESULT([libcurl])
else
LIBFETCH_LIBS="$(curl-config --libs)"
AC_MSG_RESULT([libcurl (static)])
fi
CCFLAGS="$CCFLAGS $(curl-config --cflags)"
fi
fi
if test -z "$have_libfetch"; then
AC_MSG_RESULT([none])
fi
AC_SUBST([LIBFETCH_LIBS])
AC_SUBST([LIBFETCH_DYNAMIC])
AC_SUBST([LIBFETCH_SONAME])
AC_DEFINE_UNQUOTED([LIBFETCH_IS_FETCH], [$LIBFETCH_IS_FETCH], [libfetch is fetch(3)])
AC_DEFINE_UNQUOTED([LIBFETCH_IS_LIBCURL], [$LIBFETCH_IS_LIBCURL], [libfetch is libcurl])
AC_DEFINE_UNQUOTED([LIBFETCH_DYNAMIC], [$LIBFETCH_DYNAMIC], [whether the chosen libfetch is to be loaded at run-time])
AC_DEFINE_UNQUOTED([LIBFETCH_SONAME], ["$LIBFETCH_SONAME"], [soname of chosen libfetch])
])
+1
View File
@@ -22,6 +22,7 @@ AC_DEFUN([ZFS_AC_CONFIG_USER], [
ZFS_AC_CONFIG_USER_LIBCRYPTO
ZFS_AC_CONFIG_USER_LIBAIO
ZFS_AC_CONFIG_USER_LIBATOMIC
ZFS_AC_CONFIG_USER_LIBFETCH
ZFS_AC_CONFIG_USER_CLOCK_GETTIME
ZFS_AC_CONFIG_USER_PAM
ZFS_AC_CONFIG_USER_RUNSTATEDIR
+4
View File
@@ -211,6 +211,7 @@ AC_DEFUN([ZFS_AC_CONFIG_ALWAYS], [
ZFS_AC_CONFIG_ALWAYS_CC_NO_UNUSED_BUT_SET_VARIABLE
ZFS_AC_CONFIG_ALWAYS_CC_NO_BOOL_COMPARE
ZFS_AC_CONFIG_ALWAYS_CC_IMPLICIT_FALLTHROUGH
ZFS_AC_CONFIG_ALWAYS_CC_FRAME_LARGER_THAN
ZFS_AC_CONFIG_ALWAYS_CC_NO_FORMAT_TRUNCATION
ZFS_AC_CONFIG_ALWAYS_CC_NO_FORMAT_ZERO_LENGTH
@@ -367,6 +368,9 @@ AC_DEFUN([ZFS_AC_RPM], [
RPM_DEFINE_KMOD=${RPM_DEFINE_KMOD}' --define "kernels $(LINUX_VERSION)"'
RPM_DEFINE_KMOD=${RPM_DEFINE_KMOD}' --define "ksrc $(LINUX)"'
RPM_DEFINE_KMOD=${RPM_DEFINE_KMOD}' --define "kobj $(LINUX_OBJ)"'
RPM_DEFINE_KMOD=${RPM_DEFINE_KMOD}' --define "kernel_cc KERNEL_CC=$(KERNEL_CC)"'
RPM_DEFINE_KMOD=${RPM_DEFINE_KMOD}' --define "kernel_ld KERNEL_LD=$(KERNEL_LD)"'
RPM_DEFINE_KMOD=${RPM_DEFINE_KMOD}' --define "kernel_llvm KERNEL_LLVM=$(KERNEL_LLVM)"'
])
RPM_DEFINE_DKMS=''
+2 -2
View File
@@ -73,14 +73,14 @@ AC_DEFUN([ZFS_AC_META], [
if test ! -f ".nogitrelease" && git rev-parse --git-dir > /dev/null 2>&1; then
_match="${ZFS_META_NAME}-${ZFS_META_VERSION}"
_alias=$(git describe --match=${_match} 2>/dev/null)
_release=$(echo ${_alias}|cut -f3- -d'-'|sed 's/-/_/g')
_release=$(echo ${_alias}|sed "s/${ZFS_META_NAME}//"|cut -f3- -d'-'|tr - _)
if test -n "${_release}"; then
ZFS_META_RELEASE=${_release}
_zfs_ac_meta_type="git describe"
else
_match="${ZFS_META_NAME}-${ZFS_META_VERSION}-${ZFS_META_RELEASE}"
_alias=$(git describe --match=${_match} 2>/dev/null)
_release=$(echo ${_alias}|cut -f3- -d'-'|sed 's/-/_/g')
_release=$(echo ${_alias}|sed 's/${ZFS_META_NAME}//'|cut -f3- -d'-'|tr - _)
if test -n "${_release}"; then
ZFS_META_RELEASE=${_release}
_zfs_ac_meta_type="git describe"
+3
View File
@@ -221,6 +221,7 @@ AC_CONFIG_FILES([
tests/zfs-tests/cmd/mktree/Makefile
tests/zfs-tests/cmd/mmap_exec/Makefile
tests/zfs-tests/cmd/mmap_libaio/Makefile
tests/zfs-tests/cmd/mmap_seek/Makefile
tests/zfs-tests/cmd/mmapwrite/Makefile
tests/zfs-tests/cmd/nvlist_to_lua/Makefile
tests/zfs-tests/cmd/randfree_file/Makefile
@@ -327,6 +328,7 @@ AC_CONFIG_FILES([
tests/zfs-tests/tests/functional/cli_user/zpool_status/Makefile
tests/zfs-tests/tests/functional/compression/Makefile
tests/zfs-tests/tests/functional/cp_files/Makefile
tests/zfs-tests/tests/functional/crtime/Makefile
tests/zfs-tests/tests/functional/ctime/Makefile
tests/zfs-tests/tests/functional/deadman/Makefile
tests/zfs-tests/tests/functional/delegate/Makefile
@@ -381,6 +383,7 @@ AC_CONFIG_FILES([
tests/zfs-tests/tests/functional/rootpool/Makefile
tests/zfs-tests/tests/functional/rsend/Makefile
tests/zfs-tests/tests/functional/scrub_mirror/Makefile
tests/zfs-tests/tests/functional/simd/Makefile
tests/zfs-tests/tests/functional/slog/Makefile
tests/zfs-tests/tests/functional/snapshot/Makefile
tests/zfs-tests/tests/functional/snapused/Makefile
+2 -1
View File
@@ -1,6 +1,7 @@
#!/bin/sh
ZVER=$(cut -f 1 -d '-' /sys/module/zfs/version)
read -r ZVER < /sys/module/zfs/version
ZVER="${ZVER%%-*}"
KVER=$(uname -r)
exec bpftrace \
@@ -2,8 +2,8 @@
get_devtype() {
local typ
typ=$(udevadm info --query=property --name="$1" | grep "^ID_FS_TYPE=" | sed 's|^ID_FS_TYPE=||')
if [ "$typ" = "" ] ; then
typ=$(udevadm info --query=property --name="$1" | sed -n 's|^ID_FS_TYPE=||p')
if [ -z "$typ" ] ; then
typ=$(blkid -c /dev/null "$1" -o value -s TYPE)
fi
echo "$typ"
@@ -36,7 +36,6 @@ find_zfs_block_devices() {
local dev
local mp
local fstype
local pool
local _
numfields="$(awk '{print NF; exit}' /proc/self/mountinfo)"
if [ "$numfields" = "10" ] ; then
@@ -47,10 +46,7 @@ find_zfs_block_devices() {
# shellcheck disable=SC2086
while read -r ${fields?} ; do
[ "$fstype" = "zfs" ] || continue
if [ "$mp" = "$1" ]; then
pool=$(echo "$dev" | cut -d / -f 1)
get_pool_devices "$pool"
fi
[ "$mp" = "$1" ] && get_pool_devices "${dev%%/*}"
done < /proc/self/mountinfo
}
@@ -100,9 +96,9 @@ if [ -n "$hostonly" ]; then
majmin=$(get_maj_min "$dev")
if [ -d "/sys/dev/block/$majmin/slaves" ] ; then
for _depdev in "/sys/dev/block/$majmin/slaves"/*; do
[[ -f $_depdev/dev ]] || continue
_depdev=/dev/$(basename "$_depdev")
_depdevname=$(udevadm info --query=property --name="$_depdev" | grep "^DEVNAME=" | sed 's|^DEVNAME=||')
[ -f "$_depdev/dev" ] || continue
_depdev="/dev/${_depdev##*/}"
_depdevname=$(udevadm info --query=property --name="$_depdev" | sed -n 's|^DEVNAME=||p')
_depdevtype=$(get_devtype "$_depdevname")
dinfo "zfsexpandknowledge: underlying block device backing ZFS dataset $mp: ${_depdevname//$'\n'/ }"
array_contains "$_depdevname" "${host_devs[@]}" || host_devs+=("$_depdevname")
+7 -1
View File
@@ -60,11 +60,17 @@ install() {
# Fallback: Guess the path and include all matches
dracut_install /usr/lib*/gcc/**/libgcc_s.so*
fi
# shellcheck disable=SC2050
if [ @LIBFETCH_DYNAMIC@ -gt 0 ]; then
for d in $libdirs; do
[ -e "$d/@LIBFETCH_SONAME@" ] && dracut_install "$d/@LIBFETCH_SONAME@"
done
fi
dracut_install @mounthelperdir@/mount.zfs
dracut_install @udevdir@/vdev_id
dracut_install awk
dracut_install basename
dracut_install cut
dracut_install tr
dracut_install head
dracut_install @udevdir@/zvol_id
inst_hook cmdline 95 "${moddir}/parse-zfs.sh"
+1 -1
View File
@@ -43,7 +43,7 @@ case "${root}" in
root="${root#FILESYSTEM=}"
root="zfs:${root#ZFS=}"
# switch + with spaces because kernel cmdline does not allow us to quote parameters
root=$(printf '%s\n' "$root" | sed "s/+/ /g")
root=$(echo "$root" | tr '+' ' ')
rootok=1
wait_for_zfs=1
@@ -8,7 +8,7 @@ Before=zfs-import.target
[Service]
Type=oneshot
ExecStart=/bin/sh -c "systemctl set-environment BOOTFS=$(@sbindir@/zpool list -H -o bootfs | grep -m1 -v '^-$')"
ExecStart=/bin/sh -c "exec systemctl set-environment BOOTFS=$(@sbindir@/zpool list -H -o bootfs | grep -m1 -v '^-$')"
[Install]
WantedBy=zfs-import.target
+1 -1
View File
@@ -89,7 +89,7 @@ else
_zfs_generator_cb() {
dset="${1}"
mpnt="${2}"
unit="sysroot$(echo "$mpnt" | sed 's;/;-;g').mount"
unit="sysroot$(echo "$mpnt" | tr '/' '-').mount"
{
echo "[Unit]"
+27 -10
View File
@@ -42,15 +42,32 @@ if [ "$(zpool list -H -o feature@encryption "${BOOTFS%%/*}")" = 'active' ]; then
[ "$KEYSTATUS" = "unavailable" ] || exit 0
KEYLOCATION="$(zfs get -H -o value keylocation "${ENCRYPTIONROOT}")"
if ! [ "${KEYLOCATION}" = "prompt" ]; then
zfs load-key "${ENCRYPTIONROOT}"
else
# decrypt them
TRY_COUNT=5
while [ $TRY_COUNT -gt 0 ]; do
systemd-ask-password "Encrypted ZFS password for ${BOOTFS}" --no-tty | zfs load-key "${ENCRYPTIONROOT}" && break
TRY_COUNT=$((TRY_COUNT - 1))
done
fi
case "${KEYLOCATION%%://*}" in
prompt)
for _ in 1 2 3; do
systemd-ask-password --no-tty "Encrypted ZFS password for ${BOOTFS}" | zfs load-key "${ENCRYPTIONROOT}" && break
done
;;
http*)
systemctl start network-online.target
zfs load-key "${ENCRYPTIONROOT}"
;;
file)
KEYFILE="${KEYLOCATION#file://}"
[ -r "${KEYFILE}" ] || udevadm settle
[ -r "${KEYFILE}" ] || {
info "Waiting for key ${KEYFILE} for ${ENCRYPTIONROOT}..."
for _ in 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20; do
sleep 0.5s
[ -r "${KEYFILE}" ] && break
done
}
[ -r "${KEYFILE}" ] || warn "Key ${KEYFILE} for ${ENCRYPTIONROOT} hasn't appeared. Trying anyway."
zfs load-key "${ENCRYPTIONROOT}"
;;
*)
zfs load-key "${ENCRYPTIONROOT}"
;;
esac
fi
fi
@@ -10,5 +10,5 @@ ConditionKernelCommandLine=bootfs.rollback
# ${BOOTFS} should have been set by zfs-env-bootfs.service
Type=oneshot
ExecStartPre=/bin/sh -c 'test -n "${BOOTFS}"'
ExecStart=/bin/sh -c '. /lib/dracut-lib.sh; SNAPNAME="$(getarg bootfs.rollback)"; @sbindir@/zfs rollback -Rf "${BOOTFS}@${SNAPNAME:-%v}"'
ExecStart=/bin/sh -c '. /lib/dracut-lib.sh; SNAPNAME="$(getarg bootfs.rollback)"; exec @sbindir@/zfs rollback -Rf "${BOOTFS}@${SNAPNAME:-%v}"'
RemainAfterExit=yes
@@ -10,5 +10,5 @@ ConditionKernelCommandLine=bootfs.snapshot
# ${BOOTFS} should have been set by zfs-env-bootfs.service
Type=oneshot
ExecStartPre=/bin/sh -c 'test -n "${BOOTFS}"'
ExecStart=-/bin/sh -c '. /lib/dracut-lib.sh; SNAPNAME="$(getarg bootfs.snapshot)"; @sbindir@/zfs snapshot "${BOOTFS}@${SNAPNAME:-%v}"'
ExecStart=-/bin/sh -c '. /lib/dracut-lib.sh; SNAPNAME="$(getarg bootfs.snapshot)"; exec @sbindir@/zfs snapshot "${BOOTFS}@${SNAPNAME:-%v}"'
RemainAfterExit=yes
+7
View File
@@ -30,6 +30,13 @@ find /lib/ -type f -name "libgcc_s.so.[1-9]" | while read -r libgcc; do
copy_exec "$libgcc"
done
# shellcheck disable=SC2050
if [ @LIBFETCH_DYNAMIC@ -gt 0 ]; then
find /lib/ -name "@LIBFETCH_SONAME@" | while read -r libfetch; do
copy_exec "$libfetch"
done
fi
copy_file config "/etc/hostid"
copy_file cache "@sysconfdir@/zfs/zpool.cache"
copy_file config "@initconfdir@/zfs"
+7 -11
View File
@@ -105,8 +105,7 @@ find_rootfs()
find_pools()
{
pools=$("$@" 2> /dev/null | \
grep -E "pool:|^[a-zA-Z0-9]" | \
sed 's@.*: @@' | \
sed -Ee '/pool:|^[a-zA-Z0-9]/!d' -e 's@.*: @@' | \
tr '\n' ';')
echo "${pools%%;}" # Return without the last ';'.
@@ -403,35 +402,32 @@ decrypt_fs()
KEYSTATUS="$(get_fs_value "${ENCRYPTIONROOT}" keystatus)"
# Continue only if the key needs to be loaded
[ "$KEYSTATUS" = "unavailable" ] || return 0
TRY_COUNT=3
# If key is stored in a file, do not prompt
# Do not prompt if key is stored noninteractively,
if ! [ "${KEYLOCATION}" = "prompt" ]; then
$ZFS load-key "${ENCRYPTIONROOT}"
# Prompt with plymouth, if active
elif [ -e /bin/plymouth ] && /bin/plymouth --ping 2>/dev/null; then
elif /bin/plymouth --ping 2>/dev/null; then
echo "plymouth" > /run/zfs_console_askpwd_cmd
while [ $TRY_COUNT -gt 0 ]; do
for _ in 1 2 3; do
plymouth ask-for-password --prompt "Encrypted ZFS password for ${ENCRYPTIONROOT}" | \
$ZFS load-key "${ENCRYPTIONROOT}" && break
TRY_COUNT=$((TRY_COUNT - 1))
done
# Prompt with systemd, if active
elif [ -e /run/systemd/system ]; then
echo "systemd-ask-password" > /run/zfs_console_askpwd_cmd
while [ $TRY_COUNT -gt 0 ]; do
systemd-ask-password "Encrypted ZFS password for ${ENCRYPTIONROOT}" --no-tty | \
for _ in 1 2 3; do
systemd-ask-password --no-tty "Encrypted ZFS password for ${ENCRYPTIONROOT}" | \
$ZFS load-key "${ENCRYPTIONROOT}" && break
TRY_COUNT=$((TRY_COUNT - 1))
done
# Prompt with ZFS tty, otherwise
else
# Temporarily setting "printk" to "7" allows the prompt to appear even when the "quiet" kernel option has been used
echo "load-key" > /run/zfs_console_askpwd_cmd
storeprintk="$(awk '{print $1}' /proc/sys/kernel/printk)"
read -r storeprintk _ < /proc/sys/kernel/printk
echo 7 > /proc/sys/kernel/printk
$ZFS load-key "${ENCRYPTIONROOT}"
echo "$storeprintk" > /proc/sys/kernel/printk
+7 -1
View File
@@ -1,4 +1,4 @@
# ZoL userland configuration.
# OpenZFS userland configuration.
# NOTE: This file is intended for sysv init and initramfs.
# Changing some of these settings may not make any difference on
@@ -9,6 +9,12 @@
# To enable a boolean setting, set it to yes, on, true, or 1.
# Anything else will be interpreted as unset.
# Run `zfs load-key` during system start?
ZFS_LOAD_KEY='yes'
# Run `zfs unload-key` during system stop?
ZFS_UNLOAD_KEY='no'
# Run `zfs mount -a` during system start?
ZFS_MOUNT='yes'
+1
View File
@@ -1,4 +1,5 @@
zfs-import
zfs-load-key
zfs-mount
zfs-share
zfs-zed
+1 -1
View File
@@ -3,7 +3,7 @@ include $(top_srcdir)/config/Shellcheck.am
EXTRA_DIST += README.md
init_SCRIPTS = zfs-import zfs-mount zfs-share zfs-zed
init_SCRIPTS = zfs-import zfs-load-key zfs-mount zfs-share zfs-zed
SUBSTFILES += $(init_SCRIPTS)
+7 -4
View File
@@ -42,14 +42,16 @@ INSTALLING INIT SCRIPT LINKS
To setup the init script links in /etc/rc?.d manually on a Debian GNU/Linux
(or derived) system, run the following commands (the order is important!):
update-rc.d zfs-import start 07 S . stop 07 0 1 6 .
update-rc.d zfs-mount start 02 2 3 4 5 . stop 06 0 1 6 .
update-rc.d zfs-zed start 07 2 3 4 5 . stop 08 0 1 6 .
update-rc.d zfs-share start 27 2 3 4 5 . stop 05 0 1 6 .
update-rc.d zfs-import start 07 S . stop 07 0 1 6 .
update-rc.d zfs-load-key start 02 2 3 4 5 . stop 06 0 1 6 .
update-rc.d zfs-mount start 02 2 3 4 5 . stop 06 0 1 6 .
update-rc.d zfs-zed start 07 2 3 4 5 . stop 08 0 1 6 .
update-rc.d zfs-share start 27 2 3 4 5 . stop 05 0 1 6 .
To do the same on RedHat, Fedora and/or CentOS:
chkconfig zfs-import
chkconfig zfs-load-key
chkconfig zfs-mount
chkconfig zfs-zed
chkconfig zfs-share
@@ -57,6 +59,7 @@ INSTALLING INIT SCRIPT LINKS
On Gentoo:
rc-update add zfs-import boot
rc-update add zfs-load-key boot
rc-update add zfs-mount boot
rc-update add zfs-zed default
rc-update add zfs-share default
+1 -2
View File
@@ -57,8 +57,7 @@ find_pools()
local pools
pools=$("$@" 2> /dev/null | \
grep -E "pool:|^[a-zA-Z0-9]" | \
sed 's@.*: @@' | \
sed -Ee '/pool:|^[a-zA-Z0-9]/!d' -e 's@.*: @@' | \
sort | \
tr '\n' ';')
+131
View File
@@ -0,0 +1,131 @@
#!@DEFAULT_INIT_SHELL@
#
# zfs-load-key This script will load/unload the zfs filesystems keys.
#
# chkconfig: 2345 06 99
# description: This script will load or unload the zfs filesystems keys during
# system boot/shutdown. Only filesystems with key path set
# in keylocation property. See the zfs(8) man page for details.
# probe: true
#
### BEGIN INIT INFO
# Provides: zfs-load-key
# Required-Start: $local_fs zfs-import
# Required-Stop: $local_fs zfs-import
# Default-Start: 2 3 4 5
# Default-Stop: 0 1 6
# X-Start-Before: zfs-mount
# X-Stop-After: zfs-zed
# Short-Description: Load ZFS keys for filesystems and volumes
# Description: Run the `zfs load-key` or `zfs unload-key` commands.
### END INIT INFO
#
# Released under the 2-clause BSD license.
#
# This script is based on debian/zfsutils.zfs.init from the
# Debian GNU/kFreeBSD zfsutils 8.1-3 package, written by Aurelien Jarno.
# Source the common init script
. @sysconfdir@/zfs/zfs-functions
# ----------------------------------------------------
do_depend()
{
# bootmisc will log to /var which may be a different zfs than root.
before bootmisc logger zfs-mount
after zfs-import sysfs
keyword -lxc -openvz -prefix -vserver
}
# Load keys for all datasets/filesystems
do_load_keys()
{
zfs_log_begin_msg "Load ZFS filesystem(s) keys"
"$ZFS" list -Ho name,encryptionroot,keystatus,keylocation |
while IFS=" " read -r name encryptionroot keystatus keylocation; do
if [ "$encryptionroot" != "-" ] &&
[ "$name" = "$encryptionroot" ] &&
[ "$keystatus" = "unavailable" ] &&
[ "$keylocation" != "prompt" ] &&
[ "$keylocation" != "none" ]
then
zfs_action "Load key for $encryptionroot" \
"$ZFS" load-key "$encryptionroot"
fi
done
zfs_log_end_msg 0
return 0
}
# Unload keys for all datasets/filesystems
do_unload_keys()
{
zfs_log_begin_msg "Unload ZFS filesystem(s) key"
"$ZFS" list -Ho name,encryptionroot,keystatus | sed '1!G;h;$!d' |
while IFS=" " read -r name encryptionroot keystatus; do
if [ "$encryptionroot" != "-" ] &&
[ "$name" = "$encryptionroot" ] &&
[ "$keystatus" = "available" ]
then
zfs_action "Unload key for $encryptionroot" \
"$ZFS" unload-key "$encryptionroot"
fi
done
zfs_log_end_msg 0
return 0
}
do_start()
{
check_boolean "$ZFS_LOAD_KEY" || exit 0
check_module_loaded "zfs" || exit 0
do_load_keys
}
do_stop()
{
check_boolean "$ZFS_UNLOAD_KEY" || exit 0
check_module_loaded "zfs" || exit 0
do_unload_keys
}
# ----------------------------------------------------
if [ ! -e /sbin/openrc-run ]
then
case "$1" in
start)
do_start
;;
stop)
do_stop
;;
force-reload|condrestart|reload|restart|status)
# no-op
;;
*)
[ -n "$1" ] && echo "Error: Unknown command $1."
echo "Usage: $0 {start|stop}"
exit 3
;;
esac
exit $?
else
# Create wrapper functions since Gentoo don't use the case part.
depend() { do_depend; }
start() { do_start; }
stop() { do_stop; }
fi
+1
View File
@@ -1,3 +1,4 @@
*.service
*.target
*.preset
*.timer
+4 -1
View File
@@ -12,7 +12,10 @@ systemdunit_DATA = \
zfs-volume-wait.service \
zfs-import.target \
zfs-volumes.target \
zfs.target
zfs.target \
zfs-scrub-monthly@.timer \
zfs-scrub-weekly@.timer \
zfs-scrub@.service
SUBSTFILES += $(systemdpreset_DATA) $(systemdunit_DATA)
@@ -0,0 +1,12 @@
[Unit]
Description=Monthly zpool scrub timer for %i
Documentation=man:zpool-scrub(8)
[Timer]
OnCalendar=monthly
Persistent=true
RandomizedDelaySec=1h
Unit=zfs-scrub@%i.service
[Install]
WantedBy=timers.target
@@ -0,0 +1,12 @@
[Unit]
Description=Weekly zpool scrub timer for %i
Documentation=man:zpool-scrub(8)
[Timer]
OnCalendar=weekly
Persistent=true
RandomizedDelaySec=1h
Unit=zfs-scrub@%i.service
[Install]
WantedBy=timers.target
+14
View File
@@ -0,0 +1,14 @@
[Unit]
Description=zpool scrub on %i
Documentation=man:zpool-scrub(8)
Requires=zfs.target
After=zfs.target
ConditionACPower=true
ConditionPathIsDirectory=/sys/module/zfs
[Service]
ExecStart=/bin/sh -c '\
if @sbindir@/zpool status %i | grep "scrub in progress"; then\
exec @sbindir@/zpool wait -t scrub %i;\
else exec @sbindir@/zpool scrub -w %i; fi'
ExecStop=-/bin/sh -c '@sbindir@/zpool scrub -p %i 2>/dev/null || true'
+12 -11
View File
@@ -1,5 +1,5 @@
# This is a script with common functions etc used by zfs-import, zfs-mount,
# zfs-share and zfs-zed.
# This is a script with common functions etc used by zfs-import, zfs-load-key,
# zfs-mount, zfs-share and zfs-zed.
#
# It is _NOT_ to be called independently
#
@@ -92,6 +92,8 @@ ZPOOL="@sbindir@/zpool"
ZPOOL_CACHE="@sysconfdir@/zfs/zpool.cache"
# Sensible defaults
ZFS_LOAD_KEY='yes'
ZFS_UNLOAD_KEY='no'
ZFS_MOUNT='yes'
ZFS_UNMOUNT='yes'
ZFS_SHARE='yes'
@@ -104,7 +106,8 @@ fi
# ----------------------------------------------------
export ZFS ZED ZPOOL ZPOOL_CACHE ZFS_MOUNT ZFS_UNMOUNT ZFS_SHARE ZFS_UNSHARE
export ZFS ZED ZPOOL ZPOOL_CACHE ZFS_LOAD_KEY ZFS_UNLOAD_KEY ZFS_MOUNT ZFS_UNMOUNT \
ZFS_SHARE ZFS_UNSHARE
zfs_action()
{
@@ -345,7 +348,7 @@ read_mtab()
# Unset all MTAB_* variables
# shellcheck disable=SC2046
unset $(env | grep ^MTAB_ | sed 's,=.*,,')
unset $(env | sed -e '/^MTAB_/!d' -e 's,=.*,,')
while read -r fs mntpnt fstype opts rest; do
if echo "$fs $mntpnt $fstype $opts" | grep -qE "$match"; then
@@ -360,9 +363,8 @@ read_mtab()
fs=$(/bin/echo "$fs" | sed 's,\\0,\\00,')
# Remove 'unwanted' characters.
mntpnt=$(printf '%b\n' "$mntpnt" | sed -e 's,/,,g' \
-e 's,-,,g' -e 's,\.,,g' -e 's, ,,g')
fs=$(printf '%b\n' "$fs")
mntpnt=$(printf '%b' "$mntpnt" | tr -d '/. -')
fs=$(printf '%b' "$fs")
# Set the variable.
eval export "MTAB_$mntpnt=\"$fs\""
@@ -374,8 +376,7 @@ in_mtab()
{
local mntpnt="$1"
# Remove 'unwanted' characters.
mntpnt=$(printf '%b\n' "$mntpnt" | sed -e 's,/,,g' \
-e 's,-,,g' -e 's,\.,,g' -e 's, ,,g')
mntpnt=$(printf '%b' "$mntpnt" | tr -d '/. -')
local var
var="$(eval echo "MTAB_$mntpnt")"
@@ -391,7 +392,7 @@ read_fstab()
# Unset all FSTAB_* variables
# shellcheck disable=SC2046
unset $(env | grep ^FSTAB_ | sed 's,=.*,,')
unset $(env | sed -e '/^FSTAB_/!d' -e 's,=.*,,')
i=0
while read -r fs mntpnt fstype opts; do
@@ -401,7 +402,7 @@ read_fstab()
if echo "$fs $mntpnt $fstype $opts" | grep -qE "$match"; then
eval export "FSTAB_dev_$i=$fs"
fs=$(printf '%b\n' "$fs" | sed 's,/,_,g')
fs=$(printf '%b' "$fs" | tr '/' '_')
eval export "FSTAB_$i=$mntpnt"
i=$((i + 1))
+2
View File
@@ -72,6 +72,8 @@ struct libzfs_handle {
boolean_t libzfs_prop_debug;
regex_t libzfs_urire;
uint64_t libzfs_max_nvlist;
void *libfetch;
char *libfetch_load_error;
};
struct zfs_handle {
+10
View File
@@ -159,6 +159,16 @@ void color_start(char *color);
void color_end(void);
int printf_color(char *color, char *format, ...);
/*
* These functions are used by the ZFS libraries and cmd/zpool code, but are
* not exported in the ABI.
*/
typedef int (*pool_vdev_iter_f)(void *, nvlist_t *, void *);
int for_each_vdev_cb(void *zhp, nvlist_t *nv, pool_vdev_iter_f func,
void *data);
int for_each_vdev_in_nvlist(nvlist_t *nvroot, pool_vdev_iter_f func,
void *data);
void update_vdevs_config_dev_sysfs_path(nvlist_t *config);
#ifdef __cplusplus
}
#endif
+1
View File
@@ -67,6 +67,7 @@
#define __always_inline inline
#define noinline __noinline
#define ____cacheline_aligned __aligned(CACHE_LINE_SIZE)
#define fallthrough __attribute__((__fallthrough__))
#if !defined(_KERNEL) && !defined(_STANDALONE)
#define likely(x) __builtin_expect(!!(x), 1)
+6
View File
@@ -62,6 +62,12 @@
#define param_set_arc_long_args(var) \
CTLTYPE_ULONG, &var, 0, param_set_arc_long, "LU"
#define param_set_arc_min_args(var) \
CTLTYPE_ULONG, &var, 0, param_set_arc_min, "LU"
#define param_set_arc_max_args(var) \
CTLTYPE_ULONG, &var, 0, param_set_arc_max, "LU"
#define param_set_arc_int_args(var) \
CTLTYPE_INT, &var, 0, param_set_arc_int, "I"
+22
View File
@@ -30,6 +30,9 @@
#define _OPENSOLARIS_SYS_RANDOM_H_
#include_next <sys/random.h>
#if __FreeBSD_version >= 1300108
#include <sys/prng.h>
#endif
static inline int
random_get_bytes(uint8_t *p, size_t s)
@@ -45,4 +48,23 @@ random_get_pseudo_bytes(uint8_t *p, size_t s)
return (0);
}
static inline uint32_t
random_in_range(uint32_t range)
{
#if defined(_KERNEL) && __FreeBSD_version >= 1300108
return (prng32_bounded(range));
#else
uint32_t r;
ASSERT(range != 0);
if (range == 1)
return (0);
(void) random_get_pseudo_bytes((uint8_t *)&r, sizeof (r));
return (r % range);
#endif
}
#endif /* !_OPENSOLARIS_SYS_RANDOM_H_ */
+18 -1
View File
@@ -59,6 +59,8 @@ enum symfollow { NO_FOLLOW = NOFOLLOW };
#include <sys/file.h>
#include <sys/filedesc.h>
#include <sys/syscallsubr.h>
#include <sys/vm.h>
#include <vm/vm_object.h>
typedef struct vop_vector vnodeops_t;
#define VOP_FID VOP_VPTOFH
@@ -83,6 +85,22 @@ vn_is_readonly(vnode_t *vp)
#define vn_has_cached_data(vp) \
((vp)->v_object != NULL && \
(vp)->v_object->resident_page_count > 0)
static __inline void
vn_flush_cached_data(vnode_t *vp, boolean_t sync)
{
#if __FreeBSD_version > 1300054
if (vm_object_mightbedirty(vp->v_object)) {
#else
if (vp->v_object->flags & OBJ_MIGHTBEDIRTY) {
#endif
int flags = sync ? OBJPC_SYNC : 0;
zfs_vmobject_wlock(vp->v_object);
vm_object_page_clean(vp->v_object, 0, 0, flags);
zfs_vmobject_wunlock(vp->v_object);
}
}
#define vn_exists(vp) do { } while (0)
#define vn_invalid(vp) do { } while (0)
#define vn_renamepath(tdvp, svp, tnm, lentnm) do { } while (0)
@@ -114,7 +132,6 @@ vn_is_readonly(vnode_t *vp)
/* TODO: This field needs conversion! */
#define va_nblocks va_bytes
#define va_blksize va_blocksize
#define va_seq va_gen
#define MAXOFFSET_T OFF_MAX
#define EXCL 0
@@ -41,6 +41,10 @@
#include <sys/ccompat.h>
#include <linux/types.h>
#if KSTACK_PAGES * PAGE_SIZE >= 16384
#define HAVE_LARGE_STACKS 1
#endif
#define cond_resched() kern_yield(PRI_USER)
#define taskq_create_sysdc(a, b, d, e, p, dc, f) \
+2 -1
View File
@@ -118,7 +118,8 @@ extern minor_t zfsdev_minor_alloc(void);
#define Z_ISLNK(type) ((type) == VLNK)
#define Z_ISDIR(type) ((type) == VDIR)
#define zn_has_cached_data(zp) vn_has_cached_data(ZTOV(zp))
#define zn_has_cached_data(zp) vn_has_cached_data(ZTOV(zp))
#define zn_flush_cached_data(zp, sync) vn_flush_cached_data(ZTOV(zp), sync)
#define zn_rlimit_fsize(zp, uio) \
vn_rlimit_fsize(ZTOV(zp), GET_UIO_STRUCT(uio), zfs_uio_td(uio))
@@ -30,9 +30,9 @@
#define _ZFS_BLKDEV_H
#include <linux/blkdev.h>
#include <linux/elevator.h>
#include <linux/backing-dev.h>
#include <linux/hdreg.h>
#include <linux/major.h>
#include <linux/msdos_fs.h> /* for SECTOR_* */
#ifndef HAVE_BLK_QUEUE_FLAG_SET
@@ -92,11 +92,14 @@ blk_queue_set_write_cache(struct request_queue *q, bool wc, bool fua)
static inline void
blk_queue_set_read_ahead(struct request_queue *q, unsigned long ra_pages)
{
#if !defined(HAVE_BLK_QUEUE_UPDATE_READAHEAD) && \
!defined(HAVE_DISK_UPDATE_READAHEAD)
#ifdef HAVE_BLK_QUEUE_BDI_DYNAMIC
q->backing_dev_info->ra_pages = ra_pages;
#else
q->backing_dev_info.ra_pages = ra_pages;
#endif
#endif
}
#ifdef HAVE_BIO_BVEC_ITER
@@ -28,6 +28,14 @@
#include <linux/compiler.h>
#if !defined(fallthrough)
#if defined(HAVE_IMPLICIT_FALLTHROUGH)
#define fallthrough __attribute__((__fallthrough__))
#else
#define fallthrough ((void)0)
#endif
#endif
#if !defined(READ_ONCE)
#define READ_ONCE(x) ACCESS_ONCE(x)
#endif

Some files were not shown because too many files have changed in this diff Show More