A plain rewrite of the shell version, and generates identical
units, save for replacing some empty lines with nothing, having fewer
meaningless spaces in After=s and different spacing in the lock scripts,
for a clean git diff -w
This is a gain of anywhere from 0m0.336s vs 0m0.022s (15.27x)
to 0m0.202s vs 0m0.006s (33.67x), depending on the hardware,
a.k.a. from "absolutely unusable" to "perfectly fine"
This also properly deals with canmount=noauto units across multiple
pools
See PR for detailed timings (of an early version) and diffs
Reviewed-by: Antonio Russo <aerusso@aerusso.net>
Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: InsanePrawn <insane.prawny@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Issue #11915Closes#11917
zstreamdump(8) was in quite a bad state,
and the wrapper didn't work if invoked without /sbin in $PATH
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes#12015
This produces a leaner image, doesn't fail if zdb doesn't exist,
properly handles hostnameless systems, doesn't mention crypto modules
for no reason, doesn't add useless empty executable in hopes an
eight-year-old PR is merged, uses i-t builtins for all copies
Also optimize the checkbashisms filter to spawn one (or a few) awks
instead of one per regular file and remove initramfs/hooks therefrom due
to a command -v false positive
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes#12017
configure on s390x has a key check fail with an error about
a variable being used uninitialized. So let's initialize it.
Reviewed-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes#12126
Just like #12087, the set_acl signature changed with all the bolted-on
*userns parameters, which disabled set_acl usage, and caused #12076.
Turn zpl_set_acl into zpl_set_acl and zpl_set_acl_impl, and add a
new configure test for the new version.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes#12076Closes#12093
In case of AIO failure, we should probably fallback to the old
behavior and still work.
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Reviewed-by: Alan Somers <asomers@gmail.com>
Reviewed-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes#12032Closes#12040
No changes to the text itself
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes#12111
Linux man-pages' mount(8) points at fcntl(2), as does mount(2),
and support for it is little-used, deprecated, and configurable
since 4.5.
As far as I can tell, FreeBSD doesn't support nbmand at all ‒
mandatory locks are mostly dead
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes#12111
Previous commit added accounting for geom mode, but not for dev.
In geom mode we actually have GEOM statistics, while in dev mode
additional accounting actually makes more sense.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Closes#12097
The change correctly handles BrokenPipeError and improves the
associated tests.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes#12037Closes#12036
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Closes#12049
ZFS does not expect transient errors from crypto. For read they are
counted as checksum errors, while for write end up in panic. To not
panic on random low memory conditions retry ENOMEM errors in the OCF
wrapper function.
While there remove unneeded timeout and priority from msleep().
External-issue: https://reviews.freebsd.org/D30339
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Mark Maybee <mark.maybee@delphix.com>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Closes#12077
I looked for a bit, and couldn't find any documentation on
how to print all logged dbgmsg entries, just messages since
the DTrace probe started, until @allanjude kindly pointed me
toward the sysctl.
So let's add that note where the DTrace probe is mentioned for
FreeBSD, so other people can find it.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Allan Jude <allan@klarasystems.com>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes#12113
Propagate vdev child state to parents on invalid label
Add VDEV_AUX_BAD_LABEL to print_import_config()
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Mark Maybee <mark.maybee@delphix.com>
Co-authored-by: Srikanth N S <srikanth.nagasubbaraoseetharaman@hpe.com>
Signed-off-by: Vipin Kumar Verma <vipin.verma@hpe.com>
Closes#12088
Linux changed the tmpfile() signature again in torvalds/linux@6521f89,
which in turn broke our HAVE_TMPFILE detection in configure.
Update that macro to include the new case, and change the signature of
zpl_tmpfile as appropriate.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes: #12060Closes: #12087
This change addresses two distinct scenarios which are possible
when performing a sequential resilver to a dRAID pool with vdevs
that contain silent unknown damage. Which in this circumstance
took the form of the devices being intentionally overwritten with
zeros. However, it could also result from a device returning incorrect
data while a sequential resilver was in progress.
Scenario 1) A sequential resilver is performed while all of the
dRAID vdevs are ONLINE and there is silent damage present on the
vdev being resilvered. In this case, nothing will be repaired
by vdev_raidz_io_done_reconstruct_known_missing() because
rc->rc_error isn't set on any of the raid columns. To address
this vdev_draid_io_start_read() has been updated to always mark
the resilvering column as ESTALE for sequential resilver IO.
Scenario 2) Multiple columns contain silent damage for the same
block and a sequential resilver is performed. In this case it's
impossible to generate the correct data from parity unless all of
the damaged columns are being sequentially resilvered (and thus
only good data is used to generate parity). This is as expected
and there's nothing which can be done about it. However, we need
to be careful not to make to situation worse. Since we can't
verify the data is actually good without a checksum, we must
only repair the devices which are being sequentially resilvered.
Otherwise, an incorrect repair to a device which previously
contained good data could effectively lock in the damage and
make reconstruction impossible. A check for this was added to
vdev_raidz_io_done_verified() along with a new test case.
Lastly, this change updates the redundancy_draid_spare1 and
redundancy_draid_spare3 test cases to be more representative
of normal dRAID replacement operation. Specifically, what we
care about is that the scrub run after a sequential resilver
does not find additional blocks which need repair. This would
indicate the sequential resilver failed to rebuild a section of
one of the devices. Note also the tests were switched to using
the verify_pool() function which still checks for checksum errors.
Reviewed-by: Mark Maybee <mark.maybee@delphix.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes#12061
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Lauri Tirkkonen <lauri@hacktheplanet.fi>
Closes#12064
While use of dynamic taskqs allows to reduce number of idle threads,
hardcoded 8 taskqs of each kind is a big overkill for small systems,
complicating CPU scheduling, increasing I/O reorder, etc, while
providing no real locking benefits, just not needed there.
On another side, 12*8 worker threads per kind are able to overload
almost any system nowadays. For example, pool of several fast SSDs
with SHA256 checksum makes system barely responsive during scrub, or
with dedup enabled barely responsive during large file deletion.
To address both problems this patch introduces ZTI_SCALE macro, alike
to ZTI_BATCH, but with multiple taskqs, depending on number of CPUs,
to be used in places where lock scalability is needed, while request
ordering is not so much. The code is made to create new taskq for
~6 worker threads (less for small systems, but more for very large)
up to 80% of CPU cores (previous 75% was not good for rounding down).
Both number of threads and threads per taskq are now tunable in case
somebody really wants to use all of system power for ZFS.
While obviously some benchmarks show small peak performance reduction
(not so big really, especially on systems with SMT, where use of the
second threads does not give as much performance as the first ones),
they also show dramatic latency reduction and much more smooth user-
space operation in case of high CPU usage by ZFS.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored-By: iXsystems, Inc.
Closes#11966
The redundancy_draid.ksh and redundancy_raidz.ksh tests were updated
by commit 93c8e91fe to additionally verify self-healing. This
additional check increased the run time which can now occasionally
exceed the default maximum timeout in the CI environment. To prevent
this from causing failures increase the default timeout for the
redundancy test cases.
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes#12043
Use dsl_dataset_has_resume_receive_state()
not dsl_dataset_is_zapified() to check if
stream is resumable.
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Alek Pinchuk <apinchuk@axcient.com>
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Signed-off-by: Paul Zuchowski <pzuchowski@datto.com>
Closes#12034
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes#11997
The kernel will use the xattr property by default when not overridden
by a mount option.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Ryan Moeller <ryan@iXsystems.com>
Closes#11997
Commit d1d4769 takes into account the encryption key version to
decide if the local_mac could be zeroed out. However, this could lead
to failure mounting encrypted datasets created with intermediate
versions of ZFS encryption available in master between major releases.
In order to prevent this situation revert d1d4769 pending a more
comprehensive fix which addresses the mount failure case.
Reviewed-by: George Amanakis <gamanakis@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #11294
Issue #12025
Issue #12300Closes#12033
mandoc: ./man/man8/zfs-mount-generator.8.in:188:2:
ERROR: skipping end of block that is not open: RE
mandoc: ./man/man8/zfs_ids_to_path.8:38:2:
ERROR: skipping unknown macro: .LP
mandoc: ./man/man8/zfs_ids_to_path.8:48:2:
ERROR: inserting missing end of block: Sh breaks Bl
mandoc: ./man/man8/zfs-wait.8:69:2:
ERROR: skipping end of block that is not open: El
mandoc: ./man/man8/zfs-program.8:460:2:
ERROR: inserting missing end of block: It breaks Bd
mandoc: ./man/man8/zfs-mount-generator.8:188:2:
ERROR: skipping end of block that is not open: RE
mandoc: ./man/man8/zstream.8:43:2:
ERROR: skipping unknown macro: .LP
mandoc: ./man/man8/zstream.8:107:2:
ERROR: inserting missing end of block: Sh breaks Bl
mandoc: ./man/man8/zstream.8:107:2:
ERROR: inserting missing end of block: Sh breaks Bl
make: *** [Makefile:1529: mancheck] Error 1
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Issue #12017
The following seven tests been observed to occasionally fail during
CI testing. This commit adds them to the list of known somewhat
flaky test cases.
Reviewed-by: George Melikov <mail@gmelikov.ru>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes#12023
Linux kernel commit 0f00b82e5413571ed225ddbccad6882d7ea60bc7 removes the
revalidate_disk() handler from struct block_device_operations. This
caused a regression, and this commit eliminates the call to it and the
assignment in the block_device_operations static handler assignment
code, when configure identifies that the kernel doesn't support that
API handler.
Reviewed-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Coleman Kane <ckane@colemankane.org>
Closes#11967Closes#11977
zfs_zevent_console committed multiple printk()s per line without
properly continuing them ‒ a single event could easily be fragmented
across over thirty lines, making it useless for direct application
zfs_zevent_cols exists purely to wrap the output from zfs_zevent_console
The niche this was supposed to fill can be better served by something
akin to the all-syslog ZEDLET
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes#7082Closes#11996
Also always free tmp2 at the end
Before:
nabijaczleweli@tarta:~/uwu$ valgrind --leak-check=full ./blergh
==8947== Memcheck, a memory error detector
==8947== Using Valgrind-3.14.0 and LibVEX
==8947== Command: ./blergh
==8947==
(null)
==8947==
==8947== HEAP SUMMARY:
==8947== in use at exit: 23 bytes in 1 blocks
==8947== total heap usage: 3 allocs, 2 frees, 1,147 bytes allocated
==8947==
==8947== 23 bytes in 1 blocks are definitely lost in loss record 1 of 1
==8947== at 0x483577F: malloc (vg_replace_malloc.c:299)
==8947== by 0x48D74B7: vasprintf (vasprintf.c:73)
==8947== by 0x48B7833: asprintf (asprintf.c:35)
==8947== by 0x401258: zfs_get_enclosure_sysfs_path
(zutil_device_path_os.c:191)
==8947== by 0x401482: main (blergh.c:107)
==8947==
==8947== LEAK SUMMARY:
==8947== definitely lost: 23 bytes in 1 blocks
==8947== indirectly lost: 0 bytes in 0 blocks
==8947== possibly lost: 0 bytes in 0 blocks
==8947== still reachable: 0 bytes in 0 blocks
==8947== suppressed: 0 bytes in 0 blocks
==8947==
==8947== For counts of detected and suppressed errors, rerun with: -v
==8947== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
nabijaczleweli@tarta:~/uwu$ sed -n 191p zutil_device_path_os.c
tmpsize = asprintf(&tmp1, "/sys/block/%s/device", dev_name);
After:
nabijaczleweli@tarta:~/uwu$ valgrind --leak-check=full ./blergh
==9512== Memcheck, a memory error detector
==9512== Using Valgrind-3.14.0 and LibVEX
==9512== Command: ./blergh
==9512==
(null)
==9512==
==9512== HEAP SUMMARY:
==9512== in use at exit: 0 bytes in 0 blocks
==9512== total heap usage: 3 allocs, 3 frees, 1,147 bytes allocated
==9512==
==9512== All heap blocks were freed -- no leaks are possible
==9512==
==9512== For counts of detected and suppressed errors, rerun with: -v
==9512== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes#11993
As a bonus, this also passes the open flags into the open flags instead
of the mode (it worked by accident because O_RDONLY is 0),
correctly detects a failed map,
and prefaults the entire file since we're always writing to every page
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes#11993
This commits contains changes to allow running `copy-builtin` without
bash + some minor improvements.
changed shebang to /bin/sh
added -f option to `set` to globally disable unneeded globbing
replaced all `echo` commands within add_after() with `printf`
alternative to avoid possible issues with options (-neE)
dropped non-portable superfluous `readlink` command
replaced superfluous `true` command with `:` builtin alternative
replaced non-portable `--recursive` option of `cp` command with `-R`
alternative
dropped non-portable `local` keyword
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: illiliti <illiliti@protonmail.com>
Closes#12004
When dRAID performs a normal read operation only the data columns
in the raid map are read from disk. This is enough information to
calculate the checksum, verify it, and return the needed data to the
application. It's only in the event of a checksum failure that the
additional parity and any empty columns must be read since they are
required for parity reconstruction.
Reading these additional columns is handled by vdev_raidz_read_all()
which calls vdev_draid_map_alloc_empty() to expand the raid_map_t
and submit IOs for the missing columns. This all works correctly,
but it fails to account for any "short" columns. These are data
columns which are padded with a empty skip sector at the end.
Since that empty sector is not needed for a normal read it's not
read when columns is first read from disk. However, like the parity
and empty columns the skip sector is needed to perform reconstruction.
The fix is to mark any "short" columns as never being read by clearing
the rc_tried flag when expanding the raid_map_t. This will cause
the entire column to re-read from disk in the event of a checksum
failure allowing the self-healing functionality to repair the block.
Note that this only effects the self-healing feature because when
scrubbing a pool the parity, data, and empty columns are all read
initially to verify their contents. Furthermore, only blocks which
contain "short" columns would be effected, and only when the memory
backing the skip sector wasn't already zeroed out.
This change extends the existing redundancy_raidz.ksh test case to
verify self-healing (as well as resilver and scrub). Then applies
the same test case to dRAID with a slightly modified version of
the test script called redundancy_draid.ksh. The unused variable
combrec was also removed from both test cases.
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Mark Maybee <mark.maybee@delphix.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes#12010
Afterward, git grep ZoL matches:
* README.md: * [ZoL Site](https://zfsonlinux.org)
- Correct
* etc/default/zfs.in:# ZoL userland configuration.
- Changing this would induce a needless upgrade-check,
if the user has modified the configuration;
this can be updated the next time the defaults change
* module/zfs/dmu_send.c: * ZoL < 0.7 does not handle [...]
- Before 0.7 is ZoL, so fair enough
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Issue #11956